04 Aug, 2016

1 commit

  • When searching for a suitable node that should be used for inserting a new
    register, which does not fall within the range of any existing node, we not
    only looks for nodes which are directly adjacent to the new register, but
    for nodes within a certain proximity. This is done to avoid creating lots
    of small nodes with just a few registers spacing in between, which would
    increase memory usage as well as tree traversal time.

    This means there might be multiple node candidates which fall within the
    proximity range of the new register. If we choose the first node we
    encounter, under certain register insertion patterns it is possible to end
    up with overlapping ranges. This will break order in the rbtree and can
    cause the cached register value to become corrupted.

    E.g. take the simplified example where the proximity range is 2 and the
    register insertion sequence is 1, 4, 2, 3, 5.
    * Insert of register 1 creates a new node, this is the root of the rbtree
    * Insert of register 4 creates a new node, which is inserted to the right
    of the root.
    * Insert of register 2 gets inserted to the first node
    * Insert of register 3 gets inserted to the first node
    * Insert of register 5 also gets inserted into the first node since
    this is the first node encountered and it is within the proximity range.
    Now there are two overlapping nodes.

    To avoid this always choose the node that is closest to the new register.
    This will ensure that nodes will not overlap. The tree traversal is still
    done as a binary search, we just don't stop at the first node found. So the
    complexity of the algorithm stays within the same order.

    Ideally if a new register is in the range of two adjacent blocks those
    blocks should be merged, but that is a much more invasive change and left
    for later.

    The issue was initially introduced in commit 472fdec7380c ("regmap: rbtree:
    Reduce number of nodes, take 2"), but became much more exposed by commit
    6399aea629b0 ("regmap: rbtree: When adding a reg do a bsearch for target
    node") which changed the order in which nodes are looked-up.

    Fixes: 6399aea629b0 ("regmap: rbtree: When adding a reg do a bsearch for target node")
    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     

06 Jan, 2016

1 commit


20 Nov, 2015

2 commits


16 Nov, 2015

1 commit

  • A binary search is much more efficient rather than iterating
    over the rbtree in ascending order which the current code is
    doing.

    During initialisation the reg defaults are written to the
    cache in a large chunk and these are always sorted in the
    ascending order so for this situation ideally we should have
    iterated the rbtree in descending order.

    But at runtime the drivers may write into the cache in any
    random order so this patch selects to use a bsearch to give
    an optimal runtime performance and also at initialisation
    time when reg defaults are written the performance of binary
    search would be much better than iterating in ascending order
    which the current code was doing.

    Signed-off-by: Nikesh Oswal
    Signed-off-by: Mark Brown

    Nikesh Oswal
     

29 Jul, 2015

1 commit

  • When inserting a new register into a block, the present bit map size is
    increased using krealloc. krealloc does not clear the additionally
    allocated memory, leaving it filled with random values. Result is that
    some registers are considered cached even though this is not the case.

    Fix the problem by clearing the additionally allocated memory. Also, if
    the bitmap size does not increase, do not reallocate the bitmap at all
    to reduce overhead.

    Fixes: 3f4ff561bc88 ("regmap: rbtree: Make cache_present bitmap per node")
    Signed-off-by: Guenter Roeck
    Signed-off-by: Mark Brown
    Cc: stable@vger.kernel.org

    Guenter Roeck
     

08 Mar, 2015

1 commit

  • When inserting a new register into a block at the lower end the present
    bitmap is currently shifted into the wrong direction. The effect of this is
    that the bitmap becomes corrupted and registers which are present might be
    reported as not present and vice versa.

    Fix this by shifting left rather than right.

    Fixes: 472fdec7380c("regmap: rbtree: Reduce number of nodes, take 2")
    Reported-by: Daniel Baluta
    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown
    Cc: stable@vger.kernel.org

    Lars-Peter Clausen
     

20 Oct, 2014

1 commit

  • If the inlcude headers aren't sorted alphabetically, then the
    logical choice is to append new ones, however that creates a
    lot of potential for conflicts or duplicates because every change
    will then add new includes in the same location.

    Signed-off-by: Xiubo Li
    Signed-off-by: Mark Brown

    Xiubo Li
     

26 Aug, 2014

1 commit

  • Commit 6cfec04bcc05 ("regmap: Separate regmap dev initialization") moved the
    regmap debugfs initialization after regcache initialization. This means
    that the regmap debugfs directory is not created yet when the cache
    initialization runs and so any debugfs files registered by the regcache are
    created in the debugfs root directory rather than the debugfs directory of
    the regmap instance. Fix this by adding a separate callback for the
    regcache debugfs initialization which will be called after the parent
    debugfs entry has been created.

    Fixes: 6cfec04bcc05 (regmap: Separate regmap dev initialization)
    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown
    Cc: stable@vger.kernel.org

    Lars-Peter Clausen
     

14 Apr, 2014

1 commit


04 Sep, 2013

1 commit

  • Pull regmap updates from Mark Brown:
    "A quiet release for regmap, some cleanups, fixes and:

    - Improved node coalescing for rbtree, reducing memory usage and
    improving performance during syncs.
    - Support for registering multiple register patches.
    - A quirk for handling interrupts that need to be clear when masked
    in regmap-irq"

    * tag 'regmap-v3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
    regmap: rbtree: Make cache_present bitmap per node
    regmap: rbtree: Reduce number of nodes, take 2
    regmap: rbtree: Simplify adjacent node look-up
    regmap: debugfs: Fix continued read from registers file
    regcache-rbtree: Fix reg_stride != 1
    regmap: Allow multiple patches to be registered
    regmap: regcache: allow read-only regs to be cached
    regmap: fix regcache_reg_present() for empty cache
    regmap: core: allow a virtual range to cover its own data window
    regmap: irq: document mask/wake_invert flags
    regmap: irq: make flags bool and put them in a bitfield
    regmap: irq: Allow to acknowledge masked interrupts during initialization
    regmap: Provide __acquires/__releases annotations

    Linus Torvalds
     

29 Aug, 2013

3 commits

  • With devices which have a dense and small register map but placed at a large
    offset the global cache_present bitmap imposes a huge memory overhead. Making
    the cache_present per rbtree node avoids the issue and easily reduces the memory
    footprint by a factor of ten. For devices with a more sparse map or without a
    large base register offset the memory usage might increase slightly by a few
    bytes, but not significantly. E.g. for a device which has ~50 registers at
    offset 0x4000 the memory footprint of the register cache goes down form 2496
    bytes to 175 bytes.

    Moving the bitmap to a per node basis means that the handling of the bitmap is
    now cache implementation specific and can no longer be managed by the core. The
    regcache_sync_block() function is extended by a additional parameter so that the
    cache implementation can tell the core which registers in the block are set and
    which are not. The parameter is optional and if NULL the core assumes that all
    registers are set. The rbtree cache also needs to implement its own drop
    callback instead of relying on the core to handle this.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     
  • Support for reducing the number of nodes and memory consumption of the rbtree
    cache by allowing for small unused holes in the node's register cache block was
    initially added in commit 0c7ed856 ("regmap: Cut down on the average # of nodes
    in the rbtree cache"). But the commit had problems and so its effect was
    reverted again in commit 4e67fb5 ("regmap: rbtree: Fix overlapping rbnodes.").
    This patch brings the feature back of reducing the average number of nodes,
    which will speedup node look-up, while at the same time also reducing the memory
    usage of the rbtree cache. This patch takes a slightly different approach than
    the original patch though. It modifies the adjacent node look-up to not only
    consider nodes that are just one to the left or the right of the register but
    any node that falls in a certain range around the register. The range is
    calculated based on how much memory it would take to allocate a new node
    compared to how much memory it takes adding a set of unused registers to an
    existing node. E.g. if a node takes up 24 bytes and each register in a block
    uses 1 byte the range will be from the register address - 24 to the register
    address + 24. If we find a node that falls within this range it is cheaper or as
    expensive to add the register to the existing node and have a couple of unused
    registers in the node's cache compared to allocating a new node.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     
  • A register which is adjacent to a node will either be left to the first
    register or right to the last register. It will not be within the node's range,
    so there is no point in checking for each register cached by the node whether
    the new register is next to it. It is sufficient to check whether the register
    comes before the first register or after the last register of the node.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     

27 Aug, 2013

1 commit

  • There are a couple of calculations, which convert between register addresses and
    block indices, in regcache_rbtree_sync() and regcache_rbtree_node_alloc() which
    assume that reg_stride is 1. This will break the rb cache for configurations
    which do not use a reg_stride of 1.

    Also rename 'base' in regcache_rbtree_sync() to 'start' to avoid confusion with
    'base_reg'.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     

22 Aug, 2013

1 commit

  • Avoid overlapping register regions by making the initial blklen of a new
    node 1. If a register write occurs to a yet uncached register, that is
    lower than but near an existing node's base_reg, a new node is created
    and it's blklen is set to an arbitrary value (sizeof(*rbnode)). That may
    cause this node to overlap with another node. Those nodes should be merged,
    but this merge doesn't happen yet, so this patch at least makes the initial
    blklen small enough to avoid hitting the wrong node, which may otherwise
    lead to severe breakage.

    Signed-off-by: David Jander
    Signed-off-by: Mark Brown
    Cc: stable@vger.kernel.org

    David Jander
     

30 Jun, 2013

1 commit


02 Jun, 2013

1 commit


24 May, 2013

1 commit

  • The parameter passed to the regmap lock/unlock callbacks needs to be
    map->lock_arg, regcache passes just map. This works fine in the case that no
    custom locking callbacks are used since in this case map->lock_arg equals map,
    but will break when custom locking callbacks are used. The issue was introduced
    in commit 0d4529c5("regmap: make lock/unlock functions customizable") and is
    fixed by this patch.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     

23 May, 2013

1 commit

  • The parameter passed to the regmap lock/unlock callbacks needs to be
    map->lock_arg, regcache passes just map. This works fine in the case that no
    custom locking callbacks are used, since in this case map->lock_arg equals map,
    but will break when custom locking callbacks are used. The issue was introduced
    in commit 0d4529c5 ("regmap: make lock/unlock functions customizable") and is
    fixed by this patch.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown

    Lars-Peter Clausen
     

12 May, 2013

2 commits


16 Apr, 2013

1 commit


30 Mar, 2013

2 commits

  • The idea of holding blocks of registers in device format is shared between
    at least rbtree and lzo cache formats so split out the loop that does the
    sync from the rbtree code so optimisations on it can be reused.

    Signed-off-by: Mark Brown
    Reviewed-by: Dimitris Papastamos

    Mark Brown
     
  • The idea of maintaining a bitmap of present registers is something that
    can usefully be used by other cache types that maintain blocks of cached
    registers so move the code out of the rbtree cache and into the generic
    regcache code.

    Refactor the interface slightly as we go to wrap the set bit and enlarge
    bitmap operations (since we never do one without the other) and make it
    more robust for reads of uncached registers by bounds checking before we
    look at the bitmap.

    Signed-off-by: Mark Brown
    Reviewed-by: Dimitris Papastamos

    Mark Brown
     

27 Mar, 2013

2 commits

  • This will bring no meaningful benefit by itself, it is done as a separate
    commit to aid bisection if there are problems with the following commits
    adding support for coalescing adjacent writes.

    Signed-off-by: Mark Brown

    Mark Brown
     
  • This patch aims to bring down the average number of nodes
    in the rbtree cache and increase the average number of registers
    per node. This should improve general lookup and traversal times.
    This is achieved by setting the minimum size of a block within the
    rbnode to the size of the rbnode itself. This will essentially
    cache possibly non-existent registers so to combat this scenario,
    we keep a separate bitmap in memory which keeps track of which register
    exists. The memory overhead of this change is likely in the order of
    ~5-10%, possibly less depending on the register file layout. On my test
    system with a bitmap of ~4300 bits and a relatively sparse register
    layout, the memory requirements for the entire cache did not increase
    (the cutting down of nodes which was about 50% of the original number
    compensated the situation).

    A second patch that can be built on top of this can look at the
    ratio `sizeof(*rbnode) / map->cache_word_size' in order to suitably
    adjust the block length of each block.

    Signed-off-by: Dimitris Papastamos
    Signed-off-by: Mark Brown

    Dimitris Papastamos
     

14 Mar, 2013

1 commit

  • The last register block, which falls into the specified range, is not handled
    correctly. The formula which calculates the number of register which should be
    synced is inverse (and off by one). E.g. if all registers in that block should
    be synced only one is synced, and if only one should be synced all (but one) are
    synced. To calculate the number of registers that need to be synced we need to
    subtract the number of the first register in the block from the max register
    number and add one. This patch updates the code accordingly.

    The issue was introduced in commit ac8d91c ("regmap: Supply ranges to the sync
    operations").

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown
    Cc: stable@vger.kernel.org

    Lars-Peter Clausen
     

13 Mar, 2013

1 commit


04 Mar, 2013

2 commits


10 Apr, 2012

2 commits

  • regmap_config.reg_stride is introduced. All extant register addresses
    are a multiple of this value. Users of serial-oriented regmap busses will
    typically set this to 1. Users of the MMIO regmap bus will typically set
    this based on the value size of their registers, in bytes, so 4 for a
    32-bit register.

    Throughout the regmap code, actual register addresses are used. Wherever
    the register address is used to index some array of values, the address
    is divided by the stride to determine the index, or vice-versa. Error-
    checking is added to all entry-points for register address data to ensure
    that register addresses actually satisfy the specified stride. The MMIO
    bus ensures that the specified stride is large enough for the register
    size.

    Signed-off-by: Stephen Warren
    Signed-off-by: Mark Brown

    Stephen Warren
     
  • Mark Brown
     

08 Apr, 2012

1 commit

  • Pull two more small regmap fixes from Mark Brown:
    - Now we have users for it that aren't running Android it turns out
    that regcache_sync_region() is much more useful to drivers if it's
    exported for use by modules. Who knew?
    - Make sure we don't divide by zero when doing debugfs dumps of
    rbtrees, not visible up until now because everything was providing at
    least some cache on startup.

    * tag 'regmap-3.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
    regmap: prevent division by zero in rbtree_show
    regmap: Export regcache_sync_region()

    Linus Torvalds
     

06 Apr, 2012

1 commit


05 Apr, 2012

1 commit


01 Apr, 2012

1 commit

  • The code currently passes the register offset in the current block to
    regcache_lookup_reg. This works fine as long as there is only one block and with
    base register of 0, but in all other cases it will look-up the default for a
    wrong register, which can cause unnecessary register writes. This patch fixes
    it by passing the actual register number to regcache_lookup_reg.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Mark Brown
    Cc:

    Lars-Peter Clausen
     

25 Mar, 2012

1 commit

  • Pull avoidance patches from Paul Gortmaker:
    "Nearly every subsystem has some kind of header with a proto like:

    void foo(struct device *dev);

    and yet there is no reason for most of these guys to care about the
    sub fields within the device struct. This allows us to significantly
    reduce the scope of headers including headers. For this instance, a
    reduction of about 40% is achieved by replacing the include with the
    simple fact that the device is some kind of a struct.

    Unlike the much larger module.h cleanup, this one is simply two
    commits. One to fix the implicit users, and then one
    to delete the device.h includes from the linux/include/ dir wherever
    possible."

    * tag 'device-for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
    device.h: audit and cleanup users in main include dir
    device.h: cleanup users outside of linux/include (C files)

    Linus Torvalds
     

14 Mar, 2012

1 commit


12 Mar, 2012

1 commit

  • For files that are actively using linux/device.h, make sure
    that they call it out. This will allow us to clean up some
    of the implicit uses of linux/device.h within include/*
    without introducing build regressions.

    Yes, this was created by "cheating" -- i.e. the headers were
    cleaned up, and then the fallout was found and fixed, and then
    the two commits were reordered. This ensures we don't introduce
    build regressions into the git history.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker