24 Mar, 2012

40 commits

  • Add two changes that improve the performance of x86 systems

    1. replace main loop with incrementing counter this change improves
    the performance of the selftest by about 5-6% on Nehalem CPUs. The
    apparent reason is that the compiler can use the loop index to perform
    an indexed memory access. This is reported to make the performance of
    PowerPC CPUs to get worse.

    2. replace the rem_len loop with incrementing counter this change
    improves the performance of the selftest, which has more than the usual
    number of occurances, by about 1-2% on x86 CPUs. In actual work loads
    the length is most often a multiple of 4 bytes and this code does not
    get executed as often if at all. Again this change is reported to make
    the performance of PowerPC get worse.

    [djwong@us.ibm.com: Minor changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • Add slicing-by-8 algorithm to the existing slicing-by-4 algorithm. This
    consists of:

    - extend largest BITS size from 32 to 64
    - extend tables from tab[4][256] to up to tab[8][256]
    - Add code for inner loop.

    [djwong@us.ibm.com: Minor changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • crc32.c provides a choice of one of several algorithms for computing the
    LSB and LSB versions of the CRC32 checksum based on the parameters
    CRC_LE_BITS and CRC_BE_BITS.

    In the original version the values 1, 2, 4 and 8 respectively selected
    versions of the alrogithm that computed the crc 1, 2, 4 and 32 bits as a
    time.

    This patch series adds a new version that computes the CRC 64 bits at a
    time. To make things easier to understand the parameter has been
    reinterpreted to actually stand for the number of bits processed in each
    step of the algorithm so that the old value 8 has been replaced with the
    value 32.

    This also allows us to add in a widely used crc algorithm that computes
    the crc 8 bits at a time called the Sarwate algorithm.

    [djwong@us.ibm.com: Minor changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • crc32.c in its original version freely mixed u32, __le32 and __be32 types
    which caused warnings from sparse with __CHECK_ENDIAN__. This patch fixes
    these by forcing the types to u32.

    [djwong@us.ibm.com: Minor changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • Misc cleanup of lib/crc32.c and related files.

    - remove unnecessary header files.

    - straighten out some convoluted ifdef's

    - rewrite some references to 2 dimensional arrays as 1 dimensional
    arrays to make them correct. I.e. replace tab[i] with tab[0][i].

    - a few trivial whitespace changes

    - fix a warning in gen_crc32tables.c caused by a mismatch in the type of
    the pointer passed to output table. Since the table is only used at
    kernel compile time, it is simpler to make the table big enough to hold
    the largest column size used. One cannot make the column size smaller
    in output_table because it has to be used by both the le and be tables
    and they can have different column sizes.

    [djwong@us.ibm.com: Minor changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • Replace the unit test provided in crc32.c, which doesn't have a makefile
    and doesn't compile with current headers, with a simpler self test
    routine that also gives a measure of performance and runs at module init
    time. The self test option can be enabled through a configuration
    option CONFIG_CRC32_SELFTEST.

    The test stresses the pre and post loops and is thus not very realistic
    since actual uses will likely have addresses and lengths that are at
    least 4 byte aligned. However, the main loop is long enough so that the
    performance is dominated by that loop.

    The expected values for crc32_le and crc32_be were generated with the
    original version of crc32.c using CRC_BITS_LE = 8 and CRC_BITS_BE = 8.
    These values were then used to check all the values of the BITS
    parameters in both the original and new versions.

    The performance results show some variability from run to run in spite
    of attempts to both warm the cache and reduce the amount of OS noise by
    limiting interrutps during the test. To get comparable results and to
    analyse options wrt performance the best time reported over a small
    sample of runs has been taken.

    [djwong@us.ibm.com: Minor changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • Move a long comment from lib/crc32.c to Documentation/crc32.txt where it
    will more likely get read.

    Edited the resulting document to add an explanation of the slicing-by-n
    algorithm.

    [djwong@us.ibm.com: minor changelog tweaks]
    [akpm@linux-foundation.org: fix typo, per George]
    Signed-off-by: George Spelvin
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • This patchset (re)uses Bob Pearson's crc32 slice-by-8 code to stamp out
    a software crc32c implementation. It removes the crc32c implementation
    in crypto/ in favor of using the stamped-out one in lib/. There is also
    a change to Kconfig so that the kernel builder can pick an
    implementation best suited for the hardware.

    The motivation for this patchset is that I am working on adding full
    metadata checksumming to ext4. As far as performance impact of adding
    checksumming goes, I see nearly no change with a standard mail server
    ffsb simulation. On a test that involves only file creation and
    deletion and extent tree writes, I see a drop of about 50 pcercent with
    the current kernel crc32c implementation; this improves to a drop of
    about 20 percent with the enclosed crc32c code.

    When metadata is usually a small fraction of total IO, this new
    implementation doesn't help much because metadata is usually a small
    fraction of total IO. However, when we are doing IO that is almost all
    metadata (such as rm -rf'ing a tree), then this patch speeds up the
    operation substantially.

    Incidentally, given that iscsi, sctp, and btrfs also use crc32c, this
    patchset should improve their speed as well. I have not yet quantified
    that, however. This latest submission combines Bob's patches from late
    August 2011 with mine so that they can be one coherent patch set.
    Please excuse my inability to combine some of the patches; I've been
    advised to leave Bob's patches alone and build atop them instead. :/

    Since the last posting, I've also collected some crc32c test results on
    a bunch of different x86/powerpc/sparc platforms. The results can be
    viewed here: http://goo.gl/sgt3i ; the "crc32-kern-le" and "crc32c"
    columns describe the performance of the kernel's current crc32 and
    crc32c software implementations. The "crc32c-by8-le" column shows
    crc32c performance with this patchset applied. I expect crc32
    performance to be roughly the same.

    The two _boost columns at the right side of the spreadsheet shows how much
    faster the new implementation is over the old one. As you can see, crc32
    rises substantially, and crc32c experiences a huge increase.

    This patch:

    - remove trailing whitespace from lib/crc32.c
    - remove trailing whitespace from lib/crc32defs.h

    [djwong@us.ibm.com: changelog tweaks]
    Signed-off-by: Bob Pearson
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Pearson
     
  • checkpatch already makes an exception to the 80-column rule for quoted
    strings, and Documentation/CodingStyle recommends not splitting quoted
    strings across lines, because it breaks the ability to grep for the
    string. Rather than just permitting this, actively warn about quoted
    strings split across lines.

    Test case:

    void context(void)
    {
    struct { unsigned magic; const char *strdata; } foo[] = {
    { 42, "these strings"
    "do not produce warnings" },
    { 256, "though perhaps"
    "they should" },
    };
    pr_err("this string"
    " should produce a warning\n");
    pr_err("this multi-line string\n"
    "should not produce a warning\n");
    asm ("this asm\n\t"
    "should not produce a warning");
    }

    Results of checkpatch on that test case:

    WARNING: quoted string split across lines
    + " should produce a warning\n");

    total: 0 errors, 1 warnings, 15 lines checked

    Signed-off-by: Josh Triplett
    Acked-by: Joe Perches
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     
  • Add blank lines between a few tests, remove an extraneous one.

    Signed-off-by: Joe Perches
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Using yield() is generally wrong. Warn on its use.

    Signed-off-by: Joe Perches
    Cc: Andy Whitcroft
    Cc: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Add some more subjective --strict tests.

    Add a test for block comments that start with a blank line followed only
    by a line with just the comment block initiator. Prefer a blank line
    followed by /* comment...

    Add a test for unnecessary spaces after a cast.

    Add a test for symmetric uses of braces in if/else blocks.
    If one branch needs braces, then all branches should use braces.

    Signed-off-by: Joe Perches
    Cc: Andy Whitcroft
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Add [] to a type extensions. Fixes false positives on:

    .attrs = (struct attribute *[]) {

    Signed-off-by: Andy Whitcroft
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • With any very high precedence operator it is not necessary to enforce
    additional parentheses around simple negated expressions. This prevents
    us requesting further perentheses around the following:

    #define PMEM_IS_FREE(id, index) !(pmem[id].bitmap[index].allocated)

    For now add logical and bitwise not and unary minus.

    Signed-off-by: Andy Whitcroft
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • Adjacent strings indicate concatentation, therefore look at identifiers
    directly adjacent to literal strings as strings too. This allows us to
    better detect the form below and accept it as a simple constant:

    #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt

    Signed-off-by: Andy Whitcroft
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • Signed-off-by: Andy Whitcroft
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • Handle the [ A ... B ] form deeper into a definition, for example:

    static const unsigned char pci_irq_swizzle[2][PCI_MAX_DEVICES] = {
    {0, 0, 0, 0, 0, 0, 0, 27, 27, [9 ... PCI_MAX_DEVICES - 1] = 0 },
    {0, 0, 0, 0, 0, 0, 0, 29, 29, [9 ... PCI_MAX_DEVICES - 1] = 0 },
    };

    Reported-by: Marek Vasut
    Signed-off-by: Andy Whitcroft
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • Fix checkpatch.pl when both -q and --ignore are given and prevents it from
    printing a

    NOTE: Ignored message types: blah

    messages.

    E.g., if I use -q --ignore PREFER_PACKED,PREFER_ALIGNED, i see:

    NOTE: Ignored message types: PREFER_ALIGNED PREFER_PACKED

    It makes no sense to print this when -q is given.

    Signed-off-by: Artem Bityutskiy
    Cc: Andy Whitcroft
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Artem Bityutskiy
     
  • Argument alignment across multiple lines should match the open
    parenthesis.

    Logical continuations should be at the end of the previous line, not the
    start of a new line.

    These are not required by CodingStyle so make the tests active only when
    using --strict.

    Improved by some examples from Bruce Allen.

    Signed-off-by: Joe Perches
    Cc: "Bruce W. Allen"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • It's equivalent to __printf, so prefer __scanf.

    Signed-off-by: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Introduce prio_set_parent() to abstract the operation which is used to
    attach the node to its parent.

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     
  • In current code, the deleted-node is recorded from first to last,
    actually, we can directly attach these node on 'node' we will insert as
    the left child, it can let the code more readable.

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     
  • Introduce iter_walk_down()/iter_walk_up() to remove the common code
    between prio_tree_left() and prio_tree_right().

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     
  • Remove the code since 'node' has already been initialized in the begin of
    the function

    Signed-off-by: Xiao Guangrong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiao Guangrong
     
  • - Generate a 64-bit pattern more efficiently

    memchr_inv needs to generate a 64-bit pattern filled with a target
    character. The operation can be done by more efficient way.

    - Don't call the slow check_bytes() if the memory area is 64-bit aligned

    memchr_inv compares contiguous 64-bit words with the 64-bit pattern as
    much as possible. The outside of the region is checked by check_bytes()
    that scans for each byte. Unfortunately, the first 64-bit word is
    unexpectedly scanned by check_bytes() even if the memory area is aligned
    to a 64-bit boundary.

    Both changes were originally suggested by Eric Dumazet.

    Signed-off-by: Akinobu Mita
    Suggested-by: Eric Dumazet
    Cc: Brian Norris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • After moving some core functions to led-core.c, led-class.c can be built
    as module again.

    Signed-off-by: Bryan Wu
    Acked-by: Richard Purdie
    Acked-by: Linus Walleij
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bryan Wu
     
  • Improve the readability by moving the code setting gen_config to one
    place.

    [akpm@linux-foundation.org: fix some patch skew]
    Signed-off-by: Axel Lin
    Cc: Shreshtha Kumar Sahu
    Cc: "Milo(Woogyom) Kim"
    Cc: Richard Purdie
    Acked-by: Linus Walleij
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • Signed-off-by: Axel Lin
    Cc: Peter Meerwald
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • Use 'pdata' rather than 'pltfm' in lm3530_init_registers().

    Signed-off-by: Milo(Woogyom) Kim
    Cc: Linus Walleij
    Cc: Richard Purdie
    Cc: Axel Lin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • LM3530_ALS_ZONE_REG is read-only register.
    Writing this register is not necessary.

    Signed-off-by: Milo(Woogyom) Kim
    Cc: Linus Walleij
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • * add 'struct lm3530_pwm_data' in the platform data
    The pwm data is the platform specific functions which generate the pwm.
    The pwm data is only valid when brightness is pwm input mode.
    Functions should be implemented by the pwm driver.
    pwm_set_intensity() : set duty of pwm.
    pwm_get_intensity() : get current the brightness.

    * brightness control by pwm
    If the control mode is pwm, then brightness is changed by the duty of
    pwm=. So pwm platform function should be called in lm3530_brightness_set().

    * do not update brightness register when pwm input mode
    In pwm input mode, brightness register is not used.
    If any value is updated in this register, then the led will be off.

    * when input mode is changed, set duty of pwm to 0 if unnecessary.

    Signed-off-by: Milo(Woogyom) Kim
    Cc: Linus Walleij
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • To get members of lm3530_data, use 'struct led_classdev' rather than
    'struct i2c_client'.

    [akpm@linux-foundation.org: fix 80-column fixes more nicely]
    Signed-off-by: Milo(Woogyom) Kim
    Cc: Linus Walleij
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • Only 7 bits are used for updating the brightness. (register address :
    A0h) So the max_brightness property of lm3530 should be set to 127.

    On initializing registers, maximum initial brightness is limited to
    'max_brightness'.

    Division-by-two is removed on updating the brightness. This arithmetic is
    not necessary because the range of brightness is 0 ~ 127= .

    Signed-off-by: Milo(Woogyom) Kim
    Cc: Linus Walleij
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • Direct usage of the asm include has long been deprecated by the
    introduction of gpiolib.

    Signed-off-by: Mark Brown
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Brown
     
  • Driver for the PCA9633 I2C chip supporting four LEDs and 255 brightness
    levels.

    [akpm@linux-foundation.org: fix kcalloc() call]
    [axel.lin@gmail.com: fix kcalloc parameters swapped]
    Signed-off-by: Peter Meerwald
    Signed-off-by: Axel Lin
    Cc: Lars-Peter Clausen
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Meerwald
     
  • Saves ~50 bytes text and speeds things up.

    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Fix it by assigning the lp5521_read return value.

    Signed-off-by: srinidhi kasagar
    Cc: Milo(Woogyom) Kim
    Cc: Linus Walleij
    Cc: Arun MURTHY
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Srinidhi KASAGAR
     
  • For better readability, values of LP5521_REG_ENABLE register were
    redefined= . Additional definitions: LP5521_ENABLE_DEFAULT and
    LP5521_ENABLE_RUN_PROGRAM= .

    Use definition rather than hard code value.
    : 0x3F -> 'LP5521_CMD_DIRECT'

    Signed-off-by: Milo(Woogyom) Kim
    Acked-by: Linus Walleij
    Cc: Arun MURTHY
    Cc: Srinidhi Kasagar
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • The lp5521 has autonomous operation mode without external control.
    Using lp5521_platform_data, various led patterns can be configurable.
    For supporting this feature, new functions and device attribute are
    added.

    Structure of lp5521_led_pattern: 3 channels are supported - red, green
    and blue. Pattern(s) of each channel and numbers of pattern(s) are
    defined in the pla= tform data. Pattern data are hexa codes which
    include pattern commands such like set pwm, wait, ramp up/down, branch
    and so on.

    Pattern mode functions:
    * lp5521_clear_program_memory
    Before running new led pattern, program memory should be cleared.
    * lp5521_write_program_memory
    Pattern data updated in the program memory via the i2c.
    * lp5521_get_pattern
    Get pattern from predefined in the platform data.
    * lp5521_run_led_pattern
    Stop current pattern or run new pattern.
    Transition time is required between different operation mode.

    Device attribute - 'led_pattern': To load specific led pattern, new device
    attribute is added.

    When the lp5521 driver is unloaded, stop current led pattern mode.

    Documentation updated : description about how to define the led patterns
    and example.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Milo(Woogyom) Kim
    Acked-by: Linus Walleij
    Cc: Arun MURTHY
    Cc: Srinidhi Kasagar
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo
     
  • The value of CONFIG register(Addr 08h) is configurable. For supporting
    this feature, update_config is added in the platform data. If
    'update_config' is not defined, the default value is 'LP5521_PWRSAVE_EN |
    LP5521_CP_MODE_AUTO | LP5521_R_TO_BATT'.

    To define CONFIG register in the platform data, the bit definitions were
    mo= ved to the header file.

    Documentation updated : description about 'update_config' and example.

    Signed-off-by: Milo(Woogyom) Kim
    Acked-by: Linus Walleij
    Cc: Arun MURTHY
    Cc: Srinidhi Kasagar
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kim, Milo