05 Jan, 2011

2 commits

  • /proc/diskstats would display a strange output as follows.

    $ cat /proc/diskstats |grep sda
    8 0 sda 90524 7579 102154 20464 0 0 0 0 0 14096 20089
    8 1 sda1 19085 1352 21841 4209 0 0 0 0 4294967064 15689 4293424691
    ~~~~~~~~~~
    8 2 sda2 71252 3624 74891 15950 0 0 0 0 232 23995 1562390
    8 3 sda3 54 487 2188 92 0 0 0 0 0 88 92
    8 4 sda4 4 0 8 0 0 0 0 0 0 0 0
    8 5 sda5 81 2027 2130 138 0 0 0 0 0 87 137

    Its reason is the wrong way of accounting hd_struct->in_flight. When a bio is
    merged into a request belongs to different partition by ELEVATOR_FRONT_MERGE.

    The detailed root cause is as follows.

    Assuming that there are two partition, sda1 and sda2.

    1. A request for sda2 is in request_queue. Hence sda1's hd_struct->in_flight
    is 0 and sda2's one is 1.

    | hd_struct->in_flight
    ---------------------------
    sda1 | 0
    sda2 | 1
    ---------------------------

    2. A bio belongs to sda1 is issued and is merged into the request mentioned on
    step1 by ELEVATOR_BACK_MERGE. The first sector of the request is changed
    from sda2 region to sda1 region. However the two partition's
    hd_struct->in_flight are not changed.

    | hd_struct->in_flight
    ---------------------------
    sda1 | 0
    sda2 | 1
    ---------------------------

    3. The request is finished and blk_account_io_done() is called. In this case,
    sda2's hd_struct->in_flight, not a sda1's one, is decremented.

    | hd_struct->in_flight
    ---------------------------
    sda1 | -1
    sda2 | 1
    ---------------------------

    The patch fixes the problem by caching the partition lookup
    inside the request structure, hence making sure that the increment
    and decrement will always happen on the same partition struct. This
    also speeds up IO with accounting enabled, since it cuts down on
    the number of lookups we have to do.

    Also add a refcount to struct hd_struct to keep the partition in
    memory as long as users exist. We use kref_test_and_get() to ensure
    we don't add a reference to a partition which is going away.

    Signed-off-by: Jerome Marchand
    Signed-off-by: Yasuaki Ishimatsu
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Jerome Marchand
     
  • Add kref_test_and_get() function, which atomically add a reference only if
    refcount is not zero. This prevent to add a reference to an object that is
    already being removed.

    Signed-off-by: Jerome Marchand
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Jerome Marchand
     

03 Jan, 2011

2 commits


21 Dec, 2010

1 commit


20 Dec, 2010

1 commit

  • Commit a8adbe3 forgot to remove the return variable, kill it.

    drivers/block/loop.c: In function 'lo_splice_actor':
    drivers/block/loop.c:398: warning: unused variable 'ret'
    [...]
    fs/nfsd/vfs.c: In function 'nfsd_splice_actor':
    fs/nfsd/vfs.c:848: warning: unused variable 'ret'

    Reported-by: Stephen Rothwell
    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Dec, 2010

4 commits


13 Dec, 2010

1 commit


01 Dec, 2010

2 commits


28 Nov, 2010

1 commit


16 Nov, 2010

26 commits

  • Signed-off-by: Kiyoshi Ueda
    Signed-off-by: Jun'ichi Nomura
    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     
  • Jens Axboe
     
  • Jens Axboe
     
  • Linus Torvalds
     
  • The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
    failure when CONFIG_PRINTK=n. This is because the capabilities code
    which used the new option was built even though the variable in question
    didn't exist.

    The patch here fixes this by moving the capabilities checks out of the
    LSM and into the caller. All (known) LSMs should have been calling the
    capabilities hook already so it actually makes the code organization
    better to eliminate the hook altogether.

    Signed-off-by: Eric Paris
    Acked-by: James Morris
    Signed-off-by: Linus Torvalds

    Eric Paris
     
  • …/git/tmlind/linux-omap-2.6

    * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
    arm: omap1: devices: need to return with a value
    OMAP1: camera.h: add missing include
    omap: dma: Add read-back to DMA interrupt handler to avoid spuriousinterrupts
    OMAP2: Devkit8000: Fix mmc regulator failure

    Linus Torvalds
     
  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    hwmon: (w83795) Check for BEEP pin availability
    hwmon: (w83795) Clear intrusion alarm immediately
    hwmon: (w83795) Read the intrusion state properly
    hwmon: (w83795) Print the actual temperature channels as sources
    hwmon: (w83795) List all usable temperature sources
    hwmon: (w83795) Expose fan control method
    hwmon: (w83795) Fix fan control mode attributes
    hwmon: (lm95241) Check validity of input values
    hwmon: Change mail address of Hans J. Koch

    Linus Torvalds
     
  • * 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    i2c: Sanity checks on adapter registration
    i2c: Mark i2c_adapter.id as deprecated
    i2c: Drivers shouldn't include
    i2c: Delete unused adapter IDs
    i2c: Remove obsolete cleanup for clientdata

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: sysfs: fix printk warnings
    PCI: fix pci_bus_alloc_resource() hang, prefer positive decode
    PCI: read current power state at enable time
    PCI: fix size checks for mmap() on /proc/bus/pci files
    x86/PCI: coalesce overlapping host bridge windows
    PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area

    Linus Torvalds
     
  • Make sure I2C adapters being registered have the required struct
    fields set. If they don't, problems will happen later.

    Signed-off-by: Jean Delvare

    Jean Delvare
     
  • It's about time to make it clear that i2c_adapter.id is deprecated.
    Hopefully this will remind the last user to move over to a different
    strategy.

    Signed-off-by: Jean Delvare
    Acked-by: Jarod Wilson
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Hans Verkuil

    Jean Delvare
     
  • Drivers don't need to include , especially not when
    they don't use anything that header file provides.

    Signed-off-by: Jean Delvare
    Cc: Michael Hunold
    Acked-by: Mauro Carvalho Chehab

    Jean Delvare
     
  • Delete unused I2C adapter IDs. Special cases are:

    * I2C_HW_B_RIVA was still set in driver rivafb, however no other
    driver is ever looking for this value, so we can safely remove it.
    * I2C_HW_B_HDPVR is used in staging driver lirc_zilog, however no
    adapter ID is ever set to this value, so the code in question never
    runs. As the code additionally expects that I2C_HW_B_HDPVR may not
    be defined, we can delete it now and let the lirc_zilog driver
    maintainer rewrite this piece of code.

    Big thanks for Hans Verkuil for doing all the hard work :)

    Signed-off-by: Jean Delvare
    Acked-by: Jarod Wilson
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Hans Verkuil

    Jean Delvare
     
  • A few new i2c-drivers came into the kernel which clear the clientdata-pointer
    on exit. This is obsolete meanwhile, so fix it and hope the word will spread.

    Signed-off-by: Wolfram Sang
    Acked-by: Alan Cox
    Acked-by: Guennadi Liakhovetski
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Jean Delvare

    Wolfram Sang
     
  • Move the logging bits from kernel.h into printk.h so that
    there is a bit more logical separation of the generic from
    the printk logging specific parts.

    Signed-off-by: Joe Perches
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The fix in commit 6b4e81db2552 ("i8k: Tell gcc that *regs gets
    clobbered") to work around the gcc miscompiling i8k.c to add "+m
    (*regs)" caused register pressure problems and a build failure.

    Changing the 'asm' statement to 'asm volatile' instead should prevent
    that and works around the gcc bug as well, so we can remove the "+m".

    [ Background on the gcc bug: a memory clobber fails to mark the function
    the asm resides in as non-pure (aka "__attribute__((const))"), so if
    the function does nothing else that triggers the non-pure logic, gcc
    will think that that function has no side effects at all. As a result,
    callers will be mis-compiled.

    Adding the "+m" made gcc see that it's not a pure function, and so
    does "asm volatile". The problem was never really the need to mark
    "*regs" as changed, since the memory clobber did that part - the
    problem was just a bug in the gcc "pure" function analysis - Linus ]

    Signed-off-by: Jim Bos
    Acked-by: Jakub Jelinek
    Cc: Andi Kleen
    Cc: Andreas Schwab
    Signed-off-by: Linus Torvalds

    Jim Bos
     
  • On the W83795ADG, there's a single pin for BEEP and OVT#, so you
    can't have both. Check the configuration and don't create beep
    attributes when BEEP pin is not available.

    The W83795G has a dedicated BEEP pin so the functionality is always
    available there.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • When asked to clear the intrusion alarm, do so immediately. We have to
    invalidate the cache to make sure the new status will be read. But we
    also have to read from the status register once to clear the pending
    alarm, as writing to CLR_CHS surprising won't clear it automatically.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • We can't read the intrusion state from the real-time alarm registers
    as we do for all other alarm flags, because real-time alarm bits don't
    stick (by definition) and the intrusion state has to stick until
    explicitly cleared (otherwise it has little value.)

    So we have to use the interrupt status register instead, which is read
    from the same address but with a configuration bit flipped in another
    register.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • Don't expose raw register values to user-space. Decode and encode
    temperature channels selected as temperature sources as needed.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • Temperature sources are not correlated directly with temperature
    channels. A look-up table is required to find out which temperature
    sources can be used depending on which temperature channels (both
    analog and digital) are enabled.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • Expose fan control method (DC vs. PWM) using the standard sysfs
    attributes. I've made it read-only as the board should be wired for
    a given mode, the BIOS should have set up the chip for this mode, and
    you shouldn't have to change it. But it would be easy enough to make
    it changeable if someone comes up with a use case.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • There were two bugs:
    * Speed cruise mode was improperly reported for all fans but fan1.
    * Fan control method (PWM vs. DC) was mixed with the control mode.
    It will be added back as a separate attribute, as per the standard
    sysfs interface.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • This clears the following build-time warnings I was seeing:

    drivers/hwmon/lm95241.c: In function "set_interval":
    drivers/hwmon/lm95241.c:132:15: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_max2":
    drivers/hwmon/lm95241.c:278:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_max1":
    drivers/hwmon/lm95241.c:277:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_min2":
    drivers/hwmon/lm95241.c:249:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_min1":
    drivers/hwmon/lm95241.c:248:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_type2":
    drivers/hwmon/lm95241.c:220:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_type1":
    drivers/hwmon/lm95241.c:219:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result

    This also fixes a small race in set_interval() as a side effect: by
    working with a temporary local variable we prevent data->interval from
    being accessed at a time it contains the interval value in the wrong
    unit.

    Signed-off-by: Jean Delvare
    Cc: Davide Rizzo

    Jean Delvare
     
  • My old mail address doesn't exist anymore. This changes all occurrences
    to my new address.

    Signed-off-by: Hans J. Koch
    Signed-off-by: Jean Delvare

    Hans J. Koch
     
  • o Allow hierarchical cgroup creation for blkio controller

    o Currently we disallow it as both the io controller policies (throttling
    as well as proportion bandwidth) do not support hierarhical accounting
    and control. But the flip side is that blkio controller can not be used with
    libvirt as libvirt creates a cgroup hierarchy deeper than 1 level.

    //libvirt/qemu/

    o So this patch will allow creation of cgroup hierarhcy but at the backend
    everything will be treated as flat. So if somebody created a an hierarchy
    like as follows.

    root
    / \
    test1 test2
    |
    test3

    CFQ and throttling will practically treat all groups at same level.

    pivot
    / | \ \
    root test1 test2 test3

    o Once we have actual support for hierarchical accounting and control
    then we can introduce another cgroup tunable file "blkio.use_hierarchy"
    which will be 0 by default but if user wants to enforce hierarhical
    control then it can be set to 1. This way there should not be any
    ABI problems down the line.

    o The only not so pretty part is introduction of extra file "use_hierarchy"
    down the line. Kame-san had mentioned that hierarhical accounting is
    expensive in memory controller hence they keep it off by default. I
    suspect same will be the case for IO controller also as for each IO
    completion we shall have to account IO through hierarchy up to the root.
    if yes, then it probably is not a very bad idea to introduce this extra
    file so that it will be used only when somebody needs it and some people
    might enable hierarchy only in part of the hierarchy.

    o This is how basically memory controller also uses "use_hierarhcy" and
    they also allowed creation of hierarchies when actual backend support
    was not available.

    Signed-off-by: Vivek Goyal
    Acked-by: Balbir Singh
    Reviewed-by: Gui Jianfeng
    Reviewed-by: Ciju Rajan K
    Tested-by: Ciju Rajan K
    Signed-off-by: Jens Axboe

    Vivek Goyal