30 Sep, 2010

7 commits


29 Sep, 2010

11 commits

  • I have been seeing occasional pauses in transaction throughput up to
    30s long under heavy parallel workloads. The only notable thing was
    that the xfsaild was trying to be active during the pauses, but
    making no progress. It was running exactly 20 times a second (on the
    50ms no-progress backoff), and the number of pushbuf events was
    constant across this time as well. IOWs, the xfsaild appeared to be
    stuck on buffers that it could not push out.

    Further investigation indicated that it was trying to push out inode
    buffers that were pinned and/or locked. The xfsbufd was also getting
    woken at the same frequency (by the xfsaild, no doubt) to push out
    delayed write buffers. The xfsbufd was not making any progress
    because all the buffers in the delwri queue were pinned. This scan-
    and-make-no-progress dance went one in the trace for some seconds,
    before the xfssyncd came along an issued a log force, and then
    things started going again.

    However, I noticed something strange about the log force - there
    were way too many IO's issued. 516 log buffers were written, to be
    exact. That added up to 129MB of log IO, which got me very
    interested because it's almost exactly 25% of the size of the log.
    He delayed logging code is suppose to aggregate the minimum of 25%
    of the log or 8MB worth of changes before flushing. That's what
    really puzzled me - why did a log force write 129MB instead of only
    8MB?

    Essentially what has happened is that no CIL pushes had occurred
    since the previous tail push which cleared out 25% of the log space.
    That caused all the new transactions to block because there wasn't
    log space for them, but they kick the xfsaild to push the tail.
    However, the xfsaild was not making progress because there were
    buffers it could not lock and flush, and the xfsbufd could not flush
    them because they were pinned. As a result, both the xfsaild and the
    xfsbufd could not move the tail of the log forward without the CIL
    first committing.

    The cause of the problem was that the background CIL push, which
    should happen when 8MB of aggregated changes have been committed, is
    being held off by the concurrent transaction commit load. The
    background push does a down_write_trylock() which will fail if there
    is a concurrent transaction commit holding the push lock in read
    mode. With 8 CPUs all doing transactions as fast as they can, there
    was enough concurrent transaction commits to hold off the background
    push until tail-pushing could no longer free log space, and the halt
    would occur.

    It should be noted that there is no reason why it would halt at 25%
    of log space used by a single CIL checkpoint. This bug could
    definitely violate the "no transaction should be larger than half
    the log" requirement and hence result in corruption if the system
    crashed under heavy load. This sort of bug is exactly the reason why
    delayed logging was tagged as experimental....

    The fix is to start blocking background pushes once the threshold
    has been exceeded. Rework the threshold calculations to keep the
    amount of log space a CIL checkpoint can use to below that of the
    AIL push threshold to avoid the problem completely.

    Signed-off-by: Dave Chinner
    Reviewed-by: Alex Elder
    Reviewed-by: Christoph Hellwig

    Dave Chinner
     
  • In max8925_irq_sync_unlock(), irq control bit is set at the same time.
    Zero means enabling irq, and one means disabling irq.

    The original code is:
    irq_chg[0] &= irq_data->enable;

    It should be changed to:
    irq_chg[0] &= ~irq_data->enable;

    Otherwise, irq control bit is mess.

    Signed-off-by: Kevin Liu
    Signed-off-by: Haojian Zhuang
    Signed-off-by: Samuel Ortiz

    Kevin Liu
     
  • The driver was originally tested with an additional patch which
    made this unneeded but that patch had issuges and got lost on the
    way to mainline, causing problems when the errors are reported.

    Signed-off-by: Mark Brown
    Signed-off-by: Samuel Ortiz
    Cc: stable@kernel.org

    Mark Brown
     
  • Linus Torvalds
     
  • When caching is disabled on the MN10300 arch, the sys_cacheflush()
    function is removed by conditional stuff in the makefiles, but is still
    referred to by the syscall table.

    Provide a null version that just returns 0 when caching is disabled (or
    -EINVAL if the arguments are silly).

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Tssk. Apparently Al hadn't checked commit c52c2ddc1dfa ("alpha: switch
    osf_sigprocmask() to use of sigprocmask()") at all. It doesn't compile.

    Fixed as per suggestions from Michael Cree.

    Reported-by: Michael Cree
    Cc: Al Viro
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
    ahci: fix module refcount breakage introduced by libahci split

    Linus Torvalds
     
  • libata depends on scsi_host_template for module reference counting and
    sht's should be owned by each low level driver. During libahci split,
    the sht was left with libahci.ko leaving the actual low level drivers
    not reference counted. This made ahci and ahci_platform always
    unloadable even while they're being actively used.

    Fix it by defining AHCI_SHT() macro in ahci.h and defining a sht for
    each low level ahci driver.

    stable: only applicable to 2.6.35.

    Signed-off-by: Tejun Heo
    Reported-by: Pedro Francisco
    Tested-by: Michael Tokarev
    Cc: stable@kernel.org
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/staging:
    hwmon (coretemp): Fix build breakage if SMP is undefined

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: fix pci_resource_alignment prototype

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
    tcp: Fix >4GB writes on 64-bit.
    net/9p: Mount only matching virtio channels
    de2104x: fix ethtool
    tproxy: check for transparent flag in ip_route_newports
    ipv6: add IPv6 to neighbour table overflow warning
    tcp: fix TSO FACK loss marking in tcp_mark_head_lost
    3c59x: fix regression from patch "Add ethtool WOL support"
    ipv6: add a missing unregister_pernet_subsys call
    s390: use free_netdev(netdev) instead of kfree()
    sgiseeq: use free_netdev(netdev) instead of kfree()
    rionet: use free_netdev(netdev) instead of kfree()
    ibm_newemac: use free_netdev(netdev) instead of kfree()
    smsc911x: Add MODULE_ALIAS()
    net: reset skb queue mapping when rx'ing over tunnel
    br2684: fix scheduling while atomic
    de2104x: fix TP link detection
    de2104x: fix power management
    de2104x: disable autonegotiation on broken hardware
    net: fix a lockdep splat
    e1000e: 82579 do not gate auto config of PHY by hardware during nominal use
    ...

    Linus Torvalds
     

28 Sep, 2010

22 commits