18 Oct, 2011

2 commits

  • There's a lock inversion between the cputimer->lock and rq->lock;
    notably the two callchains involved are:

    update_rlimit_cpu()
    sighand->siglock
    set_process_cpu_timer()
    cpu_timer_sample_group()
    thread_group_cputimer()
    cputimer->lock
    thread_group_cputime()
    task_sched_runtime()
    ->pi_lock
    rq->lock

    scheduler_tick()
    rq->lock
    task_tick_fair()
    update_curr()
    account_group_exec()
    cputimer->lock

    Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
    the second one is keeping up-to-date.

    This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
    SMP accounting oddities").

    Cure the problem by removing the cputimer->lock and rq->lock nesting,
    this leaves concurrent enablers doing duplicate work, but the time
    wasted should be on the same order otherwise wasted spinning on the
    lock and the greater-than assignment filter should ensure we preserve
    monotonicity.

    Reported-by: Dave Jones
    Reported-by: Simon Kirby
    Signed-off-by: Peter Zijlstra
    Cc: stable@kernel.org
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins
    Signed-off-by: Thomas Gleixner

    Peter Zijlstra
     
  • Linus Torvalds
     

17 Oct, 2011

2 commits

  • The size is always valid, but variable-length arrays generate worse code
    for no good reason (unless the function happens to be inlined and the
    compiler sees the length for the simple constant it is).

    Also, there seems to be some code generation problem on POWER, where
    Henrik Bakken reports that register r28 can get corrupted under some
    subtle circumstances (interrupt happening at the wrong time?). That all
    indicates some seriously broken compiler issues, but since variable
    length arrays are bad regardless, there's little point in trying to
    chase it down.

    "Just don't do that, then".

    Reported-by: Henrik Grindal Bakken
    Cc: Benjamin Herrenschmidt
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
    ARM: 7128/1: vic: Don't write to the read-only register VIC_IRQ_STATUS
    ARM: 7122/1: localtimer: add header linux/errno.h explicitly
    ARM: 7117/1: perf: fix HW_CACHE_* events on Cortex-A9
    ARM: 7113/1: mm: Align bank start to MAX_ORDER_NR_PAGES

    Linus Torvalds
     

15 Oct, 2011

4 commits


14 Oct, 2011

8 commits


13 Oct, 2011

4 commits


12 Oct, 2011

3 commits

  • Currently we have a few issues with the way the workqueue code is used to
    implement AIL pushing:

    - it accidentally uses the same workqueue as the syncer action, and thus
    can be prevented from running if there are enough sync actions active
    in the system.
    - it doesn't use the HIGHPRI flag to queue at the head of the queue of
    work items

    At this point I'm not confident enough in getting all the workqueue flags and
    tweaks right to provide a perfectly reliable execution context for AIL
    pushing, which is the most important piece in XFS to make forward progress
    when the log fills.

    Revert back to use a kthread per filesystem which fixes all the above issues
    at the cost of having a task struct and stack around for each mounted
    filesystem. In addition this also gives us much better ways to diagnose
    any issues involving hung AIL pushing and removes a small amount of code.

    Signed-off-by: Christoph Hellwig
    Reported-by: Stefan Priebe
    Tested-by: Stefan Priebe
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • We need to check for pinned buffers even in .iop_pushbuf given that inode
    items flush into the same buffers that may be pinned directly due operations
    on the unlinked inode list operating directly on buffers. To do this add a
    return value to .iop_pushbuf that tells the AIL push about this and use
    the existing log force mechanisms to unpin it.

    Signed-off-by: Christoph Hellwig
    Reported-by: Stefan Priebe
    Tested-by: Stefan Priebe
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     
  • If an item was locked we should not update xa_last_pushed_lsn and thus skip
    it when restarting the AIL scan as we need to be able to lock and write it
    out as soon as possible. Otherwise heavy lock contention might starve AIL
    pushing too easily, especially given the larger backoff once we moved
    xa_last_pushed_lsn all the way to the target lsn.

    Signed-off-by: Christoph Hellwig
    Reported-by: Stefan Priebe
    Tested-by: Stefan Priebe
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

11 Oct, 2011

7 commits

  • The btrfs file defrag code will loop through the extents and
    force COW on them. But there is a concurrent truncate in the middle of
    the defrag, it might end up defragging the same range over and over
    again.

    The problem is that writepage won't go through and do anything on pages
    past i_size, so the cow won't happen, so the file will appear to still
    be fragmented. defrag will end up hitting the same extents again and
    again.

    In the worst case, the truncate can actually live lock with the defrag
    because the defrag keeps creating new ordered extents which the truncate
    code keeps waiting on.

    The fix here is to make defrag check for i_size inside the main loop,
    instead of just once before the looping starts.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • This UML breakage:

    linux-2.6.30.1[3800] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb9c498 ax:ffffffffff600000 si:0 di:606790
    linux-2.6.30.1[3856] vsyscall fault (exploit attempt?) ip:ffffffffff600000 cs:33 sp:7fbfb13168 ax:ffffffffff600000 si:0 di:606790

    Is caused by commit 3ae36655 ("x86-64: Rework vsyscall emulation and add
    vsyscall= parameter") - the vsyscall emulation code is not fully cooked
    yet as UML relies on some rather fragile SIGSEGV semantics.

    Linus suggested in https://lkml.org/lkml/2011/8/9/376 to default
    to vsyscall=native for now, this patch implements that.

    Signed-off-by: Adrian Bunk
    Acked-by: Andrew Lutomirski
    Cc: H. Peter Anvin
    Link: http://lkml.kernel.org/r/20111005214047.GE14406@localhost.pp.htv.fi
    Signed-off-by: Ingo Molnar

    Adrian Bunk
     
  • Follow those steps:

    # mount -o autodefrag /dev/sda7 /mnt
    # dd if=/dev/urandom of=/mnt/tmp bs=200K count=1
    # sync
    # dd if=/dev/urandom of=/mnt/tmp bs=8K count=1 conv=notrunc

    and then it'll go into a loop: writeback -> defrag -> writeback ...

    It's because writeback writes [8K, 200K] and then writes [0, 8K].

    I tried to make writeback know if the pages are dirtied by defrag,
    but the patch was a bit intrusive. Here I simply set writeback_index
    when we defrag a file.

    Signed-off-by: Li Zefan
    Signed-off-by: Chris Mason

    Li Zefan
     
  • Due to the 16 bit access to mscan registers there's too much data copied to
    the zero initialized CAN frame when having an odd number of bytes to copy.
    This patch ensures that only the requested bytes are copied by using an
    8 bit access for the remaining byte.

    Reported-by: Andre Naujoks
    Signed-off-by: Oliver Hartkopp
    Signed-off-by: Wolfgang Grandegger
    Signed-off-by: David S. Miller

    Wolfgang Grandegger
     
  • ipv6_gro_receive() doesn't update the protocol ops after pulling
    the ext headers. It looks like a typo.

    Signed-off-by: Zheng Yan
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Yan, Zheng
     
  • There are some consolidations of NPAR configuration
    when FCoE and iSCSI L2 clients will get the same id,
    in this case FCoE ring will be non-functional.

    Signed-off-by: Dmitry Kravkov
    Signed-off-by: Eilon Greenstein
    Signed-off-by: David S. Miller

    Dmitry Kravkov
     
  • The doorbell register was being unconditionally swapped. In x86, that
    meant it was being swapped to BE and written to the descriptor and to
    memory, depending on the case of blue frame support or writing to
    doorbell register. On PPC, this meant it was being swapped to LE and
    then swapped back to BE while writing to the register. But in the blue
    frame case, it was being written as LE to the descriptor.

    The fix is not to swap doorbell unconditionally, write it to the
    register as BE and convert it to BE when writing it to the descriptor.

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Reported-by: Richard Hendrickson
    Cc: Eli Cohen
    Cc: Yevgeny Petrilin
    Cc: Benjamin Herrenschmidt
    Signed-off-by: David S. Miller

    Thadeu Lima de Souza Cascardo
     

10 Oct, 2011

6 commits

  • * git://git.samba.org/sfrench/cifs-2.6:
    [CIFS] Fix first time message on mount, ntlmv2 upgrade delayed to 3.2

    Linus Torvalds
     
  • * 'fixes' of git://git.linaro.org/people/arnd/arm-soc:
    ARM: mach-ux500: enable fix for ARM errata 754322
    ARM: OMAP: musb: Remove a redundant omap4430_phy_init call in usb_musb_init
    ARM: OMAP: Fix i2c init for twl4030
    ARM: OMAP4: MMC: fix power and audio issue, decouple USBC1 from MMC1

    Linus Torvalds
     
  • This fixes a compilation error in cpu-tegra.c which was introduced in
    dc8d966bccde ("ARM: convert PCI defines to variables") which removed the
    now obsolete mach/hardware.h from the mach-tegra subtree.

    Signed-off-by: Marc Dietrich
    Signed-off-by: Olof Johansson
    Cc: Sergei Shtylyov
    Signed-off-by: Linus Torvalds

    Marc Dietrich
     
  • * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    drm/radeon/kms: use hardcoded dig encoder to transmitter mapping for DCE4.1
    drm/radeon/kms: fix dp_detect handling for DP bridge chips
    drm/radeon/kms: retry aux transactions if there are status flags

    Linus Torvalds
     
  • A couple of changes to the Tegra maintainership setup:

    I'm very glad to bring on Stephen Warren on board as a maintainer. The
    work he has done so far is excellent, and the fact that he works for
    Nvidia means he has long-term interest in the platform.

    Erik Gilling did an astounding amount of work on getting things up and
    running but has been a silent partner on the maintainership side for a
    while, and is stepping down. Thanks for your contributions so far, Erik.

    Finally, update the git URL since I'll take over running the main repo
    for a while.

    Overall maintainership model isn't changing much at this time: We'll all
    three review patches as appropriate, and one of us will collect the main
    repo (me at this time).

    Signed-off-by: Olof Johansson
    Cc: Erik Gilling
    Acked-by: Colin Cross
    Acked-by: Stephen Warren
    Signed-off-by: Linus Torvalds

    Olof Johansson
     
  • * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: (29 commits)
    MIPS: Call oops_enter, oops_exit in die
    staging/octeon: Software should check the checksum of no tcp/udp packets
    MIPS: Octeon: Enable C0_UserLocal probing.
    MIPS: No branches in delay slots for huge pages in handle_tlbl
    MIPS: Don't clobber CP0_STATUS value for CONFIG_MIPS_MT_SMTC
    MIPS: Octeon: Select CONFIG_HOLES_IN_ZONE
    MIPS: PM: Use struct syscore_ops instead of sysdevs for PM (v2)
    MIPS: Compat: Use 32-bit wrapper for compat_sys_futex.
    MIPS: Do not use EXTRA_CFLAGS
    MIPS: Alchemy: DB1200: Disable cascade IRQ in handler
    SERIAL: Lantiq: Set timeout in uart_port
    MIPS: Lantiq: Fix setting the PCI bus speed on AR9
    MIPS: Lantiq: Fix external interrupt sources
    MIPS: tlbex: Fix build error in R3000 code.
    MIPS: Alchemy: Include Au1100 in PM code.
    MIPS: Alchemy: Fix typo in MAC0 registration
    MIPS: MSP71xx: Fix build error.
    MIPS: Handle __put_user() sleeping.
    MIPS: Allow forced irq threading
    MIPS: i8259: Mark cascade interrupt non-threaded
    ...

    Linus Torvalds
     

09 Oct, 2011

1 commit


08 Oct, 2011

1 commit


07 Oct, 2011

2 commits