22 Sep, 2010

3 commits

  • Inodes of devices such as /dev/zero can get dirty for example via
    utime(2) syscall or due to atime update. Backing device of such inodes
    (zero_bdi, etc.) is however unable to handle dirty inodes and thus
    __mark_inode_dirty complains. In fact, inode should be rather dirtied
    against backing device of the filesystem holding it. This is generally a
    good rule except for filesystems such as 'bdev' or 'mtd_inodefs'. Inodes
    in these pseudofilesystems are referenced from ordinary filesystem
    inodes and carry mapping with real data of the device. Thus for these
    inodes we have to use inode->i_mapping->backing_dev_info as we did so
    far. We distinguish these filesystems by checking whether sb->s_bdi
    points to a non-trivial backing device or not.

    Example: Assume we have an ext3 filesystem on /dev/sda1 mounted on /.
    There's a device inode A described by a path "/dev/sdb" on this
    filesystem. This inode will be dirtied against backing device "8:0"
    after this patch. bdev filesystem contains block device inode B coupled
    with our inode A. When someone modifies a page of /dev/sdb, it's B that
    gets dirtied and the dirtying happens against the backing device "8:16".
    Thus both inodes get filed to a correct bdi list.

    Cc: stable@kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Jens Axboe

    Jan Kara
     
  • These devices don't do any writeback but their device inodes still can get
    dirty so mark bdi appropriately so that bdi code does the right thing and files
    inodes to lists of bdi carrying the device inodes.

    Cc: stable@kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Jens Axboe

    Jan Kara
     
  • Properly initialize this backing dev info so that writeback code does not
    barf when getting to it e.g. via sb->s_bdi.

    Cc: stable@kernel.org
    Signed-off-by: Jan Kara
    Signed-off-by: Jens Axboe

    Jan Kara
     

21 Sep, 2010

3 commits

  • Mike reported a kernel crash when a usb key hotplug is performed while all
    kernel thrads are not in a root cgroup and are running in one of the child
    cgroups of blkio controller.

    BUG: unable to handle kernel NULL pointer dereference at 0000002c
    IP: [] cfq_get_queue+0x232/0x412
    *pde = 00000000
    Oops: 0000 [#1] PREEMPT
    last sysfs file: /sys/devices/pci0000:00/0000:00:1d.7/usb2/2-1/2-1:1.0/host3/scsi_host/host3/uevent

    [..]
    Pid: 30039, comm: scsi_scan_3 Not tainted 2.6.35.2-fg.roam #1 Volvi2 /Aspire 4315
    EIP: 0060:[] EFLAGS: 00010086 CPU: 0
    EIP is at cfq_get_queue+0x232/0x412
    EAX: f705f9c0 EBX: e977abac ECX: 00000000 EDX: 00000000
    ESI: f00da400 EDI: f00da4ec EBP: e977a800 ESP: dff8fd00
    DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
    Process scsi_scan_3 (pid: 30039, ti=dff8e000 task=f6b6c9a0 task.ti=dff8e000)
    Stack:
    00000000 00000000 00000001 01ff0000 f00da508 00000000 f00da524 f00da540
    e7994940 dd631750 f705f9c0 e977a820 e977ac44 f00da4d0 00000001 f6b6c9a0
    00000010 00008010 0000000b 00000000 00000001 e977a800 dd76fac0 00000246
    Call Trace:
    [] ? cfq_set_request+0x228/0x34c
    [] ? cfq_set_request+0x0/0x34c
    [] ? elv_set_request+0xf/0x1c
    [] ? get_request+0x1ad/0x22f
    [] ? get_request_wait+0x1f/0x11a
    [] ? kvasprintf+0x33/0x3b
    [] ? scsi_execute+0x1d/0x103
    [] ? scsi_execute_req+0x58/0x83
    [] ? scsi_probe_and_add_lun+0x188/0x7c2
    [] ? attribute_container_add_device+0x15/0xfa
    [] ? kobject_get+0xf/0x13
    [] ? get_device+0x10/0x14
    [] ? scsi_alloc_target+0x217/0x24d
    [] ? __scsi_scan_target+0x95/0x480
    [] ? dequeue_entity+0x14/0x1fe
    [] ? update_curr+0x165/0x1ab
    [] ? update_curr+0x165/0x1ab
    [] ? scsi_scan_channel+0x4a/0x76
    [] ? scsi_scan_host_selected+0x77/0xad
    [] ? do_scan_async+0x0/0x11a
    [] ? do_scsi_scan_host+0x51/0x56
    [] ? do_scan_async+0x0/0x11a
    [] ? do_scan_async+0xe/0x11a
    [] ? do_scan_async+0x0/0x11a
    [] ? kthread+0x5e/0x63
    [] ? kthread+0x0/0x63
    [] ? kernel_thread_helper+0x6/0x10
    Code: 44 24 1c 54 83 44 24 18 54 83 fa 03 75 94 8b 06 c7 86 64 02 00 00 01 00 00 00 83 e0 03 09 f0 89 06 8b 44 24 28 8b 90 58 01 00 00 42 2c 85 c0 75 03 8b 42 08 8d 54 24 48 52 8d 4c 24 50 51 68
    EIP: [] cfq_get_queue+0x232/0x412 SS:ESP 0068:dff8fd00
    CR2: 000000000000002c
    ---[ end trace 9a88306573f69b12 ]---

    The problem here is that we don't have bdi->dev information available when
    thread does some IO. Hence when dev_name() tries to access bdi->dev, it
    crashes.

    This problem does not happen if kernel threads are in root group as root
    group is statically allocated at device initialization time and we don't
    hit this piece of code.

    Fix it by delaying the filling of major and minor number information of
    device in blk_group. Initially a blk_group is created with 0 as device
    information and this information is filled later once some more IO comes
    in from same group.

    Reported-by: Mike Kazantsev
    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • This bug was introduced in 7b6d91daee5cac6402186ff224c3af39d79f4a0e
    "block: unify flags for struct bio and struct request"

    Cc: Boaz Harrosh
    Signed-off-by: Benny Halevy
    Signed-off-by: Jens Axboe

    Benny Halevy
     
  • The "h->scatter_list" is allocated inside a for loop. If any of those
    allocations fail, then the rest of the list is uninitialized data. When
    we free it we should start from the top and free backwards so that we
    don't call kfree() on uninitialized pointers.

    Also if the allocation for "h->scatter_list" fails then we would get an
    Oops here. I should have noticed this when I send: 4ee69851c "cciss:
    handle allocation failure." but I didn't. Sorry about that.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Jens Axboe

    Dan Carpenter
     

20 Sep, 2010

5 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
    alpha: deal with multiple simultaneously pending signals
    alpha: fix a 14 years old bug in sigreturn tracing
    alpha: unb0rk sigsuspend() and rt_sigsuspend()
    alpha: belated ERESTART_RESTARTBLOCK race fix
    alpha: Shift perf event pending work earlier in timer interrupt
    alpha: wire up fanotify and prlimit64 syscalls
    alpha: kill big kernel lock
    alpha: fix build breakage in asm/cacheflush.h
    alpha: remove unnecessary cast from void* in assignment.
    alpha: Use static const char * const where possible

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6:
    ide: Fix ordering of procfs registry.

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
    dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
    netpoll: Disable IRQ around RCU dereference in netpoll_rx
    sctp: Do not reset the packet during sctp_packet_config().
    net/llc: storing negative error codes in unsigned short
    MAINTAINERS: move atlx discussions to netdev
    drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
    drivers/net/eql.c: prevent reading uninitialized stack memory
    drivers/net/usb/hso.c: prevent reading uninitialized memory
    xfrm: dont assume rcu_read_lock in xfrm_output_one()
    r8169: Handle rxfifo errors on 8168 chips
    3c59x: Remove atomic context inside vortex_{set|get}_wol
    tcp: Prevent overzealous packetization by SWS logic.
    net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
    phylib: fix PAL state machine restart on resume
    net: use rcu_barrier() in rollback_registered_many
    bonding: correctly process non-linear skbs
    ipv4: enable getsockopt() for IP_NODEFRAG
    ipv4: force_igmp_version ignored when a IGMPv3 query received
    ppp: potential NULL dereference in ppp_mp_explode()
    net/llc: make opt unsigned in llc_ui_setsockopt()
    ...

    Linus Torvalds
     
  • …git/kgene/linux-samsung

    * 's5p-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
    ARM: S3C64XX: Add IORESOURCE_IRQ_HIGHLEVEL flag to dm9000 on mach-real6410
    ARM: S3C64XX: Fix coding style errors on mach-real6410
    ARM: S3C64XX: Prototype SPI devices
    ARM: S3C64XX: Fix dev-spi build
    ARM: SAMSUNG: Fix on s5p_gpio_[get,set]_drvstr
    ARM: SAMSUNG: Fix on drive strength value
    ARM: S5PV210: Add FIMC clocks
    ARM: S5PV210: Reduce the iodesc length of systimer
    ARM: S5PV210: Update I2C-1 Clock Register Property.
    ARM: S5P: Decrease IO Registers memory region size on FIMC
    ARM: S5P: Fix DMA coherent mask for FIMC

    Linus Torvalds
     
  • Coda's REQ_* defines were renamed to avoid clashes with the block layer
    (commit 4aeefdc69f7b: "coda: fixup clash with block layer REQ_*
    defines").

    However one was missed and response messages are no longer matched with
    requests and waiting threads are no longer woken up. This patch fixes
    this.

    Signed-off-by: Jan Harkes
    [ Also fixed up whitespace while at it -Linus ]
    Signed-off-by: Linus Torvalds

    Jan Harkes
     

19 Sep, 2010

10 commits

  • Unlike the other targets, alpha sets _one_ sigframe and
    buggers off until the next syscall/interrupt, even if
    more signals are pending. It leads to quite a few unpleasant
    inconsistencies, starting with SIGSEGV potentially arriving
    not where it should and including e.g. mess with sigsuspend();
    consider two pending signals blocked until sigsuspend()
    unblocks them. We pick the first one; then, if we are hit
    by interrupt while in the handler, we process the second one
    as well. If we are not, and if no syscalls had been made,
    we get out of the first handler and leave the second signal
    pending; normally sigreturn() would've picked it anyway, but
    here it starts with restoring the original mask and voila -
    the second signal is blocked again. On everything else we
    get both delivered consistently.

    It's actually easy to fix; the only thing to watch out for
    is prevention of double syscall restart. Fortunately, the
    idea I've nicked from arm fix by rmk works just fine...

    Testcase demonstrating the behaviour in question; on alpha
    we get one or both flags set (usually one), on everything
    else both are always set.
    #include
    #include
    int had1, had2;
    void f1(int sig) { had1 = 1; }
    void f2(int sig) { had2 = 1; }
    main()
    {
    sigset_t set1, set2;
    sigemptyset(&set1);
    sigemptyset(&set2);
    sigaddset(&set2, 1);
    sigaddset(&set2, 2);
    signal(1, f1);
    signal(2, f2);
    sigprocmask(SIG_SETMASK, &set2, NULL);
    raise(1);
    raise(2);
    sigsuspend(&set1);
    printf("had1:%d had2:%d\n", had1, had2);
    }

    Tested-by: Michael Cree
    Signed-off-by: Al Viro
    Signed-off-by: Matt Turner

    Al Viro
     
  • The way sigreturn() is implemented on alpha breaks PTRACE_SYSCALL,
    all way back to 1.3.95 when alpha has grown PTRACE_SYSCALL support.

    What happens is direct return to ret_from_syscall, in order to bypass
    mangling of a3 (error indicator) and prevent other mutilations of
    registers (e.g. by syscall restart). That's fine, but... the entire
    TIF_SYSCALL_TRACE codepath is kept separate on alpha and post-syscall
    stopping/notifying the tracer is after the syscall. And the normal
    path we are forcibly switching to doesn't have it.

    So we end up with *one* stop in traced sigreturn() vs. two in other
    syscalls. And yes, strace is visibly broken by that; try to strace
    the following
    #include
    #include
    void f(int sig) {}
    main()
    {
    signal(SIGHUP, f);
    raise(SIGHUP);
    write(1, "eeeek\n", 6);
    }
    and watch the show. The
    close(1) = 405
    in the end of strace output is coming from return value of write() (6 ==
    __NR_close on alpha) and syscall number of exit_group() (__NR_exit_group ==
    405 there).

    The fix is fairly simple - the only thing we end up missing is the call
    of syscall_trace() and we can tell whether we'd been called from the
    SYSCALL_TRACE path by checking ra value. Since we are setting the
    switch_stack up (that's what sys_sigreturn() does), we have the right
    environment for calling syscall_trace() - just before we call
    undo_switch_stack() and return. Since undo_switch_stack() will overwrite
    s0 anyway, we can use it to store the result of "has it been called from
    SYSCALL_TRACE path?" check. The same thing applies in rt_sigreturn().

    Tested-by: Michael Cree
    Signed-off-by: Al Viro
    Signed-off-by: Matt Turner

    Al Viro
     
  • Old code used to set regs->r0 and regs->r19 to force the right
    return value. Leaving that after switch to ERESTARTNOHAND
    was a Bad Idea(tm), since now that screws the restart - if we
    hit the case when get_signal_to_deliver() returns 0, we will
    step back to syscall insn, with v0 set to EINTR and a3 to 1.
    The latter won't matter, since EINTR is 4, aka __NR_write.

    Testcase:

    #include
    #define _GNU_SOURCE
    #include
    #include

    main()
    {
    sigset_t mask;
    sigemptyset(&mask);
    sigaddset(&mask, SIGCONT);
    sigprocmask(SIG_SETMASK, &mask, NULL);
    kill(0, SIGCONT);
    syscall(__NR_sigsuspend, 1, "b0rken\n", 7);
    }

    results on alpha in immediate message to stdout...

    Fix is obvious; moreover, since we don't need regs anymore, we can
    switch to normal prototypes for these guys and lose the wrappers.
    Even better, rt_sigsuspend() is identical to generic version in
    kernel/signal.c now.

    Tested-by: Michael Cree
    Signed-off-by: Al Viro
    Signed-off-by: Matt Turner

    Al Viro
     
  • same thing as had been done on other targets back in 2003 -
    move setting ->restart_block.fn into {rt_,}sigreturn().

    Tested-by: Michael Cree
    Signed-off-by: Al Viro
    Signed-off-by: Matt Turner

    Al Viro
     
  • Pending work from the performance event subsystem is executed in
    the timer interrupt. This patch shifts the call to
    perf_event_do_pending() before the call to update_process_times()
    as the latter may call back into the perf event subsystem and it
    is prudent to have the pending work executed first.

    Signed-off-by: Michael Cree
    Signed-off-by: Matt Turner

    Michael Cree
     
  • The 2.6.36-rc kernel added three new system calls:
    fanotify_init, fanotify_mark, and prlimit64. This
    patch wires them up on Alpha.

    Built and booted on an XP900. Untested beyond that.

    Signed-off-by: Mikael Pettersson
    Signed-off-by: Matt Turner

    Mikael Pettersson
     
  • All uses of the BKL on alpha are totally bogus, nothing
    is really protected by this. Remove the remaining users
    so we don't have to mark alpha as 'depends on BKL'.

    Signed-off-by: Arnd Bergmann
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: linux-alpha@vger.kernel.org
    Signed-off-by: Matt Turner

    Arnd Bergmann
     
  • Alpha SMP flush_icache_user_range() is implemented as an inline
    function inside include/asm/cacheflush.h. It dereferences @current
    but doesn't include linux/sched.h and thus causes build failure if
    linux/sched.h wasn't included previously. Fix it by including the
    needed header file explicitly.

    Signed-off-by: Tejun Heo
    Reported-by: Stephen Rothwell
    Signed-off-by: Matt Turner

    Tejun Heo
     
  • Acked-by: Jan-Benedict Glaw
    Signed-off-by: matt mooney
    Signed-off-by: Matt Turner

    matt mooney
     
  • Acked-by: Richard Henderson
    Signed-off-by: Joe Perches
    Signed-off-by: Matt Turner

    Joe Perches
     

18 Sep, 2010

12 commits

  • Direct Cache Access is not supported on IOAT ver.3.0 multiple-IOH platforms.
    This patch blocks registering of dca providers when multiple IOH detected with IOAT ver.3.0.

    Signed-off-by: Maciej Sosnowski
    Signed-off-by: David S. Miller

    Sosnowski, Maciej
     
  • Add IORESOURCE_IRQ_HIGHLEVEL irq flag to dm9000 driver
    platform data in board mach-real6410.

    Signed-off-by: Darius Augulis
    [kgene.kim@samsung.com: minor title fix]
    Signed-off-by: Kukjin Kim

    Darius Augulis
     
  • Fix errors reported by checkpatch.pl script

    Signed-off-by: Darius Augulis
    [kgene.kim@samsung.com: minor title fix]
    Signed-off-by: Kukjin Kim

    Darius Augulis
     
  • Avoids build warnings due to the undeclared non-statics.

    Signed-off-by: Mark Brown
    Signed-off-by: Kukjin Kim

    Mark Brown
     
  • We cannot use rcu_dereference_bh safely in netpoll_rx as we may
    be called with IRQs disabled. We could however simply disable
    IRQs as that too causes BH to be disabled and is safe in either
    case.

    Thanks to John Linville for discovering this bug and providing
    a patch.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • sctp_packet_config() is called when getting the packet ready
    for appending of chunks. The function should not touch the
    current state, since it's possible to ping-pong between two
    transports when sending, and that can result packet corruption
    followed by skb overlfow crash.

    Reported-by: Thomas Dreibholz
    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
    ALSA: pcm - Fix race with proc files
    ALSA: pcm - Fix unbalanced pm_qos_request
    ALSA: HDA: Enable internal speaker on Dell M101z
    ALSA: patch_nvhdmi.c: Fix supported sample rate list.
    sound: Remove pr_ uses of KERN_
    ALSA: hda - Add quirk for Toshiba C650D using a Conexant CX20585
    ALSA: hda_intel: ALSA HD Audio patch for Intel Patsburg DeviceIDs

    Linus Torvalds
     
  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    hwmon: (lm95241) Replace rate sysfs attribute with update_interval
    hwmon: (adm1031) Replace update_rate sysfs attribute with update_interval
    hwmon: (w83627ehf) Use proper exit sequence
    hwmon: (emc1403) Remove unnecessary hwmon_device_unregister
    hwmon: (f75375s) Do not overwrite values read from registers
    hwmon: (f75375s) Shift control mode to the correct bit position
    hwmon: New subsystem maintainers
    hwmon: (lis3lv02d) Prevent NULL pointer dereference

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
    GFS2: gfs2_logd should be using interruptible waits

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
    firewire: nosy: fix build when CONFIG_FIREWIRE=N
    firewire: ohci: activate cycle timer register quirk on Ricoh chips

    Linus Torvalds
     
  • * 'for-linus' of git://neil.brown.name/md:
    md: fix v1.x metadata update when a disk is missing.
    md: call md_update_sb even for 'external' metadata arrays.

    Linus Torvalds
     
  • If a signal hits us outside of a syscall and another gets delivered
    when we are in sigreturn (e.g. because it had been in sa_mask for
    the first one and got sent to us while we'd been in the first handler),
    we have a chance of returning from the second handler to location one
    insn prior to where we ought to return. If r0 happens to contain -513
    (-ERESTARTNOINTR), sigreturn will get confused into doing restart
    syscall song and dance.

    Incredible joy to debug, since it manifests as random, infrequent and
    very hard to reproduce double execution of instructions in userland
    code...

    The fix is simple - mark it "don't bother with restarts" in wrapper,
    i.e. set r8 to 0 in sys_sigreturn and sys_rt_sigreturn wrappers,
    suppressing the syscall restart handling on return from these guys.
    They can't legitimately return a restart-worthy error anyway.

    Testcase:
    #include
    #include
    #include
    #include
    #include

    void f(int n)
    {
    __asm__ __volatile__(
    "ldr r0, [%0]\n"
    "b 1f\n"
    "b 2f\n"
    "1:b .\n"
    "2:\n" : : "r"(&n));
    }

    void handler1(int sig) { }
    void handler2(int sig) { raise(1); }
    void handler3(int sig) { exit(0); }

    main()
    {
    struct sigaction s = {.sa_handler = handler2};
    struct itimerval t1 = { .it_value = {1} };
    struct itimerval t2 = { .it_value = {2} };

    signal(1, handler1);

    sigemptyset(&s.sa_mask);
    sigaddset(&s.sa_mask, 1);
    sigaction(SIGALRM, &s, NULL);

    signal(SIGVTALRM, handler3);

    setitimer(ITIMER_REAL, &t1, NULL);
    setitimer(ITIMER_VIRTUAL, &t2, NULL);

    f(-513); /* -ERESTARTNOINTR */

    write(1, "buggered\n", 9);
    return 1;
    }

    Signed-off-by: Al Viro
    Acked-by: Russell King
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Al Viro
     

17 Sep, 2010

7 commits