18 Feb, 2011

40 commits

  • commit 862f0982eadcea0e114576c57ea426d3d51a69a6 upstream.

    FEC_MMFR_OP_WRITE should be used than FEC_MMFR_OP_READ in
    a mdio write operation.

    It's probably a typo introduced by commit:

    e6b043d512fa8d9a3801bf5d72bfa3b8fc3b3cc8
    netdev/fec.c: add phylib supporting to enable carrier detection (v2)

    Signed-off-by: Shawn Guo
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Shawn Guo
     
  • commit 09c9d4c9b6a2b5909ae3c6265e4cd3820b636863 upstream.

    Revert commit 224cb3e981f1b2f9f93dbd49eaef505d17d894c2
    dm: Call blk_abort_queue on failed paths

    Multipath began to use blk_abort_queue() to allow for
    lower latency path deactivation. This was found to
    cause list corruption:

    the cmd gets blk_abort_queued/timedout run on it and the scsi eh
    somehow is able to complete and run scsi_queue_insert while
    scsi_request_fn is still trying to process the request.

    https://www.redhat.com/archives/dm-devel/2010-November/msg00085.html

    Signed-off-by: Mike Snitzer
    Signed-off-by: Alasdair G Kergon
    Cc: Mike Anderson
    Cc: Mike Christie
    Signed-off-by: Greg Kroah-Hartman

    Mike Snitzer
     
  • commit c217649bf2d60ac119afd71d938278cffd55962b upstream.

    No longer needlessly hold md->bdev->bd_inode->i_mutex when changing the
    size of a DM device. This additional locking is unnecessary because
    i_size_write() is already protected by the existing critical section in
    dm_swap_table(). DM already has a reference on md->bdev so the
    associated bd_inode may be changed without lifetime concerns.

    A negative side-effect of having held md->bdev->bd_inode->i_mutex was
    that a concurrent DM device resize and flush (via fsync) would deadlock.
    Dropping md->bdev->bd_inode->i_mutex eliminates this potential for
    deadlock. The following reproducer no longer deadlocks:
    https://www.redhat.com/archives/dm-devel/2009-July/msg00284.html

    Signed-off-by: Mike Snitzer
    Signed-off-by: Mikulas Patocka
    Signed-off-by: Alasdair G Kergon
    Signed-off-by: Greg Kroah-Hartman

    Mike Snitzer
     
  • commit 8d661f1e462d50bd83de87ee628aaf820ce3c66c upstream.

    It is defined in include/linux/ieee80211.h. As per IEEE spec.
    bit6 to bit15 in block ack parameter represents buffer size.
    So the bitmask should be 0xFFC0.

    Signed-off-by: Amitkumar Karwar
    Signed-off-by: Bing Zhao
    Reviewed-by: Johannes Berg
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Amitkumar Karwar
     
  • commit 3f0d3d016d89a5efb8b926d4707eb21fa13f3d27 upstream.

    Some Lenovos have TPMs that require a quirk to function correctly. This can
    be autodetected by checking whether the device has a _HID of INTC0102. This
    is an invalid PNPid, and as such is discarded by the pnp layer - however
    it's still present in the ACPI code, so we can pull it out that way. This
    means that the quirk won't be automatically applied on non-ACPI systems,
    but without ACPI we don't have any way to identify the chip anyway so I
    don't think that's a great concern.

    Signed-off-by: Matthew Garrett
    Acked-by: Rajiv Andrade
    Tested-by: Jiri Kosina
    Tested-by: Andy Isaacson
    Signed-off-by: James Morris
    Signed-off-by: Greg Kroah-Hartman

    Matthew Garrett
     
  • commit 415103f9932d45f7927f4b17e3a9a13834cdb9a1 upstream.

    selinux_inode_init_security computes transitions sids even for filesystems
    that use mount point labeling. It shouldn't do that. It should just use
    the mount point label always and no matter what.

    This causes 2 problems. 1) it makes file creation slower than it needs to be
    since we calculate the transition sid and 2) it allows files to be created
    with a different label than the mount point!

    # id -Z
    staff_u:sysadm_r:sysadm_t:s0-s0:c0.c1023
    # sesearch --type --class file --source sysadm_t --target tmp_t
    Found 1 semantic te rules:
    type_transition sysadm_t tmp_t : file user_tmp_t;

    # mount -o loop,context="system_u:object_r:tmp_t:s0" /tmp/fs /mnt/tmp

    # ls -lZ /mnt/tmp
    drwx------. root root system_u:object_r:tmp_t:s0 lost+found
    # touch /mnt/tmp/file1
    # ls -lZ /mnt/tmp
    -rw-r--r--. root root staff_u:object_r:user_tmp_t:s0 file1
    drwx------. root root system_u:object_r:tmp_t:s0 lost+found

    Whoops, we have a mount point labeled filesystem tmp_t with a user_tmp_t
    labeled file!

    Signed-off-by: Eric Paris
    Reviewed-by: Reviewed-by: James Morris
    Signed-off-by: Greg Kroah-Hartman

    Eric Paris
     
  • commit 350e4f31e0eaf56dfc3b328d24a11bdf42a41fb8 upstream.

    Commit 2f90b865 added two new netlink message types to the netlink route
    socket. SELinux has hooks to define if netlink messages are allowed to
    be sent or received, but it did not know about these two new message
    types. By default we allow such actions so noone likely noticed. This
    patch adds the proper definitions and thus proper permissions
    enforcement.

    Signed-off-by: Eric Paris
    Cc: James Morris
    Signed-off-by: Greg Kroah-Hartman

    Eric Paris
     
  • commit 3fc5e98d8cf85e0d77fc597b49e9268dff67400e upstream.

    In construct_alloc_key(), up_write() is called in the error path if
    __key_link_begin() fails, but this is incorrect as __key_link_begin() only
    returns with the nominated keyring locked if it returns successfully.

    Without this patch, you might see the following in dmesg:

    =====================================
    [ BUG: bad unlock balance detected! ]
    -------------------------------------
    mount.cifs/5769 is trying to release lock (&key->sem) at:
    [] request_key_and_link+0x263/0x3fc
    but there are no more locks to release!

    other info that might help us debug this:
    3 locks held by mount.cifs/5769:
    #0: (&type->s_umount_key#41/1){+.+.+.}, at: [] sget+0x278/0x3e7
    #1: (&ret_buf->session_mutex){+.+.+.}, at: [] cifs_get_smb_ses+0x35a/0x443 [cifs]
    #2: (root_key_user.cons_lock){+.+.+.}, at: [] request_key_and_link+0x10a/0x3fc

    stack backtrace:
    Pid: 5769, comm: mount.cifs Not tainted 2.6.37-rc6+ #1
    Call Trace:
    [] ? request_key_and_link+0x263/0x3fc
    [] print_unlock_inbalance_bug+0xca/0xd5
    [] lock_release_non_nested+0xc1/0x263
    [] ? request_key_and_link+0x263/0x3fc
    [] ? request_key_and_link+0x263/0x3fc
    [] lock_release+0x17d/0x1a4
    [] up_write+0x23/0x3b
    [] request_key_and_link+0x263/0x3fc
    [] ? cifs_get_spnego_key+0x61/0x21f [cifs]
    [] request_key+0x41/0x74
    [] cifs_get_spnego_key+0x200/0x21f [cifs]
    [] CIFS_SessSetup+0x55d/0x1273 [cifs]
    [] cifs_setup_session+0x90/0x1ae [cifs]
    [] cifs_get_smb_ses+0x37f/0x443 [cifs]
    [] cifs_mount+0x1aa1/0x23f3 [cifs]
    [] ? alloc_debug_processing+0xdb/0x120
    [] ? cifs_get_spnego_key+0x1ef/0x21f [cifs]
    [] cifs_do_mount+0x165/0x2b3 [cifs]
    [] vfs_kern_mount+0xaf/0x1dc
    [] do_kern_mount+0x4d/0xef
    [] do_mount+0x6f4/0x733
    [] sys_mount+0x88/0xc2
    [] system_call_fastpath+0x16/0x1b

    Reported-by: Jeff Layton
    Signed-off-by: David Howells
    Reviewed-and-Tested-by: Jeff Layton
    Signed-off-by: Linus Torvalds
    Cc: James Morris
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • commit 9b29050f8f75916f974a2d231ae5d3cd59792296 upstream.

    The current TPM TIS driver in git discards the timeout values returned
    from the TPM. The check of the response packet needs to consider that
    the return_code field is 0 on success and the size of the expected
    packet is equivalent to the header size + u32 length indicator for the
    TPM_GetCapability() result + 3 timeout indicators of type u32.

    I am also adding a sysfs entry 'timeouts' showing the timeouts that are
    being used.

    Signed-off-by: Stefan Berger
    Tested-by: Guillaume Chazarain
    Signed-off-by: Rajiv Andrade
    Signed-off-by: Greg Kroah-Hartman

    Stefan Berger
     
  • commit c4ff4b829ef9e6353c0b133b7adb564a68054979 upstream.

    If duration variable value is 0 at this point, it's because
    chip->vendor.duration wasn't filled by tpm_get_timeouts() yet.
    This patch sets then the lowest timeout just to give enough
    time for tpm_get_timeouts() to further succeed.

    This fix avoids long boot times in case another entity attempts
    to send commands to the TPM when the TPM isn't accessible.

    Signed-off-by: Rajiv Andrade
    Signed-off-by: James Morris
    Signed-off-by: Greg Kroah-Hartman

    Rajiv Andrade
     
  • commit 0ca7a5b9ac5d301845dd6382ff25a699b6263a81 upstream.

    Fixes the following kernel oops in nilfs_setup_super() which could
    arise if one of two super-blocks is unavailable.

    > BUG: unable to handle kernel NULL pointer dereference at (null)
    > Pid: 3529, comm: mount.nilfs2 Not tainted 2.6.37 #1 /
    > EIP: 0060:[] EFLAGS: 00010202 CPU: 3
    > EIP is at memcpy+0xc/0x1b
    > Call Trace:
    > [] ? nilfs_setup_super+0x6c/0xa5 [nilfs2]
    > [] ? nilfs_get_root_dentry+0x81/0xcb [nilfs2]
    > [] ? nilfs_mount+0x4f9/0x62c [nilfs2]
    > [] ? kstrdup+0x36/0x3f
    > [] ? nilfs_mount+0x0/0x62c [nilfs2]
    > [] ? vfs_kern_mount+0x4d/0x12c
    > [] ? get_fs_type+0x76/0x8f
    > [] ? do_kern_mount+0x33/0xbf
    > [] ? do_mount+0x2ed/0x714
    > [] ? copy_mount_options+0x28/0xfc
    > [] ? sys_mount+0x72/0xaf
    > [] ? syscall_call+0x7/0xb

    Reported-by: Wakko Warner
    Signed-off-by: Ryusuke Konishi
    Tested-by: Wakko Warner
    LKML-Reference:
    Signed-off-by: Greg Kroah-Hartman

    Ryusuke Konishi
     
  • commit f87f928882d080eaec8b0d76aecff003d664697d upstream.

    This patch fixes 32 bit legacy paging with NPT enabled. The
    mmu_check_root call on the top-level of the loop causes
    root_gfn to take values (in the tdp_enabled path) which are
    outside of guest memory. So the mmu_check_root call fails at
    some point in the loop interation causing the guest to
    tiple-fault.
    This patch changes the mmu_check_root calls to the places
    where they are really necessary. As a side-effect it
    introduces a check for the root of a pae page table too.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Greg Kroah-Hartman

    Joerg Roedel
     
  • commit c093b8b46c5f0dd12d799f0d6a3b579863df72f6 upstream.

    We use the physical address instead of the base gfn for the four
    PAE page directories we use in unpaged mode. When the guest accesses
    an address above 1GB that is backed by a large host page, a BUG_ON()
    in kvm_mmu_set_gfn() triggers.

    Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=21962
    Reported-and-tested-by: Nicolas Prochazka
    Signed-off-by: Avi Kivity
    Cc: Marcelo Tosatti
    Signed-off-by: Greg Kroah-Hartman

    Avi Kivity
     
  • commit a0272630bb594b4eac03a79e77957df7dad8eade upstream.

    isr_ack is never initialized. So, until the first PIC reset, interrupts
    may fail to be injected. This can cause Windows XP to fail to boot, as
    reported in the fallout from the fix to
    https://bugzilla.kernel.org/show_bug.cgi?id=21962.

    Reported-and-tested-by: Nicolas Prochazka
    Signed-off-by: Avi Kivity
    Signed-off-by: Greg Kroah-Hartman

    Avi Kivity
     
  • commit e91ece5590b3c728624ab57043fc7a05069c604a upstream.

    md_make_request was calling bio_sectors() for part_stat_add
    after it was calling the make_request function. This is
    bad because the make_request function can free the bio and
    because the bi_size field can change around.

    The fix here was suggested by Jens Axboe. It saves the
    sector count before the make_request call. I hit this
    with CONFIG_DEBUG_PAGEALLOC turned on while trying to break
    his pretty fusionio card.

    Signed-off-by: Chris Mason
    Signed-off-by: NeilBrown
    Signed-off-by: Greg Kroah-Hartman

    Chris Mason
     
  • commit 77c5fd19075d299fe820bb59bb21b0b113676e20 upstream.

    pata_mpc52xx supports BMDMA but inherits ata_sff_port_ops which
    triggers BUG_ON() when a DMA command is issued. Fix it.

    Signed-off-by: Tejun Heo
    Reported-by: Roman Fietze
    Cc: Sergei Shtylyov
    Signed-off-by: Jeff Garzik
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • commit bf2cb0dab8c97f00a71875d9b13dbac17a2f47ca upstream.

    When a RAID6 is converted to a RAID5, the extra drive should
    be discarded. However it isn't due to a typo in a comparison.

    This bug was introduced in commit e93f68a1fc6 in 2.6.35-rc4
    and is suitable for any -stable since than.

    As the extra drive is not removed, the 'degraded' counter is wrong and
    so the RAID5 will not respond correctly to a subsequent failure.

    Signed-off-by: NeilBrown
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • commit 0ca69886a8273ac1350143d562280bfcbe4760dc upstream.

    When an md device is in the process of coming on line it is possible
    for an IO request (typically a partition table probe) to get through
    before the array is fully initialised, which can cause unexpected
    behaviour (e.g. a crash).

    So explicitly record when the array is ready for IO and don't allow IO
    through until then.

    There is no possibility for a similar problem when the array is going
    off-line as there must only be one 'open' at that time, and it is busy
    off-lining the array and so cannot send IO requests. So no memory
    barrier is needed in md_stop()

    This has been a bug since commit 409c57f3801 in 2.6.30 which
    introduced md_make_request. Before then, each personality would
    register its own make_request_fn when it was ready.
    This is suitable for any stable kernel from 2.6.30.y onwards.

    Signed-off-by: NeilBrown
    Reported-by: "Hawrylewicz Czarnowski, Przemyslaw"
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • commit 6c9879101442b08581e8a0e3ae6b7f643a78fd63 upstream.

    commit 589a594be1fb (2.6.37-rc4) fixed a problem were md_thread would
    sometimes call the ->run function at a bad time.

    If an error is detected during array start up after the md_thread has
    been started, the md_thread is killed. This resulted in the ->run
    function being called once. However the array may not be in a state
    that it is safe to call ->run.

    However the fix imposed meant that ->run was not called on a timeout.
    This means that when an array goes idle, bitmap bits do not get
    cleared promptly. While the array is busy the bits will still be
    cleared when appropriate so this is not very serious. There is no
    risk to data.

    Change the test so that we only avoid calling ->run when the thread
    is being stopped. This more explicitly addresses the problem situation.

    This is suitable for 2.6.37-stable and any -stable kernel to which
    589a594be1fb was applied.

    Signed-off-by: NeilBrown
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • commit bf572541ab44240163eaa2d486b06f306a31d45a upstream.

    Commit 1a855a0606 (2.6.37-rc4) fixed a problem where devices were
    re-added when they shouldn't be but caused a regression in a less
    common case that means sometimes devices cannot be re-added when they
    should be.

    In particular, when re-adding a device to an array without metadata
    we should always access the device, but after the above commit we
    didn't.

    This patch sets the In_sync flag in that case so that the re-add
    succeeds.

    This patch is suitable for any -stable kernel to which 1a855a0606 was
    applied.

    Signed-off-by: NeilBrown
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • commit 6dc19899958e420a931274b94019e267e2396d3e upstream.

    I noticed a failure where we hit the following WARN_ON in
    generic_smp_call_function_interrupt:

    if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
    continue;

    data->csd.func(data->csd.info);

    refs = atomic_dec_return(&data->refs);
    WARN_ON(refs < 0); cpumask sees and
    clears bit in cpumask
    might be using old or new fn!
    decrements refs below 0

    set data->refs (too late!)

    The important thing to note is since the interrupt handler walks a
    potentially stale call_function.queue without any locking, then another
    cpu can view the percpu *data structure at any time, even when the owner
    is in the process of initialising it.

    The following test case hits the WARN_ON 100% of the time on my PowerPC
    box (having 128 threads does help :)

    #include
    #include

    #define ITERATIONS 100

    static void do_nothing_ipi(void *dummy)
    {
    }

    static void do_ipis(struct work_struct *dummy)
    {
    int i;

    for (i = 0; i < ITERATIONS; i++)
    smp_call_function(do_nothing_ipi, NULL, 1);

    printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
    }

    static struct work_struct work[NR_CPUS];

    static int __init testcase_init(void)
    {
    int cpu;

    for_each_online_cpu(cpu) {
    INIT_WORK(&work[cpu], do_ipis);
    schedule_work_on(cpu, &work[cpu]);
    }

    return 0;
    }

    static void __exit testcase_exit(void)
    {
    }

    module_init(testcase_init)
    module_exit(testcase_exit)
    MODULE_LICENSE("GPL");
    MODULE_AUTHOR("Anton Blanchard");

    I tried to fix it by ordering the read and the write of ->cpumask and
    ->refs. In doing so I missed a critical case but Paul McKenney was able
    to spot my bug thankfully :) To ensure we arent viewing previous
    iterations the interrupt handler needs to read ->refs then ->cpumask then
    ->refs _again_.

    Thanks to Milton Miller and Paul McKenney for helping to debug this issue.

    [miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ]
    [miltonm@bga.com: remove excess tests]
    Signed-off-by: Anton Blanchard
    Signed-off-by: Milton Miller
    Cc: Ingo Molnar
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Anton Blanchard
     
  • commit 20d9600cb407b0b55fef6ee814b60345c6f58264 upstream.

    When using devices that support max_segments > BIO_MAX_PAGES (256), direct
    IO tries to allocate a bio with more pages than allowed, which leads to an
    oops in dio_bio_alloc(). Clamp the request to the supported maximum, and
    change dio_bio_alloc() to reflect that bio_alloc() will always return a
    bio when called with __GFP_WAIT and a valid number of vectors.

    [akpm@linux-foundation.org: remove redundant BUG_ON()]
    Signed-off-by: David Dillow
    Reviewed-by: Jeff Moyer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    David Dillow
     
  • commit 2550326ac7a062fdfc204f9a3b98bdb9179638fc upstream.

    Fix collision with kernel-supplied #define:

    drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined
    arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition

    Signed-off-by: Randy Dunlap
    Cc: Haojian Zhuang
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Randy Dunlap
     
  • commit dd3cb633078fb12e06ce6cebbdfbf55a7562e929 upstream.

    This fixes parsing of the device invariants (MAC address)
    for PCMCIA SSB devices.

    ssb_pcmcia_do_get_invariants expects an iv pointer as data
    argument.

    Tested-by: dylan cristiani
    Signed-off-by: Michael Buesch
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Michael Büsch
     
  • commit d2478521afc20227658a10a8c5c2bf1a2aa615b3 upstream.

    This patch fixes an OOPS triggered when calling modprobe ipmi_si a
    second time after the first modprobe returned without finding any ipmi
    devices. This can happen if you reload the module after having the
    first module load fail. The driver was not deregistering from PNP in
    that case.

    Peter Huewe originally reported this patch and supplied a fix, I have a
    different patch based on Linus' suggestion that cleans things up a bit
    more.

    Cc: openipmi-developer@lists.sourceforge.net
    Reviewed-by: Peter Huewe
    Cc: Randy Dunlap
    Signed-off-by: Corey Minyard
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Corey Minyard
     
  • commit 9ffdc6c37df131f89d52001e0ef03091b158826f upstream.

    Signed-off-by: Marcin Slusarz
    [ add {}'s to fix a warning ]
    Signed-off-by: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Marcin Slusarz
     
  • commit 397357666de6b5b6adb5fa99f9758ec8cf30ac34 upstream.

    If it was not possible to enable watchdog for any cpu, switch
    watchdog_enabled back to 0, because it's visible via
    kernel.watchdog sysctl.

    Signed-off-by: Marcin Slusarz
    Signed-off-by: Don Zickus
    Cc: Peter Zijlstra
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Marcin Slusarz
     
  • commit fbea668498e93bb38ac9226c7af9120a25957375 upstream.

    Remove the broken line wrapping handling in pdc_iodc_print().
    It is broken in 3 ways :
    - It doesn't keep track of the current screen position, it just
    assumes that the new buffer will be printed at the begining of the
    screen.
    - It doesn't take in account that non printable characters won't
    increase the current position on the screen.
    - And last but not least, it triggers a kernel panic if a backspace
    is the first char in the provided buffer :

    Backtrace:
    [] pdc_console_write+0x44/0x78
    [] pdc_console_tty_write+0x20/0x38
    [] n_tty_write+0x2a4/0x550
    [] tty_write+0x1e0/0x2d8
    [] vfs_write+0xb8/0x188
    [] sys_write+0x68/0xb8
    [] syscall_exit+0x0/0x14

    Most terminals handle the line wrapping just fine. I've confirmed that
    it works correctly on a C8000 with both vga and serial output.

    Signed-off-by: Guy Martin
    Signed-off-by: James Bottomley
    Signed-off-by: Greg Kroah-Hartman

    Guy Martin
     
  • commit 6044565af458e7fa6e748bff437ecc49dea88d79 upstream.

    Regression since commit 10389536742c, "firewire: core: check for 1394a
    compliant IRM, fix inaccessibility of Sony camcorder":

    The camcorder Canon MV5i generates lots of bus resets when asynchronous
    requests are sent to it (e.g. Config ROM read requests or FCP Command
    write requests) if the camcorder is not root node. This causes drop-
    outs in videos or makes the camcorder entirely inaccessible.
    https://bugzilla.redhat.com/show_bug.cgi?id=633260

    Fix this by allowing any Canon device, even if it is a pre-1394a IRM
    like MV5i are, to remain root node (if it is at least Cycle Master
    capable). With the FireWire controller cards that I tested, MV5i always
    becomes root node when plugged in and left to its own devices.

    Reported-by: Ralf Lange
    Signed-off-by: Stefan Richter
    Signed-off-by: Greg Kroah-Hartman

    Stefan Richter
     
  • commit 1f1936ff3febf38d582177ea319eaa278f32c91f upstream.

    Some of those functions try to adjust the CPU features, for example
    to remove NAP support on some revisions. However, they seem to use
    r5 as an index into the CPU table entry, which might have been right
    a long time ago but no longer is. r4 is the right register to use.

    This probably caused some off behaviours on some PowerMac variants
    using 750cx or 7455 processor revisions.

    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Herrenschmidt
     
  • commit 429f4d8d20b91e4a6c239f951c06a56a6ac22957 upstream.

    When converting to the new cpumask code I screwed up:

    - if (cpu_isset(cpu, numa_cpumask_lookup_table[node])) {
    - cpu_clear(cpu, numa_cpumask_lookup_table[node]);
    + if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) {
    + cpumask_set_cpu(cpu, node_to_cpumask_map[node]);

    This was introduced in commit 25863de07af9 (powerpc/cpumask: Convert NUMA code
    to new cpumask API)

    Fix it.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Greg Kroah-Hartman

    Anton Blanchard
     
  • commit 57cdfdf829a850a317425ed93c6a576c9ee6329c upstream.

    Spinlocks on shared processor partitions use H_YIELD to notify the
    hypervisor we are waiting on another virtual CPU. Unfortunately this means
    the hcall tracepoints can recurse.

    The patch below adds a percpu depth and checks it on both the entry and
    exit hcall tracepoints.

    Signed-off-by: Anton Blanchard
    Acked-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Greg Kroah-Hartman

    Anton Blanchard
     
  • commit b2e0861e51f2961954330dcafe6d148ee3ab5cff upstream.

    In order to prevent the fsl_dma driver from claiming the DMA channels that the
    P1022DS audio driver needs, the compatible properties for those nodes must say
    "fsl,ssi-dma-channel" instead of "fsl,eloplus-dma-channel".

    Signed-off-by: Timur Tabi
    Signed-off-by: Kumar Gala
    Signed-off-by: Greg Kroah-Hartman

    Timur Tabi
     
  • commit 02a8f01b5a9f396d0327977af4c232d0f94c45fd upstream.

    Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for
    the last queue of the group to become busy (have at least one request),
    so that the group does not lose out for not being continuously
    backlogged. The commit did not check for the condition that the last
    queue already has some requests. As a result, if the queue already has
    requests, wait_busy is set. Later on, cfq_select_queue() checks the
    flag, and decides that since the queue has a request now and wait_busy
    is set, the queue is expired. This results in early expiration of the
    queue.

    This patch fixes the problem by adding a check to see if queue already
    has requests. If it does, wait_busy is not set. As a result, time slices
    do not expire early.

    The queues with more than one request are usually buffered writers.
    Testing shows improvement in isolation between buffered writers.

    Signed-off-by: Justin TerAvest
    Reviewed-by: Gui Jianfeng
    Acked-by: Vivek Goyal
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Justin TerAvest
     
  • commit 795abaf1e4e188c4171e3cd3dbb11a9fcacaf505 upstream.

    Commit c0e69a5bbc6f ("klist.c: bit 0 in pointer can't be used as flag")
    intended to make sure that all klist objects were at least pointer size
    aligned, but used the constant "4" which only works on 32-bit.

    Use "sizeof(void *)" which is correct in all cases.

    Signed-off-by: David S. Miller
    Acked-by: Jesper Nilsson
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    David Miller
     
  • commit 88f5acf88ae6a9778f6d25d0d5d7ec2d57764a97 upstream.

    Commit aa45484 ("calculate a better estimate of NR_FREE_PAGES when memory
    is low") noted that watermarks were based on the vmstat NR_FREE_PAGES. To
    avoid synchronization overhead, these counters are maintained on a per-cpu
    basis and drained both periodically and when a threshold is above a
    threshold. On large CPU systems, the difference between the estimate and
    real value of NR_FREE_PAGES can be very high. The system can get into a
    case where pages are allocated far below the min watermark potentially
    causing livelock issues. The commit solved the problem by taking a better
    reading of NR_FREE_PAGES when memory was low.

    Unfortately, as reported by Shaohua Li this accurate reading can consume a
    large amount of CPU time on systems with many sockets due to cache line
    bouncing. This patch takes a different approach. For large machines
    where counter drift might be unsafe and while kswapd is awake, the per-cpu
    thresholds for the target pgdat are reduced to limit the level of drift to
    what should be a safe level. This incurs a performance penalty in heavy
    memory pressure by a factor that depends on the workload and the machine
    but the machine should function correctly without accidentally exhausting
    all memory on a node. There is an additional cost when kswapd wakes and
    sleeps but the event is not expected to be frequent - in Shaohua's test
    case, there was one recorded sleep and wake event at least.

    To ensure that kswapd wakes up, a safe version of zone_watermark_ok() is
    introduced that takes a more accurate reading of NR_FREE_PAGES when called
    from wakeup_kswapd, when deciding whether it is really safe to go back to
    sleep in sleeping_prematurely() and when deciding if a zone is really
    balanced or not in balance_pgdat(). We are still using an expensive
    function but limiting how often it is called.

    When the test case is reproduced, the time spent in the watermark
    functions is reduced. The following report is on the percentage of time
    spent cumulatively spent in the functions zone_nr_free_pages(),
    zone_watermark_ok(), __zone_watermark_ok(), zone_watermark_ok_safe(),
    zone_page_state_snapshot(), zone_page_state().

    vanilla 11.6615%
    disable-threshold 0.2584%

    David said:

    : We had to pull aa454840 "mm: page allocator: calculate a better estimate
    : of NR_FREE_PAGES when memory is low and kswapd is awake" from 2.6.36
    : internally because tests showed that it would cause the machine to stall
    : as the result of heavy kswapd activity. I merged it back with this fix as
    : it is pending in the -mm tree and it solves the issue we were seeing, so I
    : definitely think this should be pushed to -stable (and I would seriously
    : consider it for 2.6.37 inclusion even at this late date).

    Signed-off-by: Mel Gorman
    Reported-by: Shaohua Li
    Reviewed-by: Christoph Lameter
    Tested-by: Nicolas Bareil
    Cc: David Rientjes
    Cc: Kyle McMartin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Mel Gorman
     
  • commit a34650f0f1ca589cda09c48cb62baf15e680a247 upstream.

    The bfin_sdh driver allocates the wrong size for the private data
    in the mmc_host. The first parameter of mmc_alloc_host should be
    the size of the local driver struct rather than the common mmc_host.

    Signed-off-by: Sonic Zhang
    Signed-off-by: Mike Frysinger
    Signed-off-by: Chris Ball
    Signed-off-by: Greg Kroah-Hartman

    Sonic Zhang
     
  • commit 01c88e2d6b7330c0cc5867fe2297e7d826e1337d upstream.

    Commit 4b53433468 ("memcg: clean up try_charge main loop") removes a
    cancel of charge at case: memory charge-> success. mem+swap charge->
    failure.

    This leaks usage of memory. Fix it.

    Signed-off-by: KAMEZAWA Hiroyuki
    Reviewed-by: Johannes Weiner
    Acked-by: Daisuke Nishimura
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    KAMEZAWA Hiroyuki
     
  • commit b0a2679d27408d97ce31e5f800b44227d3388b84 upstream.

    Disable the initrd if the passed address already overlaps the reserved
    region. This avoids oopses on Netwinders when NeTTrom tells the kernel
    that an initrd is located at mem+4MB, but this overlaps the BSS,
    resulting in the kernels in-use BSS being freed.

    This should be applied to v2.6.37-stable.

    Signed-off-by: Russell King
    Signed-off-by: Greg Kroah-Hartman

    Russell King
     
  • commit aa5bd67dcfdf9af34c7fa36ebc87d4e1f7e91873 upstream.

    Since check_prlimit_permission always fails in the case of SUID/GUID
    processes, such processes are not able to read or set their own limits.
    This commit changes this by assuming that process can always read/change
    its own limits.

    Signed-off-by: Kacper Kornet
    Acked-by: Jiri Slaby
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Kacper Kornet