04 Jan, 2012

2 commits

  • Greg Kroah-Hartman
     
  • commit 3b87487ac5008072f138953b07505a7e3493327f upstream.

    This reverts commit de28f25e8244c7353abed8de0c7792f5f883588c.

    It results in resume problems for various people. See for example

    http://thread.gmane.org/gmane.linux.kernel/1233033
    http://thread.gmane.org/gmane.linux.kernel/1233389
    http://thread.gmane.org/gmane.linux.kernel/1233159
    http://thread.gmane.org/gmane.linux.kernel/1227868/focus=1230877

    and the fedora and ubuntu bug reports

    https://bugzilla.redhat.com/show_bug.cgi?id=767248
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/904569

    which got bisected down to the stable version of this commit.

    Reported-by: Jonathan Nieder
    Reported-by: Phil Miller
    Reported-by: Philip Langdale
    Reported-by: Tim Gardner
    Cc: Thomas Gleixner
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Linus Torvalds
     

22 Dec, 2011

38 commits

  • Greg Kroah-Hartman
     
  • commit 82e14e8bdd88b69018fe757192b01dd98582905e upstream.

    For cards that have two or more DAIs, snd_soc_resume's loop over all
    DAIs ends up calling schedule_work(deferred_resume_work) once per DAI.
    Since this is the same work item each time, the 2nd and subsequent
    calls return 0 (work item already queued), and trigger the dev_err
    message below stating that a work item may have been lost.

    Solve this by adjusting the loop to simply calculate whether to run the
    resume work immediately or defer it, and then call schedule work (or not)
    one time based on that.

    Note: This has not been tested in mainline, but only in chromeos-2.6.38;
    mainline doesn't support suspend/resume on Tegra, nor does the mainline
    Tegra ASoC driver contain multiple DAIs. It has been compile-checked in
    mainline.

    Signed-off-by: Stephen Warren
    Acked-by: Liam Girdwood
    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Stephen Warren
     
  • commit 02a551c9755b799579e0a093bcc99b80b4dc1453 upstream.

    Huawei use the product code HUAWEI_PRODUCT_E353 (0x1506) for a
    number of different devices, which each can appear with a number
    of different descriptor sets. Different types of interfaces
    can be identified by looking at the subclass and protocol fields

    Subclass 1 protocol 8 is actually the data interface of a CDC
    ECM set, with subclass 1 protocol 9 as the control interface.
    Neither support serial data communcation, and cannot therefore
    be supported by this driver.

    At the same time, add a few other sets which appear if the
    device is configured in "Windows mode" using this modeswitch
    message:
    55534243000000000000000000000011060000000100000000000000000000

    Signed-off-by: Bjørn Mork
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Bjørn Mork
     
  • commit 414b591fd16655871e9f5592a55368b10a3ccc30 upstream.

    This patch adds the controlling interfaces for the Huawei E398.

    Thanks to Bjørn Mork for extracting the interface
    numbers from the windows driver.

    Signed-off-by: Alex Hermann
    Signed-off-by: Greg Kroah-Hartman

    Alex Hermann
     
  • commit 6abff5dc4d5a2c90e597137ce8987e7fd439259b upstream.

    Add USB IDs for Motorola H24 HSPA USB module.

    Signed-off-by: Krzysztof Hałasa
    Acked-by: Oliver Neukum
    Signed-off-by: Greg Kroah-Hartman

    Krzysztof Hałasa
     
  • commit 935a9fee51c945b8942be2d7b4bae069167b4886 upstream.

    Found one system with UEFI/iBFT, kernel does not detect the iBFT during
    iscsi_ibft module loading.

    Root cause: on x86 (UEFI), we are calling of find_ibft_region() much earlier
    - specifically in setup_arch() before ACPI is enabled.

    Try to split acpi checking code out and call that later

    At that time ACPI iBFT already get permanent mapped with ioremap.
    So isa_virt_to_bus() will get wrong phys from right virt address.
    We could just skip that phys address printing.

    For legacy one, print the found address early.

    -v2: update comments and description according to Konrad.
    -v3: fix problem about module use case that is found by Konrad.
    -v4: use acpi_get_table() instead of acpi_table_parse() to handle module use case that is found by Konrad again..
    Signed-off-by: Yinghai Lu
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Greg Kroah-Hartman

    Yinghai Lu
     
  • commit cd5cfce856684e13b9b57d46b78bb827e9c4da3c upstream.

    Fixes:
    https://bugs.freedesktop.org/show_bug.cgi?id=43739

    Signed-off-by: Alex Deucher
    Signed-off-by: Dave Airlie
    Signed-off-by: Greg Kroah-Hartman

    Alex Deucher
     
  • commit c7caf4d4c56aee40b995f5858ccf1c814f3d2da2 upstream.

    Add USB ID for Sitecom WLA-2000 v1.001 WLAN.

    Reported-and-tested-by: Roland Gruber
    Signed-off-by: Larry Finger
    Signed-off-by: Greg Kroah-Hartman

    Larry Finger
     
  • commit 48706d0a91583d08c56e7ef2a7602d99c8d4133f upstream.

    Fix two bugs in fuse_retrieve():

    - retrieving more than one page would yield repeated instances of the
    first page

    - if more than FUSE_MAX_PAGES_PER_REQ pages were requested than the
    request page array would overflow

    fuse_retrieve() was added in 2.6.36 and these bugs had been there since the
    beginning.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Greg Kroah-Hartman

    Miklos Szeredi
     
  • commit 5a0dc7365c240795bf190766eba7a27600be3b3e upstream.

    We need to zero out part of a page which beyond EOF before setting uptodate,
    otherwise, mapread or write will see non-zero data beyond EOF.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Greg Kroah-Hartman

    Yongqiang Yang
     
  • commit 13a79a4741d37fda2fbafb953f0f301dc007928f upstream.

    If there is an unwritten but clean buffer in a page and there is a
    dirty buffer after the buffer, then mpage_submit_io does not write the
    dirty buffer out. As a result, da_writepages loops forever.

    This patch fixes the problem by checking dirty flag.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Greg Kroah-Hartman

    Yongqiang Yang
     
  • commit ea51d132dbf9b00063169c1159bee253d9649224 upstream.

    If the pte mapping in generic_perform_write() is unmapped between
    iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
    "copied" parameter to ->end_write can be zero. ext4 couldn't cope with
    it with delayed allocations enabled. This skips the i_disksize
    enlargement logic if copied is zero and no new data was appeneded to
    the inode.

    gdb> bt
    #0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
    08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
    #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
    xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
    #2 0xffffffff810d97f1 in generic_perform_write (iocb=, iov=, nr_segs=, pos=0x108000, ppos=0xffff88001e26be40, count=, written=0x0) at mm/filemap.c:2440
    #3 generic_file_buffered_write (iocb=, iov=, nr_segs=, p\
    os=0x108000, ppos=0xffff88001e26be40, count=, written=0x0) at mm/filemap.c:2482
    #4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
    xffff88001e26be40) at mm/filemap.c:2600
    #5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=, pos=) at mm/filemap.c:2632
    #6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
    t fs/ext4/file.c:136
    #7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=, len=, \
    ppos=0xffff88001e26bf48) at fs/read_write.c:406
    #8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960

    , count=0x4\
    000, pos=0xffff88001e26bf48) at fs/read_write.c:435
    #9 0xffffffff8113816c in sys_write (fd=, buf=0x1ec2960
    , count=0x\
    4000) at fs/read_write.c:487
    #10
    #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
    #12 0x0000000000000000 in ?? ()
    gdb> print offset
    $22 = 0xffffffffffffffff
    gdb> print idx
    $23 = 0xffffffff
    gdb> print inode->i_blkbits
    $24 = 0xc
    gdb> up
    #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
    xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
    2512 if (ext4_da_should_update_i_disksize(page, end)) {
    gdb> print start
    $25 = 0x0
    gdb> print end
    $26 = 0xffffffffffffffff
    gdb> print pos
    $27 = 0x108000
    gdb> print new_i_size
    $28 = 0x108000
    gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
    $29 = 0xd9000
    gdb> down
    2467 for (i = 0; i < idx; i++)
    gdb> print i
    $30 = 0xd44acbee

    This is 100% reproducible with some autonuma development code tuned in
    a very aggressive manner (not normal way even for knumad) which does
    "exotic" changes to the ptes. It wouldn't normally trigger but I don't
    see why it can't happen normally if the page is added to swap cache in
    between the two faults leading to "copied" being zero (which then
    hangs in ext4). So it should be fixed. Especially possible with lumpy
    reclaim (albeit disabled if compaction is enabled) as that would
    ignore the young bits in the ptes.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Greg Kroah-Hartman

    Andrea Arcangeli
     
  • commit fc6cb1cda5db7b2d24bf32890826214b857c728e upstream.

    /proc/mounts was showing the mount option [no]init_inode_table when
    the correct mount option that will be accepted by parse_options() is
    [no]init_itable.

    Signed-off-by: "Theodore Ts'o"
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit d3db728125c4470a2d061ac10fa7395e18237263 upstream.

    d312ae878b6a "xen: use maximum reservation to limit amount of usable RAM"
    clamped the total amount of RAM to the current maximum reservation. This is
    correct for dom0 but is not correct for guest domains. In order to boot a guest
    "pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for
    future memory expansion the guest must derive max_pfn from the e820 provided by
    the toolstack and not the current maximum reservation (which can reflect only
    the current maximum, not the guest lifetime max). The existing algorithm
    already behaves this correctly if we do not artificially limit the maximum
    number of pages for the guest case.

    For a guest booted with maxmem=512, memory=128 this results in:
    [ 0.000000] BIOS-provided physical RAM map:
    [ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
    [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
    -[ 0.000000] Xen: 0000000000100000 - 0000000008100000 (usable)
    -[ 0.000000] Xen: 0000000008100000 - 0000000020800000 (unusable)
    +[ 0.000000] Xen: 0000000000100000 - 0000000020800000 (usable)
    ...
    [ 0.000000] NX (Execute Disable) protection: active
    [ 0.000000] DMI not present or invalid.
    [ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
    [ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
    -[ 0.000000] last_pfn = 0x8100 max_arch_pfn = 0x1000000
    +[ 0.000000] last_pfn = 0x20800 max_arch_pfn = 0x1000000
    [ 0.000000] initial memory mapped : 0 - 027ff000
    [ 0.000000] Base memory trampoline at [c009f000] 9f000 size 4096
    -[ 0.000000] init_memory_mapping: 0000000000000000-0000000008100000
    -[ 0.000000] 0000000000 - 0008100000 page 4k
    -[ 0.000000] kernel direct mapping tables up to 8100000 @ 27bb000-27ff000
    +[ 0.000000] init_memory_mapping: 0000000000000000-0000000020800000
    +[ 0.000000] 0000000000 - 0020800000 page 4k
    +[ 0.000000] kernel direct mapping tables up to 20800000 @ 26f8000-27ff000
    [ 0.000000] xen: setting RW the range 27e8000 - 27ff000
    [ 0.000000] 0MB HIGHMEM available.
    -[ 0.000000] 129MB LOWMEM available.
    -[ 0.000000] mapped low ram: 0 - 08100000
    -[ 0.000000] low ram: 0 - 08100000
    +[ 0.000000] 520MB LOWMEM available.
    +[ 0.000000] mapped low ram: 0 - 20800000
    +[ 0.000000] low ram: 0 - 20800000

    With this change "xl mem-set 512M" will successfully increase the
    guest RAM (by reducing the balloon).

    There is no change for dom0.

    Reported-and-Tested-by: George Shuklin
    Signed-off-by: Ian Campbell
    Reviewed-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Greg Kroah-Hartman

    Ian Campbell
     
  • commit 355840e7a7e56bb2834fd3b0da64da5465f8aeaa upstream.

    commit a847627709b3402163d99f7c6fda4a77bcd6b51b in linux-3.0.9
    attempted to backport this to 3.0 but only made one change were two
    were necessary. This add the second change.

    This bug was introduced in 415e72d034c50520ddb7ff79e7d1792c1306f0c9
    which was in 2.6.36.

    There is a small window of time between when a device fails and when
    it is removed from the array. During this time we might still read
    from it, but we won't write to it - so it is possible that we could
    read stale data.

    We didn't need the test of 'Faulty' before because the test on
    In_sync is sufficient. Since we started allowing reads from the early
    part of non-In_sync devices we need a test on Faulty too.

    This is suitable for any kernel from 2.6.36 onwards, though the patch
    might need a bit of tweaking in 3.0 and earlier.

    Signed-off-by: NeilBrown
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • commit 859f57ca00805e6c482eef1a7ab073097d02c8ca upstream.

    [slightly different from the upstream version because of a previous cleanup]

    Currently xfs_attr_inactive causes a synchronous transactions if we are
    removing a file that has any extents allocated to the attribute fork, and
    thus makes XFS extremely slow at removing files with out of line extended
    attributes. The code looks a like a relict from the days before the busy
    extent list, but with the busy extent list we avoid reusing data and attr
    extents that have been freed but not commited yet, so this code is just
    as superflous as the synchronous transactions for data blocks.

    Signed-off-by: Christoph Hellwig
    Reported-by: Bernd Schubert
    Reviewed-by: Dave Chinner
    Signed-off-by: Alex Elder
    Signed-off-by: Greg Kroah-Hartman

    Christoph Hellwig
     
  • commit c29f7d457ac63311feb11928a866efd2fe153d74 upstream.

    The i_ino field in the VFS inode is of type unsigned long and thus can't
    hold the full 64-bit inode number on 32-bit kernels. We have the full
    inode number in the XFS inode, so use that one for nfs exports. Note
    that I've also switched the 32-bit file handles types to it, just to make
    the code more consistent and copy & paste errors less likely to happen.

    Reported-by: Guoquan Yang
    Reported-by: Hank Peng
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Ben Myers
    Signed-off-by: Greg Kroah-Hartman

    Christoph Hellwig
     
  • This is for stable kernel branch 3.0 only. Previous and later versions
    have different code paths and are not affected by this bug.

    This is the same fix as "hwmon: (coretemp) Fix oops on driver load"
    but for the CPU offlining case. Sorry for missing it at first.

    Signed-off-by: Jean Delvare
    Cc: Durgadoss R
    Acked-by: Guenter Roeck
    Cc: Fenghua Yu
    Signed-off-by: Greg Kroah-Hartman

    Jean Delvare
     
  • commit 434a964daa14b9db083ce20404a4a2add54d037a upstream.

    Clement Lecigne reports a filesystem which causes a kernel oops in
    hfs_find_init() trying to dereference sb->ext_tree which is NULL.

    This proves to be because the filesystem has a corrupted MDB extent
    record, where the extents file does not fit into the first three extents
    in the file record (the first blocks).

    In hfs_get_block() when looking up the blocks for the extent file
    (HFS_EXT_CNID), it fails the first blocks special case, and falls
    through to the extent code (which ultimately calls hfs_find_init())
    which is in the process of being initialised.

    Hfs avoids this scenario by always having the extents b-tree fitting
    into the first blocks (the extents B-tree can't have overflow extents).

    The fix is to check at mount time that the B-tree fits into first
    blocks, i.e. fail if HFS_I(inode)->alloc_blocks >=
    HFS_I(inode)->first_blocks

    Note, the existing commit 47f365eb57573 ("hfs: fix oops on mount with
    corrupted btree extent records") becomes subsumed into this as a special
    case, but only for the extents B-tree (HFS_EXT_CNID), it is perfectly
    acceptable for the catalog B-Tree file to grow beyond three extents,
    with the remaining extent descriptors in the extents overfow.

    This fixes CVE-2011-2203

    Reported-by: Clement LECIGNE
    Signed-off-by: Phillip Lougher
    Cc: Jeff Mahoney
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Cc: Moritz Mühlenhoff
    Signed-off-by: Greg Kroah-Hartman

    Phillip Lougher
     
  • commit 1a51410abe7d0ee4b1d112780f46df87d3621043 upstream.

    Ok, this isn't optimal, since it means that 'iotop' needs admin
    capabilities, and we may have to work on this some more. But at the
    same time it is very much not acceptable to let anybody just read
    anybody elses IO statistics quite at this level.

    Use of the GENL_ADMIN_PERM suggested by Johannes Berg as an alternative
    to checking the capabilities by hand.

    Reported-by: Vasiliy Kulikov
    Cc: Johannes Berg
    Acked-by: Balbir Singh
    Signed-off-by: Linus Torvalds
    Cc: Moritz Mühlenhoff
    Signed-off-by: Greg Kroah-Hartman

    Linus Torvalds
     
  • commit 8762202dd0d6e46854f786bdb6fb3780a1625efe upstream.

    I hit a J_ASSERT(blocknr != 0) failure in cleanup_journal_tail() when
    mounting a fsfuzzed ext3 image. It turns out that the corrupted ext3
    image has s_first = 0 in journal superblock, and the 0 is passed to
    journal->j_head in journal_reset(), then to blocknr in
    cleanup_journal_tail(), in the end the J_ASSERT failed.

    So validate s_first after reading journal superblock from disk in
    journal_get_superblock() to ensure s_first is valid.

    The following script could reproduce it:

    fstype=ext3
    blocksize=1024
    img=$fstype.img
    offset=0
    found=0
    magic="c0 3b 39 98"

    dd if=/dev/zero of=$img bs=1M count=8
    mkfs -t $fstype -b $blocksize -F $img
    filesize=`stat -c %s $img`
    while [ $offset -lt $filesize ]
    do
    if od -j $offset -N 4 -t x1 $img | grep -i "$magic";then
    echo "Found journal: $offset"
    found=1
    break
    fi
    offset=`echo "$offset+$blocksize" | bc`
    done

    if [ $found -ne 1 ];then
    echo "Magic \"$magic\" not found"
    exit 1
    fi

    dd if=/dev/zero of=$img seek=$(($offset+23)) conv=notrunc bs=1 count=1

    mkdir -p ./mnt
    mount -o loop $img ./mnt

    Cc: Jan Kara
    Signed-off-by: Eryu Guan
    Signed-off-by: "Theodore Ts'o"
    Cc: Moritz Mühlenhoff
    Signed-off-by: Greg Kroah-Hartman

    Eryu Guan
     
  • commit 2ded6e6a94c98ea453a156748cb7fabaf39a76b9 upstream.

    When HPET is operating in RTC mode, the TN_ENABLE bit on timer1
    controls whether the HPET or the RTC delivers interrupts to irq8. When
    the system goes into suspend, the RTC driver sends a signal to the
    HPET driver so that the HPET releases control of irq8, allowing the
    RTC to wake the system from suspend. The switchover is accomplished by
    a write to the HPET configuration registers which currently only
    occurs while servicing the HPET interrupt.

    On some systems, I have seen the system suspend before an HPET
    interrupt occurs, preventing the write to the HPET configuration
    register and leaving the HPET in control of the irq8. As the HPET is
    not active during suspend, it does not generate a wake signal and RTC
    alarms do not work.

    This patch forces the HPET driver to immediately transfer control of
    the irq8 channel to the RTC instead of waiting until the next
    interrupt event.

    Signed-off-by: Mark Langsdorf
    Link: http://lkml.kernel.org/r/20111118153306.GB16319@alberich.amd.com
    Tested-by: Andreas Herrmann
    Signed-off-by: Andreas Herrmann
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Mark Langsdorf
     
  • commit e58f516ff4730c4047c3f104b061f7a03e9a263c upstream.

    When we can't configure the dma channel we want to fall
    back to PIO. We do this by setting host->do_dma to zero.
    This does not work as do_dma is used to see whether dma
    can be used for the current transfer. Instead, we have
    to set host->dma to NULL.

    Signed-off-by: Sascha Hauer
    Signed-off-by: Chris Ball
    Signed-off-by: Greg Kroah-Hartman

    Sascha Hauer
     
  • commit 0b57d7602b68f7b2786b2f0e22da39cbd4139a95 upstream.

    wait_for_completion_interruptible_timeout() may return negative value.
    In this case, checking if (t > 0) will return true if t is unsigned.

    Signed-off-by: Axel Lin
    Acked-by: Lars-Peter Clausen
    Signed-off-by: Guenter Roeck
    Signed-off-by: Greg Kroah-Hartman

    Axel Lin
     
  • commit 13c07b0286d340275f2d97adf085cecda37ede37 upstream.

    Exactly like roundup_pow_of_two(1), the rounddown version was buggy for
    the case of a compile-time constant '1' argument. Probably because it
    originated from the same code, sharing history with the roundup version
    from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix
    roundup_pow_of_two(1)").

    However, unlike the roundup version, the fix for rounddown is to just
    remove the broken special case entirely. It's simply not needed - the
    generic code

    1UL << ilog2(n)

    does the right thing for the constant '1' argment too. The only reason
    roundup needed that special case was because rounding up does so by
    subtracting one from the argument (and then adding one to the result)
    causing the obvious problems with "ilog2(0)".

    But rounddown doesn't do any of that, since ilog2() naturally truncates
    (ie "rounds down") to the right rounded down value. And without the
    ilog2(0) case, there's no reason for the special case that had the wrong
    value.

    tl;dr: rounddown_pow_of_two(1) should be 1, not 0.

    Acked-by: Dmitry Torokhov
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Linus Torvalds
     
  • Upstream commit d305a6557b2c4dca0110f05ffe745b1ef94adb80.

    If addBA responses comes in just after addba_resp_timer has
    expired mac80211 will still accept it and try to open the
    aggregation session. This causes drivers to be confused and
    in some cases even crash.

    This patch fixes the race condition and makes sure that if
    addba_resp_timer has expired addBA response is not longer
    accepted and we do not try to open half-closed session.

    Signed-off-by: Nikolay Martynov
    [some adjustments]
    Signed-off-by: Johannes Berg
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Martynov
     
  • commit 34a5b4b6af104cf18eb50748509528b9bdbc4036 upstream.

    The ht40 setting should not change after association unless channel switch

    This fix a problem we are seeing which cause uCode assert because driver
    sending invalid information and make uCode confuse

    Here is the firmware assert message:
    kernel: iwlagn 0000:03:00.0: Microcode SW error detected. Restarting 0x82000000.
    kernel: iwlagn 0000:03:00.0: Loaded firmware version: 17.168.5.3 build 42301
    kernel: iwlagn 0000:03:00.0: Start IWL Error Log Dump:
    kernel: iwlagn 0000:03:00.0: Status: 0x000512E4, count: 6
    kernel: iwlagn 0000:03:00.0: 0x00002078 | ADVANCED_SYSASSERT
    kernel: iwlagn 0000:03:00.0: 0x00009514 | uPc
    kernel: iwlagn 0000:03:00.0: 0x00009496 | branchlink1
    kernel: iwlagn 0000:03:00.0: 0x00009496 | branchlink2
    kernel: iwlagn 0000:03:00.0: 0x0000D1F2 | interruptlink1
    kernel: iwlagn 0000:03:00.0: 0x00000000 | interruptlink2
    kernel: iwlagn 0000:03:00.0: 0x01008035 | data1
    kernel: iwlagn 0000:03:00.0: 0x0000C90F | data2
    kernel: iwlagn 0000:03:00.0: 0x000005A7 | line
    kernel: iwlagn 0000:03:00.0: 0x5080B520 | beacon time
    kernel: iwlagn 0000:03:00.0: 0xCC515AE0 | tsf low
    kernel: iwlagn 0000:03:00.0: 0x00000003 | tsf hi
    kernel: iwlagn 0000:03:00.0: 0x00000000 | time gp1
    kernel: iwlagn 0000:03:00.0: 0x29703BF0 | time gp2
    kernel: iwlagn 0000:03:00.0: 0x00000000 | time gp3
    kernel: iwlagn 0000:03:00.0: 0x000111A8 | uCode version
    kernel: iwlagn 0000:03:00.0: 0x000000B0 | hw version
    kernel: iwlagn 0000:03:00.0: 0x00480303 | board version
    kernel: iwlagn 0000:03:00.0: 0x09E8004E | hcmd
    kernel: iwlagn 0000:03:00.0: CSR values:
    kernel: iwlagn 0000:03:00.0: (2nd byte of CSR_INT_COALESCING is CSR_INT_PERIODIC_REG)
    kernel: iwlagn 0000:03:00.0: CSR_HW_IF_CONFIG_REG: 0X00480303
    kernel: iwlagn 0000:03:00.0: CSR_INT_COALESCING: 0X0000ff40
    kernel: iwlagn 0000:03:00.0: CSR_INT: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_INT_MASK: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_FH_INT_STATUS: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_GPIO_IN: 0X00000030
    kernel: iwlagn 0000:03:00.0: CSR_RESET: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_GP_CNTRL: 0X080403c5
    kernel: iwlagn 0000:03:00.0: CSR_HW_REV: 0X000000b0
    kernel: iwlagn 0000:03:00.0: CSR_EEPROM_REG: 0X07d60ffd
    kernel: iwlagn 0000:03:00.0: CSR_EEPROM_GP: 0X90000001
    kernel: iwlagn 0000:03:00.0: CSR_OTP_GP_REG: 0X00030001
    kernel: iwlagn 0000:03:00.0: CSR_GIO_REG: 0X00080044
    kernel: iwlagn 0000:03:00.0: CSR_GP_UCODE_REG: 0X000093bb
    kernel: iwlagn 0000:03:00.0: CSR_GP_DRIVER_REG: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_UCODE_DRV_GP1: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_UCODE_DRV_GP2: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_LED_REG: 0X00000078
    kernel: iwlagn 0000:03:00.0: CSR_DRAM_INT_TBL_REG: 0X88214dd2
    kernel: iwlagn 0000:03:00.0: CSR_GIO_CHICKEN_BITS: 0X27800200
    kernel: iwlagn 0000:03:00.0: CSR_ANA_PLL_CFG: 0X00000000
    kernel: iwlagn 0000:03:00.0: CSR_HW_REV_WA_REG: 0X0001001a
    kernel: iwlagn 0000:03:00.0: CSR_DBG_HPET_MEM_REG: 0Xffff0010
    kernel: iwlagn 0000:03:00.0: FH register values:
    kernel: iwlagn 0000:03:00.0: FH_RSCSR_CHNL0_STTS_WPTR_REG: 0X21316d00
    kernel: iwlagn 0000:03:00.0: FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0X021479c0
    kernel: iwlagn 0000:03:00.0: FH_RSCSR_CHNL0_WPTR: 0X00000060
    kernel: iwlagn 0000:03:00.0: FH_MEM_RCSR_CHNL0_CONFIG_REG: 0X80819104
    kernel: iwlagn 0000:03:00.0: FH_MEM_RSSR_SHARED_CTRL_REG: 0X000000fc
    kernel: iwlagn 0000:03:00.0: FH_MEM_RSSR_RX_STATUS_REG: 0X07030000
    kernel: iwlagn 0000:03:00.0: FH_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000
    kernel: iwlagn 0000:03:00.0: FH_TSSR_TX_STATUS_REG: 0X07ff0001
    kernel: iwlagn 0000:03:00.0: FH_TSSR_TX_ERROR_REG: 0X00000000
    kernel: iwlagn 0000:03:00.0: Start IWL Event Log Dump: display last 20 entries
    kernel: ------------[ cut here ]------------
    WARNING: at net/mac80211/util.c:1208 ieee80211_reconfig+0x1f1/0x407()
    kernel: Hardware name: 4290W4H
    kernel: Pid: 1896, comm: kworker/0:0 Not tainted 3.1.0 #2
    kernel: Call Trace:
    kernel: [] ? warn_slowpath_common+0x73/0x87
    kernel: [] ? ieee80211_reconfig+0x1f1/0x407
    kernel: [] ? ieee80211_recalc_smps_work+0x32/0x32
    kernel: [] ? ieee80211_restart_work+0x7e/0x87
    kernel: [] ? process_one_work+0x1c8/0x2e3
    kernel: [] ? worker_thread+0x17a/0x23a
    kernel: [] ? manage_workers.clone.18+0x15b/0x15b
    kernel: [] ? manage_workers.clone.18+0x15b/0x15b
    kernel: [] ? kthread+0x7a/0x82
    kernel: [] ? kernel_thread_helper+0x4/0x10
    kernel: [] ? kthread_flush_work_fn+0x11/0x11
    kernel: [] ? gs_change+0xb/0xb

    Reported-by: Udo Steinberg
    Signed-off-by: Wey-Yi Guy
    Signed-off-by: John W. Linville
    Signed-off-by: Greg Kroah-Hartman

    Wey-Yi Guy
     
  • commit a855b84c3d8c73220d4d3cd392a7bee7c83de70e upstream.

    Percpu allocator recorded the cpus which map to the first and last
    units in pcpu_first/last_unit_cpu respectively and used them to
    determine the address range of a chunk - e.g. it assumed that the
    first unit has the lowest address in a chunk while the last unit has
    the highest address.

    This simply isn't true. Groups in a chunk can have arbitrary positive
    or negative offsets from the previous one and there is no guarantee
    that the first unit occupies the lowest offset while the last one the
    highest.

    Fix it by actually comparing unit offsets to determine cpus occupying
    the lowest and highest offsets. Also, rename pcu_first/last_unit_cpu
    to pcpu_low/high_unit_cpu to avoid confusion.

    The chunk address range is used to flush cache on vmalloc area
    map/unmap and decide whether a given address is in the first chunk by
    per_cpu_ptr_to_phys() and the bug was discovered by invalid
    per_cpu_ptr_to_phys() translation for crash_note.

    Kudos to Dave Young for tracking down the problem.

    Signed-off-by: Tejun Heo
    Reported-by: WANG Cong
    Reported-by: Dave Young
    Tested-by: Dave Young
    LKML-Reference:
    Signed-off-by: Thomas Renninger
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • commit 4399c8bf2b9093696fa8160d79712e7346989c46 upstream.

    If target_level == 0, current code breaks out of the while-loop if
    SUPERPAGE bit is set. We should also break out if PTE is not present.
    If we don't do this, KVM calls to iommu_iova_to_phys() will cause
    pfn_to_dma_pte() to create mapping for 4KiB pages.

    Signed-off-by: Allen Kay
    Signed-off-by: David Woodhouse
    Signed-off-by: Youquan Song
    Signed-off-by: Greg Kroah-Hartman

    Allen Kay
     
  • commit 8140a95d228efbcd64d84150e794761a32463947 upstream.

    set dmar->iommu_superpage field to the smallest common denominator
    of super page sizes supported by all active VT-d engines. Initialize
    this field in intel_iommu_domain_init() API so intel_iommu_map() API
    will be able to use iommu_superpage field to determine the appropriate
    super page size to use.

    Signed-off-by: Allen Kay
    Signed-off-by: David Woodhouse
    Signed-off-by: Youquan Song
    Signed-off-by: Greg Kroah-Hartman

    Allen Kay
     
  • commit 292827cb164ad00cc7689a21283b1261c0b6daed upstream.

    iommu_unmap() API expects IOMMU drivers to return the actual page order
    of the address being unmapped. Previous code was just returning page
    order passed in from the caller. This patch fixes this problem.

    Signed-off-by: Allen Kay
    Signed-off-by: David Woodhouse
    Signed-off-by: Youquan Song
    Signed-off-by: Greg Kroah-Hartman

    Allen Kay
     
  • commit 9b5cd7f37e1e018432111333e2a67f78ba41edfe upstream.

    SBC-3 says:

    A TRANSFER LENGTH field set to zero specifies that 256 logical
    blocks shall be written. Any other value specifies the number
    of logical blocks that shall be written.

    The old code was always just returning the value in the TRANSFER LENGTH
    byte. Fix this to return 256 if the byte is 0.

    Signed-off-by: Roland Dreier
    Signed-off-by: Nicholas Bellinger
    Signed-off-by: Greg Kroah-Hartman

    Roland Dreier
     
  • commit 02125a826459a6ad142f8d91c5b6357562f96615 upstream.

    __d_path() API is asking for trouble and in case of apparmor d_namespace_path()
    getting just that. The root cause is that when __d_path() misses the root
    it had been told to look for, it stores the location of the most remote ancestor
    in *root. Without grabbing references. Sure, at the moment of call it had
    been pinned down by what we have in *path. And if we raced with umount -l, we
    could have very well stopped at vfsmount/dentry that got freed as soon as
    prepend_path() dropped vfsmount_lock.

    It is safe to compare these pointers with pre-existing (and known to be still
    alive) vfsmount and dentry, as long as all we are asking is "is it the same
    address?". Dereferencing is not safe and apparmor ended up stepping into
    that. d_namespace_path() really wants to examine the place where we stopped,
    even if it's not connected to our namespace. As the result, it looked
    at ->d_sb->s_magic of a dentry that might've been already freed by that point.
    All other callers had been careful enough to avoid that, but it's really
    a bad interface - it invites that kind of trouble.

    The fix is fairly straightforward, even though it's bigger than I'd like:
    * prepend_path() root argument becomes const.
    * __d_path() is never called with NULL/NULL root. It was a kludge
    to start with. Instead, we have an explicit function - d_absolute_root().
    Same as __d_path(), except that it doesn't get root passed and stops where
    it stops. apparmor and tomoyo are using it.
    * __d_path() returns NULL on path outside of root. The main
    caller is show_mountinfo() and that's precisely what we pass root for - to
    skip those outside chroot jail. Those who don't want that can (and do)
    use d_path().
    * __d_path() root argument becomes const. Everyone agrees, I hope.
    * apparmor does *NOT* try to use __d_path() or any of its variants
    when it sees that path->mnt is an internal vfsmount. In that case it's
    definitely not mounted anywhere and dentry_path() is exactly what we want
    there. Handling of sysctl()-triggered weirdness is moved to that place.
    * if apparmor is asked to do pathname relative to chroot jail
    and __d_path() tells it we it's not in that jail, the sucker just calls
    d_absolute_path() instead. That's the other remaining caller of __d_path(),
    BTW.
    * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway -
    the normal seq_file logics will take care of growing the buffer and redoing
    the call of ->show() just fine). However, if it gets path not reachable
    from root, it returns SEQ_SKIP. The only caller adjusted (i.e. stopped
    ignoring the return value as it used to do).

    Reviewed-by: John Johansen
    ACKed-by: John Johansen
    Signed-off-by: Al Viro
    Signed-off-by: Greg Kroah-Hartman

    Al Viro
     
  • commit 1368edf0647ac112d8cfa6ce47257dc950c50f5c upstream.

    Commit f5252e00 ("mm: avoid null pointer access in vm_struct via
    /proc/vmallocinfo") adds newly allocated vm_structs to the vmlist after
    it is fully initialised. Unfortunately, it did not check that
    __vmalloc_area_node() successfully populated the area. In the event of
    allocation failure, the vmalloc area is freed but the pointer to freed
    memory is inserted into the vmlist leading to a a crash later in
    get_vmalloc_info().

    This patch adds a check for ____vmalloc_area_node() failure within
    __vmalloc_node_range. It does not use "goto fail" as in the previous
    error path as a warning was already displayed by __vmalloc_area_node()
    before it called vfree in its failure path.

    Credit goes to Luciano Chavez for doing all the real work of identifying
    exactly where the problem was.

    Signed-off-by: Mel Gorman
    Reported-by: Luciano Chavez
    Tested-by: Luciano Chavez
    Reviewed-by: Rik van Riel
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Mel Gorman
     
  • commit d021563888312018ca65681096f62e36c20e63cc upstream.

    setup_zone_migrate_reserve() expects that zone->start_pfn starts at
    pageblock_nr_pages aligned pfn otherwise we could access beyond an
    existing memblock resulting in the following panic if
    CONFIG_HOLES_IN_ZONE is not configured and we do not check pfn_valid:

    IP: [] setup_zone_migrate_reserve+0xcd/0x180
    *pdpt = 0000000000000000 *pde = f000ff53f000ff53
    Oops: 0000 [#1] SMP
    Pid: 1, comm: swapper Not tainted 3.0.7-0.7-pae #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
    EIP: 0060:[] EFLAGS: 00010006 CPU: 0
    EIP is at setup_zone_migrate_reserve+0xcd/0x180
    EAX: 000c0000 EBX: f5801fc0 ECX: 000c0000 EDX: 00000000
    ESI: 000c01fe EDI: 000c01fe EBP: 00140000 ESP: f2475f58
    DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
    Process swapper (pid: 1, ti=f2474000 task=f2472cd0 task.ti=f2474000)
    Call Trace:
    [] __setup_per_zone_wmarks+0xec/0x160
    [] setup_per_zone_wmarks+0xf/0x20
    [] init_per_zone_wmark_min+0x27/0x86
    [] do_one_initcall+0x2b/0x160
    [] kernel_init+0xbe/0x157
    [] kernel_thread_helper+0x6/0xd
    Code: a5 39 f5 89 f7 0f 46 fd 39 cf 76 40 8b 03 f6 c4 08 74 32 eb 91 90 89 c8 c1 e8 0e 0f be 80 80 2f 86 c0 8b 14 85 60 2f 86 c0 89 c8 82 b4 12 00 00 c1 e0 05 03 82 ac 12 00 00 8b 00 f6 c4 08 0f
    EIP: [] setup_zone_migrate_reserve+0xcd/0x180 SS:ESP 0068:f2475f58
    CR2: 00000000000012b4

    We crashed in pageblock_is_reserved() when accessing pfn 0xc0000 because
    highstart_pfn = 0x36ffe.

    The issue was introduced in 3.0-rc1 by 6d3163ce ("mm: check if any page
    in a pageblock is reserved before marking it MIGRATE_RESERVE").

    Make sure that start_pfn is always aligned to pageblock_nr_pages to
    ensure that pfn_valid s always called at the start of each pageblock.
    Architectures with holes in pageblocks will be correctly handled by
    pfn_valid_within in pageblock_is_reserved.

    Signed-off-by: Michal Hocko
    Signed-off-by: Mel Gorman
    Tested-by: Dang Bo
    Reviewed-by: KAMEZAWA Hiroyuki
    Cc: Andrea Arcangeli
    Cc: David Rientjes
    Cc: Arve Hjnnevg
    Cc: KOSAKI Motohiro
    Cc: John Stultz
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Michal Hocko
     
  • commit d68fb11c3dae75c8331538dcf083a65e697cc034 upstream.

    The clock_getres() function must return the resolution in the timespec
    argument and return 0 for success.

    Signed-off-by: Thomas Gleixner
    Acked-by: John Stultz
    Cc: Richard Cochran
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit 58a84aa92723d1ac3e1cc4e3b0ff49291663f7e1 upstream.

    Commit 70b50f94f1644 ("mm: thp: tail page refcounting fix") keeps all
    page_tail->_count zero at all times. But the current kernel does not
    set page_tail->_count to zero if a 1GB page is utilized. So when an
    IOMMU 1GB page is used by KVM, it wil result in a kernel oops because a
    tail page's _count does not equal zero.

    kernel BUG at include/linux/mm.h:386!
    invalid opcode: 0000 [#1] SMP
    Call Trace:
    gup_pud_range+0xb8/0x19d
    get_user_pages_fast+0xcb/0x192
    ? trace_hardirqs_off+0xd/0xf
    hva_to_pfn+0x119/0x2f2
    gfn_to_pfn_memslot+0x2c/0x2e
    kvm_iommu_map_pages+0xfd/0x1c1
    kvm_iommu_map_memslots+0x7c/0xbd
    kvm_iommu_map_guest+0xaa/0xbf
    kvm_vm_ioctl_assigned_device+0x2ef/0xa47
    kvm_vm_ioctl+0x36c/0x3a2
    do_vfs_ioctl+0x49e/0x4e4
    sys_ioctl+0x5a/0x7c
    system_call_fastpath+0x16/0x1b
    RIP gup_huge_pud+0xf2/0x159

    Signed-off-by: Youquan Song
    Reviewed-by: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Youquan Song
     
  • commit b6999b19120931ede364fa3b685e698a61fed31d upstream.

    With the 3.2-rc kernel, IOMMU 2M pages in KVM works. But when I tried
    to use IOMMU 1GB pages in KVM, I encountered an oops and the 1GB page
    failed to be used.

    The root cause is that 1GB page allocation calls gup_huge_pud() while 2M
    page calls gup_huge_pmd. If compound pages are used and the page is a
    tail page, gup_huge_pmd() increases _mapcount to record tail page are
    mapped while gup_huge_pud does not do that.

    So when the mapped page is relesed, it will result in kernel oops
    because the page is not marked mapped.

    This patch add tail process for compound page in 1GB huge page which
    keeps the same process as 2M page.

    Reproduce like:
    1. Add grub boot option: hugepagesz=1G hugepages=8
    2. mount -t hugetlbfs -o pagesize=1G hugetlbfs /dev/hugepages
    3. qemu-kvm -m 2048 -hda os-kvm.img -cpu kvm64 -smp 4 -mem-path /dev/hugepages
    -net none -device pci-assign,host=07:00.1

    kernel BUG at mm/swap.c:114!
    invalid opcode: 0000 [#1] SMP
    Call Trace:
    put_page+0x15/0x37
    kvm_release_pfn_clean+0x31/0x36
    kvm_iommu_put_pages+0x94/0xb1
    kvm_iommu_unmap_memslots+0x80/0xb6
    kvm_assign_device+0xba/0x117
    kvm_vm_ioctl_assigned_device+0x301/0xa47
    kvm_vm_ioctl+0x36c/0x3a2
    do_vfs_ioctl+0x49e/0x4e4
    sys_ioctl+0x5a/0x7c
    system_call_fastpath+0x16/0x1b
    RIP put_compound_page+0xd4/0x168

    Signed-off-by: Youquan Song
    Reviewed-by: Andrea Arcangeli
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Youquan Song