16 Dec, 2011

3 commits

  • …kernel/git/konrad/xen

    * 'stable/for-linus-fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/swiotlb: Use page alignment for early buffer allocation.
    xen: only limit memory map to maximum reservation for domain 0.

    Linus Torvalds
     
  • This fixes an odd bug found on a Dell PowerEdge 1850/0RC130
    (BIOS A05 01/09/2006) where all of the modules doing pci_set_dma_mask
    would fail with:

    ata_piix 0000:00:1f.1: enabling device (0005 -> 0007)
    ata_piix 0000:00:1f.1: can't derive routing for PCI INT A
    ata_piix 0000:00:1f.1: BMDMA: failed to set dma mask, falling back to PIO

    The issue was the Xen-SWIOTLB was allocated such as that the end of
    buffer was stradling a page (and also above 4GB). The fix was
    spotted by Kalev Leonid which was to piggyback on git commit
    e79f86b2ef9c0a8c47225217c1018b7d3d90101c "swiotlb: Use page alignment
    for early buffer allocation" which:

    We could call free_bootmem_late() if swiotlb is not used, and
    it will shrink to page alignment.

    So alloc them with page alignment at first, to avoid lose two pages

    And doing that fixes the outstanding issue.

    CC: stable@kernel.org
    Suggested-by: "Kalev, Leonid"
    Reported-and-Tested-by: "Taylor, Neal E"
    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • d312ae878b6a "xen: use maximum reservation to limit amount of usable RAM"
    clamped the total amount of RAM to the current maximum reservation. This is
    correct for dom0 but is not correct for guest domains. In order to boot a guest
    "pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for
    future memory expansion the guest must derive max_pfn from the e820 provided by
    the toolstack and not the current maximum reservation (which can reflect only
    the current maximum, not the guest lifetime max). The existing algorithm
    already behaves this correctly if we do not artificially limit the maximum
    number of pages for the guest case.

    For a guest booted with maxmem=512, memory=128 this results in:
    [ 0.000000] BIOS-provided physical RAM map:
    [ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
    [ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
    -[ 0.000000] Xen: 0000000000100000 - 0000000008100000 (usable)
    -[ 0.000000] Xen: 0000000008100000 - 0000000020800000 (unusable)
    +[ 0.000000] Xen: 0000000000100000 - 0000000020800000 (usable)
    ...
    [ 0.000000] NX (Execute Disable) protection: active
    [ 0.000000] DMI not present or invalid.
    [ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
    [ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
    -[ 0.000000] last_pfn = 0x8100 max_arch_pfn = 0x1000000
    +[ 0.000000] last_pfn = 0x20800 max_arch_pfn = 0x1000000
    [ 0.000000] initial memory mapped : 0 - 027ff000
    [ 0.000000] Base memory trampoline at [c009f000] 9f000 size 4096
    -[ 0.000000] init_memory_mapping: 0000000000000000-0000000008100000
    -[ 0.000000] 0000000000 - 0008100000 page 4k
    -[ 0.000000] kernel direct mapping tables up to 8100000 @ 27bb000-27ff000
    +[ 0.000000] init_memory_mapping: 0000000000000000-0000000020800000
    +[ 0.000000] 0000000000 - 0020800000 page 4k
    +[ 0.000000] kernel direct mapping tables up to 20800000 @ 26f8000-27ff000
    [ 0.000000] xen: setting RW the range 27e8000 - 27ff000
    [ 0.000000] 0MB HIGHMEM available.
    -[ 0.000000] 129MB LOWMEM available.
    -[ 0.000000] mapped low ram: 0 - 08100000
    -[ 0.000000] low ram: 0 - 08100000
    +[ 0.000000] 520MB LOWMEM available.
    +[ 0.000000] mapped low ram: 0 - 20800000
    +[ 0.000000] low ram: 0 - 20800000

    With this change "xl mem-set 512M" will successfully increase the
    guest RAM (by reducing the balloon).

    There is no change for dom0.

    Reported-and-Tested-by: George Shuklin
    Signed-off-by: Ian Campbell
    Cc: stable@kernel.org
    Reviewed-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk

    Ian Campbell
     

15 Dec, 2011

4 commits


14 Dec, 2011

16 commits

  • Fixes:
    https://bugs.freedesktop.org/show_bug.cgi?id=43739

    Signed-off-by: Alex Deucher
    Cc: stable@kernel.org
    Signed-off-by: Dave Airlie

    Alex Deucher
     
  • The label 'out_bdi' should be followed by bdi_destroy() instead of
    fput() which should be after the 'out_fput' label.

    If bdi_setup_and_register() fails then jump to the 'out_fput' label
    instead of the 'out_bdi' one.

    If fget(data.info_fd) fails then jump to the previously fixed 'out_bdi'
    label to call bdi_destroy() otherwise the bdi object will not be
    destroyed.

    Compile tested only.

    Signed-off-by: Djalal Harouni
    Signed-off-by: Al Viro

    Djalal Harouni
     
  • We need to zero out part of a page which beyond EOF before setting uptodate,
    otherwise, mapread or write will see non-zero data beyond EOF.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Yongqiang Yang
     
  • If a file is fallocated on a hole, map->m_lblk + map->m_len may be greater
    than ee_block + ee_len.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Yongqiang Yang
     
  • If a page has been read into memory and never been written, it has no
    buffers, but we should handle the page in truncate or punch hole.

    VFS code of writing operations has handled holes correctly, so this
    patch removes the code handling holes in writing operations.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Yongqiang Yang
     
  • If there is an unwritten but clean buffer in a page and there is a
    dirty buffer after the buffer, then mpage_submit_io does not write the
    dirty buffer out. As a result, da_writepages loops forever.

    This patch fixes the problem by checking dirty flag.

    Signed-off-by: Yongqiang Yang
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Yongqiang Yang
     
  • If the pte mapping in generic_perform_write() is unmapped between
    iov_iter_fault_in_readable() and iov_iter_copy_from_user_atomic(), the
    "copied" parameter to ->end_write can be zero. ext4 couldn't cope with
    it with delayed allocations enabled. This skips the i_disksize
    enlargement logic if copied is zero and no new data was appeneded to
    the inode.

    gdb> bt
    #0 0xffffffff811afe80 in ext4_da_should_update_i_disksize (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x1\
    08000, len=0x1000, copied=0x0, page=0xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2467
    #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
    xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
    #2 0xffffffff810d97f1 in generic_perform_write (iocb=, iov=, nr_segs=, pos=0x108000, ppos=0xffff88001e26be40, count=, written=0x0) at mm/filemap.c:2440
    #3 generic_file_buffered_write (iocb=, iov=, nr_segs=, p\
    os=0x108000, ppos=0xffff88001e26be40, count=, written=0x0) at mm/filemap.c:2482
    #4 0xffffffff810db5d1 in __generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, ppos=0\
    xffff88001e26be40) at mm/filemap.c:2600
    #5 0xffffffff810db853 in generic_file_aio_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=, pos=) at mm/filemap.c:2632
    #6 0xffffffff811a71aa in ext4_file_write (iocb=0xffff88001e26bde8, iov=0xffff88001e26bec8, nr_segs=0x1, pos=0x108000) a\
    t fs/ext4/file.c:136
    #7 0xffffffff811375aa in do_sync_write (filp=0xffff88003f606a80, buf=, len=, \
    ppos=0xffff88001e26bf48) at fs/read_write.c:406
    #8 0xffffffff81137e56 in vfs_write (file=0xffff88003f606a80, buf=0x1ec2960

    , count=0x4\
    000, pos=0xffff88001e26bf48) at fs/read_write.c:435
    #9 0xffffffff8113816c in sys_write (fd=, buf=0x1ec2960
    , count=0x\
    4000) at fs/read_write.c:487
    #10
    #11 0x00007f120077a390 in __brk_reservation_fn_dmi_alloc__ ()
    #12 0x0000000000000000 in ?? ()
    gdb> print offset
    $22 = 0xffffffffffffffff
    gdb> print idx
    $23 = 0xffffffff
    gdb> print inode->i_blkbits
    $24 = 0xc
    gdb> up
    #1 ext4_da_write_end (file=0xffff88003f606a80, mapping=0xffff88001d3824e0, pos=0x108000, len=0x1000, copied=0x0, page=0\
    xffffea0000d792e8, fsdata=0x0) at fs/ext4/inode.c:2512
    2512 if (ext4_da_should_update_i_disksize(page, end)) {
    gdb> print start
    $25 = 0x0
    gdb> print end
    $26 = 0xffffffffffffffff
    gdb> print pos
    $27 = 0x108000
    gdb> print new_i_size
    $28 = 0x108000
    gdb> print ((struct ext4_inode_info *)((char *)inode-((int)(&((struct ext4_inode_info *)0)->vfs_inode))))->i_disksize
    $29 = 0xd9000
    gdb> down
    2467 for (i = 0; i < idx; i++)
    gdb> print i
    $30 = 0xd44acbee

    This is 100% reproducible with some autonuma development code tuned in
    a very aggressive manner (not normal way even for knumad) which does
    "exotic" changes to the ptes. It wouldn't normally trigger but I don't
    see why it can't happen normally if the page is added to swap cache in
    between the two faults leading to "copied" being zero (which then
    hangs in ext4). So it should be fixed. Especially possible with lumpy
    reclaim (albeit disabled if compaction is enabled) as that would
    ignore the young bits in the ptes.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Andrea Arcangeli
     
  • * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    Revert "x86, efi: Calling __pa() with an ioremap()ed address is invalid"
    x86, efi: Make efi_call_phys_{prelog,epilog} CONFIG_RELOCATABLE-aware

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: add missing spin_unlock at ceph_mdsc_build_path()
    ceph: fix SEEK_CUR, SEEK_SET regression
    crush: fix mapping calculation when force argument doesn't exist
    ceph: use i_ceph_lock instead of i_lock
    rbd: remove buggy rollback functionality
    rbd: return an error when an invalid header is read
    ceph: fix rasize reporting by ceph_show_options

    Linus Torvalds
     
  • * 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux:
    writeback: set max_pause to lowest value on zero bdi_dirty
    writeback: permit through good bdi even when global dirty exceeded
    writeback: comment on the bdi dirty threshold
    fs: Make write(2) interruptible by a fatal signal
    writeback: Fix issue on make htmldocs

    Linus Torvalds
     
  • one of the paths was missing spin_unlock

    Signed-off-by: Yehuda Sadeh

    Yehuda Sadeh
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • same story as with ubifs

    Signed-off-by: Al Viro

    Al Viro
     
  • doing that before you are ready to handle mount() is a Bad Idea(tm)...

    Signed-off-by: Al Viro

    Al Viro
     
  • * 'fixes' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm:
    ARM: 7204/1: arch/arm/kernel/setup.c: initialize arm_dma_zone_size earlier
    ARM: 7185/1: perf: don't assign platform_device on unsupported CPUs
    ARM: 7187/1: fix unwinding for XIP kernels
    ARM: 7186/1: fix Kconfig issue with PHYS_OFFSET and !MMU

    Linus Torvalds
     
  • Commit 06222e491e663dac939f04b125c9dc52126a75c4 got the if wrong so that
    it always evaluates as true. This is semantically harmless, but makes
    SEEK_CUR and SEEK_SET needlessly query the server.

    Rewrite the if to explicitly enumerate the cases we DO need a valid i_size
    to make this code less fragile.

    Reported-by: Roel Kluin
    Signed-off-by: Sage Weil

    Sage Weil
     

13 Dec, 2011

12 commits

  • Fix race between lseek(fd, 0, SEEK_CUR) and read/write. This was fixed in
    generic code by commit 5b6f1eb97d (vfs: lseek(fd, 0, SEEK_CUR) race condition).

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • The test in fuse_file_llseek() "not SEEK_CUR or not SEEK_SET" always evaluates
    to true.

    This was introduced in 3.1 by commit 06222e49 (fs: handle SEEK_HOLE/SEEK_DATA
    properly in all fs's that define their own llseek) and changed the behavior of
    SEEK_CUR and SEEK_SET to always retrieve the file attributes. This is a
    performance regression.

    Fix the test so that it makes sense.

    Signed-off-by: Miklos Szeredi
    CC: stable@vger.kernel.org
    CC: Josef Bacik
    CC: Al Viro

    Roel Kluin
     
  • Fix two bugs in fuse_retrieve():

    - retrieving more than one page would yield repeated instances of the
    first page

    - if more than FUSE_MAX_PAGES_PER_REQ pages were requested than the
    request page array would overflow

    fuse_retrieve() was added in 2.6.36 and these bugs had been there since the
    beginning.

    Signed-off-by: Miklos Szeredi
    CC: stable@vger.kernel.org

    Miklos Szeredi
     
  • Exactly like roundup_pow_of_two(1), the rounddown version was buggy for
    the case of a compile-time constant '1' argument. Probably because it
    originated from the same code, sharing history with the roundup version
    from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix
    roundup_pow_of_two(1)").

    However, unlike the roundup version, the fix for rounddown is to just
    remove the broken special case entirely. It's simply not needed - the
    generic code

    1UL << ilog2(n)

    does the right thing for the constant '1' argment too. The only reason
    roundup needed that special case was because rounding up does so by
    subtracting one from the argument (and then adding one to the result)
    causing the obvious problems with "ilog2(0)".

    But rounddown doesn't do any of that, since ilog2() naturally truncates
    (ie "rounds down") to the right rounded down value. And without the
    ilog2(0) case, there's no reason for the special case that had the wrong
    value.

    tl;dr: rounddown_pow_of_two(1) should be 1, not 0.

    Acked-by: Dmitry Torokhov
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (jz4740) Staticise jz4740_hwmon_driver
    hwmon: (jz4740) fix signedness bug

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
    mmc: core: Fix deadlock when the CONFIG_MMC_UNSAFE_RESUME is not defined
    mmc: sdhci-s3c: Remove old and misprototyped suspend operations
    mmc: tmio: fix clock gating on platforms with a .set_pwr() method
    mmc: sh_mmcif: fix clock gating on platforms with a .down_pwr() method
    mmc: core: Fix typo at mmc_card_sleep
    mmc: core: Fix power_off_notify during suspend
    mmc: core: Fix setting power notify state variable for non-eMMC
    mmc: core: Add quirk for long data read time
    mmc: Add module.h include to sdhci-cns3xxx.c
    mmc: mxcmmc: fix falling back to PIO
    mmc: omap_hsmmc: DMA unmap only once in case of MMC error

    Linus Torvalds
     
  • /proc/mounts was showing the mount option [no]init_inode_table when
    the correct mount option that will be accepted by parse_options() is
    [no]init_itable.

    Signed-off-by: "Theodore Ts'o"
    Cc: stable@kernel.org

    Theodore Ts'o
     
  • This hangs my MacBook Air at boot time; I get no console
    messages at all. I reverted this on top of -rc5 and my machine
    boots again.

    This reverts commit e8c7106280a305e1ff2a3a8a4dfce141469fb039.

    Signed-off-by: Matt Fleming
    Signed-off-by: Keith Packard
    Acked-by: H. Peter Anvin
    Cc: Matthew Garrett
    Cc: Zhang Rui
    Cc: Huang Ying
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Link: http://lkml.kernel.org/r/1321621751-3650-1-git-send-email-matt@console
    Signed-off-by: Ingo Molnar

    Keith Packard
     
  • If the force argument isn't valid, we should continue calculating a
    mapping as if it weren't specified.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • It is not used outside this driver so no need to make the symbol global.

    Signed-off-by: Axel Lin
    Acked-by: Lars-Peter Clausen
    Signed-off-by: Guenter Roeck

    Axel Lin
     
  • wait_for_completion_interruptible_timeout() may return negative value.
    In this case, checking if (t > 0) will return true if t is unsigned.

    Signed-off-by: Axel Lin
    Acked-by: Lars-Peter Clausen
    Cc: stable@kernel.org (3.0+)
    Signed-off-by: Guenter Roeck

    Axel Lin
     
  • Commit 1939dd84b3 ("ext4: cleanup ext4_ext_grow_indepth code") added a
    reference to ext4_extent_header.eh_depth, but forget to pass the value
    read through le16_to_cpu. The result is a crash on big-endian
    machines, such as this crash on a POWER7 server:

    attempt to access beyond end of device
    sda8: rw=0, want=776392648163376, limit=168558560
    Unable to handle kernel paging request for data at address 0x6b6b6b6b6b6b6bcb
    Faulting instruction address: 0xc0000000001f5f38
    cpu 0x14: Vector: 300 (Data Access) at [c000001bd1aaecf0]
    pc: c0000000001f5f38: .__brelse+0x18/0x60
    lr: c0000000002e07a4: .ext4_ext_drop_refs+0x44/0x80
    sp: c000001bd1aaef70
    msr: 9000000000009032
    dar: 6b6b6b6b6b6b6bcb
    dsisr: 40000000
    current = 0xc000001bd15b8010
    paca = 0xc00000000ffe4600
    pid = 19911, comm = flush-8:0
    enter ? for help
    [c000001bd1aaeff0] c0000000002e07a4 .ext4_ext_drop_refs+0x44/0x80
    [c000001bd1aaf090] c0000000002e0c58 .ext4_ext_find_extent+0x408/0x4c0
    [c000001bd1aaf180] c0000000002e145c .ext4_ext_insert_extent+0x2bc/0x14c0
    [c000001bd1aaf2c0] c0000000002e3fb8 .ext4_ext_map_blocks+0x628/0x1710
    [c000001bd1aaf420] c0000000002b2974 .ext4_map_blocks+0x224/0x310
    [c000001bd1aaf4d0] c0000000002b7f2c .mpage_da_map_and_submit+0xbc/0x490
    [c000001bd1aaf5a0] c0000000002b8688 .write_cache_pages_da+0x2c8/0x430
    [c000001bd1aaf720] c0000000002b8b28 .ext4_da_writepages+0x338/0x670
    [c000001bd1aaf8d0] c000000000157280 .do_writepages+0x40/0x90
    [c000001bd1aaf940] c0000000001ea830 .writeback_single_inode+0xe0/0x530
    [c000001bd1aafa00] c0000000001eb680 .writeback_sb_inodes+0x210/0x300
    [c000001bd1aafb20] c0000000001ebc84 .__writeback_inodes_wb+0xd4/0x140
    [c000001bd1aafbe0] c0000000001ebfec .wb_writeback+0x2fc/0x3e0
    [c000001bd1aafce0] c0000000001ed770 .wb_do_writeback+0x2f0/0x300
    [c000001bd1aafdf0] c0000000001ed848 .bdi_writeback_thread+0xc8/0x340
    [c000001bd1aafed0] c0000000000c5494 .kthread+0xb4/0xc0
    [c000001bd1aaff90] c000000000021f48 .kernel_thread+0x54/0x70

    This is due to getting ext_depth(inode) == 0x101 and therefore running
    off the end of the path array in ext4_ext_drop_refs into following
    unallocated structures.

    This fixes it by adding the necessary le16_to_cpu.

    Signed-off-by: Paul Mackerras
    Signed-off-by: "Theodore Ts'o"

    Paul Mackerras
     

12 Dec, 2011

2 commits


11 Dec, 2011

3 commits