07 Mar, 2015

40 commits

  • commit 1a4bcf470c886b955adf36486f4c86f2441d85cb upstream.

    We have a scenario where after the fsync log replay we can lose file data
    that had been previously fsync'ed if we added an hard link for our inode
    and after that we sync'ed the fsync log (for example by fsync'ing some
    other file or directory).

    This is because when adding an hard link we updated the inode item in the
    log tree with an i_size value of 0. At that point the new inode item was
    in memory only and a subsequent fsync log replay would not make us lose
    the file data. However if after adding the hard link we sync the log tree
    to disk, by fsync'ing some other file or directory for example, we ended
    up losing the file data after log replay, because the inode item in the
    persisted log tree had an an i_size of zero.

    This is easy to reproduce, and the following excerpt from my test for
    xfstests shows this:

    _scratch_mkfs >> $seqres.full 2>&1
    _init_flakey
    _mount_flakey

    # Create one file with data and fsync it.
    # This made the btrfs fsync log persist the data and the inode metadata with
    # a correct inode->i_size (4096 bytes).
    $XFS_IO_PROG -f -c "pwrite -S 0xaa -b 4K 0 4K" -c "fsync" \
    $SCRATCH_MNT/foo | _filter_xfs_io

    # Now add one hard link to our file. This made the btrfs code update the fsync
    # log, in memory only, with an inode metadata having a size of 0.
    ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link

    # Now force persistence of the fsync log to disk, for example, by fsyncing some
    # other file.
    touch $SCRATCH_MNT/bar
    $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar

    # Before a power loss or crash, we could read the 4Kb of data from our file as
    # expected.
    echo "File content before:"
    od -t x1 $SCRATCH_MNT/foo

    # Simulate a crash/power loss.
    _load_flakey_table $FLAKEY_DROP_WRITES
    _unmount_flakey

    _load_flakey_table $FLAKEY_ALLOW_WRITES
    _mount_flakey

    # After the fsync log replay, because the fsync log had a value of 0 for our
    # inode's i_size, we couldn't read anymore the 4Kb of data that we previously
    # wrote and fsync'ed. The size of the file became 0 after the fsync log replay.
    echo "File content after:"
    od -t x1 $SCRATCH_MNT/foo

    Another alternative test, that doesn't need to fsync an inode in the same
    transaction it was created, is:

    _scratch_mkfs >> $seqres.full 2>&1
    _init_flakey
    _mount_flakey

    # Create our test file with some data.
    $XFS_IO_PROG -f -c "pwrite -S 0xaa -b 8K 0 8K" \
    $SCRATCH_MNT/foo | _filter_xfs_io

    # Make sure the file is durably persisted.
    sync

    # Append some data to our file, to increase its size.
    $XFS_IO_PROG -f -c "pwrite -S 0xcc -b 4K 8K 4K" \
    $SCRATCH_MNT/foo | _filter_xfs_io

    # Fsync the file, so from this point on if a crash/power failure happens, our
    # new data is guaranteed to be there next time the fs is mounted.
    $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

    # Add one hard link to our file. This made btrfs write into the in memory fsync
    # log a special inode with generation 0 and an i_size of 0 too. Note that this
    # didn't update the inode in the fsync log on disk.
    ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link

    # Now make sure the in memory fsync log is durably persisted.
    # Creating and fsync'ing another file will do it.
    touch $SCRATCH_MNT/bar
    $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar

    # As expected, before the crash/power failure, we should be able to read the
    # 12Kb of file data.
    echo "File content before:"
    od -t x1 $SCRATCH_MNT/foo

    # Simulate a crash/power loss.
    _load_flakey_table $FLAKEY_DROP_WRITES
    _unmount_flakey

    _load_flakey_table $FLAKEY_ALLOW_WRITES
    _mount_flakey

    # After mounting the fs again, the fsync log was replayed.
    # The btrfs fsync log replay code didn't update the i_size of the persisted
    # inode because the inode item in the log had a special generation with a
    # value of 0 (and it couldn't know the correct i_size, since that inode item
    # had a 0 i_size too). This made the last 4Kb of file data inaccessible and
    # effectively lost.
    echo "File content after:"
    od -t x1 $SCRATCH_MNT/foo

    This isn't a new issue/regression. This problem has been around since the
    log tree code was added in 2008:

    Btrfs: Add a write ahead tree log to optimize synchronous operations
    (commit e02119d5a7b4396c5a872582fddc8bd6d305a70a)

    Test cases for xfstests follow soon.

    Signed-off-by: Filipe Manana
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 381cf6587f8a8a8e981bc0c1aaaa8859b51dc756 upstream.

    If btrfs_find_item is called with NULL path it allocates one locally but
    does not free it. Affected paths are inserting an orphan item for a file
    and for a subvol root.

    Move the path allocation to the callers.

    Fixes: 3f870c289900 ("btrfs: expand btrfs_find_item() to include find_orphan_item functionality")
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    David Sterba
     
  • commit 5efa0490cc94aee06cd8d282683e22a8ce0a0026 upstream.

    This has been confusing people for too long, the message is really just
    informative.

    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason
    Signed-off-by: Greg Kroah-Hartman

    David Sterba
     
  • commit 7eb71e0351fbb1b242ae70abb7bb17107fe2f792 upstream.

    It turns out it's possible to get __remove_osd() called twice on the
    same OSD. That doesn't sit well with rb_erase() - depending on the
    shape of the tree we can get a NULL dereference, a soft lockup or
    a random crash at some point in the future as we end up touching freed
    memory. One scenario that I was able to reproduce is as follows:

    con_fault_finish()
    osd_reset()

    ceph_osdc_handle_map()

    kick_requests()

    reset_changed_osds()
    __reset_osd()
    __remove_osd()




    __kick_osd_requests()
    __reset_osd()
    __remove_osd()
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Sage Weil
    Reviewed-by: Alex Elder
    Signed-off-by: Greg Kroah-Hartman

    Ilya Dryomov
     
  • commit 4690555e13c48fef07f2762f6b0cd6b181e326d0 upstream.

    Since kernel 3.14 the backlight control has been broken on various Samsung
    Atom based netbooks. This has been bisected and this problem happens since
    commit b35684b8fa94 ("drm/i915: do full backlight setup at enable time")

    This has been reported and discussed in detail here:
    http://lists.freedesktop.org/archives/intel-gfx/2014-July/049395.html

    Unfortunately no-one has been able to fix this. This only affects Samsung
    Atom netbooks, and the Linux kernel and the BIOS of those laptops have never
    worked well together. All affected laptops already have a quirk to avoid using
    the standard acpi-video interface and instead use the samsung specific SABI
    interface which samsung-laptop uses. It seems that recent fixes to the i915
    driver have also broken backlight control through the SABI interface.

    The intel_backlight driver OTOH works fine, and also allows for finer grained
    backlight control. So add a new use_native_backlight quirk, and replace the
    broken_acpi_video quirk with this quirk for affected models. This new quirk
    disables acpi-video as before and also stops samsung-laptop from registering
    the SABI based samsung_laptop backlight interface, leaving only the working
    intel_backlight interface.

    This commit enables this new quirk for 3 models which are known to be affected,
    chances are that it needs to be used on other models too.

    BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1094948 # N145P
    BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1115713 # N250P
    Reported-by: Bertrik Sikken # N150P
    Cc: stable@vger.kernel.org # 3.16
    Signed-off-by: Hans de Goede
    Signed-off-by: Darren Hart
    Signed-off-by: Greg Kroah-Hartman

    Hans de Goede
     
  • commit 164c24063a3eadee11b46575c5482b2f1417be49 upstream.

    sm->offset maybe wrong but magic maybe right, the offset do not have CRC.

    Badness at c00c7580 [verbose debug info unavailable]
    NIP: c00c7580 LR: c00c718c CTR: 00000014
    REGS: df07bb40 TRAP: 0700 Not tainted (2.6.34.13-WR4.3.0.0_standard)
    MSR: 00029000 CR: 22084f84 XER: 00000000
    TASK = df84d6e0[908] 'mount' THREAD: df07a000
    GPR00: 00000001 df07bbf0 df84d6e0 00000000 00000001 00000000 df07bb58 00000041
    GPR08: 00000041 c0638860 00000000 00000010 22084f88 100636c8 df814ff8 00000000
    GPR16: df84d6e0 dfa558cc c05adb90 00000048 c0452d30 00000000 000240d0 000040d0
    GPR24: 00000014 c05ae734 c05be2e0 00000000 00000001 00000000 00000000 c05ae730
    NIP [c00c7580] __alloc_pages_nodemask+0x4d0/0x638
    LR [c00c718c] __alloc_pages_nodemask+0xdc/0x638
    Call Trace:
    [df07bbf0] [c00c718c] __alloc_pages_nodemask+0xdc/0x638 (unreliable)
    [df07bc90] [c00c7708] __get_free_pages+0x20/0x48
    [df07bca0] [c00f4a40] __kmalloc+0x15c/0x1ec
    [df07bcd0] [c01fc880] jffs2_scan_medium+0xa58/0x14d0
    [df07bd70] [c01ff38c] jffs2_do_mount_fs+0x1f4/0x6b4
    [df07bdb0] [c020144c] jffs2_do_fill_super+0xa8/0x260
    [df07bdd0] [c020230c] jffs2_fill_super+0x104/0x184
    [df07be00] [c0335814] get_sb_mtd_aux+0x9c/0xec
    [df07be20] [c033596c] get_sb_mtd+0x84/0x1e8
    [df07be60] [c0201ed0] jffs2_get_sb+0x1c/0x2c
    [df07be70] [c0103898] vfs_kern_mount+0x78/0x1e8
    [df07bea0] [c0103a58] do_kern_mount+0x40/0x100
    [df07bec0] [c011fe90] do_mount+0x240/0x890
    [df07bf10] [c0120570] sys_mount+0x90/0xd8
    [df07bf40] [c00110d8] ret_from_syscall+0x0/0x4

    === Exception: c01 at 0xff61a34
    LR = 0x100135f0
    Instruction dump:
    38800005 38600000 48010f41 4bfffe1c 4bfc2d15 4bfffe8c 72e90200 4082fc28
    3d20c064 39298860 8809000d 68000001 2f800000 419efc0c 38000001
    mount: mounting /dev/mtdblock3 on /common failed: Input/output error

    Signed-off-by: Chen Jie
    Signed-off-by: Andrew Morton
    Signed-off-by: David Woodhouse
    Signed-off-by: Greg Kroah-Hartman

    Chen Jie
     
  • commit 0c510cc83bdbaac8406f4f7caef34f4da0ba35ea upstream.

    When DRAM errors occur on memory controllers after EDAC_MAX_MCS (16),
    the kernel fatally dereferences unallocated structures, see splat below;
    this occurs on at least NumaConnect systems.

    Fix by checking if a memory controller info structure was found.

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
    IP: [] decode_bus_error+0x2f/0x2b0
    PGD 2f8b5a3067 PUD 2f8b5a2067 PMD 0
    Oops: 0000 [#2] SMP
    Modules linked in:
    CPU: 224 PID: 11930 Comm: stream_c.exe.gn Tainted: G D 3.19.0 #1
    Hardware name: Supermicro H8QGL/H8QGL, BIOS 3.5b 01/28/2015
    task: ffff8807dbfb8c00 ti: ffff8807dd16c000 task.ti: ffff8807dd16c000
    RIP: 0010:[] [] decode_bus_error+0x2f/0x2b0
    RSP: 0000:ffff8907dfc03c48 EFLAGS: 00010297
    RAX: 0000000000000001 RBX: 9c67400010080a13 RCX: 0000000000001dc6
    RDX: 000000001dc61dc6 RSI: ffff8907dfc03df0 RDI: 000000000000001c
    RBP: ffff8907dfc03ce8 R08: 0000000000000000 R09: 0000000000000022
    R10: ffff891fffa30380 R11: 00000000001cfc90 R12: 0000000000000008
    R13: 0000000000000000 R14: 000000000000001c R15: 00009c6740001000
    FS: 00007fa97ee18700(0000) GS:ffff8907dfc00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000320 CR3: 0000003f889b8000 CR4: 00000000000407e0
    Stack:
    0000000000000000 ffff8907dfc03df0 0000000000000008 9c67400010080a13
    000000000000001c 00009c6740001000 ffff8907dfc03c88 ffffffff810e4f9a
    ffff8907dfc03ce8 ffffffff81b375b9 0000000000000000 0000000000000010
    Call Trace:

    ? vprintk_default
    ? printk
    amd_decode_mce
    notifier_call_chain
    atomic_notifier_call_chain
    mce_log
    machine_check_poll
    mce_timer_fn
    ? mce_cpu_restart
    call_timer_fn.isra.29
    run_timer_softirq
    __do_softirq
    irq_exit
    smp_apic_timer_interrupt
    apic_timer_interrupt

    ? down_read_trylock
    __do_page_fault
    ? __schedule
    do_page_fault
    page_fault

    Signed-off-by: Daniel J Blueman
    Link: http://lkml.kernel.org/r/1424144078-24589-1-git-send-email-daniel@numascale.com
    [ Boris: massage commit message ]
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Daniel J Blueman
     
  • commit 11249e73992981e31fd50e7231da24fad68e3320 upstream.

    d0585cd815fa ("sb_edac: Claim a different PCI device") changed the
    probing of sb_edac to look for PCI device 0x3ca0:

    3f:0e.0 System peripheral: Intel Corporation Xeon E5/Core i7 Processor Home Agent (rev 07)
    00: 86 80 a0 3c 00 00 00 00 07 00 80 08 00 00 80 00
    ...

    but we're matching for 0x3ca8, i.e. PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_TA
    in sbridge_probe() therefore the probing fails.

    Changing it to probe for 0x3ca0 (PCI_DEVICE_ID_INTEL_SBRIDGE_IMC_HA0),
    .i.e., the 14.0 device, fixes the issue and driver loads successfully
    again:

    [ 2449.013120] EDAC DEBUG: sbridge_init:
    [ 2449.017029] EDAC sbridge: Seeking for: PCI ID 8086:3ca0
    [ 2449.022368] EDAC DEBUG: sbridge_get_onedevice: Detected 8086:3ca0
    [ 2449.028498] EDAC sbridge: Seeking for: PCI ID 8086:3ca0
    [ 2449.033768] EDAC sbridge: Seeking for: PCI ID 8086:3ca8
    [ 2449.039028] EDAC DEBUG: sbridge_get_onedevice: Detected 8086:3ca8
    [ 2449.045155] EDAC sbridge: Seeking for: PCI ID 8086:3ca8
    ...

    Add a debug printk while at it to be able to catch the failure in the
    future and dump driver version on successful load.

    Fixes: d0585cd815fa ("sb_edac: Claim a different PCI device")
    Acked-by: Aristeu Rozanski
    Cc: Tony Luck
    Acked-by: Andy Lutomirski
    Acked-by: Mauro Carvalho Chehab
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit d1901ef099c38afd11add4cfb3312c02ef21ec4a upstream.

    When a drive is marked write-mostly it should only be the
    target of reads if there is no other option.

    This behaviour was broken by

    commit 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    md/raid1: read balance chooses idlest disk for SSD

    which causes a write-mostly device to be *preferred* is some cases.

    Restore correct behaviour by checking and setting
    best_dist_disk and best_pending_disk rather than best_disk.

    We only need to test one of these as they are both changed
    from -1 or >=0 at the same time.

    As we leave min_pending and best_dist unchanged, any non-write-mostly
    device will appear better than the write-mostly device.

    Reported-by: Tomáš Hodek
    Reported-by: Dark Penguin
    Signed-off-by: NeilBrown
    Link: http://marc.info/?l=linux-raid&m=135982797322422
    Fixes: 9dedf60313fa4dddfd5b9b226a0ef12a512bf9dc
    Signed-off-by: Greg Kroah-Hartman

    Tomáš Hodek
     
  • commit 26ac107378c4742978216be1005b7291b799c7b2 upstream.

    Commit a7854487cd7128a30a7f4f5259de9f67d5efb95f:
    md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write.

    Causes an RCW cycle to be forced even when the array is degraded.
    A degraded array cannot support RCW as that requires reading all data
    blocks, and one may be missing.

    Forcing an RCW when it is not possible causes a live-lock and the code
    spins, repeatedly deciding to do something that cannot succeed.

    So change the condition to only force RCW on non-degraded arrays.

    Reported-by: Manibalan P
    Bisected-by: Jes Sorensen
    Tested-by: Jes Sorensen
    Signed-off-by: NeilBrown
    Fixes: a7854487cd7128a30a7f4f5259de9f67d5efb95f
    Signed-off-by: Greg Kroah-Hartman

    NeilBrown
     
  • commit 48536c9195ae8c2a00fd8f400bac72ab613feaab upstream.

    Commit f6edb53c4993ffe92ce521fb449d1c146cea6ec2 converted the probe to
    a CPU wide event first (pid == -1). For kernels that do not support
    the PERF_FLAG_FD_CLOEXEC flag the probe fails with EINVAL. Since this
    errno is not handled pid is not reset to 0 and the subsequent use of
    pid = -1 as an argument brings in an additional failure path if
    perf_event_paranoid > 0:

    $ perf record -- sleep 1
    perf_event_open(..., 0) failed unexpectedly with error 13 (Permission denied)
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.007 MB /tmp/perf.data (11 samples) ]

    Also, ensure the fd of the confirmation check is closed and comment why
    pid = -1 is used.

    Needs to go to 3.18 stable tree as well.

    Signed-off-by: Adrian Hunter
    Based-on-patch-by: David Ahern
    Acked-by: David Ahern
    Cc: David Ahern
    Link: http://lkml.kernel.org/r/54EC610C.8000403@intel.com
    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Greg Kroah-Hartman

    Adrian Hunter
     
  • commit d4a19eb3b15a4ba98f627182f48d5bc0cffae670 upstream.

    We have two race conditions in the probe code which could lead to a null
    pointer dereference in the interrupt handler.

    The interrupt handler accesses the clockevent device, which may not yet be
    registered.

    First race condition happens when the interrupt handler gets registered before
    the interrupts get disabled. The second race condition happens when the
    interrupts get enabled, but the clockevent device is not yet registered.

    Fix that by disabling the interrupts before we register the interrupt and enable
    the interrupts after the clockevent device got registered.

    Reported-by: Gongbae Park
    Signed-off-by: Matthias Brugger
    Signed-off-by: Daniel Lezcano
    Signed-off-by: Greg Kroah-Hartman

    Matthias Brugger
     
  • commit c2996cb29bfb73927a79dc96e598a718e843f01a upstream.

    The KSTK_EIP() and KSTK_ESP() macros should return the user program
    counter (PC) and stack pointer (A0StP) of the given task. These are used
    to determine which VMA corresponds to the user stack in
    /proc//maps, and for the user PC & A0StP in /proc//stat.

    However for Meta the PC & A0StP from the task's kernel context are used,
    resulting in broken output. For example in following /proc//maps
    output, the 3afff000-3b021000 VMA should be described as the stack:

    # cat /proc/self/maps
    ...
    100b0000-100b1000 rwxp 00000000 00:00 0 [heap]
    3afff000-3b021000 rwxp 00000000 00:00 0

    And in the following /proc//stat output, the PC is in kernel code
    (1074234964 = 0x40078654) and the A0StP is in the kernel heap
    (1335981392 = 0x4fa17550):

    # cat /proc/self/stat
    51 (cat) R ... 1335981392 1074234964 ...

    Fix the definitions of KSTK_EIP() and KSTK_ESP() to use
    task_pt_regs(tsk)->ctx rather than (tsk)->thread.kernel_context. This
    gets the registers from the user context stored after the thread info at
    the base of the kernel stack, which is from the last entry into the
    kernel from userland, regardless of where in the kernel the task may
    have been interrupted, which results in the following more correct
    /proc//maps output:

    # cat /proc/self/maps
    ...
    0800b000-08070000 r-xp 00000000 00:02 207 /lib/libuClibc-0.9.34-git.so
    ...
    100b0000-100b1000 rwxp 00000000 00:00 0 [heap]
    3afff000-3b021000 rwxp 00000000 00:00 0 [stack]

    And /proc//stat now correctly reports the PC in libuClibc
    (134320308 = 0x80190b4) and the A0StP in the [stack] region (989864576 =
    0x3b002280):

    # cat /proc/self/stat
    51 (cat) R ... 989864576 134320308 ...

    Reported-by: Alexey Brodkin
    Reported-by: Vineet Gupta
    Signed-off-by: James Hogan
    Cc: linux-metag@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    James Hogan
     
  • commit dfcc70a8c868fe03276fa59864149708fb41930b upstream.

    For filesystems without separate project quota inode field in the
    superblock we just reuse project quota file for group quotas (and vice
    versa) if project quota file is allocated and we need group quota file.
    When we reuse the file, quota structures on disk suddenly have wrong
    type stored in d_flags though. Nobody really cares about this (although
    structure type reported to userspace was wrong as well) except
    that after commit 14bf61ffe6ac (quota: Switch ->get_dqblk() and
    ->set_dqblk() to use bytes as space units) assertion in
    xfs_qm_scall_getquota() started to trigger on xfs/106 test (apparently I
    was testing without XFS_DEBUG so I didn't notice when submitting the
    above commit).

    Fix the problem by properly resetting ddq->d_flags when running quotacheck
    for a quota file.

    Reported-by: Al Viro
    Signed-off-by: Jan Kara
    Reviewed-by: Dave Chinner
    Signed-off-by: Dave Chinner
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 2f97c20e5f7c3582c7310f65a04465bfb0fd0e85 upstream.

    The gpio_chip operations receive a pointer the gpio_chip struct which is
    contained in the driver's private struct, yet the container_of call in those
    functions point to the mfd struct defined in include/linux/mfd/tps65912.h.

    Signed-off-by: Nicolas Saenz Julienne
    Signed-off-by: Linus Walleij
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Saenz Julienne
     
  • commit 9cf75e9e4ddd587ac12e88e8751c358b7b27e95f upstream.

    The change:

    7b8792bbdffdff3abda704f89c6a45ea97afdc62
    gpiolib: of: Correct error handling in of_get_named_gpiod_flags

    assumed that only one gpio-chip is registred per of-node.
    Some drivers register more than one chip per of-node, so
    adjust the matching function of_gpiochip_find_and_xlate to
    not stop looking for chips if a node-match is found and
    the translation fails.

    Fixes: 7b8792bbdffd ("gpiolib: of: Correct error handling in of_get_named_gpiod_flags")
    Signed-off-by: Hans Holmberg
    Acked-by: Alexandre Courbot
    Tested-by: Robert Jarzmik
    Tested-by: Tyler Hall
    Signed-off-by: Linus Walleij
    Signed-off-by: Greg Kroah-Hartman

    Hans Holmberg
     
  • commit 9d42d48a342aee208c1154696196497fdc556bbf upstream.

    The native (64-bit) sigval_t union contains sival_int (32-bit) and
    sival_ptr (64-bit). When a compat application invokes a syscall that
    takes a sigval_t value (as part of a larger structure, e.g.
    compat_sys_mq_notify, compat_sys_timer_create), the compat_sigval_t
    union is converted to the native sigval_t with sival_int overlapping
    with either the least or the most significant half of sival_ptr,
    depending on endianness. When the corresponding signal is delivered to a
    compat application, on big endian the current (compat_uptr_t)sival_ptr
    cast always returns 0 since sival_int corresponds to the top part of
    sival_ptr. This patch fixes copy_siginfo_to_user32() so that sival_int
    is copied to the compat_siginfo_t structure.

    Reported-by: Bamvor Jian Zhang
    Tested-by: Bamvor Jian Zhang
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Catalin Marinas
     
  • commit a52d209336f8fc7483a8c7f4a8a7d2a8e1692a6c upstream.

    Since the removal of CONFIG_REGULATOR_DUMMY option, the touchscreen stopped
    working. This patch enables the "replacement" for REGULATOR_DUMMY and
    allows the touchscreen to work even though there is no regulator for "vcc".

    Signed-off-by: Martin Vajnar
    Signed-off-by: Robert Jarzmik
    Signed-off-by: Greg Kroah-Hartman

    Martin Vajnar
     
  • commit 428d53be5e7468769d4e7899cca06ed5f783a6e1 upstream.

    We have to delete the allocated interrupt info if __inject_vm() fails.

    Otherwise user space can keep flooding kvm with floating interrupts and
    provoke more and more memory leaks.

    Reported-by: Dominik Dingel
    Reviewed-by: Dominik Dingel
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit 8e2207cdd087ebb031e9118d1fd0902c6533a5e5 upstream.

    If a vm with no VCPUs is created, the injection of a floating irq
    leads to an endless loop in the kernel.

    Let's skip the search for a destination VCPU for a floating irq if no
    VCPUs were created.

    Reviewed-by: Dominik Dingel
    Reviewed-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit 0ac96caf0f9381088c673a16d910b1d329670edf upstream.

    The hrtimer that handles the wait with enabled timer interrupts
    should not be disturbed by changes of the host time.

    This patch changes our hrtimer to be based on a monotonic clock.

    Signed-off-by: David Hildenbrand
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit 2d00f759427bb3ed963b60f570830e9eca7e1c69 upstream.

    Patch 0759d0681cae ("KVM: s390: cleanup handle_wait by reusing
    kvm_vcpu_block") changed the way pending guest clock comparator
    interrupts are detected. It was assumed that as soon as the hrtimer
    wakes up, the condition for the guest ckc is satisfied.

    This is however only true as long as adjclock() doesn't speed
    up the monotonic clock. Reason is that the hrtimer is based on
    CLOCK_MONOTONIC, the guest clock comparator detection is based
    on the raw TOD clock. If CLOCK_MONOTONIC runs faster than the
    TOD clock, the hrtimer wakes the target VCPU up too early and
    the target VCPU will not detect any pending interrupts, therefore
    going back to sleep. It will never be woken up again because the
    hrtimer has finished. The VCPU is stuck.

    As a quick fix, we have to forward the hrtimer until the guest
    clock comparator is really due, to guarantee properly timed wake
    ups.

    As the hrtimer callback might be triggered on another cpu, we
    have to make sure that the timer is really stopped and not currently
    executing the callback on another cpu. This can happen if the vcpu
    thread is scheduled onto another physical cpu, but the timer base
    is not migrated. So lets use hrtimer_cancel instead of try_to_cancel.

    A proper fix might be to introduce a RAW based hrtimer.

    Reported-by: Christian Borntraeger
    Signed-off-by: David Hildenbrand
    Acked-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit 23b133bdc452aa441fcb9b82cbf6dd05cfd342d0 upstream.

    Check length of extended attributes and allocation descriptors when
    loading inodes from disk. Otherwise corrupted filesystems could confuse
    the code and make the kernel oops.

    Reported-by: Carl Henrik Lunde
    Signed-off-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 79144954278d4bb5989f8b903adcac7a20ff2a5a upstream.

    Store blocksize in a local variable in udf_fill_inode() since it is used
    a lot of times.

    Signed-off-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit ed4cbc81addbc076b016c5b979fd1a02f0897f0a upstream.

    activate_mm() and switch_mm() call get_new_mmu_context() which in turn
    can enable the HTW before the entryhi is changed with the new ASID.
    Since the latter will enable the HTW in local_flush_tlb_all(),
    then there is a small timing window where the HTW is running with the
    new ASID but with an old pgd since the TLBMISS_HANDLER_SETUP_PGD
    hasn't assigned a new one yet. In order to prevent that, we introduce a
    simple htw counter to avoid starting HTW accidentally due to nested
    htw_{start,stop}() sequences. Moreover, since various IPI calls can
    enforce TLB flushing operations on a different core, such an operation
    may interrupt another htw_{stop,start} in progress leading inconsistent
    updates of the htw_seq variable. In order to avoid that, we disable the
    interrupts whenever we update that variable.

    Signed-off-by: Markos Chandras
    Cc: linux-mips@linux-mips.org
    Patchwork: https://patchwork.linux-mips.org/patch/9118/
    Signed-off-by: Ralf Baechle
    Signed-off-by: Greg Kroah-Hartman

    Markos Chandras
     
  • commit 06f34e1c28f3608b0ce5b310e41102d3fe7b65a1 upstream.

    We used to calculate page address differently in 2 cases:

    1. In virt_to_page(x) we do
    --->8---
    mem_map + (x - CONFIG_LINUX_LINK_BASE) >> PAGE_SHIFT
    --->8---

    2. In in pte_page(x) we do
    --->8---
    mem_map + (pte_val(x) - PAGE_OFFSET) >> PAGE_SHIFT
    --->8---

    That leads to problems in case PAGE_OFFSET != CONFIG_LINUX_LINK_BASE -
    different pages will be selected depending on where and how we calculate
    page address.

    In particular in the STAR 9000853582 when gdb attempted to read memory
    of another process it got improper page in get_user_pages() because this
    is exactly one of the places where we search for a page by pte_page().

    The fix is trivial - we need to calculate page address similarly in both
    cases.

    Signed-off-by: Alexey Brodkin
    Signed-off-by: Vineet Gupta
    Signed-off-by: Greg Kroah-Hartman

    Alexey Brodkin
     
  • commit 5f1437f61a0b351d25b528c159360da3d5e8c77b upstream.

    When the UART is in DMA receive mode (RDMAS set) and one character
    just arrived while another interrupt is handled (e.g. TX), the RDRF
    (receiver data register full flag) is set due to the water level of
    1. But since the DMA will take care of this character, there is no
    need to handle it by calling lpuart_prepare_rx. Handling it leads to
    adding the RX timeout timer twice:

    [ 74.336698] Kernel BUG at 80053070 [verbose debug info unavailable]
    [ 74.342999] Internal error: Oops - BUG: 0 [#1] ARM0:00.00 khungtaskd
    [ 74.347817] Modules linked in: 0 S 0.0 0.0 0:00.00 writeback
    [ 74.350926] CPU: 0 PID: 0 Comm: swapper Not tainted 3.19.0-rc3-00001-g39d78e2 #1788
    [ 74.358617] Hardware name: Freescale Vybrid VF610 (Device Tree)t
    [ 74.364563] task: 807a7678 ti: 8079c000 task.ti: 8079c000 kblockd
    [ 74.370002] PC is at add_timer+0x24/0x28.0 0.0 0:00.09 kworker/u2:1
    [ 74.373960] LR is at lpuart_int+0x15c/0x3d8
    [ 74.378171] pc : [] lr : [] psr: a0010193
    [ 74.378171] sp : 8079de10 ip : 8079de20 fp : 8079de1c
    [ 74.389694] r10: 807d44c0 r9 : 8688c300 r8 : 00000013
    [ 74.394943] r7 : 20010193 r6 : 00000000 r5 : 000000a0 r4 : 86997210
    [ 74.401498] r3 : ffffa7da r2 : 80817868 r1 : 86997210 r0 : 86997344
    [ 74.408052] Flags: NzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel
    [ 74.415489] Control: 10c5387d Table: 8611c059 DAC: 00000015
    [ 74.421265] Process swapper (pid: 0, stack limit = 0x8079c230)
    ...

    Solve this by only execute the receiver path (lpuart_prepare_rx) if
    the DMA receive mode (RDMAS) is not set. Also, make sure the flag is
    cleared on initialization, in case it has been left set.

    This can be best reproduced using UART as a serial console, then
    running top while dd'ing data into the terminal.

    Signed-off-by: Stefan Agner
    Signed-off-by: Greg Kroah-Hartman

    Stefan Agner
     
  • commit 4a8588a1cf867333187d9ff071e6fbdab587d194 upstream.

    If the serial port gets closed while a RX transfer is in progress,
    the timer might fire after the serial port shutdown finished. This
    leads in a NULL pointer dereference:

    [ 7.508324] Unable to handle kernel NULL pointer dereference at virtual address 00000000
    [ 7.516590] pgd = 86348000
    [ 7.519445] [00000000] *pgd=86179831, *pte=00000000, *ppte=00000000
    [ 7.526145] Internal error: Oops: 17 [#1] ARM
    [ 7.530611] Modules linked in:
    [ 7.533876] CPU: 0 PID: 123 Comm: systemd Not tainted 3.19.0-rc3-00004-g5b11ea7 #1778
    [ 7.541827] Hardware name: Freescale Vybrid VF610 (Device Tree)
    [ 7.547862] task: 861c3400 ti: 86ac8000 task.ti: 86ac8000
    [ 7.553392] PC is at lpuart_timer_func+0x24/0xf8
    [ 7.558127] LR is at lpuart_timer_func+0x20/0xf8
    [ 7.562857] pc : [] lr : [] psr: 600b0113
    [ 7.562857] sp : 86ac9b90 ip : 86ac9b90 fp : 86ac9bbc
    [ 7.574467] r10: 80817180 r9 : 80817b98 r8 : 80817998
    [ 7.579803] r7 : 807acee0 r6 : 86989000 r5 : 00000100 r4 : 86997210
    [ 7.586444] r3 : 86ac8000 r2 : 86ac9bc0 r1 : 86997210 r0 : 00000000
    [ 7.593085] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
    [ 7.600341] Control: 10c5387d Table: 86348059 DAC: 00000015
    [ 7.606203] Process systemd (pid: 123, stack limit = 0x86ac8230)

    Setup the timer on UART startup which allows to delete the timer
    unconditionally on shutdown. This also saves the initialization
    on each transfer.

    Signed-off-by: Stefan Agner
    Signed-off-by: Greg Kroah-Hartman

    Stefan Agner
     
  • commit 29183a70b0b828500816bd794b3fe192fce89f73 upstream.

    Additional validation of adjtimex freq values to avoid
    potential multiplication overflows were added in commit
    5e5aeb4367b (time: adjtimex: Validate the ADJ_FREQUENCY values)

    Unfortunately the patch used LONG_MAX/MIN instead of
    LLONG_MAX/MIN, which was fine on 64-bit systems, but being
    much smaller on 32-bit systems caused false positives
    resulting in most direct frequency adjustments to fail w/
    EINVAL.

    ntpd only does direct frequency adjustments at startup, so
    the issue was not as easily observed there, but other time
    sync applications like ptpd and chrony were more effected by
    the bug.

    See bugs:

    https://bugzilla.kernel.org/show_bug.cgi?id=92481
    https://bugzilla.redhat.com/show_bug.cgi?id=1188074

    This patch changes the checks to use LLONG_MAX for
    clarity, and additionally the checks are disabled
    on 32-bit systems since LLONG_MAX/PPM_SCALE is always
    larger then the 32-bit long freq value, so multiplication
    overflows aren't possible there.

    Reported-by: Josh Boyer
    Reported-by: George Joseph
    Tested-by: George Joseph
    Signed-off-by: John Stultz
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Sasha Levin
    Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org
    [ Prettified the changelog and the comments a bit. ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    John Stultz
     
  • commit df0036d117e6c9df36324e517728e33543065f9a upstream.

    There was a follow on replacement patch against the prior
    "kgdb: Timeout if secondary CPUs ignore the roundup".

    See: https://lkml.org/lkml/2015/1/7/442

    This patch is the delta vs the patch that was committed upstream:
    * Fix an off-by-one error in kdb_cpu().
    * Replace NR_CPUS with CONFIG_NR_CPUS to tell checkpatch that we
    really want a static limit.
    * Removed the "KGDB: " prefix from the pr_crit() in debug_core.c
    (kgdb-next contains a patch which introduced pr_fmt() to this file
    to the tag will now be applied automatically).

    Cc: Daniel Thompson
    Signed-off-by: Jason Wessel
    Signed-off-by: Greg Kroah-Hartman

    Jason Wessel
     
  • commit f7d4ca8bbfda23b4f1eae9b6757ff64166b093d5 upstream.

    Currently when kdb traps printk messages then the raw log level prefix
    (consisting of '\001' followed by a numeral) does not get stripped off
    before the message is issued to the various I/O handlers supported by
    kdb. This causes annoying visual noise as well as causing problems
    grepping for ^. It is also a change of behaviour compared to normal usage
    of printk() usage. For example -h ends up with different output to
    that of kdb's "sr h".

    This patch addresses the problem by stripping log levels from messages
    before they are issued to the I/O handlers. printk() which can also
    act as an i/o handler in some cases is special cased; if the caller
    provided a log level then the prefix will be preserved when sent to
    printk().

    The addition of non-printable characters to the output of kdb commands is a
    regression, albeit and extremely elderly one, introduced by commit
    04d2c8c83d0e ("printk: convert the format for KERN_ to a 2 byte
    pattern"). Note also that this patch does *not* restore the original
    behaviour from v3.5. Instead it makes printk() from within a kdb command
    display the message without any prefix (i.e. like printk() normally does).

    Signed-off-by: Daniel Thompson
    Cc: Joe Perches
    Signed-off-by: Jason Wessel
    Signed-off-by: Greg Kroah-Hartman

    Daniel Thompson
     
  • commit 146755923262037fc4c54abc28c04b1103f3cc51 upstream.

    The output of KDB 'summary' command should report MemTotal, MemFree
    and Buffers output in kB. Current codes report in unit of pages.

    A define of K(x) as
    is defined in the code, but not used.

    This patch would apply the define to convert the values to kB.
    Please include me on Cc on replies. I do not subscribe to linux-kernel.

    Signed-off-by: Jay Lan
    Signed-off-by: Jason Wessel
    Signed-off-by: Greg Kroah-Hartman

    Jay Lan
     
  • commit 165235180ff61f0012ea68a299e46daec43dcaa7 upstream.

    mvebu_armada375_smp_wa_init is only used on armada 375 but is defined
    for all mvebu machines. As it calls a function that is only provided
    sometimes, this can result in a link error:

    arch/arm/mach-mvebu/built-in.o: In function `mvebu_armada375_smp_wa_init':
    :(.text+0x228): undefined reference to `mvebu_setup_boot_addr_wa'

    To solve this, we can just change the existing #ifdef around the
    function to also check for Armada375 SMP platforms.

    Signed-off-by: Arnd Bergmann
    Fixes: 305969fb6292 ("ARM: mvebu: use the common function for Armada 375 SMP workaround")
    Cc: Andrew Lunn
    Cc: Jason Cooper
    Cc: Gregory Clement
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit 95fcedb027a27f32bf2434f9271635c380e57fb5 upstream.

    The vexpress tc2 power management code calls mcpm_loopback, which
    is only available if ARM_CPU_SUSPEND is enabled, otherwise we
    get a link error:

    arch/arm/mach-vexpress/built-in.o: In function `tc2_pm_init':
    arch/arm/mach-vexpress/tc2_pm.c:389: undefined reference to `mcpm_loopback'

    This explicitly selects ARM_CPU_SUSPEND like other platforms that
    need it.

    Signed-off-by: Arnd Bergmann
    Fixes: 3592d7e002438 ("ARM: 8082/1: TC2: test the MCPM loopback during boot")
    Acked-by: Nicolas Pitre
    Acked-by: Liviu Dudau
    Cc: Kevin Hilman
    Cc: Sudeep Holla
    Cc: Lorenzo Pieralisi
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit ff34cae5b4fc7a84113d7c7e8611ba87a7c31dba upstream.

    A recent cleanup rearranged the Kconfig file for mach-bcm and
    accidentally dropped the dependency on ARCH_MULTI_V7, which
    makes it possible to now build the two mobile SoC platforms
    on an ARMv6-only kernel, resulting in a log of Kconfig
    warnings like

    warning: ARCH_BCM_MOBILE selects ARM_ERRATA_775420 which has unmet direct dependencies (CPU_V7)

    and which of course cannot work on any machine.

    This puts back the dependencies as before.

    Signed-off-by: Arnd Bergmann
    Fixes: 64e74aa788f99 ("ARM: mach-bcm: ARCH_BCM_MOBILE: remove one level of menu from Kconfig")
    Acked-by: Florian Fainelli
    Acked-by: Scott Branden
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit a1ad3b94a7661b643fef2efbc6fc217bd148f462 upstream.

    The automatic CPU power state machine for B15 CPUs does not work
    reliably as-is. This patch implements a manual sequence in software to
    replace it.

    This was tested successfully with over 10,000 hotplug cycles of
    something like this:

    echo 0 > /sys/devices/system/cpu/cpu1/online
    echo 1 > /sys/devices/system/cpu/cpu1/online

    whereas the existing sequence often locks up after a few hundred cycles.

    Fixes: 62639c2f5332 ("ARM: brcmstb: reintroduce SMP support")
    Acked-by: Gregory Fong
    Signed-off-by: Brian Norris
    Signed-off-by: Florian Fainelli
    Signed-off-by: Greg Kroah-Hartman

    Brian Norris
     
  • commit baad2dc49c5d970ea881d92981a1b76c94a7b7a1 upstream.

    Add regulator_has_full_constraints() call to spitz board file to let
    regulator core know that we do not have any additional regulators left.
    This lets it substitute unprovided regulators with dummy ones.

    This fixes the following warnings that can be seen on spitz if
    regulators are enabled:

    ads7846 spi2.0: unable to get regulator: -517
    spi spi2.0: Driver ads7846 requests probe deferral

    Signed-off-by: Dmitry Eremin-Solenikov
    Acked-by: Mark Brown
    Signed-off-by: Robert Jarzmik
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Eremin-Solenikov
     
  • commit 9bc78f32c2e430aebf6def965b316aa95e37a20c upstream.

    Add regulator_has_full_constraints() call to poodle board file to let
    regulator core know that we do not have any additional regulators left.
    This lets it substitute unprovided regulators with dummy ones.

    This fixes the following warnings that can be seen on poodle if
    regulators are enabled:

    ads7846 spi1.0: unable to get regulator: -517
    spi spi1.0: Driver ads7846 requests probe deferral
    wm8731 0-001b: Failed to get supply 'AVDD': -517
    wm8731 0-001b: Failed to request supplies: -517
    wm8731 0-001b: ASoC: failed to probe component -517

    Signed-off-by: Dmitry Eremin-Solenikov
    Acked-by: Mark Brown
    Signed-off-by: Robert Jarzmik
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Eremin-Solenikov
     
  • commit 271e80176aae4e5b481f4bb92df9768c6075bbca upstream.

    Add regulator_has_full_constraints() call to corgi board file to let
    regulator core know that we do not have any additional regulators left.
    This lets it substitute unprovided regulators with dummy ones.

    This fixes the following warnings that can be seen on corgi if
    regulators are enabled:

    ads7846 spi1.0: unable to get regulator: -517
    spi spi1.0: Driver ads7846 requests probe deferral
    wm8731 0-001b: Failed to get supply 'AVDD': -517
    wm8731 0-001b: Failed to request supplies: -517
    wm8731 0-001b: ASoC: failed to probe component -517
    corgi-audio corgi-audio: ASoC: failed to instantiate card -517

    Signed-off-by: Dmitry Eremin-Solenikov
    Acked-by: Mark Brown
    Signed-off-by: Robert Jarzmik
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Eremin-Solenikov
     
  • commit 19e3ae6b4f07a87822c1c9e7ed99d31860e701af upstream.

    The vcs device's poll/fasync support relies on the vt notifier to signal
    changes to the screen content. Notifier invocations were missing for
    changes that comes through the selection interface though. Fix that.

    Tested with BRLTTY 5.2.

    Signed-off-by: Nicolas Pitre
    Cc: Dave Mielke
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Pitre