10 Dec, 2015

2 commits

  • commit 6b2a3d628aa752f0ab825fc6d4d07b09e274d1c1 upstream.

    The data to audit/record is in the 'from' buffer (ie., the input
    read buffer).

    Fixes: 72586c6061ab ("n_tty: Fix auditing support for cannonical mode")
    Cc: Miloslav Trmač
    Signed-off-by: Peter Hurley
    Acked-by: Laura Abbott
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • commit db27a7a37aa0b1f8b373f8b0fb72a2ccaafb85b7 upstream.

    Let's provide a function to lookup a VCPU by id.

    Reviewed-by: Christian Borntraeger
    Reviewed-by: Dominik Dingel
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    [split patch from refactoring patch]
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     

27 Oct, 2015

6 commits

  • commit ee296d7c5709440f8abd36b5b65c6b3e388538d9 upstream.

    They just call file_inode and then the corresponding *_inode_file_wait
    function. Just make them static inlines instead.

    Signed-off-by: Jeff Layton
    Cc: William Dauchy
    Signed-off-by: Greg Kroah-Hartman

    Jeff Layton
     
  • commit 29d01b22eaa18d8b46091d3c98c6001c49f78e4a upstream.

    Allow callers to pass in an inode instead of a filp.

    Signed-off-by: Jeff Layton
    Reviewed-by: "J. Bruce Fields"
    Tested-by: "J. Bruce Fields"
    Cc: William Dauchy
    Signed-off-by: Greg Kroah-Hartman

    Jeff Layton
     
  • commit fe32d3cd5e8eb0f82e459763374aa80797023403 upstream.

    These functions check should_resched() before unlocking spinlock/bh-enable:
    preempt_count always non-zero => should_resched() always returns false.
    cond_resched_lock() worked iff spin_needbreak is set.

    This patch adds argument "preempt_offset" to should_resched().

    preempt_count offset constants for that:

    PREEMPT_DISABLE_OFFSET - offset after preempt_disable()
    PREEMPT_LOCK_OFFSET - offset after spin_lock()
    SOFTIRQ_DISABLE_OFFSET - offset after local_bh_distable()
    SOFTIRQ_LOCK_OFFSET - offset after spin_lock_bh()

    Signed-off-by: Konstantin Khlebnikov
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Alexander Graf
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: bdb438065890 ("sched: Extract the basic add/sub preempt_count modifiers")
    Link: http://lkml.kernel.org/r/20150715095204.12246.98268.stgit@buzz
    Signed-off-by: Ingo Molnar
    Signed-off-by: Mike Galbraith
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • commit 90b62b5129d5cb50f62f40e684de7a1961e57197 upstream.

    "CHECK" suggests it's only used as a comparison mask. But now it's used
    further as a config-conditional preempt disabler offset. Lets
    disambiguate this name.

    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1431441711-29753-4-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Mike Galbraith
    Signed-off-by: Greg Kroah-Hartman

    Frederic Weisbecker
     
  • [ Upstream commit 31b33dfb0a144469dd805514c9e63f4993729a48 ]

    Earlier patch 6ae459bda tried to detect void ckecksum partial
    skb by comparing pull length to checksum offset. But it does
    not work for all cases since checksum-offset depends on
    updates to skb->data.

    Following patch fixes it by validating checksum start offset
    after skb-data pointer is updated. Negative value of checksum
    offset start means there is no need to checksum.

    Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
    Reported-by: Andrew Vagin
    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Pravin B Shelar
     
  • [ Upstream commit 6ae459bdaaeebc632b16e54dcbabb490c6931d61 ]

    VXLAN device can receive skb with checksum partial. But the checksum
    offset could be in outer header which is pulled on receive. This results
    in negative checksum offset for the skb. Such skb can cause the assert
    failure in skb_checksum_help(). Following patch fixes the bug by setting
    checksum-none while pulling outer header.

    Following is the kernel panic msg from old kernel hitting the bug.

    ------------[ cut here ]------------
    kernel BUG at net/core/dev.c:1906!
    RIP: 0010:[] skb_checksum_help+0x144/0x150
    Call Trace:

    [] queue_userspace_packet+0x408/0x470 [openvswitch]
    [] ovs_dp_upcall+0x5d/0x60 [openvswitch]
    [] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
    [] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
    [] ovs_vport_receive+0x2a/0x30 [openvswitch]
    [] vxlan_rcv+0x53/0x60 [openvswitch]
    [] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
    [] udp_queue_rcv_skb+0x2dc/0x3b0
    [] __udp4_lib_rcv+0x1cf/0x6c0
    [] udp_rcv+0x1a/0x20
    [] ip_local_deliver_finish+0xdd/0x280
    [] ip_local_deliver+0x88/0x90
    [] ip_rcv_finish+0x10d/0x370
    [] ip_rcv+0x235/0x300
    [] __netif_receive_skb+0x55d/0x620
    [] netif_receive_skb+0x80/0x90
    [] virtnet_poll+0x555/0x6f0
    [] net_rx_action+0x134/0x290
    [] __do_softirq+0xa8/0x210
    [] call_softirq+0x1c/0x30
    [] do_softirq+0x65/0xa0
    [] irq_exit+0x8e/0xb0
    [] do_IRQ+0x63/0xe0
    [] common_interrupt+0x6e/0x6e

    Reported-by: Anupam Chanda
    Signed-off-by: Pravin B Shelar
    Acked-by: Tom Herbert
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Pravin B Shelar
     

23 Oct, 2015

3 commits

  • commit 4bacc9c9234c7c8eec44f5ed4e960d9f96fa0f01 upstream.

    Make file->f_path always point to the overlay dentry so that the path in
    /proc/pid/fd is correct and to ensure that label-based LSMs have access to the
    overlay as well as the underlay (path-based LSMs probably don't need it).

    Using my union testsuite to set things up, before the patch I see:

    [root@andromeda union-testsuite]# bash 5 /a/foo107
    [root@andromeda union-testsuite]# stat /mnt/a/foo107
    ...
    Device: 23h/35d Inode: 13381 Links: 1
    ...
    [root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
    ...
    Device: 23h/35d Inode: 13381 Links: 1
    ...

    After the patch:

    [root@andromeda union-testsuite]# bash 5 /mnt/a/foo107
    [root@andromeda union-testsuite]# stat /mnt/a/foo107
    ...
    Device: 23h/35d Inode: 40346 Links: 1
    ...
    [root@andromeda union-testsuite]# stat -L /proc/$$/fd/5
    ...
    Device: 23h/35d Inode: 40346 Links: 1
    ...

    Note the change in where /proc/$$/fd/5 points to in the ls command. It was
    pointing to /a/foo107 (which doesn't exist) and now points to /mnt/a/foo107
    (which is correct).

    The inode accessed, however, is the lower layer. The union layer is on device
    25h/37d and the upper layer on 24h/36d.

    Signed-off-by: David Howells
    Signed-off-by: Al Viro
    Cc: "Kamata, Munehisa"
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • commit d31911b9374a76560d2c8ea4aa6ce5781621e81d upstream.

    Currently one mrq->data maybe execute dma_map_sg() twice
    when mmc subsystem prepare over one new request, and the
    following log show up:
    sdhci[sdhci_pre_dma_transfer] invalid cookie: 24, next-cookie 25

    In this condition, mrq->date map a dma-memory(1) in sdhci_pre_req
    for the first time, and map another dma-memory(2) in sdhci_prepare_data
    for the second time. But driver only unmap the dma-memory(2), and
    dma-memory(1) never unmapped, which cause the dma memory leak issue.

    This patch use another method to map the dma memory for the mrq->data
    which can fix this dma memory leak issue.

    Fixes: 348487cb28e6 ("mmc: sdhci: use pipeline mmc requests to improve performance")
    Reported-and-tested-by: Jiri Slaby
    Signed-off-by: Haibo Chen
    Signed-off-by: Ulf Hansson
    Signed-off-by: Jiri Slaby
    Signed-off-by: Greg Kroah-Hartman

    Haibo Chen
     
  • commit b7f76ea2ef6739ee484a165ffbac98deb855d3d3 upstream.

    Signed-off-by: Jann Horn
    Reviewed-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     

30 Sep, 2015

3 commits

  • commit 841df7df196237ea63233f0f9eaa41db53afd70f upstream.

    Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
    superblock fails" changed jbd2_cleanup_journal_tail() to return EIO
    when the journal is aborted. That makes logic in
    jbd2_log_do_checkpoint() bail out which is fine, except that
    jbd2_journal_destroy() expects jbd2_log_do_checkpoint() to always make
    a progress in cleaning the journal. Without it jbd2_journal_destroy()
    just loops in an infinite loop.

    Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
    jbd2_log_do_checkpoint() fails with error.

    Reported-by: Eryu Guan
    Tested-by: Eryu Guan
    Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
    Signed-off-by: Jan Kara
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 0fdea1e8a2853f79d39b8555cc9de16a7e0ab26f upstream.

    Commit 718ba5b87343, moved the responsibility for unlocking the socket to
    xs_tcp_setup_socket, meaning that the socket will be unlocked before we
    know that it has finished trying to connect. The following patch is based on
    an initial patch by Russell King to ensure that we delay clearing the
    XPRT_CONNECTING flag until we either know that we failed to initiate
    a connection attempt, or the connection attempt itself failed.

    Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from racing")
    Reported-by: Russell King
    Reported-by: Russell King
    Tested-by: Russell King
    Tested-by: Benjamin Coddington
    Signed-off-by: Trond Myklebust
    Signed-off-by: Greg Kroah-Hartman

    Trond Myklebust
     
  • commit 2f064f3485cd29633ad1b3cfb00cc519509a3d72 upstream.

    Commit c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb") added
    checks for page->pfmemalloc to __skb_fill_page_desc():

    if (page->pfmemalloc && !page->mapping)
    skb->pfmemalloc = true;

    It assumes page->mapping == NULL implies that page->pfmemalloc can be
    trusted. However, __delete_from_page_cache() can set set page->mapping
    to NULL and leave page->index value alone. Due to being in union, a
    non-zero page->index will be interpreted as true page->pfmemalloc.

    So the assumption is invalid if the networking code can see such a page.
    And it seems it can. We have encountered this with a NFS over loopback
    setup when such a page is attached to a new skbuf. There is no copying
    going on in this case so the page confuses __skb_fill_page_desc which
    interprets the index as pfmemalloc flag and the network stack drops
    packets that have been allocated using the reserves unless they are to
    be queued on sockets handling the swapping which is the case here and
    that leads to hangs when the nfs client waits for a response from the
    server which has been dropped and thus never arrive.

    The struct page is already heavily packed so rather than finding another
    hole to put it in, let's do a trick instead. We can reuse the index
    again but define it to an impossible value (-1UL). This is the page
    index so it should never see the value that large. Replace all direct
    users of page->pfmemalloc by page_is_pfmemalloc which will hide this
    nastiness from unspoiled eyes.

    The information will get lost if somebody wants to use page->index
    obviously but that was the case before and the original code expected
    that the information should be persisted somewhere else if that is
    really needed (e.g. what SLAB and SLUB do).

    [akpm@linux-foundation.org: fix blooper in slub]
    Fixes: c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb")
    Signed-off-by: Michal Hocko
    Debugged-by: Vlastimil Babka
    Debugged-by: Jiri Bohac
    Cc: Eric Dumazet
    Cc: David Miller
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Michal Hocko
     

22 Sep, 2015

4 commits

  • commit a068acf2ee77693e0bf39d6e07139ba704f461c3 upstream.

    Many file systems that implement the show_options hook fail to correctly
    escape their output which could lead to unescaped characters (e.g. new
    lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
    could lead to confusion, spoofed entries (resulting in things like
    systemd issuing false d-bus "mount" notifications), and who knows what
    else. This looks like it would only be the root user stepping on
    themselves, but it's possible weird things could happen in containers or
    in other situations with delegated mount privileges.

    Here's an example using overlay with setuid fusermount trusting the
    contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
    of "sudo" is something more sneaky:

    $ BASE="ovl"
    $ MNT="$BASE/mnt"
    $ LOW="$BASE/lower"
    $ UP="$BASE/upper"
    $ WORK="$BASE/work/ 0 0
    none /proc fuse.pwn user_id=1000"
    $ mkdir -p "$LOW" "$UP" "$WORK"
    $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
    $ cat /proc/mounts
    none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
    none /proc fuse.pwn user_id=1000 0 0
    $ fusermount -u /proc
    $ cat /proc/mounts
    cat: /proc/mounts: No such file or directory

    This fixes the problem by adding new seq_show_option and
    seq_show_option_n helpers, and updating the vulnerable show_option
    handlers to use them as needed. Some, like SELinux, need to be open
    coded due to unusual existing escape mechanisms.

    [akpm@linux-foundation.org: add lost chunk, per Kees]
    [keescook@chromium.org: seq_show_option should be using const parameters]
    Signed-off-by: Kees Cook
    Acked-by: Serge Hallyn
    Acked-by: Jan Kara
    Acked-by: Paul Moore
    Cc: J. R. Okajima
    Signed-off-by: Kees Cook
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Kees Cook
     
  • commit 5d0ddfebb93069061880fc57ee4ba7246bd1e1ee upstream.

    Nick Meier reported a regression with HyperV that "
    After rebooting the VM, the following messages are logged in syslog
    when trying to load the tulip driver:
    tulip: Linux Tulip drivers version 1.1.15 (Feb 27, 2007)
    tulip: 0000:00:0a.0: PCI INT A: failed to register GSI
    tulip: Cannot enable tulip board #0, aborting
    tulip: probe of 0000:00:0a.0 failed with error -16
    Errors occur in 3.19.0 kernel
    Works in 3.17 kernel.
    "

    According to the ACPI dump file posted by Nick at
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072

    The ACPI MADT table includes an interrupt source overridden entry for
    ACPI SCI:
    [236h 0566 1] Subtable Type : 02
    [237h 0567 1] Length : 0A
    [238h 0568 1] Bus : 00
    [239h 0569 1] Source : 09
    [23Ah 0570 4] Interrupt : 00000009
    [23Eh 0574 2] Flags (decoded below) : 000D
    Polarity : 1
    Trigger Mode : 3

    And in DSDT table, we have _PRT method to define PCI interrupts, which
    eventually goes to:
    Name (PRSA, ResourceTemplate ()
    {
    IRQ (Level, ActiveLow, Shared, )
    {3,4,5,7,9,10,11,12,14,15}
    })
    Name (PRSB, ResourceTemplate ()
    {
    IRQ (Level, ActiveLow, Shared, )
    {3,4,5,7,9,10,11,12,14,15}
    })
    Name (PRSC, ResourceTemplate ()
    {
    IRQ (Level, ActiveLow, Shared, )
    {3,4,5,7,9,10,11,12,14,15}
    })
    Name (PRSD, ResourceTemplate ()
    {
    IRQ (Level, ActiveLow, Shared, )
    {3,4,5,7,9,10,11,12,14,15}
    })

    According to the MADT and DSDT tables, IRQ 9 may be used for:
    1) ACPI SCI in level, high mode
    2) PCI legacy IRQ in level, low mode
    So there's a conflict in polarity setting for IRQ 9.

    Prior to commit cd68f6bd53cf ("x86, irq, acpi: Get rid of special
    handling of GSI for ACPI SCI"), ACPI SCI is handled specially and
    there's no check for conflicts between ACPI SCI and PCI legagy IRQ.
    And it seems that the HyperV hypervisor doesn't make use of the
    polarity configuration in IOAPIC entry, so it just works.

    Commit cd68f6bd53cf gets rid of the specially handling of ACPI SCI,
    and then the pin attribute checking code discloses the conflicts
    between ACPI SCI and PCI legacy IRQ on HyperV virtual machine,
    and rejects the request to assign IRQ9 to PCI devices.

    So penalize legacy IRQ used by ACPI SCI and mark it unusable if ACPI
    SCI attributes conflict with PCI IRQ attributes.

    Please refer to following links for more information:
    https://bugzilla.kernel.org/show_bug.cgi?id=101301
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072

    Fixes: cd68f6bd53cf ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI")
    Reported-and-tested-by: Nick Meier
    Acked-by: Thomas Gleixner
    Signed-off-by: Jiang Liu
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Jiang Liu
     
  • commit 932c435caba8a2ce473a91753bad0173269ef334 upstream.

    Add a dev_flags bit, PCI_DEV_FLAGS_VPD_REF_F0, to access VPD through
    function 0 to provide VPD access on other functions. This is for hardware
    devices that provide copies of the same VPD capability registers in
    multiple functions. Because the kernel expects that each function has its
    own registers, both the locking and the state tracking are affected by VPD
    accesses to different functions.

    On such devices for example, if a VPD write is performed on function 0,
    *any* later attempt to read VPD from any other function of that device will
    hang. This has to do with how the kernel tracks the expected value of the
    F bit per function.

    Concurrent accesses to different functions of the same device can not only
    hang but also corrupt both read and write VPD data.

    When hangs occur, typically the error message:

    vpd r/w failed. This is likely a firmware bug on this device.

    will be seen.

    Never set this bit on function 0 or there will be an infinite recursion.

    Signed-off-by: Mark Rustad
    Signed-off-by: Bjorn Helgaas
    Acked-by: Alexander Duyck
    Signed-off-by: Greg Kroah-Hartman

    Mark Rustad
     
  • commit c689a923c867eac40ed3826c1d9328edea8b6bc7 upstream.

    Add inverse unit conversion macro to convert from standard IIO units to
    units that might be used by some devices.

    Those are useful in combination with scale factors that are specified as
    IIO_VAL_FRACTIONAL. Typically the denominator for those specifications will
    contain the maximum raw value the sensor will generate and the numerator
    the value it maps to in a specific unit. Sometimes datasheets specify those
    in different units than the standard IIO units (e.g. degree/s instead of
    rad/s) and so we need to do a unit conversion.

    From a mathematical point of view it does not make a difference whether we
    apply the unit conversion to the numerator or the inverse unit conversion
    to the denominator since (x / y) / z = x / (y * z). But as the denominator
    is typically a larger value and we are rounding both the numerator and
    denominator to integer values using the later method gives us a better
    precision (E.g. the relative error is smaller if we round 8000.3 to 8000
    rather than rounding 8.3 to 8).

    This is where in inverse unit conversion macros will be used.

    Marked for stable as used by some upcoming fixes.

    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Jonathan Cameron
    Signed-off-by: Greg Kroah-Hartman

    Lars-Peter Clausen
     

14 Sep, 2015

3 commits

  • commit b7560de198222994374c1340a389f12d5efb244a upstream.

    This helper is required for irq chips which do not implement a
    irq_set_type callback and need to call down the irq domain hierarchy
    for the actual trigger type change.

    This helper is required to fix further wreckage caused by the
    conversion of TI OMAP to hierarchical irq domains and therefor tagged
    for stable.

    [ tglx: Massaged changelog ]

    Signed-off-by: Grygorii Strashko
    Cc: Sudeep Holla
    Cc:
    Cc:
    Cc:
    Cc:
    Cc:
    Cc:
    Cc:
    Cc: stable@vger.kernel.org # 4.1
    Link: http://lkml.kernel.org/r/1439554830-19502-3-git-send-email-grygorii.strashko@ti.com
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Grygorii Strashko
     
  • commit 74a80d67b8316eb3fbeb73dafc060a5a0a708587 upstream.

    This reverts commit 42b966fbf35da9c87f08d98f9b8978edf9e717cf.

    As implemented, ACS-4 sense reporting for ATA devices bypasses error
    diagnosis and handling in libata degrading EH behavior significantly.
    Revert the related changes for now.

    Signed-off-by: Tejun Heo
    Cc: Hannes Reinecke
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • commit 84ded2f8e7dda336fc2fb3570726ceb3b3b3590f upstream.

    This reverts commit fe7173c206de63fc28475ee6ae42ff95c05692de.

    As implemented, ACS-4 sense reporting for ATA devices bypasses error
    diagnosis and handling in libata degrading EH behavior significantly.
    Revert the related changes for now.

    ATA_ID_COMMAND_SET_3/4 constants are not reverted as they're used by
    later changes.

    Signed-off-by: Tejun Heo
    Cc: Hannes Reinecke
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     

17 Aug, 2015

1 commit

  • commit 5f867db63473f32cce1b868e281ebd42a41f8fad upstream.

    Commit 66507c7bc8895f0da6b ("mtd: nand: Add support to use nand_base
    poi databuf as bounce buffer") added a flag NAND_USE_BOUNCE_BUFFER
    using the same bit value as the existing NAND_BUSWIDTH_AUTO.

    Cc: Kamal Dasu
    Fixes: 66507c7bc8895f0da6b ("mtd: nand: Add support to use nand_base
    poi databuf as bounce buffer")
    Signed-off-by: Scott Wood
    Signed-off-by: Brian Norris
    Signed-off-by: Greg Kroah-Hartman

    Scott Wood
     

11 Aug, 2015

3 commits

  • commit 4c62360d7562a20c996836d163259c87d9378120 upstream.

    The memory error record structure includes as its first field a
    bitmask of which subsequent fields are valid. The allows new fields
    to be added to the structure while keeping compatibility with older
    software that parses these records. This mechanism was used between
    versions 2.2 and 2.3 to add four new fields, growing the size of the
    structure from 73 bytes to 80. But Linux just added all the new
    fields so this test:
    if (gdata->error_data_length >= sizeof(*mem_err))
    cper_print_mem(newpfx, mem_err);
    else
    goto err_section_too_small;
    now make Linux complain about old format records being too short.

    Add a definition for the old format of the structure and use that
    for the minimum size check. Pass the actual size to cper_print_mem()
    so it can sanity check the validation_bits field to ensure that if
    a BIOS using the old format sets bits as if it were new, we won't
    access fields beyond the end of the structure.

    Signed-off-by: Tony Luck
    Signed-off-by: Matt Fleming
    Signed-off-by: Greg Kroah-Hartman

    Luck, Tony
     
  • commit e3eea1404f5ff7a2ceb7b5e7ba412a6fd94f2935 upstream.

    Commit 4104d326b670 ("ftrace: Remove global function list and call function
    directly") simplified the ftrace code by removing the global_ops list with a
    new design. But this cleanup also broke the filtering of PIDs that are added
    to the set_ftrace_pid file.

    Add back the proper hooks to have pid filtering working once again.

    Reported-by: Matt Fleming
    Reported-by: Richard Weinberger
    Tested-by: Matt Fleming
    Signed-off-by: Steven Rostedt
    Signed-off-by: Greg Kroah-Hartman

    Steven Rostedt (Red Hat)
     
  • commit d3b58c47d330de8c29898fe9746f7530408f8a59 upstream.

    Commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for
    overlapping CAN filters" requires the skb->tstamp to be set to check for
    identical CAN skbs.

    Without timestamping to be required by user space applications this timestamp
    was not generated which lead to commit 36c01245eb8 "can: fix loss of CAN frames
    in raw_rcv" - which forces the timestamp to be set in all CAN related skbuffs
    by introducing several __net_timestamp() calls.

    This forces e.g. out of tree drivers which are not using alloc_can{,fd}_skb()
    to add __net_timestamp() after skbuff creation to prevent the frame loss fixed
    in mainline Linux.

    This patch removes the timestamp dependency and uses an atomic counter to
    create an unique identifier together with the skbuff pointer.

    Btw: the new skbcnt element introduced in struct can_skb_priv has to be
    initialized with zero in out-of-tree drivers which are not using
    alloc_can{,fd}_skb() too.

    Signed-off-by: Oliver Hartkopp
    Signed-off-by: Marc Kleine-Budde
    Signed-off-by: Greg Kroah-Hartman

    Oliver Hartkopp
     

04 Aug, 2015

11 commits

  • commit 764ad8ba8cd4c6f836fca9378f8c5121aece0842 upstream.

    The current buffer is much too small if you have a relatively long
    hostname. Bring it up to the size of the one that SETCLIENTID has.

    Reported-by: Michael Skralivetsky
    Signed-off-by: Jeff Layton
    Signed-off-by: Trond Myklebust
    Signed-off-by: Greg Kroah-Hartman

    Jeff Layton
     
  • commit 496e7ce2a46562938edcb74f65b26068ee8895f6 upstream.

    If GPIOLIB=n:

    drivers/leds/leds-gpio.c: In function ‘gpio_leds_create’:
    drivers/leds/leds-gpio.c:187: error: implicit declaration of function ‘devm_get_gpiod_from_child’
    drivers/leds/leds-gpio.c:187: warning: assignment makes pointer from integer without a cast

    Add dummies for fwnode_get_named_gpiod() and devm_get_gpiod_from_child()
    for the !GPIOLIB case to fix this.

    Signed-off-by: Geert Uytterhoeven
    Fixes: 40b7318319281b1b ("gpio: Support for unified device properties interface")
    Acked-by: Alexandre Courbot
    Acked-by: Linus Walleij
    Signed-off-by: Bryan Wu
    Signed-off-by: Greg Kroah-Hartman

    Geert Uytterhoeven
     
  • commit c8fff7bc5bba6bd59cad40441c189c4efe7190f6 upstream.

    Node 0 might be offline as well as any other numa node,
    in this case kernel cannot handle memory allocation and crashes.

    Signed-off-by: Konstantin Khlebnikov
    Fixes: 0c3f061c195c ("of: implement of_node_to_nid as a weak function")
    Signed-off-by: Grant Likely
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • commit b86a50c3b5414eafdbee7f34af4a201a4a7817c2 upstream.

    Cleanup commit 73679e508201 ("compiler-intel.h: Remove duplicate
    definition") removed the double definition of __memory_barrier()
    intrinsics.

    However, in doing so, it also removed the preceding #undef barrier by
    accident, meaning, the actual barrier() macro from compiler-gcc.h with
    inline asm is still in place as __GNUC__ is provided.

    Subsequently, barrier() can never be defined as __memory_barrier() from
    compiler.h since it already has a definition in place and if we trust
    the comment in compiler-intel.h, ecc doesn't support gcc specific asm
    statements.

    I don't have an ecc at hand (unsure if that's still used in the field?)
    and only found this by accident during code review, a revert of that
    cleanup would be simplest option.

    Fixes: 73679e508201 ("compiler-intel.h: Remove duplicate definition")
    Signed-off-by: Daniel Borkmann
    Reviewed-by: Pranith Kumar
    Cc: Pranith Kumar
    Cc: H. Peter Anvin
    Cc: mancha security
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • commit 0294112ee3135fbd15eaa70015af8283642dd970 upstream.

    This effectively reverts the following three commits:

    7bc10388ccdd ACPI / resources: free memory on error in add_region_before()
    0f1b414d1907 ACPI / PNP: Avoid conflicting resource reservations
    b9a5e5e18fbf ACPI / init: Fix the ordering of acpi_reserve_resources()

    (commit b9a5e5e18fbf introduced regressions some of which, but not
    all, were addressed by commit 0f1b414d1907 and commit 7bc10388ccdd
    was a fixup on top of the latter) and causes ACPI fixed hardware
    resources to be reserved at the fs_initcall_sync stage of system
    initialization.

    The story is as follows. First, a boot regression was reported due
    to an apparent resource reservation ordering change after a commit
    that shouldn't lead to such changes. Investigation led to the
    conclusion that the problem happened because acpi_reserve_resources()
    was executed at the device_initcall() stage of system initialization
    which wasn't strictly ordered with respect to driver initialization
    (and with respect to the initialization of the pcieport driver in
    particular), so a random change causing the device initcalls to be
    run in a different order might break things.

    The response to that was to attempt to run acpi_reserve_resources()
    as soon as we knew that ACPI would be in use (commit b9a5e5e18fbf).
    However, that turned out to be too early, because it caused resource
    reservations made by the PNP system driver to fail on at least one
    system and that failure was addressed by commit 0f1b414d1907.

    That fix still turned out to be insufficient, though, because
    calling acpi_reserve_resources() before the fs_initcall stage of
    system initialization caused a boot regression to happen on the
    eCAFE EC-800-H20G/S netbook. That meant that we only could call
    acpi_reserve_resources() at the fs_initcall initialization stage
    or later, but then we might just as well call it after the PNP
    initalization in which case commit 0f1b414d1907 wouldn't be
    necessary any more.

    For this reason, the changes made by commit 0f1b414d1907 are reverted
    (along with a memory leak fixup on top of that commit), the changes
    made by commit b9a5e5e18fbf that went too far are reverted too and
    acpi_reserve_resources() is changed into fs_initcall_sync, which
    will cause it to be executed after the PNP subsystem initialization
    (which is an fs_initcall) and before device initcalls (including
    the pcieport driver initialization) which should avoid the initial
    issue.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=100581
    Link: http://marc.info/?t=143092384600002&r=1&w=2
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=99831
    Link: http://marc.info/?t=143389402600001&r=1&w=2
    Fixes: b9a5e5e18fbf "ACPI / init: Fix the ordering of acpi_reserve_resources()"
    Reported-by: Roland Dreier
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • commit af34d637637eabaf49406eb35c948cd51ba262a6 upstream.

    Since no longer limiting max_sectors to BLK_DEF_MAX_SECTORS (commit 34b48db66e08),
    data corruption may occur on ST380013AS drive configured on 82801JI (ICH10 Family)
    SATA controller. This patch will allow the driver to limit max_sectors as before

    # cat /sys/block/sdb/queue/max_sectors_kb
    512

    I was able to double the max_sectors_kb value up to 16384 on linux-4.2.0-rc2
    before seeing corruption, but seems safer to use previous limit. Without this
    patch max_sectors_kb will be 32767.

    tj: Minor comment update.

    Reported-by: Jeff Moyer
    Signed-off-by: David Milburn
    Signed-off-by: Tejun Heo
    Fixes: 34b48db66e08 ("block: remove artifical max_hw_sectors cap")
    Signed-off-by: Greg Kroah-Hartman

    David Milburn
     
  • commit 71d126fd28de2d4d9b7b2088dbccd7ca62fad6e0 upstream.

    Some devices lose data on TRIM whether queued or not. This patch adds
    a horkage to disable TRIM.

    tj: Collapsed unnecessary if() nesting.

    Signed-off-by: Arne Fitzenreiter
    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Arne Fitzenreiter
     
  • commit 5d3abf8ff67f49271a42c0f7fa4f20f9e046bf0e upstream.

    Some devices advertise support for the READ/WRITE LOG DMA EXT commands
    but fail when we try to issue them. This can lead to queued TRIM being
    unintentionally disabled since the relevant feature flag is located in a
    general purpose log page.

    Fall back to unqueued READ LOG EXT if the DMA variant fails while
    reading a log page.

    Signed-off-by: Martin K. Petersen
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Martin K. Petersen
     
  • commit 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a upstream.

    If updating journal superblock fails after journal data has been
    flushed, the error is omitted and this will mislead the caller as a
    normal case. In ocfs2, the checkpoint will be treated successfully
    and the other node can get the lock to update. Since the sb_start is
    still pointing to the old log block, it will rewrite the journal data
    during journal recovery by the other node. Thus the new updates will
    be overwritten and ocfs2 corrupts. So in above case we have to return
    the error, and ocfs2_commit_cache will take care of the error and
    prevent the other node to do update first. And only after recovering
    journal it can do the new updates.

    The issue discussion mail can be found at:
    https://oss.oracle.com/pipermail/ocfs2-devel/2015-June/010856.html
    http://comments.gmane.org/gmane.comp.file-systems.ext4/48841

    [ Fixed bug in patch which allowed a non-negative error return from
    jbd2_cleanup_journal_tail() to leak out of jbd2_fjournal_flush(); this
    was causing xfstests ext4/306 to fail. -- Ted ]

    Reported-by: Yiwen Jiang
    Signed-off-by: Joseph Qi
    Signed-off-by: Theodore Ts'o
    Tested-by: Yiwen Jiang
    Cc: Junxiao Bi
    Signed-off-by: Greg Kroah-Hartman

    Joseph Qi
     
  • commit bd7ade3cd9b0850264306f5c2b79024a417b6396 upstream.

    sb_getblk() is used during ext4 (and possibly other FSes) writeback
    paths. Sometimes such path require allocating memory and guaranteeing
    that such allocation won't block. Currently, however, there is no way
    to provide user flags for sb_getblk which could lead to deadlocks.

    This patch implements a sb_getblk_gfp with the only difference it can
    accept user-provided GFP flags.

    Signed-off-by: Nikolay Borisov
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Borisov
     
  • commit 1e25aa9641e8f3fa39cd5e46b4afcafd7f12a44b upstream.

    By default all the sensors are runtime suspended state (lowest power
    state). During Linux suspend process, all the run time suspended
    devices are resumed and then suspended. This caused all sensors to
    power up and introduced delay in suspend time, when we introduced
    runtime PM for HID sensors. The opposite process happens during resume
    process.

    To fix this, we do powerup process of the sensors only when the request
    is issued from user (raw or tiggerred). In this way when runtime,
    resume calls for powerup it will simply return as this will not match
    user requested state.

    Note this is a regression fix as the increase in suspend / resume
    times can be substantial (report of 8 seconds on Len's laptop!)

    Signed-off-by: Srinivas Pandruvada
    Tested-by: Len Brown
    Signed-off-by: Jonathan Cameron
    Signed-off-by: Greg Kroah-Hartman

    Srinivas Pandruvada
     

22 Jul, 2015

4 commits

  • commit 3a9ad0b4fdcd57f775d3615004c8c64c021a9e7d upstream.

    David Ahern reported that d63e2e1f3df9 ("sparc/PCI: Clip bridge windows
    to fit in upstream windows") fails to boot on sparc/T5-8:

    pci 0000:06:00.0: reg 0x184: can't handle BAR above 4GB (bus address 0x110204000)

    The problem is that sparc64 assumed that dma_addr_t only needed to hold DMA
    addresses, i.e., bus addresses returned via the DMA API (dma_map_single(),
    etc.), while the PCI core assumed dma_addr_t could hold *any* bus address,
    including raw BAR values. On sparc64, all DMA addresses fit in 32 bits, so
    dma_addr_t is a 32-bit type. However, BAR values can be 64 bits wide, so
    they don't fit in a dma_addr_t. d63e2e1f3df9 added new checking that
    tripped over this mismatch.

    Add pci_bus_addr_t, which is wide enough to hold any PCI bus address,
    including both raw BAR values and DMA addresses. This will be 64 bits
    on 64-bit platforms and on platforms with a 64-bit dma_addr_t. Then
    dma_addr_t only needs to be wide enough to hold addresses from the DMA API.

    [bhelgaas: changelog, bugzilla, Kconfig to ensure pci_bus_addr_t is at
    least as wide as dma_addr_t, documentation]
    Fixes: d63e2e1f3df9 ("sparc/PCI: Clip bridge windows to fit in upstream windows")
    Fixes: 23b13bc76f35 ("PCI: Fail safely if we can't handle BARs larger than 4GB")
    Link: http://lkml.kernel.org/r/CAE9FiQU1gJY1LYrxs+ma5LCTEEe4xmtjRG0aXJ9K_Tsu+m9Wuw@mail.gmail.com
    Link: http://lkml.kernel.org/r/1427857069-6789-1-git-send-email-yinghai@kernel.org
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=96231
    Reported-by: David Ahern
    Tested-by: David Ahern
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas
    Acked-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Yinghai Lu
     
  • commit 0824965140fff1bf640a987dc790d1594a8e0699 upstream.

    Refine the mechanism introduced by commit f244d8b623da ("ACPIPHP / radeon /
    nouveau: Fix VGA switcheroo problem related to hotplug") to propagate the
    ignore_hotplug setting of the device to its parent bridge in case hotplug
    notifications related to the graphics adapter switching are given for the
    bridge rather than for the device itself (they need to be ignored in both
    cases).

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=61891
    Link: https://bugs.freedesktop.org/show_bug.cgi?id=88927
    Fixes: b440bde74f04 ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device")
    Reported-and-tested-by: tiagdtd-lava
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • commit 8a8c35fadfaf55629a37ef1a8ead1b8fb32581d2 upstream.

    Beginning at commit d52d3997f843 ("ipv6: Create percpu rt6_info"), the
    following INFO splat is logged:

    ===============================
    [ INFO: suspicious RCU usage. ]
    4.1.0-rc7-next-20150612 #1 Not tainted
    -------------------------------
    kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
    other info that might help us debug this:
    rcu_scheduler_active = 1, debug_locks = 0
    3 locks held by systemd/1:
    #0: (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0x1f/0x40
    #1: (rcu_read_lock_bh){......}, at: [] ipv6_add_addr+0x62/0x540
    #2: (addrconf_hash_lock){+...+.}, at: [] ipv6_add_addr+0x184/0x540
    stack backtrace:
    CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
    Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
    Call Trace:
    dump_stack+0x4c/0x6e
    lockdep_rcu_suspicious+0xe7/0x120
    ___might_sleep+0x1d5/0x1f0
    __might_sleep+0x4d/0x90
    kmem_cache_alloc+0x47/0x250
    create_object+0x39/0x2e0
    kmemleak_alloc_percpu+0x61/0xe0
    pcpu_alloc+0x370/0x630

    Additional backtrace lines are truncated. In addition, the above splat
    is followed by several "BUG: sleeping function called from invalid
    context at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau,
    these are the clue to the fix. Routine kmemleak_alloc_percpu() always
    uses GFP_KERNEL for its allocations, whereas it should follow the gfp
    from its callers.

    Reviewed-by: Catalin Marinas
    Reviewed-by: Kamalesh Babulal
    Acked-by: Martin KaFai Lau
    Signed-off-by: Larry Finger
    Cc: Martin KaFai Lau
    Cc: Catalin Marinas
    Cc: Tejun Heo
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Larry Finger
     
  • commit 7f1a57fdd6cb6e7be2ed31878a34655df38e1861 upstream.

    Don't call the power_supply_changed() from power_supply_register() when
    parent is still probing because it may lead to accessing parent too
    early.

    In bq27x00_battery this caused NULL pointer exception because uevent of
    power_supply_changed called back the the get_property() method provided
    by the driver. The get_property() method accessed pointer which should
    be returned by power_supply_register().

    Starting from bq27x00_battery_probe():
    di->bat = power_supply_register()
    power_supply_changed()
    kobject_uevent()
    power_supply_uevent()
    power_supply_show_property()
    power_supply_get_property()
    bq27x00_battery_get_property()
    dereference of di->bat which is NULL here

    The dereference of di->bat (value returned by power_supply_register())
    is the currently visible problem. However calling back the methods
    provided by driver before ending the probe may lead to accessing other
    driver-related data which is not yet initialized.

    The call to power_supply_changed() is postponed till probing ends -
    mutex of parent device is released.

    Reported-by: H. Nikolaus Schaller
    Signed-off-by: Krzysztof Kozlowski
    Fixes: 297d716f6260 ("power_supply: Change ownership from driver to core")
    Tested-By: Dr. H. Nikolaus Schaller
    Signed-off-by: Sebastian Reichel
    Signed-off-by: Greg Kroah-Hartman

    Krzysztof Kozlowski