29 Nov, 2016

1 commit


28 Nov, 2016

3 commits


27 Nov, 2016

8 commits

  • Pull vfs splice fix from Al Viro.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fix default_file_splice_read()

    Linus Torvalds
     
  • Botched calculation of number of pages. As the result,
    we were dropping pieces when doing splice to pipe from
    e.g. 9p.

    Reported-by: Alexei Starovoitov
    Tested-by: Alexei Starovoitov
    Signed-off-by: Al Viro

    Al Viro
     
  • Pull i2c fixes from Wolfram Sang:
    "Here is a revert and two bugfixes for the I2C designware driver.

    Please note that we are still hunting down a regression for the
    i2c-octeon driver. While there is a fix pending, we have unclear
    feedback from the testers currently. An rc8 would be quite helpful
    for this case"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    Revert "i2c: designware: do not disable adapter after transfer"
    i2c: designware: fix rx fifo depth tracking
    i2c: designware: report short transfers

    Linus Torvalds
     
  • Pull ARM fix from Russell King:
    "This resolves the ksyms issues by reverting the commit which
    introduced the breakage"

    There was what I consider to be a better fix, but it's late in the rc
    game, so I'll take the revert.

    * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
    Revert "arm: move exports to definitions"

    Linus Torvalds
     
  • Pull networking fixes from David Miller:

    1) Fix leak in fsl/fman driver, from Dan Carpenter.

    2) Call flow dissector initcall earlier than any networking driver can
    register and start to use it, from Eric Dumazet.

    3) Some dup header fixes from Geliang Tang.

    4) TIPC link monitoring compat fix from Jon Paul Maloy.

    5) Link changes require EEE re-negotiation in bcm_sf2 driver, from
    Florian Fainelli.

    6) Fix bogus handle ID passed into tfilter_notify_chain(), from Roman
    Mashak.

    7) Fix dump size calculation in rtnl_calcit(), from Zhang Shengju.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
    tipc: resolve connection flow control compatibility problem
    mvpp2: use correct size for memset
    net/mlx5: drop duplicate header delay.h
    net: ieee802154: drop duplicate header delay.h
    ibmvnic: drop duplicate header seq_file.h
    fsl/fman: fix a leak in tgec_free()
    net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS
    tipc: improve sanity check for received domain records
    tipc: fix compatibility bug in link monitoring
    net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented
    dwc_eth_qos: drop duplicate headers
    net sched filters: fix filter handle ID in tfilter_notify_chain()
    net: dsa: bcm_sf2: Ensure we re-negotiate EEE during after link change
    bnxt: do not busy-poll when link is down
    udplite: call proper backlog handlers
    ipv6: bump genid when the IFA_F_TENTATIVE flag is clear
    net/mlx4_en: Free netdev resources under state lock
    net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit"
    rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit()
    bnxt_en: Fix a VXLAN vs GENEVE issue
    ...

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:

    - Fix a crash that occurs at driver initialization if the memory region
    is already busy (request_mem_region() fails).

    - Fix a vma validation check that mistakenly allows a private device-
    dax mapping to be established. Device-dax explicitly forbids private
    mappings so it can guarantee a given fault granularity and backing
    memory type.

    Both of these fixes have soaked in -next and are tagged for -stable.

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    device-dax: fail all private mapping attempts
    device-dax: check devm_nsio_enable() return value

    Linus Torvalds
     
  • Pull KVM fixes from Radim Krčmář:
    "Four fixes for bugs found by syzkaller on x86, all for stable"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    KVM: x86: check for pic and ioapic presence before use
    KVM: x86: fix out-of-bounds accesses of rtc_eoi map
    KVM: x86: drop error recovery in em_jmp_far and em_ret_far
    KVM: x86: fix out-of-bounds access in lapic

    Linus Torvalds
     
  • Pull powerpc fixes from Michael Ellerman:
    "Fixes marked for stable:
    - Set missing wakeup bit in LPCR on POWER9
    - Fix the early OPAL console wrappers
    - Fixup kernel read only mapping

    Fixes for code merged this cycle:
    - Fix missing CRCs, add more asm-prototypes.h declarations"

    * tag 'powerpc-4.9-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/mm: Fixup kernel read only mapping
    powerpc/boot: Fix the early OPAL console wrappers
    powerpc: Fix missing CRCs, add more asm-prototypes.h declarations
    powerpc: Set missing wakeup bit in LPCR on POWER9

    Linus Torvalds
     

26 Nov, 2016

22 commits

  • In commit 10724cc7bb78 ("tipc: redesign connection-level flow control")
    we replaced the previous message based flow control with one based on
    1k blocks. In order to ensure backwards compatibility the mechanism
    falls back to using message as base unit when it senses that the peer
    doesn't support the new algorithm. The default flow control window,
    i.e., how many units can be sent before the sender blocks and waits
    for an acknowledge (aka advertisement) is 512. This was tested against
    the previous version, which uses an acknowledge frequency of on ack per
    256 received message, and found to work fine.

    However, we missed the fact that versions older than Linux 3.15 use an
    acknowledge frequency of 512, which is exactly the limit where a 4.6+
    sender will stop and wait for acknowledge. This would also work fine if
    it weren't for the fact that if the first sent message on a 4.6+ server
    side is an empty SYNACK, this one is also is counted as a sent message,
    while it is not counted as a received message on a legacy 3.15-receiver.
    This leads to the sender always being one step ahead of the receiver, a
    scenario causing the sender to block after 512 sent messages, while the
    receiver only has registered 511 read messages. Hence, the legacy
    receiver is not trigged to send an acknowledge, with a permanently
    blocked sender as result.

    We solve this deadlock by simply allowing the sender to send one more
    message before it blocks, i.e., by a making minimal change to the
    condition used for determining connection congestion.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • gcc-7 detects a short memset in mvpp2, introduced in the original
    merge of the driver:

    drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_cls_init':
    drivers/net/ethernet/marvell/mvpp2.c:3296:2: error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]

    The result seems to be that we write uninitialized data into the
    flow table registers, although we did not get any warning about
    that uninitialized data usage.

    Using sizeof() lets us initialize then entire array instead.

    Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
    Signed-off-by: Arnd Bergmann
    Signed-off-by: David S. Miller

    Arnd Bergmann
     
  • Drop duplicate header delay.h from mlx5/core/main.c.

    Signed-off-by: Geliang Tang
    Acked-by: Matan Barak
    Acked-by: Saeed Mahameed
    Signed-off-by: David S. Miller

    Geliang Tang
     
  • Drop duplicate header delay.h from adf7242.c.

    Signed-off-by: Geliang Tang
    Acked-by: Stefan Schmidt
    Signed-off-by: David S. Miller

    Geliang Tang
     
  • Drop duplicate header seq_file.h from ibmvnic.c.

    Signed-off-by: Geliang Tang
    Signed-off-by: David S. Miller

    Geliang Tang
     
  • We set "tgec->cfg" to NULL before passing it to kfree(). There is no
    need to set it to NULL at all. Let's just delete it.

    Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support")
    Signed-off-by: Dan Carpenter
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET
    command and likewise it shouldn't require the CAP_NET_ADMIN capability.

    Signed-off-by: Miroslav Lichvar
    Signed-off-by: David S. Miller

    Miroslav Lichvar
     
  • In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we
    added a data area to the link monitor STATE messages under the
    assumption that previous versions did not use any such data area.

    For versions older than Linux 4.3 this assumption is not correct. In
    those version, all STATE messages sent out from a node inadvertently
    contain a 16 byte data area containing a string; -a leftover from
    previous RESET messages which were using this during the setup phase.
    This string serves no purpose in STATE messages, and should no be there.

    Unfortunately, this data area is delivered to the link monitor
    framework, where a sanity check catches that it is not a correct domain
    record, and drops it. It also issues a rate limited warning about the
    event.

    Since such events occur much more frequently than anticipated, we now
    choose to remove the warning in order to not fill the kernel log with
    useless contents. We also make the sanity check stricter, to further
    reduce the risk that such data is inavertently admitted.

    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • commit 817298102b0b ("tipc: fix link priority propagation") introduced a
    compatibility problem between TIPC versions newer than Linux 4.6 and
    those older than Linux 4.4. In versions later than 4.4, link STATE
    messages only contain a non-zero link priority value when the sender
    wants the receiver to change its priority. This has the effect that the
    receiver resets itself in order to apply the new priority. This works
    well, and is consistent with the said commit.

    However, in versions older than 4.4 a valid link priority is present in
    all sent link STATE messages, leading to cyclic link establishment and
    reset on the 4.6+ node.

    We fix this by adding a test that the received value should not only
    be valid, but also differ from the current value in order to cause the
    receiving link endpoint to reset.

    Reported-by: Amar Nv
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it
    actually does not. The hardware probably does support it, but there is
    no code to configure the filter. As a quick and simple fix, remove the
    flag. This will cause the core to fall back to promiscuous mode.

    Signed-off-by: Andrew Lunn
    Fixes: b50b72de2f2f ("net: mvneta: enable features before registering the driver")
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • Pull parisc fixes from Helge Deller:
    "On parisc we were still seeing occasional random segmentation faults
    and memory corruption on SMP machines. Dave Anglin then looked again
    at the TLB related code and found two issues in the PCI DMA and
    generic TLB flush functions.

    Then, in our startup code we had some timing of the cache and TLB
    functions to calculate a threshold when to use a complete TLB/cache
    flush or just to flush a specific range. This code produced a race
    with newly started CPUs and thus lead to occasional kernel crashes
    (due to stale TLB/cache entries). The patch by Dave fixes this issue
    by flushing the local caches before starting secondary CPUs and by
    removing the race.

    The last problem fixed by this series is that we quite often suffered
    from hung tasks and self-detected stalls on the CPUs. It was somehow
    clear that this was related to the (in v4.7) newly introduced cr16
    clocksource and the own implementation of sched_clock(). I replaced
    the open-coded sched_clock() function and switched to the generic
    sched_clock() implementation which seems to have fixed this isse as
    well.

    All patches have been sucessfully tested on a variety of machines,
    including our debian buildd servers.

    All patches (beside the small pr_cont fix) are tagged for stable
    releases"

    * 'parisc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: Also flush data TLB in flush_icache_page_asm
    parisc: Fix race in pci-dma.c
    parisc: Switch to generic sched_clock implementation
    parisc: Fix races in parisc_setup_cache_timing()
    parisc: Fix printk continuations in system detection

    Linus Torvalds
     
  • Pull keys fixes from James Morris:
    "From David:

    - Fix mpi_powm()'s handling of a number with a zero exponent
    [CVE-2016-8650].

    Integrate my and Andrey's patches for mpi_powm() and use
    mpi_resize() instead of RESIZE_IF_NEEDED() - the latter adds a
    duplicate check into the execution path of a trivial case we
    don't normally expect to be taken.

    - Fix double free in X.509 error handling"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    mpi: Fix NULL ptr dereference in mpi_powm() [ver #3]
    X.509: Fix double free in x509_cert_parse() [ver #3]

    Linus Torvalds
     
  • CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series,
    and quite frankly, nobody has cared very deeply. We absolutely know how
    to fix it, and it's not _complicated_, but it's not exactly pretty
    either.

    This oneliner fixes it without the ugliness, and allows for further
    future cleanups.

    "We've secretly replaced their regular MODVERSIONS with nothing at
    all, let's see if they notice"

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull ACPI fixes from Rafael Wysocki:
    "Two ACPI fixes for 4.9-rc7.

    One of them reverts a recent ACPI commit that attempted to improve
    reboot/power-off on some systems, but introduced problems elsewhere,
    and the other one fixes kernel builds with the new WDAT watchdog
    driver enabled in some configurations.

    Specifics:

    - Revert the recent commit that caused the ACPI _PTS method to be
    executed in the power-off/reboot code path (as per the
    specification) in an attempt to improve things on some systems
    (apparently expecting _PTS to be executed in that code path), but
    broke power-off/reboot on at least one other machine (Rafael
    Wysocki).

    - Fix kernel builds with the new WDAT watchdog driver enabled in some
    configurations by explicitly selecting WATCHDOG_CORE when enabling
    the WDAT watchdog driver (Mika Westerberg)"

    * tag 'acpi-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    watchdog: wdat_wdt: Select WATCHDOG_CORE
    Revert "ACPI: Execute _PTS before system reboot"

    Linus Torvalds
     
  • Following the kernel Bugzilla discussion during the Kernel Summit
    (https://lwn.net/Articles/705245/), add bug tracking system location
    entry type (B) to MAINTAINERS and populate it for several subsystems
    known to be using the kernel BZ actively (and add the upstream BZ for
    ACPICA too).

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • This reverts commit 0317e6c0f1dc1ba86b8d9dccc010c5e77b8355fa.

    Srinivas reported recently touchscreen and touchpad stopped working in
    Haswell based machine in Linux 4.9-rc series with timeout errors from
    i2c_designware:

    [ 16.508013] i2c_designware INT33C3:00: controller timed out
    [ 16.508302] i2c_hid i2c-MSFT0001:02: failed to change power setting.
    [ 17.532016] i2c_designware INT33C3:00: controller timed out
    [ 18.556022] i2c_designware INT33C3:00: controller timed out
    [ 18.556315] i2c_hid i2c-ATML1000:00: failed to retrieve report from device.

    I managed to reproduce similar errors on another Haswell based machine
    where touchscreen initialization fails maybe in every 1/5 - 1/2 boots.
    Since root cause for these errors is not clear yet and debugging is
    ongoing it's better to revert this commit as we are near to release.

    Reported-by: Srinivas Pandruvada
    Signed-off-by: Jarkko Nikula
    Signed-off-by: Wolfram Sang

    Jarkko Nikula
     
  • * acpi-sleep-fixes:
    Revert "ACPI: Execute _PTS before system reboot"

    * acpi-wdat-fixes:
    watchdog: wdat_wdt: Select WATCHDOG_CORE

    Rafael J. Wysocki
     
  • …ux/kernel/git/mkl/linux-can

    Marc Kleine-Budde says:

    ====================
    pull-request: can 2016-11-23

    this is a pull request for net/master.

    The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the
    CAN-FD support, which may cause an out-of-bounds access otherwise.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Drop duplicate headers types.h and delay.h from dwc_eth_qos.c.

    Signed-off-by: Geliang Tang
    Signed-off-by: David S. Miller

    Geliang Tang
     
  • Pull MFD fixes from Lee Jones:
    "Received a copule of last minute fixes for v4.9.

    The patches from Viresh are fixing issues displayed in KernelCI"

    * tag 'mfd-fixes-4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
    mfd: wm8994-core: Don't use managed regulator bulk get API
    mfd: wm8994-core: Disable regulators before removing them
    mfd: syscon: Support native-endian regmaps

    Linus Torvalds
     
  • Pull media fix from Mauro Carvalho Chehab:
    "Fix for the firmware load logic of the tuner-xc2028 driver"

    * tag 'media/v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    xc2028: Fix use-after-free bug properly

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "Seems to be quietening down nicely, a few mediatek, one exynos and one
    hdlcd fix, along with two amd fixes"

    * tag 'drm-fixes-for-v4.9-rc7' of git://people.freedesktop.org/~airlied/linux:
    gpu/drm/exynos/exynos_hdmi - Unmap region obtained by of_iomap
    drm/mediatek: fix null pointer dereference
    drm/mediatek: fixed the calc method of data rate per lane
    drm/mediatek: fix a typo of DISP_OD_CFG to OD_RELAYMODE
    drm/radeon: fix power state when port pm is unavailable (v2)
    drm/amdgpu: fix power state when port pm is unavailable
    drm/arm: hdlcd: fix plane base address update
    drm/amd/powerplay: avoid out of bounds access on array ps.

    Linus Torvalds
     

25 Nov, 2016

6 commits

  • This is the second issue I noticed in reviewing the parisc TLB code.

    The fic instruction may use either the instruction or data TLB in
    flushing the instruction cache. Thus, on machines with a split TLB, we
    should also flush the data TLB after setting up the temporary alias
    registers.

    Although this has no functional impact, I changed the pdtlb and pitlb
    instructions to consistently use the index register %r0. These
    instructions do not support integer displacements.

    Tested on rp3440 and c8000.

    Signed-off-by: John David Anglin
    Cc: # v3.16+
    Signed-off-by: Helge Deller

    John David Anglin
     
  • We are still troubled by occasional random segmentation faults and
    memory memory corruption on SMP machines. The causes quite a few
    package builds to fail on the Debian buildd machines for parisc. When
    gcc-6 failed to build three times in a row, I looked again at the TLB
    related code. I found a couple of issues. This is the first.

    In general, we need to ensure page table updates and corresponding TLB
    purges are atomic. The attached patch fixes an instance in pci-dma.c
    where the page table update was not guarded by the TLB lock.

    Tested on rp3440 and c8000. So far, no further random segmentation
    faults have been observed.

    Signed-off-by: John David Anglin
    Cc: # v3.16+
    Signed-off-by: Helge Deller

    John David Anglin
     
  • Drop the open-coded sched_clock() function and replace it by the provided
    GENERIC_SCHED_CLOCK implementation. We have seen quite some hung tasks in the
    past, which seem to be fixed by this patch.

    Signed-off-by: Helge Deller
    Cc: # v4.7+
    Signed-off-by: Helge Deller

    Helge Deller
     
  • Helge reported to me the following startup crash:

    [ 0.000000] Linux version 4.8.0-1-parisc64-smp (debian-kernel@lists.debian.org) (gcc version 5.4.1 20161019 (GCC) ) #1 SMP Debian 4.8.7-1 (2016-11-13)
    [ 0.000000] The 64-bit Kernel has started...
    [ 0.000000] Kernel default page size is 4 KB. Huge pages enabled with 1 MB physical and 2 MB virtual size.
    [ 0.000000] Determining PDC firmware type: System Map.
    [ 0.000000] model 9000/785/J5000
    [ 0.000000] Total Memory: 2048 MB
    [ 0.000000] Memory: 2018528K/2097152K available (9272K kernel code, 3053K rwdata, 1319K rodata, 1024K init, 840K bss, 78624K reserved, 0K cma-reserved)
    [ 0.000000] virtual kernel memory layout:
    [ 0.000000] vmalloc : 0x0000000000008000 - 0x000000003f000000 (1007 MB)
    [ 0.000000] memory : 0x0000000040000000 - 0x00000000c0000000 (2048 MB)
    [ 0.000000] .init : 0x0000000040100000 - 0x0000000040200000 (1024 kB)
    [ 0.000000] .data : 0x0000000040b0e000 - 0x0000000040f533e0 (4372 kB)
    [ 0.000000] .text : 0x0000000040200000 - 0x0000000040b0e000 (9272 kB)
    [ 0.768910] Brought up 1 CPUs
    [ 0.992465] NET: Registered protocol family 16
    [ 2.429981] Releasing cpu 1 now, hpa=fffffffffffa2000
    [ 2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online
    [ 2.726692] Setting cache flush threshold to 1024 kB
    [ 2.729932] Not-handled unaligned insn 0x43ffff80
    [ 2.798114] Setting TLB flush threshold to 140 kB
    [ 2.928039] Unaligned handler failed, ret = -1
    [ 3.000419] _______________________________
    [ 3.000419] < Your System ate a SPARC! Gah! >
    [ 3.000419] -------------------------------
    [ 3.000419] \ ^__^
    [ 3.000419] (__)\ )\/\
    [ 3.000419] U ||----w |
    [ 3.000419] || ||
    [ 9.340055] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1
    [ 9.448082] task: 00000000bfd48060 task.stack: 00000000bfd50000
    [ 9.528040]
    [ 10.760029] IASQ: 0000000000000000 0000000000000000 IAOQ: 000000004025d154 000000004025d158
    [ 10.868052] IIR: 43ffff80 ISR: 0000000000340000 IOR: 000001ff54150960
    [ 10.960029] CPU: 1 CR30: 00000000bfd50000 CR31: 0000000011111111
    [ 11.052057] ORIG_R28: 000000004021e3b4
    [ 11.100045] IAOQ[0]: irq_exit+0x94/0x120
    [ 11.152062] IAOQ[1]: irq_exit+0x98/0x120
    [ 11.208031] RP(r2): irq_exit+0xb8/0x120
    [ 11.256074] Backtrace:
    [ 11.288067] [] cpu_startup_entry+0x1e4/0x598
    [ 11.368058] [] smp_callin+0x2c0/0x2f0
    [ 11.436308] [] update_curr+0x18c/0x2d0
    [ 11.508055] [] dequeue_entity+0x2c0/0x1030
    [ 11.584040] [] set_next_entity+0x80/0xd30
    [ 11.660069] [] pick_next_task_fair+0x614/0x720
    [ 11.740085] [] __schedule+0x394/0xa60
    [ 11.808054] [] schedule+0x88/0x118
    [ 11.876039] [] rescuer_thread+0x4d4/0x5b0
    [ 11.948090] [] kthread+0x1ec/0x248
    [ 12.016053] [] end_fault_vector+0x20/0xc0
    [ 12.092239] [] _switch_to_ret+0x0/0xf40
    [ 12.164044]
    [ 12.184036] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-1-parisc64-smp #1 Debian 4.8.7-1
    [ 12.244040] Backtrace:
    [ 12.244040] [] show_stack+0x68/0x80
    [ 12.244040] [] dump_stack+0xec/0x168
    [ 12.244040] [] die_if_kernel+0x25c/0x430
    [ 12.244040] [] handle_unaligned+0xb48/0xb50
    [ 12.244040]
    [ 12.632066] ---[ end trace 9ca05a7215c7bbb2 ]---
    [ 12.692036] Kernel panic - not syncing: Attempted to kill the idle task!

    We have the insn 0x43ffff80 in IIR but from IAOQ we should have:
    4025d150: 0f f3 20 df ldd,s r19(r31),r31
    4025d154: 0f 9f 00 9c ldw r31(ret0),ret0
    4025d158: bf 80 20 58 cmpb,*<> r0,ret0,4025d18c

    Cpu0 has just completed running parisc_setup_cache_timing:

    [ 2.429981] Releasing cpu 1 now, hpa=fffffffffffa2000
    [ 2.635751] CPU(s): 2 out of 2 PA8500 (PCX-W) at 440.000000 MHz online
    [ 2.726692] Setting cache flush threshold to 1024 kB
    [ 2.729932] Not-handled unaligned insn 0x43ffff80
    [ 2.798114] Setting TLB flush threshold to 140 kB
    [ 2.928039] Unaligned handler failed, ret = -1

    From the backtrace, cpu1 is in smp_callin:

    void __init smp_callin(void)
    {
    int slave_id = cpu_now_booting;

    smp_cpu_init(slave_id);
    preempt_disable();

    flush_cache_all_local(); /* start with known state */
    flush_tlb_all_local(NULL);

    local_irq_enable(); /* Interrupts have been off until now */

    cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);

    So, it has just flushed its caches and the TLB. It would seem either the
    flushes in parisc_setup_cache_timing or smp_callin have corrupted kernel
    memory.

    The attached patch reworks parisc_setup_cache_timing to remove the races
    in setting the cache and TLB flush thresholds. It also corrects the
    number of bytes flushed in the TLB calculation.

    The patch flushes the cache and TLB on cpu0 before starting the
    secondary processors so that they are started from a known state.

    Tested with a few reboots on c8000.

    Signed-off-by: John David Anglin
    Cc: # v3.18+
    Signed-off-by: Helge Deller

    John David Anglin
     
  • The kernel WARNs and then crashes today if wm8994_device_init() fails
    after calling devm_regulator_bulk_get().

    That happens because there are multiple devices involved here and the
    order in which managed resources are freed isn't correct.

    The regulators are added as children of wm8994->dev. Whereas,
    devm_regulator_bulk_get() receives wm8994->dev as the device, though it
    gets the same regulators which were added as children of wm8994->dev
    earlier.

    During failures, the children are removed first and the core eventually
    calls regulator_unregister() for them. As regulator_put() was never done
    for them (opposite of devm_regulator_bulk_get()), the kernel WARNs at

    WARN_ON(rdev->open_count);

    And eventually it crashes from debugfs_remove_recursive().

    --------x------------------x----------------

    wm8994 3-001a: Device is not a WM8994, ID is 0
    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 1 at /mnt/ssd/all/work/repos/devel/linux/drivers/regulator/core.c:4072 regulator_unregister+0xc8/0xd0
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc6-00154-g54fe84cbd50b #41
    Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
    [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
    [] (show_stack) from [] (dump_stack+0x88/0x9c)
    [] (dump_stack) from [] (__warn+0xe8/0x100)
    [] (__warn) from [] (warn_slowpath_null+0x20/0x28)
    [] (warn_slowpath_null) from [] (regulator_unregister+0xc8/0xd0)
    [] (regulator_unregister) from [] (release_nodes+0x16c/0x1dc)
    [] (release_nodes) from [] (__device_release_driver+0x8c/0x110)
    [] (__device_release_driver) from [] (device_release_driver+0x1c/0x28)
    [] (device_release_driver) from [] (bus_remove_device+0xd8/0x104)
    [] (bus_remove_device) from [] (device_del+0x10c/0x218)
    [] (device_del) from [] (platform_device_del+0x1c/0x88)
    [] (platform_device_del) from [] (platform_device_unregister+0xc/0x20)
    [] (platform_device_unregister) from [] (mfd_remove_devices_fn+0x5c/0x64)
    [] (mfd_remove_devices_fn) from [] (device_for_each_child_reverse+0x4c/0x78)
    [] (device_for_each_child_reverse) from [] (mfd_remove_devices+0x20/0x30)
    [] (mfd_remove_devices) from [] (wm8994_device_init+0x2ac/0x7f0)
    [] (wm8994_device_init) from [] (i2c_device_probe+0x178/0x1fc)
    [] (i2c_device_probe) from [] (driver_probe_device+0x214/0x2c0)
    [] (driver_probe_device) from [] (__driver_attach+0xac/0xb0)
    [] (__driver_attach) from [] (bus_for_each_dev+0x68/0x9c)
    [] (bus_for_each_dev) from [] (bus_add_driver+0x1a0/0x218)
    [] (bus_add_driver) from [] (driver_register+0x78/0xf8)
    [] (driver_register) from [] (i2c_register_driver+0x34/0x84)
    [] (i2c_register_driver) from [] (do_one_initcall+0x40/0x170)
    [] (do_one_initcall) from [] (kernel_init_freeable+0x15c/0x1fc)
    [] (kernel_init_freeable) from [] (kernel_init+0x8/0x114)
    [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
    ---[ end trace 0919d3d0bc998260 ]---

    [snip..]

    Unable to handle kernel NULL pointer dereference at virtual address 00000078
    pgd = c0004000
    [00000078] *pgd=00000000
    Internal error: Oops: 5 [#1] PREEMPT SMP ARM
    Modules linked in:
    CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 4.8.0-rc6-00154-g54fe84cbd50b #41
    Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
    task: ee874000 task.stack: ee878000
    PC is at down_write+0x14/0x54
    LR is at debugfs_remove_recursive+0x30/0x150

    [snip..]

    [] (down_write) from [] (debugfs_remove_recursive+0x30/0x150)
    [] (debugfs_remove_recursive) from [] (_regulator_put+0x24/0xac)
    [] (_regulator_put) from [] (regulator_put+0x1c/0x2c)
    [] (regulator_put) from [] (release_nodes+0x16c/0x1dc)
    [] (release_nodes) from [] (driver_probe_device+0xec/0x2c0)
    [] (driver_probe_device) from [] (__driver_attach+0xac/0xb0)
    [] (__driver_attach) from [] (bus_for_each_dev+0x68/0x9c)
    [] (bus_for_each_dev) from [] (bus_add_driver+0x1a0/0x218)
    [] (bus_add_driver) from [] (driver_register+0x78/0xf8)
    [] (driver_register) from [] (i2c_register_driver+0x34/0x84)
    [] (i2c_register_driver) from [] (do_one_initcall+0x40/0x170)
    [] (do_one_initcall) from [] (kernel_init_freeable+0x15c/0x1fc)
    [] (kernel_init_freeable) from [] (kernel_init+0x8/0x114)
    [] (kernel_init) from [] (ret_from_fork+0x14/0x3c)
    Code: e1a04000 f590f000 e3a03001 e34f3fff (e1902f9f)
    ---[ end trace 0919d3d0bc998262 ]---

    --------x------------------x----------------

    Fix the kernel warnings and crashes by using regulator_bulk_get()
    instead of devm_regulator_bulk_get() and explicitly freeing the supplies
    in exit paths.

    Tested on Exynos 5250, dual core ARM A15 machine.

    Signed-off-by: Viresh Kumar
    Acked-by: Charles Keepax
    Signed-off-by: Lee Jones

    Viresh Kumar
     
  • The order in which resources were freed in wm8994_device_exit() isn't
    correct. The regulators are removed before they are disabled.

    Fix it by reordering code a bit, which makes it exact opposite of
    wm8994_device_init() as well.

    Signed-off-by: Viresh Kumar
    Acked-by: Charles Keepax
    Signed-off-by: Lee Jones

    Viresh Kumar