14 May, 2016

7 commits

  • Commits 190aa4304de6 (Add AMD Mullins platform support) and
    cca118fa2a0a94 (Add AMD Carrizo platform support) enabled the
    driver on a lot more devices, but the following commit missed
    a single location in the code when checking if the SB800 register
    offsets should be used. This leads to the wrong register being
    written which in turn causes ACPI to go haywire.

    Fix this by introducing a helper function to check for the new
    register layout and use this consistently.

    https://bugzilla.kernel.org/show_bug.cgi?id=114201
    https://bugzilla.redhat.com/show_bug.cgi?id=1329910
    Fixes: bdecfcdb5461 (sp5100_tco: fix the device check for SB800
    and later chipsets)
    Cc: stable@vger.kernel.org (4.5+)
    Signed-off-by: Lucas Stach
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck

    Lucas Stach
     
  • lockdep reports the following circular locking dependency.

    ======================================================
    INFO: possible circular locking dependency detected ]
    4.6.0-rc3-00191-gfabf418 #162 Not tainted
    -------------------------------------------------------
    systemd/1 is trying to acquire lock:
    ((&(&wd_data->work)->work)){+.+...}, at: [] flush_work+0x0/0x280

    but task is already holding lock:

    (&wd_data->lock){+.+...}, at: [] watchdog_release+0x18/0x190

    which lock already depends on the new lock.
    the existing dependency chain (in reverse order) is:

    -> #1 (&wd_data->lock){+.+...}:
    [] mutex_lock_nested+0x64/0x4a8
    [] watchdog_ping_work+0x18/0x4c
    [] process_one_work+0x1ac/0x500
    [] worker_thread+0x38/0x554
    [] kthread+0xf4/0x108
    [] ret_from_fork+0x14/0x24

    -> #0 ((&(&wd_data->work)->work)){+.+...}:
    [] lock_acquire+0x70/0x90
    [] flush_work+0x4c/0x280
    [] __cancel_work_timer+0x9c/0x1e0
    [] watchdog_release+0x3c/0x190
    [] __fput+0x80/0x1c8
    [] task_work_run+0x94/0xc8
    [] do_work_pending+0x8c/0xb4
    [] slow_work_pending+0xc/0x20

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0 CPU1
    ---- ----
    lock(&wd_data->lock);
    lock((&(&wd_data->work)->work));
    lock(&wd_data->lock);
    lock((&(&wd_data->work)->work));

    *** DEADLOCK ***

    1 lock held by systemd/1:

    stack backtrace:
    CPU: 2 PID: 1 Comm: systemd Not tainted 4.6.0-rc3-00191-gfabf418 #162
    Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
    [] (unwind_backtrace) from [] (show_stack+0x10/0x14)
    [] (show_stack) from [] (dump_stack+0xa8/0xd4)
    [] (dump_stack) from [] (print_circular_bug+0x214/0x334)
    [] (print_circular_bug) from [] (check_prevs_add+0x4dc/0x8e8)
    [] (check_prevs_add) from [] (__lock_acquire+0xc6c/0x14ec)
    [] (__lock_acquire) from [] (lock_acquire+0x70/0x90)
    [] (lock_acquire) from [] (flush_work+0x4c/0x280)
    [] (flush_work) from [] (__cancel_work_timer+0x9c/0x1e0)
    [] (__cancel_work_timer) from [] (watchdog_release+0x3c/0x190)
    [] (watchdog_release) from [] (__fput+0x80/0x1c8)
    [] (__fput) from [] (task_work_run+0x94/0xc8)
    [] (task_work_run) from [] (do_work_pending+0x8c/0xb4)
    [] (do_work_pending) from [] (slow_work_pending+0xc/0x20)

    Turns out the call to cancel_delayed_work_sync() in watchdog_release()
    is not necessary and can be dropped. If the worker is no longer necessary,
    the subsequent call to watchdog_update_worker() will cancel it. If it is
    already running, it won't do anything, since the worker function checks
    if it needs to ping the watchdog or not.

    Reported-by: Clemens Gruber
    Tested-by: Clemens Gruber
    Fixes: 11d7aba9ceb7 ("watchdog: imx2: Convert to use infrastructure triggered keepalives")
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck
    Cc: stable

    Guenter Roeck
     
  • Let's have balanced round brackets.

    Signed-off-by: Wolfram Sang
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck

    Wolfram Sang
     
  • Adjust documentation to match latest kernel module parameters.

    Signed-off-by: Nigel Croxon
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck

    Nigel Croxon
     
  • The IMX6 watchdog supports assertion of a signal (WDOG_B) which
    can be pinmux'd to an external pin. This is typically used for boards that
    have PMIC's in control of the IMX6 power rails. In fact, failure to use
    such an external reset on boards with external PMIC's can result in various
    hangs due to the IMX6 not being fully reset [1] as well as the board failing
    to reset because its PMIC has not been reset to provide adequate voltage for
    the CPU when coming out of reset at 800Mhz.

    This uses a new device-tree property 'fsl,ext-reset-output' to indicate the
    board has such a reset and to cause the watchdog to be configured to assert
    WDOG_B instead of an internal reset both on a watchdog timeout and in
    system_restart.

    [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/333689.html

    Cc: Fabio Estevam
    Cc: Lucas Stach
    Cc: Stefan Roese
    Cc: Iain Paton
    Cc: Sascha Hauer
    Signed-off-by: Tim Harvey
    Reviewed-by: Lucas Stach
    Acked-by: Shawn Guo
    Tested-by: Akshay Bhat
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck

    Tim Harvey
     
  • When performing a suspend operation, the kernel brings all of the
    non-boot CPUs offline, calling the hot plug notifiers with the flag,
    CPU_TASKS_FROZEN, set in the action code. Similarly, during resume,
    the CPUs are brought back online, but again the notifiers have the
    FROZEN flag set.

    While some very few drivers really need to treat suspend/resume
    specially, this driver unintentionally ignores the notifications.

    This patch changes the driver to disable the watchdog interrupt
    whenever the CPU goes offline, and to enable it whenever the CPU goes
    back online. As a result, the suspended state is no longer a special
    case that leaves the watchdog active.

    Signed-off-by: Richard Cochran
    Cc: linux-watchdog@vger.kernel.org
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck

    Richard Cochran
     
  • The Qualcom watchdog timer block reports if the system was reset by the
    watchdog. Pass the information to user space.

    Reviewed-by: Grant Grundler
    Tested-by: Grant Grundler
    Signed-off-by: Guenter Roeck
    Signed-off-by: Wim Van Sebroeck

    Guenter Roeck
     

09 May, 2016

1 commit


08 May, 2016

3 commits

  • Pull misc driver fixes from Gfreg KH:
    "Here are three small fixes for some driver problems that were
    reported. Full details in the shortlog below.

    All of these have been in linux-next with no reported issues"

    * tag 'char-misc-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
    nvmem: mxs-ocotp: fix buffer overflow in read
    Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read()
    misc: mic: Fix for double fetch security bug in VOP driver

    Linus Torvalds
     
  • Pull IIO driver fixes from Grek KH:
    "It's really just IIO drivers here, some small fixes that resolve some
    'crash on boot' errors that have shown up in the -rc series, and other
    bugfixes that are required.

    All have been in linux-next with no reported problems"

    * tag 'staging-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    iio: imu: mpu6050: Fix name/chip_id when using ACPI
    iio: imu: mpu6050: fix possible NULL dereferences
    iio:adc:at91-sama5d2: Repair crash on module removal
    iio: ak8975: fix maybe-uninitialized warning
    iio: ak8975: Fix NULL pointer exception on early interrupt

    Linus Torvalds
     
  • Pull USB fixes from Greg KH:
    "Here are some last-remaining fixes for USB drivers to resolve issues
    that have shown up in testing. And two new device ids as well.

    All of these have been in linux-next with no reported issues"

    * tag 'usb-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
    Revert "USB / PM: Allow USB devices to remain runtime-suspended when sleeping"
    usb: musb: jz4740: fix error check of usb_get_phy()
    Revert "usb: musb: musb_host: Enable HCD_BH flag to handle urb return in bottom half"
    usb: musb: gadget: nuke endpoint before setting its descriptor to NULL
    USB: serial: cp210x: add Straizona Focusers device ids
    USB: serial: cp210x: add ID for Link ECU

    Linus Torvalds
     

07 May, 2016

14 commits

  • Pull ARM fixes from Russell King:
    "These are a number of updates to fix a few problems found in the ARM
    nommu code over the last couple of years, caused mostly by changes on
    the mmu side"

    * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
    ARM: 8573/1: domain: move {set,get}_domain under config guard
    ARM: 8572/1: nommu: change memory reserve for the vectors
    ARM: 8571/1: nommu: fix PMSAv7 setup

    Linus Torvalds
     
  • Pull media fixes from Mauro Carvalho Chehab:

    - deadlock fixes on driver probe at exynos4-is and s43-camif drivers

    - a build breakage if media controller is enabled and USB or PCI is
    built as module.

    * tag 'media/v4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    [media] media-device: fix builds when USB or PCI is compiled as module
    [media] media: s3c-camif: fix deadlock on driver probe()
    [media] media: exynos4-is: fix deadlock on driver probe

    Linus Torvalds
     
  • Pull libata fixes from Tejun Heo:
    "An ahci driver addition and updates to ahci port enable handling for
    some platform devices"

    * 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
    ata: add AMD Seattle platform driver
    ARM: dts: apq8064: add ahci ports-implemented mask
    ata: ahci-platform: Add ports-implemented DT bindings.
    libahci: save port map for forced port map

    Linus Torvalds
     
  • Pull rdma fix from Doug Ledford:
    "Fix for max sector calculation in iSER"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
    IB/iser: Fix max_sectors calculation

    Linus Torvalds
     
  • Pull writeback fix from Jens Axboe:
    "Just a single fix for domain aware writeback, fixing a regression that
    can cause balance_dirty_pages() to keep looping while not getting any
    work done"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    writeback: Fix performance regression in wb_over_bg_thresh()

    Linus Torvalds
     
  • Pull x86 fixes from Ingo Molnar:
    "This contains two fixes: a boot fix for older SGI/UV systems, and an
    APIC calibration fix"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
    x86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init

    Linus Torvalds
     
  • Pull power management and ACPI fixes from Rafael Wysocki:
    "Fixes for problems introduced or discovered recently (intel_pstate,
    sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
    generic device properties framework) and one fix for a hotplug-related
    deadlock in ACPICA that's been there forever, but is nasty enough.

    Specifics:

    - Fix for a recent regression in the intel_pstate driver causing it
    to fail to restore the HWP (HW-managed P-states) configuration of
    the boot CPU after suspend-to-RAM (Rafael Wysocki).

    - Fix for two recent regressions in the intel_pstate driver, one that
    can trigger a divide by zero if the driver is accessed via sysfs
    before it manages to take the first sample and one causing it to
    fail to update a structure field used in a trace point, so the
    information coming from it is less useful (Rafael Wysocki).

    - Fix for a problem in the sti-cpufreq driver introduced during the
    4.5 cycle that causes it to break CPU PM in multi-platform kernels
    by registering cpufreq-dt (which subsequently doesn't work)
    unconditionally and preventing the driver that would actually work
    from registering (Sudeep Holla).

    - Stable-candidate fix for an ARM64 cpuidle issue causing idle state
    usage counters to be incorrectly updated for idle states that were
    not entered due to errors (James Morse).

    - Fix for a recently introduced issue in the OPP (Operating
    Performance Points) framework causing it to print bogus error
    messages for missing optional regulators (Viresh Kumar).

    - Fix for a recently introduced issue in the generic device
    properties framework that may cause it to attempt to dereferece and
    invalid pointer in some cases (Heikki Krogerus).

    - Fix for a deadlock in the ACPICA core that may be triggered by
    device (eg Thunderbolt) hotplug (Prarit Bhargava)"

    * tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM / OPP: Remove useless check
    ACPICA: Dispatcher: Update thread ID for recursive method calls
    intel_pstate: Fix intel_pstate_get()
    cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
    cpufreq: st: enable selective initialization based on the platform
    ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
    device property: Avoid potential dereferences of invalid pointers

    Linus Torvalds
     
  • Pull scheduler fix from Ingo Molnar:
    "This contains a single fix that fixes a nohz tick stopping bug when
    mixed-poliocy SCHED_FIFO and SCHED_RR tasks are present on a runqueue"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "This tree contains two fixes: new Intel CPU model numbers and an
    AMD/iommu uncore PMU driver fix"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
    perf/x86: Add model numbers for Kabylake CPUs

    Linus Torvalds
     
  • Pull EFI fixes from Ingo Molnar:
    "This tree contains three fixes: a console spam fix, a file pattern fix
    and a sysfb_efi fix for a bug that triggered on older ThinkPads"

    * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/sysfb_efi: Fix valid BAR address range check
    x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT
    MAINTAINERS: Remove asterisk from EFI directory names

    Linus Torvalds
     
  • Pull parisc fix from Helge Deller:
    "Patch from Dmitry V Levin to fix a kernel crash when a straced process
    calls the (invalid) syscall which is equal to value of __NR_Linux_syscalls"

    * 'parisc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
    parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls

    Linus Torvalds
     
  • Pull ARC fixes from Vineet Gupta:
    "Late in the cycle, but this has fixes for couple of issues: a PAE40
    boot crash and Arnd spotting lack of barriers in BE io-accessors.

    The 3rd patch for enabling highmem in low physical mem ;-) honestly is
    more than a "fix" but its been in works for some time, seems to be
    stable in testing and enables 2 of our customers to go forward with
    4.6 kernel.

    - Fix for PTE truncation in PAE40 builds
    - Fix for big endian IO accessors lacking IO barrier
    - Allow HIGHMEM to work with low physical addresses"

    * tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: support HIGHMEM even without PAE40
    ARC: Fix PAE40 boot failures due to PTE truncation
    ARC: Add missing io barriers to io{read,write}{16,32}be()

    Linus Torvalds
     
  • Pull powerpc fix from Michael Ellerman:
    "Fix bad inline asm constraint in create_zero_mask() from Anton
    Blanchard"

    * tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc: Fix bad inline asm constraint in create_zero_mask()

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "Fixes for i915, amdgpu/radeon and imx.

    The IMX fix is for an autoloading regression found in Fedora. The
    radeon fixes, are the same fix to amdgpu/radeon to avoid a hardware
    lockup in some circumstances with a bad mode, and a double free bug I
    took a few hours chasing down the other morning.

    The i915 fixes are across the board, all stable material, and fixing
    some hangs and suspend/resume issues, along with a live status
    regressions"

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
    drm/amdgpu: make sure vertical front porch is at least 1
    drm/radeon: make sure vertical front porch is at least 1
    drm/amdgpu: set metadata pointer to NULL after freeing.
    drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
    drm/i915: Fake HDMI live status
    drm/i915: Fix eDP low vswing for Broadwell
    drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
    drm/i915: Fix system resume if PCI device remained enabled
    drm/i915: Avoid stalling on pending flips for legacy cursor updates

    Linus Torvalds
     

06 May, 2016

15 commits

  • Do not load one entry beyond the end of the syscall table when the
    syscall number of a traced process equals to __NR_Linux_syscalls.
    Similar bug with regular processes was fixed by commit 3bb457af4fa8
    ("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").

    This bug was found by strace test suite.

    Cc: stable@vger.kernel.org
    Signed-off-by: Dmitry V. Levin
    Acked-by: Helge Deller
    Signed-off-by: Helge Deller

    Dmitry V. Levin
     
  • * pm-opp-fixes:
    PM / OPP: Remove useless check

    * pm-cpufreq-fixes:
    intel_pstate: Fix intel_pstate_get()
    cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
    cpufreq: st: enable selective initialization based on the platform

    * pm-cpuidle-fixes:
    ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value

    Rafael J. Wysocki
     
  • * acpica-fixes:
    ACPICA: Dispatcher: Update thread ID for recursive method calls

    * device-properties-fixes:
    device property: Avoid potential dereferences of invalid pointers

    Rafael J. Wysocki
     
  • Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;

    Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
    (35.5), the ratio bits are bit 8-15.

    Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
    TSC calibration and the Local APIC timer frequency to be incorrect.

    Fix this problem by masking 0xff instead.

    [ tglx: Massaged changelog ]

    Fixes: 7da7c1561366 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
    Signed-off-by: Chen Yu
    Cc: "Rafael J. Wysocki"
    Cc: stable@vger.kernel.org
    Cc: Bin Gao
    Cc: Len Brown
    Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.com
    Signed-off-by: Thomas Gleixner

    Chen Yu
     
  • Merge fixes from Andrew Morton:
    "14 fixes"

    * emailed patches from Andrew Morton :
    byteswap: try to avoid __builtin_constant_p gcc bug
    lib/stackdepot: avoid to return 0 handle
    mm: fix kcompactd hang during memory offlining
    modpost: fix module autoloading for OF devices with generic compatible property
    proc: prevent accessing /proc//environ until it's ready
    mm/zswap: provide unique zpool name
    mm: thp: kvm: fix memory corruption in KVM with THP enabled
    MAINTAINERS: fix Rajendra Nayak's address
    mm, cma: prevent nr_isolated_* counters from going negative
    mm: update min_free_kbytes from khugepaged after core initialization
    huge pagecache: mmap_sem is unlocked when truncation splits pmd
    rapidio/mport_cdev: fix uapi type definitions
    mm: memcontrol: let v2 cgroups follow changes in system swappiness
    mm: thp: correct split_huge_pages file permission

    Linus Torvalds
     
  • Apparently patchwork ended up truncating the full name.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:

    - a fix for the persistent memory 'struct page' driver. The
    implementation overlooked the fact that pages are allocated in 2MB
    units leading to -ENOMEM when establishing some configurations.

    It's tagged for -stable as the problem was introduced with the
    initial implementation in 4.5.

    - The new "error status translation" routine, introduced with the 4.6
    updates to the nfit driver, missed a necessary path in
    acpi_nfit_ctl().

    The end result is that we are falsely assuming commands complete
    successfully when the embedded status says otherwise.

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    nfit: fix translation of command status results
    libnvdimm, pfn: fix memmap reservation sizing

    Linus Torvalds
     
  • This is another attempt to avoid a regression in wwn_to_u64() after that
    started using get_unaligned_be64(), which in turn ran into a bug on
    gcc-4.9 through 6.1.

    The regression got introduced due to the combination of two separate
    workarounds (commits e3bde9568d99: "include/linux/unaligned: force
    inlining of byteswap operations" and ef3fb2422ffe: "scsi: fc: use
    get/put_unaligned64 for wwn access") that each try to sidestep distinct
    problems with gcc behavior (code growth and increased stack usage).

    Unfortunately after both have been applied, a more serious gcc bug has
    been uncovered, leading to incorrect object code that discards part of a
    function and causes undefined behavior.

    As part of this problem is how __builtin_constant_p gets evaluated on an
    argument passed by reference into an inline function, this avoids the
    use of __builtin_constant_p() for all architectures that set
    CONFIG_ARCH_USE_BUILTIN_BSWAP. Most architectures do not set
    ARCH_SUPPORTS_OPTIMIZED_INLINING, which means they probably do not
    suffer from the problem in the qla2xxx driver, but they might still run
    into it elsewhere.

    Both of the original workarounds were only merged in the 4.6 kernel, and
    the bug that is fixed by this patch should only appear if both are
    there, so we probably don't need to backport the fix. On the other
    hand, it works by simplifying the code path and should not have any
    negative effects.

    [arnd@arndb.de: fix older gcc warnings]
    (http://lkml.kernel.org/r/12243652.bxSxEgjgfk@wuerfel)
    Link: https://lkml.org/lkml/headers/2016/4/12/1103
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
    Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
    Fixes: e3bde9568d99 ("include/linux/unaligned: force inlining of byteswap operations")
    Fixes: ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access")
    Link: http://lkml.kernel.org/r/1780465.XdtPJpi8Tt@wuerfel
    Signed-off-by: Arnd Bergmann
    Reviewed-by: Josh Poimboeuf
    Tested-by: Josh Poimboeuf # on gcc-5.3
    Tested-by: Quinn Tran
    Cc: Martin Jambor
    Cc: "Martin K. Petersen"
    Cc: James Bottomley
    Cc: Denys Vlasenko
    Cc: Thomas Graf
    Cc: Peter Zijlstra
    Cc: David Rientjes
    Cc: Ingo Molnar
    Cc: Himanshu Madhani
    Cc: Jan Hubicka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • Recently, we allow to save the stacktrace whose hashed value is 0. It
    causes the problem that stackdepot could return 0 even if in success.
    User of stackdepot cannot distinguish whether it is success or not so we
    need to solve this problem. In this patch, 1 bit are added to handle
    and make valid handle none 0 by setting this bit. After that, valid
    handle will not be 0 and 0 handle will represent failure correctly.

    Fixes: 33334e25769c ("lib/stackdepot.c: allow the stack trace hash to be zero")
    Link: http://lkml.kernel.org/r/1462252403-1106-1-git-send-email-iamjoonsoo.kim@lge.com
    Signed-off-by: Joonsoo Kim
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Assume memory47 is the last online block left in node1. This will hang:

    # echo offline > /sys/devices/system/node/node1/memory47/state

    After a couple of minutes, the following pops up in dmesg:

    INFO: task bash:957 blocked for more than 120 seconds.
    Not tainted 4.6.0-rc6+ #6
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    bash D ffff8800b7adbaf8 0 957 951 0x00000000
    Call Trace:
    schedule+0x35/0x80
    schedule_timeout+0x1ac/0x270
    wait_for_completion+0xe1/0x120
    kthread_stop+0x4f/0x110
    kcompactd_stop+0x26/0x40
    __offline_pages.constprop.28+0x7e6/0x840
    offline_pages+0x11/0x20
    memory_block_action+0x73/0x1d0
    memory_subsys_offline+0x47/0x60
    device_offline+0x86/0xb0
    store_mem_state+0xda/0xf0
    dev_attr_store+0x18/0x30
    sysfs_kf_write+0x37/0x40
    kernfs_fop_write+0x11d/0x170
    __vfs_write+0x37/0x120
    vfs_write+0xa9/0x1a0
    SyS_write+0x55/0xc0
    entry_SYSCALL_64_fastpath+0x1a/0xa4

    kcompactd is waiting for kcompactd_max_order > 0 when it's woken up to
    actually exit. Check kthread_should_stop() to break out of the wait.

    Fixes: 698b1b306 ("mm, compaction: introduce kcompactd").
    Reported-by: Reza Arbab
    Tested-by: Reza Arbab
    Cc: Andrea Arcangeli
    Cc: "Kirill A. Shutemov"
    Cc: Rik van Riel
    Cc: Joonsoo Kim
    Cc: Mel Gorman
    Cc: David Rientjes
    Cc: Michal Hocko
    Cc: Johannes Weiner
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     
  • Since the wildcard at the end of OF module aliases is gone, autoloading
    of modules that don't match a device's last (most generic) compatible
    value fails.

    For example the CODA960 VPU on i.MX6Q has the SoC specific compatible
    "fsl,imx6q-vpu" and the generic compatible "cnm,coda960". Since the
    driver currently only works with knowledge about the SoC specific
    integration, it doesn't list "cnm,cod960" in the module device table.

    This results in the device compatible
    "of:NvpuTCfsl,imx6q-vpuCcnm,coda960" not matching the module alias
    "of:N*T*Cfsl,imx6q-vpu" anymore, whereas before commit 2f632369ab79
    ("modpost: don't add a trailing wildcard for OF module aliases") it
    matched the module alias "of:N*T*Cfsl,imx6q-vpu*".

    This patch adds two module aliases for each compatible, one without the
    wildcard and one with "C*" appended.

    $ modinfo coda | grep imx6q
    alias: of:N*T*Cfsl,imx6q-vpuC*
    alias: of:N*T*Cfsl,imx6q-vpu

    Fixes: 2f632369ab79 ("modpost: don't add a trailing wildcard for OF module aliases")
    Link: http://lkml.kernel.org/r/1462203339-15340-1-git-send-email-p.zabel@pengutronix.de
    Signed-off-by: Philipp Zabel
    Cc: Javier Martinez Canillas
    Cc: Brian Norris
    Cc: Sjoerd Simons
    Cc: Rusty Russell
    Cc: Greg Kroah-Hartman
    Cc: [4.5+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Philipp Zabel
     
  • If /proc//environ gets read before the envp[] array is fully set up
    in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to
    read more bytes than are actually written, as env_start will already be
    set but env_end will still be zero, making the range calculation
    underflow, allowing to read beyond the end of what has been written.

    Fix this as it is done for /proc//cmdline by testing env_end for
    zero. It is, apparently, intentionally set last in create_*_tables().

    This bug was found by the PaX size_overflow plugin that detected the
    arithmetic underflow of 'this_len = env_end - (env_start + src)' when
    env_end is still zero.

    The expected consequence is that userland trying to access
    /proc//environ of a not yet fully set up process may get
    inconsistent data as we're in the middle of copying in the environment
    variables.

    Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&t=4363
    Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461
    Signed-off-by: Mathias Krause
    Cc: Emese Revfy
    Cc: Pax Team
    Cc: Al Viro
    Cc: Mateusz Guzik
    Cc: Alexey Dobriyan
    Cc: Cyrill Gorcunov
    Cc: Jarod Wilson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathias Krause
     
  • Instead of using "zswap" as the name for all zpools created, add an
    atomic counter and use "zswap%x" with the counter number for each zpool
    created, to provide a unique name for each new zpool.

    As zsmalloc, one of the zpool implementations, requires/expects a unique
    name for each pool created, zswap should provide a unique name. The
    zsmalloc pool creation does not fail if a new pool with a conflicting
    name is created, unless CONFIG_ZSMALLOC_STAT is enabled; in that case,
    zsmalloc pool creation fails with -ENOMEM. Then zswap will be unable to
    change its compressor parameter if its zpool is zsmalloc; it also will
    be unable to change its zpool parameter back to zsmalloc, if it has any
    existing old zpool using zsmalloc with page(s) in it. Attempts to
    change the parameters will result in failure to create the zpool. This
    changes zswap to provide a unique name for each zpool creation.

    Fixes: f1c54846ee45 ("zswap: dynamic pool creation")
    Signed-off-by: Dan Streetman
    Reported-by: Sergey Senozhatsky
    Reviewed-by: Sergey Senozhatsky
    Cc: Dan Streetman
    Cc: Minchan Kim
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     
  • After the THP refcounting change, obtaining a compound pages from
    get_user_pages() no longer allows us to assume the entire compound page
    is immediately mappable from a secondary MMU.

    A secondary MMU doesn't want to call get_user_pages() more than once for
    each compound page, in order to know if it can map the whole compound
    page. So a secondary MMU needs to know from a single get_user_pages()
    invocation when it can map immediately the entire compound page to avoid
    a flood of unnecessary secondary MMU faults and spurious
    atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
    users).

    Ideally instead of the page->_mapcount < 1 check, get_user_pages()
    should return the granularity of the "page" mapping in the "mm" passed
    to get_user_pages(). However it's non trivial change to pass the "pmd"
    status belonging to the "mm" walked by get_user_pages up the stack (up
    to the caller of get_user_pages). So the fix just checks if there is
    not a single pte mapping on the page returned by get_user_pages, and in
    turn if the caller can assume that the whole compound page is mapped in
    the current "mm" (in a pmd_trans_huge()). In such case the entire
    compound page is safe to map into the secondary MMU without additional
    get_user_pages() calls on the surrounding tail/head pages. In addition
    of being faster, not having to run other get_user_pages() calls also
    reduces the memory footprint of the secondary MMU fault in case the pmd
    split happened as result of memory pressure.

    Without this fix after a MADV_DONTNEED (like invoked by QEMU during
    postcopy live migration or balloning) or after generic swapping (with a
    failure in split_huge_page() that would only result in pmd splitting and
    not a physical page split), KVM would map the whole compound page into
    the shadow pagetables, despite regular faults or userfaults (like
    UFFDIO_COPY) may map regular pages into the primary MMU as result of the
    pte faults, leading to the guest mode and userland mode going out of
    sync and not working on the same memory at all times.

    Any other secondary MMU notifier manager (KVM is just one of the many
    MMU notifier users) will need the same information if it doesn't want to
    run a flood of get_user_pages_fast and it can support multiple
    granularity in the secondary MMU mappings, so I think it is justified to
    be exposed not just to KVM.

    The other option would be to move transparent_hugepage_adjust to
    mm/huge_memory.c but that currently has all kind of KVM data structures
    in it, so it's definitely not a cut-and-paste work, so I couldn't do a
    fix as cleaner as this one for 4.6.

    Signed-off-by: Andrea Arcangeli
    Cc: "Dr. David Alan Gilbert"
    Cc: "Kirill A. Shutemov"
    Cc: "Li, Liang Z"
    Cc: Amit Shah
    Cc: Paolo Bonzini
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Signed-off-by: Eric Engestrom
    Cc: Rajendra Nayak
    Cc: Afzal Mohammed
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Engestrom