10 Feb, 2014

8 commits

  • Linus Torvalds
     
  • Pull SELinux fixes from James Morris.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
    SELinux: Fix kernel BUG on empty security contexts.
    selinux: add SOCK_DIAG_BY_FAMILY to the list of netlink message types

    Linus Torvalds
     
  • Pull vfs fixes from Al Viro:
    "A couple of fixes, both -stable fodder. The O_SYNC bug is fairly
    old..."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fix a kmap leak in virtio_console
    fix O_SYNC|O_APPEND syncing the wrong range on write()

    Linus Torvalds
     
  • James Morris
     
  • While we are at it, don't do kmap() under kmap_atomic(), *especially*
    for a page we'd allocated with GFP_KERNEL. It's spelled "page_address",
    and had that been more than that, we'd have a real trouble - kmap_high()
    can block, and doing that while holding kmap_atomic() is a Bad Idea(tm).

    Signed-off-by: Al Viro

    Al Viro
     
  • It actually goes back to 2004 ([PATCH] Concurrent O_SYNC write support)
    when sync_page_range() had been introduced; generic_file_write{,v}() correctly
    synced
    pos_after_write - written .. pos_after_write - 1
    but generic_file_aio_write() synced
    pos_before_write .. pos_before_write + written - 1
    instead. Which is not the same thing with O_APPEND, obviously.
    A couple of years later correct variant had been killed off when
    everything switched to use of generic_file_aio_write().

    All users of generic_file_aio_write() are affected, and the same bug
    has been copied into other instances of ->aio_write().

    The fix is trivial; the only subtle point is that generic_write_sync()
    ought to be inlined to avoid calculations useless for the majority of
    calls.

    Signed-off-by: Al Viro

    Al Viro
     
  • Pull btrfs fixes from Chris Mason:
    "This is a small collection of fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    Btrfs: fix data corruption when reading/updating compressed extents
    Btrfs: don't loop forever if we can't run because of the tree mod log
    btrfs: reserve no transaction units in btrfs_ioctl_set_features
    btrfs: commit transaction after setting label and features
    Btrfs: fix assert screwup for the pending move stuff

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Tooling fixes, mostly related to the KASLR fallout, but also other
    fixes"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf buildid-cache: Check relocation when checking for existing kcore
    perf tools: Adjust kallsyms for relocated kernel
    perf tests: No need to set up ref_reloc_sym
    perf symbols: Prevent the use of kcore if the kernel has moved
    perf record: Get ref_reloc_sym from kernel map
    perf machine: Set up ref_reloc_sym in machine__create_kernel_maps()
    perf machine: Add machine__get_kallsyms_filename()
    perf tools: Add kallsyms__get_function_start()
    perf symbols: Fix symbol annotation for relocated kernel
    perf tools: Fix include for non x86 architectures
    perf tools: Fix AAAAARGH64 memory barriers
    perf tools: Demangle kernel and kernel module symbols too
    perf/doc: Remove mention of non-existent set_perf_event_pending() from design.txt

    Linus Torvalds
     

09 Feb, 2014

10 commits

  • When using a mix of compressed file extents and prealloc extents, it
    is possible to fill a page of a file with random, garbage data from
    some unrelated previous use of the page, instead of a sequence of zeroes.

    A simple sequence of steps to get into such case, taken from the test
    case I made for xfstests, is:

    _scratch_mkfs
    _scratch_mount "-o compress-force=lzo"
    $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
    $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
    $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
    $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar

    This results in the following file items in the fs tree:

    item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
    inode generation 6 transid 6 size 542872 block group 0 mode 100600
    item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
    inode ref index 2 namelen 6 name: foobar
    item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
    extent data disk byte 0 nr 0 gen 6
    extent data offset 0 nr 24576 ram 266240
    extent compression 0
    item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
    prealloc data disk byte 12849152 nr 241664 gen 6
    prealloc data offset 0 nr 241664
    item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
    extent data disk byte 12845056 nr 4096 gen 6
    extent data offset 0 nr 20480 ram 20480
    extent compression 2
    item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
    prealloc data disk byte 13090816 nr 405504 gen 6
    prealloc data offset 0 nr 258048

    The on disk extent at offset 266240 (which corresponds to 1 single disk block),
    contains 5 compressed chunks of file data. Each of the first 4 compress 4096
    bytes of file data, while the last one only compresses 3024 bytes of file data.
    Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
    1072 bytes) should always return zeroes (our next extent is a prealloc one).

    The solution here is the compression code path to zero the remaining (untouched)
    bytes of the last page it uncompressed data into, as the information about how
    much space the file data consumes in the last page is not known in the upper layer
    fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
    the remainder of the page but only if it corresponds to the last page of the inode
    and if the inode's size is not a multiple of the page size.

    This would cause not only returning random data on reads, but also permanently
    storing random data when updating parts of the region that should be zeroed.
    For the example above, it means updating a single byte in the region [285648 ; 286720[
    would store that byte correctly but also store random data on disk.

    A test case for xfstests follows soon.

    Signed-off-by: Filipe David Borba Manana
    Signed-off-by: Chris Mason

    Filipe David Borba Manana
     
  • A user reported a 100% cpu hang with my new delayed ref code. Turns out I
    forgot to increase the count check when we can't run a delayed ref because of
    the tree mod log. If we can't run any delayed refs during this there is no
    point in continuing to look, and we need to break out. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • Added in patch "btrfs: add ioctls to query/change feature bits online"
    modifications to superblock don't need to reserve metadata blocks when
    starting a transaction.

    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    David Sterba
     
  • The set_fslabel ioctl uses btrfs_end_transaction, which means it's
    possible that the change will be lost if the system crashes, same for
    the newly set features. Let's use btrfs_commit_transaction instead.

    Signed-off-by: Jeff Mahoney
    Signed-off-by: David Sterba
    Signed-off-by: Chris Mason

    Jeff Mahoney
     
  • Wang noticed that he was failing btrfs/030 even though me and Filipe couldn't
    reproduce. Turns out this is because Wang didn't have CONFIG_BTRFS_ASSERT set,
    which meant that a key part of Filipe's original patch was not being built in.
    This appears to be a mess up with merging Filipe's patch as it does not exist in
    his original patch. Fix this by changing how we make sure del_waiting_dir_move
    asserts that it did not error and take the function out of the ifdef check.
    This makes btrfs/030 pass with the assert on or off. Thanks,

    Signed-off-by: Josef Bacik
    Reviewed-by: Filipe Manana
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • Pull pinctrl fixes from Linus Walleij:
    "First round of pin control fixes for v3.14:

    - Protect pinctrl_list_add() with the proper mutex. This was
    identified by RedHat. Caused nasty locking warnings was rootcased
    by Stanislaw Gruszka.

    - Avoid adding dangerous debugfs files when either half of the
    subsystem is unused: pinmux or pinconf.

    - Various fixes to various drivers: locking, hardware particulars, DT
    parsing, error codes"

    * tag 'pinctrl-v3.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
    pinctrl: tegra: return correct error type
    pinctrl: do not init debugfs entries for unimplemented functionalities
    pinctrl: protect pinctrl_list add
    pinctrl: sirf: correct the pin index of ac97_pins group
    pinctrl: imx27: fix offset calculation in imx_read_2bit
    pinctrl: vt8500: Change devicetree data parsing
    pinctrl: imx27: fix wrong offset to ICONFB
    pinctrl: at91: use locked variant of irq_set_handler

    Linus Torvalds
     
  • Pull irq fix from Thomas Gleixner:
    "Add a missing Kconfig dependency"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    genirq: Generic irq chip requires IRQ_DOMAIN

    Linus Torvalds
     
  • Pull x86 fixes from Peter Anvin:
    "Quite a varied little collection of fixes. Most of them are
    relatively small or isolated; the biggest one is Mel Gorman's fixes
    for TLB range flushing.

    A couple of AMD-related fixes (including not crashing when given an
    invalid microcode image) and fix a crash when compiled with gcov"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86, microcode, AMD: Unify valid container checks
    x86, hweight: Fix BUG when booting with CONFIG_GCOV_PROFILE_ALL=y
    x86/efi: Allow mapping BGRT on x86-32
    x86: Fix the initialization of physnode_map
    x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable()
    x86/intel/mid: Fix X86_INTEL_MID dependencies
    arch/x86/mm/srat: Skip NUMA_NO_NODE while parsing SLIT
    mm, x86: Revisit tlb_flushall_shift tuning for page flushes except on IvyBridge
    x86: mm: change tlb_flushall_shift for IvyBridge
    x86/mm: Eliminate redundant page table walk during TLB range flushing
    x86/mm: Clean up inconsistencies when flushing TLB ranges
    mm, x86: Account for TLB flushes only when debugging
    x86/AMD/NB: Fix amd_set_subcaches() parameter type
    x86/quirks: Add workaround for AMD F16h Erratum792
    x86, doc, kconfig: Fix dud URL for Microcode data

    Linus Torvalds
     
  • Pull jfs fix from David Kleikamp:
    "Fix regression"

    * tag 'jfs-3.14-rc2' of git://github.com/kleikamp/linux-shaggy:
    jfs: fix generic posix ACL regression

    Linus Torvalds
     
  • I missed a couple errors in reviewing the patches converting jfs
    to use the generic posix ACL function. Setting ACL's currently
    fails with -EOPNOTSUPP.

    Signed-off-by: Dave Kleikamp
    Reported-by: Michael L. Semon
    Reviewed-by: Christoph Hellwig

    Dave Kleikamp
     

08 Feb, 2014

15 commits

  • On archs like S390 or um this driver cannot build nor work.
    Make it depend on HAS_IOMEM to bypass build failures.

    drivers/built-in.o: In function `dw_wdt_drv_probe':
    drivers/watchdog/dw_wdt.c:302: undefined reference to `devm_ioremap_resource'

    Signed-off-by: Richard Weinberger
    Signed-off-by: Wim Van Sebroeck

    Richard Weinberger
     
  • Pull driver core fix from Greg KH:
    "Here is a single kernfs fix to resolve a much-reported lockdep issue
    with the removal of entries in sysfs"

    * tag 'driver-core-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    kernfs: make kernfs_deactivate() honor KERNFS_LOCKDEP flag

    Linus Torvalds
     
  • Pull ceph fixes from Sage Weil:
    "There is an RBD fix for a crash due to the immutable bio changes, an
    error path fix, and a locking fix in the recent redirect support"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    libceph: do not dereference a NULL bio pointer
    libceph: take map_sem for read in handle_reply()
    libceph: factor out logic from ceph_osdc_start_request()
    libceph: fix error handling in ceph_osdc_init()

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:
    - Relax VDSO alignment requirements so that the kernel-picked one (4K)
    does not conflict with the dynamic linker's one (64K)
    - VDSO gettimeofday fix
    - Barrier fixes for atomic operations and cache flushing
    - TLB invalidation when overriding early page mappings during boot
    - Wired up new 32-bit arm (compat) syscalls
    - LSM_MMAP_MIN_ADDR when COMPAT is enabled
    - defconfig update
    - Clean-up (comments, pgd_alloc).

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: defconfig: Expand default enabled features
    arm64: asm: remove redundant "cc" clobbers
    arm64: atomics: fix use of acquire + release for full barrier semantics
    arm64: barriers: allow dsb macro to take option parameter
    security: select correct default LSM_MMAP_MIN_ADDR on arm on arm64
    arm64: compat: Wire up new AArch32 syscalls
    arm64: vdso: update wtm fields for CLOCK_MONOTONIC_COARSE
    arm64: vdso: fix coarse clock handling
    arm64: simplify pgd_alloc
    arm64: fix typo: s/SERRROR/SERROR/
    arm64: Invalidate the TLB when replacing pmd entries during boot
    arm64: Align CMA sizes to PAGE_SIZE
    arm64: add DSB after icache flush in __flush_icache_all()
    arm64: vdso: prevent ld from aligning PT_LOAD segments to 64k

    Linus Torvalds
     
  • Pull MIPS updates from Ralf Baechle:
    "hree minor patches. All have sat in -next for a few days"

    * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
    MIPS: fpu.h: Fix build when CONFIG_BUG is not set
    MIPS: Wire up sched_setattr/sched_getattr syscalls
    MIPS: Alchemy: Fix DB1100 GPIO registration

    Linus Torvalds
     
  • Pull media fixes from Mauro Carvalho Chehab:
    "A series of small fixes. Mostly driver ones. There is one core
    regression fix on a patch that was meant to fix some race issues on
    vb2, but that actually caused more harm than good. So, we're just
    reverting it for now"

    * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    [media] adv7842: Composite free-run platfrom-data fix
    [media] v4l2-dv-timings: fix GTF calculation
    [media] hdpvr: Fix memory leak in debug
    [media] af9035: add ID [2040:f900] Hauppauge WinTV-MiniStick 2
    [media] mxl111sf: Fix compile when CONFIG_DVB_USB_MXL111SF is unset
    [media] mxl111sf: Fix unintentional garbage stack read
    [media] cx24117: use a valid dev pointer for dev_err printout
    [media] cx24117: remove dead code in always 'false' if statement
    [media] update Michael Krufky's email address
    [media] vb2: Check if there are buffers before streamon
    [media] Revert "[media] videobuf_vm_{open,close} race fixes"
    [media] go7007-loader: fix usb_dev leak
    [media] media: bt8xx: add missing put_device call
    [media] exynos4-is: Compile in fimc-lite runtime PM callbacks conditionally
    [media] exynos4-is: Compile in fimc runtime PM callbacks conditionally
    [media] exynos4-is: Fix error paths in probe() for !pm_runtime_enabled()
    [media] s5p-jpeg: Fix wrong NV12 format parameters
    [media] s5k5baf: allow to handle arbitrary long i2c sequences

    Linus Torvalds
     
  • Pull hwmon fixes from Guenter Roeck:
    "Fix PMBus driver problem with some multi-page voltage sensors and fix
    da9055 interrupt initialization"

    * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
    hwmon: (da9055) Remove use of regmap_irq_get_virq()
    hwmon: (pmbus) Support per-page exponent in linear mode

    Linus Torvalds
     
  • Pull ACPI and power management fixes from Rafael Wysocki:
    "These include a fix for a recent ACPI hotplug regression, four
    concurrency related fixes and one PCI device removal fix for
    ACPI-based PCI hotplug (ACPIPHP), intel_pstate fix that should go into
    stable, three simple ACPI cleanups and a new entry for the ACPI video
    blacklist.

    Specifics:

    - Fix for a recent ACPI hotplug regression causing a NULL pointer
    dereference to occur while handling ACPI eject notifications for
    already ejected devices. From Toshi Kani.

    - Four concurrency-related fixes for ACPIPHP. Two of them add
    missing locking and the other two fix race conditions related to
    reference counting.

    - ACPIPHP fix to avoid NULL pointer dereferences during device
    removal involving Virtual Funcions.

    - intel_pstate fix to make it compute the percentage of time the CPU
    is busy properly. From Dirk Brandewie.

    - Removal of two unnecessary NULL pointer checks in ACPI code and a
    fix for sscanf() format string from Dan Carpenter and Luis G.F.

    - New ACPI video blacklist entry for HP EliteBook Revolve 810 from
    Mika Westerberg"

    * tag 'pm+acpi-3.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    ACPI / hotplug: Fix panic on eject to ejected device
    ACPI / battery: Fix incorrect sscanf() string in acpi_battery_init_alarm()
    ACPI / proc: remove unneeded NULL check
    ACPI / utils: remove a pointless NULL check
    ACPI / video: Add HP EliteBook Revolve 810 to the blacklist
    intel_pstate: Take core C0 time into account for core busy calculation
    ACPI / hotplug / PCI: Fix bridge removal race vs dock events
    ACPI / hotplug / PCI: Fix bridge removal race in handle_hotplug_event()
    ACPI / hotplug / PCI: Scan root bus under the PCI rescan-remove lock
    ACPI / hotplug / PCI: Move PCI rescan-remove locking to hotplug_event()
    ACPI / hotplug / PCI: Remove entries from bus->devices in reverse order

    Linus Torvalds
     
  • Commit f38a5181d9f3 ("ceph: Convert to immutable biovecs") introduced
    a NULL pointer dereference, which broke rbd in -rc1. Fix it.

    Cc: Kent Overstreet
    Signed-off-by: Ilya Dryomov
    Reviewed-by: Sage Weil

    Ilya Dryomov
     
  • * Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit).

    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     
  • Handling redirect replies requires both map_sem and request_mutex.
    Taking map_sem unconditionally near the top of handle_reply() avoids
    possible race conditions that arise from releasing request_mutex to be
    able to acquire map_sem in redirect reply case. (Lock ordering is:
    map_sem, request_mutex, crush_mutex.)

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Sage Weil

    Ilya Dryomov
     
  • Factor out logic from ceph_osdc_start_request() into a new helper,
    __ceph_osdc_start_request(). ceph_osdc_start_request() now amounts to
    taking locks and calling __ceph_osdc_start_request().

    Signed-off-by: Ilya Dryomov
    Reviewed-by: Sage Weil

    Ilya Dryomov
     
  • FPGA implementations of the Cortex-A57 and Cortex-A53 are now available
    in the form of the SMM-A57 and SMM-A53 Soft Macrocell Models (SMMs) for
    Versatile Express. As these attach to a Motherboard Express V2M-P1 it
    would be useful to have support for some V2M-P1 peripherals enabled by
    default.

    Additionally a couple of of features have been introduced since the last
    defconfig update (CMA, jump labels) that would be good to have enabled
    by default to ensure they are build and boot tested.

    This patch updates the arm64 defconfig to enable support for these
    devices and features. The arm64 Kconfig is modified to select
    HAVE_PATA_PLATFORM, which is required to enable support for the
    CompactFlash controller on the V2M-P1.

    A few options which don't need to appear in defconfig are trimmed:

    * BLK_DEV - selected by default
    * EXPERIMENTAL - otherwise gone from the kernel
    * MII - selected by drivers which require it
    * USB_SUPPORT - selected by default

    Signed-off-by: Mark Rutland
    Signed-off-by: Catalin Marinas

    Mark Rutland
     
  • cbnz/tbnz don't update the condition flags, so remove the "cc" clobbers
    from inline asm blocks that only use these instructions to implement
    conditional branches.

    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Will Deacon
     
  • Linux requires a number of atomic operations to provide full barrier
    semantics, that is no memory accesses after the operation can be
    observed before any accesses up to and including the operation in
    program order.

    On arm64, these operations have been incorrectly implemented as follows:

    // A, B, C are independent memory locations

    // atomic_op (B)
    1: ldaxr x0, [B] // Exclusive load with acquire

    stlxr w1, x0, [B] // Exclusive store with release
    cbnz w1, 1b

    The assumption here being that two half barriers are equivalent to a
    full barrier, so the only permitted ordering would be A -> B -> C
    (where B is the atomic operation involving both a load and a store).

    Unfortunately, this is not the case by the letter of the architecture
    and, in fact, the accesses to A and C are permitted to pass their
    nearest half barrier resulting in orderings such as Bl -> A -> C -> Bs
    or Bl -> C -> A -> Bs (where Bl is the load-acquire on B and Bs is the
    store-release on B). This is a clear violation of the full barrier
    requirement.

    The simple way to fix this is to implement the same algorithm as ARMv7
    using explicit barriers:

    // atomic_op (B)
    dmb ish // Full barrier
    1: ldxr x0, [B] // Exclusive load

    stxr w1, x0, [B] // Exclusive store
    cbnz w1, 1b
    dmb ish // Full barrier

    but this has the undesirable effect of introducing *two* full barrier
    instructions. A better approach is actually the following, non-intuitive
    sequence:

    // atomic_op (B)
    1: ldxr x0, [B] // Exclusive load

    stlxr w1, x0, [B] // Exclusive store with release
    cbnz w1, 1b
    dmb ish // Full barrier

    The simple observations here are:

    - The dmb ensures that no subsequent accesses (e.g. the access to C)
    can enter or pass the atomic sequence.

    - The dmb also ensures that no prior accesses (e.g. the access to A)
    can pass the atomic sequence.

    - Therefore, no prior access can pass a subsequent access, or
    vice-versa (i.e. A is strictly ordered before C).

    - The stlxr ensures that no prior access can pass the store component
    of the atomic operation.

    The only tricky part remaining is the ordering between the ldxr and the
    access to A, since the absence of the first dmb means that we're now
    permitting re-ordering between the ldxr and any prior accesses.

    From an (arbitrary) observer's point of view, there are two scenarios:

    1. We have observed the ldxr. This means that if we perform a store to
    [B], the ldxr will still return older data. If we can observe the
    ldxr, then we can potentially observe the permitted re-ordering
    with the access to A, which is clearly an issue when compared to
    the dmb variant of the code. Thankfully, the exclusive monitor will
    save us here since it will be cleared as a result of the store and
    the ldxr will retry. Notice that any use of a later memory
    observation to imply observation of the ldxr will also imply
    observation of the access to A, since the stlxr/dmb ensure strict
    ordering.

    2. We have not observed the ldxr. This means we can perform a store
    and influence the later ldxr. However, that doesn't actually tell
    us anything about the access to [A], so we've not lost anything
    here either when compared to the dmb variant.

    This patch implements this solution for our barriered atomic operations,
    ensuring that we satisfy the full barrier requirements where they are
    needed.

    Cc:
    Cc: Peter Zijlstra
    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Will Deacon
     

07 Feb, 2014

7 commits

  • Remove use of regmap_irq_get_virq() in driver probe which was
    conflicting with use of platform_get_irq_byname().
    platform_get_irq_byname() already returns the VIRQ number due
    to MFD core translation so using regmap_irq_get_virq() on that
    returned value results in an incorrect IRQ being requested.
    The driver probes then fail because of this.

    Signed-off-by: Adam Thomson
    Signed-off-by: Guenter Roeck

    Adam Thomson
     
  • * acpi-cleanup:
    ACPI / battery: Fix incorrect sscanf() string in acpi_battery_init_alarm()
    ACPI / proc: remove unneeded NULL check
    ACPI / utils: remove a pointless NULL check

    * acpi-video:
    ACPI / video: Add HP EliteBook Revolve 810 to the blacklist

    Rafael J. Wysocki
     
  • * pm-cpufreq:
    intel_pstate: Take core C0 time into account for core busy calculation

    Rafael J. Wysocki
     
  • * acpi-pci-hotplug:
    ACPI / hotplug / PCI: Fix bridge removal race vs dock events
    ACPI / hotplug / PCI: Fix bridge removal race in handle_hotplug_event()
    ACPI / hotplug / PCI: Scan root bus under the PCI rescan-remove lock
    ACPI / hotplug / PCI: Move PCI rescan-remove locking to hotplug_event()
    ACPI / hotplug / PCI: Remove entries from bus->devices in reverse order

    * acpi-hotplug:
    ACPI / hotplug: Fix panic on eject to ejected device

    Rafael J. Wysocki
     
  • Merge a bunch of fixes from Andrew Morton:
    "Commit 579f82901f6f ("swap: add a simple detector for inappropriate
    swapin readahead") is a feature. No probs if you decide to defer it
    until the next merge window.

    It has been sitting in my tree for over a year because of my dislike
    of all the magic numbers, but recent discussion with Hugh has made me
    give up"

    * emailed patches fron Andrew Morton :
    mm: __set_page_dirty uses spin_lock_irqsave instead of spin_lock_irq
    arch/x86/mm/numa.c: fix array index overflow when synchronizing nid to memblock.reserved.
    arch/x86/mm/numa.c: initialize numa_kernel_nodes in numa_clear_kernel_node_hotplug()
    mm: __set_page_dirty_nobuffers() uses spin_lock_irqsave() instead of spin_lock_irq()
    mm/swap: fix race on swap_info reuse between swapoff and swapon
    swap: add a simple detector for inappropriate swapin readahead
    ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters
    Documentation/kernel-parameters.txt: fix memmap= language

    Linus Torvalds
     
  • To use spin_{un}lock_irq is dangerous if caller disabled interrupt.
    During aio buffer migration, we have a possibility to see the following
    call stack.

    aio_migratepage [disable interrupt]
    migrate_page_copy
    clear_page_dirty_for_io
    set_page_dirty
    __set_page_dirty_buffers
    __set_page_dirty
    spin_lock_irq

    This mean, current aio migration is a deadlockable. spin_lock_irqsave
    is a safer alternative and we should use it.

    Signed-off-by: KOSAKI Motohiro
    Reported-by: David Rientjes rientjes@google.com>
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • The following path will cause array out of bound.

    memblock_add_region() will always set nid in memblock.reserved to
    MAX_NUMNODES. In numa_register_memblks(), after we set all nid to
    correct valus in memblock.reserved, we called setup_node_data(), and
    used memblock_alloc_nid() to allocate memory, with nid set to
    MAX_NUMNODES.

    The nodemask_t type can be seen as a bit array. And the index is 0 ~
    MAX_NUMNODES-1.

    After that, when we call node_set() in numa_clear_kernel_node_hotplug(),
    the nodemask_t got an index of value MAX_NUMNODES, which is out of [0 ~
    MAX_NUMNODES-1].

    See below:

    numa_init()
    |---> numa_register_memblks()
    | |---> memblock_set_node(memory) set correct nid in memblock.memory
    | |---> memblock_set_node(reserved) set correct nid in memblock.reserved
    | |......
    | |---> setup_node_data()
    | |---> memblock_alloc_nid() here, nid is set to MAX_NUMNODES (1024)
    |......
    |---> numa_clear_kernel_node_hotplug()
    |---> node_set() here, we have an index 1024, and overflowed

    This patch moves nid setting to numa_clear_kernel_node_hotplug() to fix
    this problem.

    Reported-by: Dave Jones
    Signed-off-by: Tang Chen
    Tested-by: Gu Zheng
    Reported-by: Dave Jones
    Cc: David Rientjes
    Tested-by: Dave Jones
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tang Chen