Eric Lee / smarc-fsl-linux-kernel

10 Feb, 2016

3 commits

2773b0e9b [media] add media controller support to videobuf2-dvb ... Browse Code »

Allow devices to pass an optional argument to register the DVB
driver at the media controller.

Signed-off-by: Mauro Carvalho Chehab

Mauro Carvalho Chehab
2016-02-10 17:23:41 +0800
de3907877 [media] em2xx: use v4l2_mc_create_media_graph() ... Browse Code »

Now that the core has a function to create the media graph,
we can get rid of the specialized code at em28xx.

Signed-off-by: Mauro Carvalho Chehab

Mauro Carvalho Chehab
2016-02-10 17:23:41 +0800
54d0dbac1 [media] v4l2-mc: add a generic function to create the media graph ... Browse Code »

The em28xx_v4l2_create_media_graph() is almost generic enough to
be at the core, as an ancillary function. Make it even more generic,
by getting rid of em28xx-specific code, relying only at the
media_device, in order to discover all entities found on PC-customer's
hardware and add it at the V4L2 core.

Signed-off-by: Mauro Carvalho Chehab

Mauro Carvalho Chehab
2016-02-10 17:23:40 +0800

09 Feb, 2016

1 commit

85e91f80c Merge tag 'v4.5-rc3' into patchwork ... Browse Code »

Linux 4.5-rc3

* tag 'v4.5-rc3': (644 commits)
Linux 4.5-rc3
epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
radix-tree: fix oops after radix_tree_iter_retry
MAINTAINERS: trim the file triggers for ABI/API
dax: dirty inode only if required
thp: make deferred_split_scan() work again
mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
um: asm/page.h: remove the pte_high member from struct pte_t
mm, hugetlb: don't require CMA for runtime gigantic pages
mm/hugetlb: fix gigantic page initialization/allocation
mm: downgrade VM_BUG in isolate_lru_page() to warning
mempolicy: do not try to queue pages from !vma_migratable()
mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
vmstat: make vmstat_update deferrable
mm, vmstat: make quiet_vmstat lighter
mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
memblock: don't mark memblock_phys_mem_size() as __init
dump_stack: avoid potential deadlocks
mm: validate_mm browse_rb SMP race condition
...

Mauro Carvalho Chehab
2016-02-09 18:56:42 +0800

08 Feb, 2016

3 commits

388f7b1d6 Linux 4.5-rc3 Browse Code »

Linus Torvalds
2016-02-08 07:38:30 +0800
c17dfb019 Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc ... Browse Code »

Pull ARM SoC fixes from Olof Johansson:
"The first real batch of fixes for this release cycle, so there are a
few more than usual.

Most of these are fixes and tweaks to board support (DT bugfixes,
etc). I've also picked up a couple of small cleanups that seemed
innocent enough that there was little reason to wait (const/
__initconst and Kconfig deps).

Quite a bit of the changes on OMAP were due to fixes to no longer
write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but
there were also other fixes.

Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC
fixes on OMAP5, and Nomadik had changes to MMC parameters in DT.

All in all, mostly the usual mix of various fixes"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits)
ARM: multi_v7_defconfig: enable DW_WATCHDOG
ARM: nomadik: fix up SD/MMC DT settings
ARM64: tegra: Add chosen node for tegra132 norrin
ARM: realview: use "depends on" instead of "if" after prompt
ARM: tango: use "depends on" instead of "if" after prompt
ARM: tango: use const and __initconst for smp_operations
ARM: realview: use const and __initconst for smp_operations
bus: uniphier-system-bus: revive tristate prompt
arm64: dts: Add missing DMA Abort interrupt to Juno
bus: vexpress-config: Add missing of_node_put
ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings
ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux
ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux
ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency
ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2
ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address
ARM: dts: LogicPD Torpedo: Revert Duplicative Entries
ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types
ARM: dts: am4372: fix irq type for arm twd and global timer
ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type
...

Linus Torvalds
2016-02-08 07:23:20 +0800
63fee123d Merge branch 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration ... Browse Code »

Pull mailbox fixes from Jassi Brar:

- fix getting element from the pcc-channels array by simply indexing
into it

- prevent building mailbox-test driver for archs that don't have IOMEM

* 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: Fix dependencies for !HAS_IOMEM archs
mailbox: pcc: fix channel calculation in get_pcc_channel()

Linus Torvalds
2016-02-08 07:17:47 +0800

07 Feb, 2016

2 commits

46df55cee Merge tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb ... Browse Code »

Pull USB fixes from Greg KH:
"Here are some USB fixes for 4.5-rc3.

The usual, xhci fixes for reported issues, combined with some small
gadget driver fixes, and a MAINTAINERS file update. All have been in
linux-next with no reported issues"

* tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: harden xhci_find_next_ext_cap against device removal
xhci: Fix list corruption in urb dequeue at host removal
usb: host: xhci-plat: fix NULL pointer in probe for device tree case
usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling
usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT
usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms
usb: xhci: set SSIC port unused only if xhci_suspend succeeds
usb: xhci: add a quirk bit for ssic port unused
usb: xhci: handle both SSIC ports in PME stuck quirk
usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver.
Revert "xhci: don't finish a TD if we get a short-transfer event mid TD"
MAINTAINERS: fix my email address
usb: dwc2: Fix probe problem on bcm2835
Revert "usb: dwc2: Move reset into dwc2_get_hwparams()"
usb: musb: ux500: Fix NULL pointer dereference at system PM
usb: phy: mxs: declare variable with initialized value
usb: phy: msm: fix error handling in probe.

Linus Torvalds
2016-02-07 14:14:46 +0800
dacd53c80 Merge tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging ... Browse Code »

Pull staging and IIO driver fixes from Greg KH:
"Here are some IIO and staging driver fixes for 4.5-rc3.

All of them, except one, are for IIO drivers, and one is for a speakup
driver fix caused by some earlier patches, to resolve a reported build
failure"

* tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
Staging: speakup: Fix allyesconfig build on mn10300
iio: dht11: Use boottime
iio: ade7753: avoid uninitialized data
iio: pressure: mpl115: fix temperature offset sign
iio: imu: Fix dependencies for !HAS_IOMEM archs
staging: iio: Fix dependencies for !HAS_IOMEM archs
iio: adc: Fix dependencies for !HAS_IOMEM archs
iio: inkern: fix a NULL dereference on error
iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer.
iio: light: acpi-als: Report data as processed
iio: dac: mcp4725: set iio name property in sysfs
iio: add HAS_IOMEM dependency to VF610_ADC
iio: add IIO_TRIGGER dependency to STK8BA50
iio: proximity: lidar: correct return value
iio-light: Use a signed return type for ltr501_match_samp_freq()

Linus Torvalds
2016-02-07 14:13:16 +0800

06 Feb, 2016

31 commits

5af9c2e19 Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge fixes from Andrew Morton:
"22 fixes"

* emailed patches from Andrew Morton : (22 commits)
epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT
radix-tree: fix oops after radix_tree_iter_retry
MAINTAINERS: trim the file triggers for ABI/API
dax: dirty inode only if required
thp: make deferred_split_scan() work again
mm: replace vma_lock_anon_vma with anon_vma_lock_read/write
ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup
um: asm/page.h: remove the pte_high member from struct pte_t
mm, hugetlb: don't require CMA for runtime gigantic pages
mm/hugetlb: fix gigantic page initialization/allocation
mm: downgrade VM_BUG in isolate_lru_page() to warning
mempolicy: do not try to queue pages from !vma_migratable()
mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
vmstat: make vmstat_update deferrable
mm, vmstat: make quiet_vmstat lighter
mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT
memblock: don't mark memblock_phys_mem_size() as __init
dump_stack: avoid potential deadlocks
mm: validate_mm browse_rb SMP race condition
m32r: fix build failure due to SMP and MMU
...

Linus Torvalds
2016-02-06 12:20:07 +0800
5d6a6a75e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client ... Browse Code »

Pull Ceph fixes from Sage Weil:
"We have a few wire protocol compatibility fixes, ports of a few recent
CRUSH mapping changes, and a couple error path fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
libceph: MOSDOpReply v7 encoding
libceph: advertise support for TUNABLES5
crush: decode and initialize chooseleaf_stable
crush: add chooseleaf_stable tunable
crush: ensure take bucket value is valid
crush: ensure bucket id is valid before indexing buckets array
ceph: fix snap context leak in error path
ceph: checking for IS_ERR instead of NULL

Linus Torvalds
2016-02-06 11:52:57 +0800
9b108828e Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

Pull drm fixes from Dave Airlie:
"Fixes all over the place:

- amdkfd: two static checker fixes
- mst: a bunch of static checker and spec/hw interaction fixes
- amdgpu: fix Iceland hw properly, and some fiji bugs, along with
some write-combining fixes.
- exynos: some regression fixes
- adv7511: fix some EDID reading issues"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits)
drm/dp/mst: deallocate payload on port destruction
drm/dp/mst: Reverse order of MST enable and clearing VC payload table.
drm/dp/mst: move GUID storage from mgr, port to only mst branch
drm/dp/mst: change MST detection scheme
drm/dp/mst: Calculate MST PBN with 31.32 fixed point
drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil
drm/mst: Add range check for max_payloads during init
drm/mst: Don't ignore the MST PBN self-test result
drm: fix missing reference counting decrease
drm/amdgpu: disable uvd and vce clockgating on Fiji
drm/amdgpu: remove exp hardware support from iceland
drm/amdgpu: load MEC ucode manually on iceland
drm/amdgpu: don't load MEC2 on topaz
drm/amdgpu: drop topaz support from gmc8 module
drm/amdgpu: pull topaz gmc bits into gmc_v7
drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above
drm/amdgpu: iceland use CI based MC IP
drm/amdgpu: move gmc7 support out of CIK dependency
drm/amdgpu/gfx7: enable cp inst/reg error interrupts
drm/amdgpu/gfx8: enable cp inst/reg error interrupts
...

Linus Torvalds
2016-02-06 11:38:15 +0800
22f60701d Merge tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management and ACPI fixes from Rafael Wysocki:
"These are: a fix for a recently introduced false-positive warnings
about PM domain pointers being changed inappropriately (harmless but
annoying), an MCH size workaround quirk for one more platform, a
compiler warning fix (generic power domains framework), an ACPI LPSS
(Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code.

Specifics:

- PM core fix to avoid false-positive warnings generated when the
pm_domain field is cleared for a device that appears to be bound to
a driver (Rafael Wysocki).

- New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer).

- Fix for an "unused function" compiler warning in the generic power
domains framework (Ulf Hansson).

- Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM
domain pointer of a device properly in one place that was
overlooked by a recent PM core update (Andy Shevchenko).

- Removal of a redundant function declaration in the ACPI CPPC core
code (Timur Tabi)"

* tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: Avoid false-positive warnings in dev_pm_domain_set()
PM / Domains: Silence compiler warning for an unused function
ACPI / CPPC: remove redundant mbox_send_message() declaration
ACPI / LPSS: set PM domain via helper setter
PNP: Add Haswell-ULT to Intel MCH size workaround

Linus Torvalds
2016-02-06 10:11:23 +0800
b6a515c8a epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT ... Browse Code »

In the current implementation of the EPOLLEXCLUSIVE flag (added for
4.5-rc1), if epoll waiters create different POLL* sets and register them
as exclusive against the same target fd, the current implementation will
stop waking any further waiters once it finds the first idle waiter.
This means that waiters could miss wakeups in certain cases.

For example, when we wake up a pipe for reading we do:
wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if
one epoll set or epfd is added to pipe p with POLLIN and a second set
epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the
wakeup since the current implementation will stop after it finds any
intersection of events with a waiter that is blocked in epoll_wait().

We could potentially address this by requiring all epoll waiters that
are added to p be required to pass the same set of POLL* events. IE the
first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL*
flags to be used by any other epfds that are added as EPOLLEXCLUSIVE.
However, I think it might be somewhat confusing interface as we would
have to reference count the number of users for that set, and so
userspace would have to keep track of that count, or we would need a
more involved interface. It also adds some shared state that we'd have
store somewhere. I don't think anybody will want to bloat
__wait_queue_head for this.

I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE
such that it can only be specified with EPOLLIN and/or EPOLLOUT. So
that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop
once we hit the first idle waiter that specifies the EPOLLIN bit, since
any remaining waiters that only have 'POLLOUT' set wouldn't need to be
woken. Likewise, we can do the same thing if 'POLLOUT' is in the wakeup
bit set and not 'POLLIN'. If both 'POLLOUT' and 'POLLIN' are set in the
wake bit set (there is at least one example of this I saw in fs/pipe.c),
then we just wake the entire exclusive list. Having both 'POLLOUT' and
'POLLIN' both set should not be on any performance critical path, so I
think that's ok (in fs/pipe.c its in pipe_release()). We also continue
to include EPOLLERR and EPOLLHUP by default in any exclusive set. Thus,
the user can specify EPOLLERR and/or EPOLLHUP but is not required to do
so.

Since epoll waiters may be interested in other events as well besides
EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by
doing a 'dup' call on the target fd and adding that as one normally
would with EPOLL_CTL_ADD. Since I think that the POLLIN and POLLOUT
events are what we are interest in balancing, I think that the 'dup'
thing could perhaps be added to only one of the waiter threads.
However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be
sufficient for the majority of use-cases.

Since EPOLLEXCLUSIVE is intended to be used with a target fd shared
among multiple epfds, where between 1 and n of the epfds may receive an
event, it does not satisfy the semantics of EPOLLONESHOT where only 1
epfd would get an event. Thus, it is not allowed to be specified in
conjunction with EPOLLEXCLUSIVE.

EPOLL_CTL_MOD is also not allowed if the fd was previously added as
EPOLLEXCLUSIVE. It seems with the limited number of flags to not be as
interesting, but this could be relaxed at some further point.

Signed-off-by: Jason Baron
Tested-by: Madars Vitolins
Cc: Michael Kerrisk
Cc: Ingo Molnar
Cc: Peter Zijlstra
Cc: Al Viro
Cc: Eric Wong
Cc: Jonathan Corbet
Cc: Andy Lutomirski
Cc: Hagen Paul Pfeifer
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jason Baron
2016-02-06 10:10:40 +0800
732042821 radix-tree: fix oops after radix_tree_iter_retry ... Browse Code »

Helper radix_tree_iter_retry() resets next_index to the current index.
In following radix_tree_next_slot current chunk size becomes zero. This
isn't checked and it tries to dereference null pointer in slot.

Tagged iterator is fine because retry happens only at slot 0 where tag
bitmask in iter->tags is filled with single bit.

Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup")
Signed-off-by: Konstantin Khlebnikov
Cc: Matthew Wilcox
Cc: Hugh Dickins
Cc: Ohad Ben-Cohen
Cc: Jeremiah Mahler
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2016-02-06 10:10:40 +0800
b14fd334f MAINTAINERS: trim the file triggers for ABI/API ... Browse Code »

Commit ea8f8fc8631 ("MAINTAINERS: add linux-api for review of API/ABI
changes") added file triggers for various paths that likely indicated
API/ABI changes. However, catching all changes in Documentation/ABI/
and include/uapi/ produces a large volume of mail to linux-api, rather
than only API/ABI changes. Drop those two entries, but leave
include/linux/syscalls.h and kernel/sys_ni.c to catch syscall-related
changes.

[josh@joshtriplett.org: redid changelog]
Signed-off-by: Michael Kerrisk
Acked-by: Shuah khan
Cc: Josh Triplett
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michael Kerrisk (man-pages)
2016-02-06 10:10:40 +0800
d2b2a28e6 dax: dirty inode only if required ... Browse Code »

Signed-off-by: Dmitry Monakhov
Reviewed-by: Jan Kara
Reviewed-by: Ross Zwisler
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dmitry Monakhov
2016-02-06 10:10:40 +0800
ae026204a thp: make deferred_split_scan() work again ... Browse Code »

We need to iterate over split_queue, not local empty list to get
anything split from the shrinker.

Fixes: e3ae19535c66 ("thp: limit number of object to scan on deferred_split_scan()")
Signed-off-by: Kirill A. Shutemov
Cc: Andrea Arcangeli
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-02-06 10:10:40 +0800
12352d3ca mm: replace vma_lock_anon_vma with anon_vma_lock_read/write ... Browse Code »

Sequence vma_lock_anon_vma() - vma_unlock_anon_vma() isn't safe if
anon_vma appeared between lock and unlock. We have to check anon_vma
first or call anon_vma_prepare() to be sure that it's here. There are
only few users of these legacy helpers. Let's get rid of them.

This patch fixes anon_vma lock imbalance in validate_mm(). Write lock
isn't required here, read lock is enough.

And reorders expand_downwards/expand_upwards: security_mmap_addr() and
wrapping-around check don't have to be under anon vma lock.

Link: https://lkml.kernel.org/r/CACT4Y+Y908EjM2z=706dv4rV6dWtxTLK9nFg9_7DhRMLppBo2g@mail.gmail.com
Signed-off-by: Konstantin Khlebnikov
Reported-by: Dmitry Vyukov
Acked-by: Kirill A. Shutemov
Cc: Andrea Arcangeli
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Konstantin Khlebnikov
2016-02-06 10:10:40 +0800
c95a51807 ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup ... Browse Code »

When recovery master down, dlm_do_local_recovery_cleanup() only remove
the $RECOVERY lock owned by dead node, but do not clear the refmap bit.
Which will make umount thread falling in dead loop migrating $RECOVERY
to the dead node.

Signed-off-by: xuejiufei
Reviewed-by: Joseph Qi
Cc: Mark Fasheh
Cc: Joel Becker
Cc: Junxiao Bi
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

xuejiufei
2016-02-06 10:10:40 +0800
012a4163b um: asm/page.h: remove the pte_high member from struct pte_t ... Browse Code »

Commit 16da306849d0 ("um: kill pfn_t") introduced a compile warning for
defconfig (SUBARCH=i386):

arch/um/kernel/skas/mmu.c:38:206:
warning: right shift count >= width of type [-Wshift-count-overflow]

Aforementioned patch changes the definition of the phys_to_pfn() macro
from

((pfn_t) ((p) >> PAGE_SHIFT))

to

((p) >> PAGE_SHIFT)

This effectively changes the phys_to_pfn() expansion's type from
unsigned long long to unsigned long.

Through the callchain init_stub_pte() => mk_pte(), the expansion of
phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
pte_set_val(pte, phys, prot) macro, eventually leading to

(pte).pte_high = (phys) >> 32;

This results in the warning from above.

Since UML only deals with 32 bit addresses, the upper 32 bits from
'phys' used to be always zero anyway. Also, all page protection flags
defined by UML don't use any bits beyond bit 9. Since the contents of a
PTE are defined within architecture scope only, the ->pte_high member
can be safely removed.

Remove the ->pte_high member from struct pte_t.
Rename ->pte_low to ->pte.
Adapt the pte helper macros in arch/um/include/asm/page.h.

Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped. This
write barrier doesn't seem to be paired with any read barrier though and
thus, was useless anyway.

Fixes: 16da306849d0 ("um: kill pfn_t")
Signed-off-by: Nicolai Stange
Cc: Dan Williams
Cc: Richard Weinberger
Cc: Nicolai Stange
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nicolai Stange
2016-02-06 10:10:40 +0800
080fe2068 mm, hugetlb: don't require CMA for runtime gigantic pages ... Browse Code »

Commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation
at runtime") has added the runtime gigantic page allocation via
alloc_contig_range(), making this support available only when CONFIG_CMA
is enabled. Because it doesn't depend on MIGRATE_CMA pageblocks and the
associated infrastructure, it is possible with few simple adjustments to
require only CONFIG_MEMORY_ISOLATION instead of full CONFIG_CMA.

After this patch, alloc_contig_range() and related functions are
available and used for gigantic pages with just CONFIG_MEMORY_ISOLATION
enabled. Note CONFIG_CMA selects CONFIG_MEMORY_ISOLATION. This allows
supporting runtime gigantic pages without the CMA-specific checks in
page allocator fastpaths.

Signed-off-by: Vlastimil Babka
Cc: Luiz Capitulino
Cc: Kirill A. Shutemov
Cc: Zhang Yanfei
Cc: Yasuaki Ishimatsu
Cc: Joonsoo Kim
Cc: Naoya Horiguchi
Cc: Mel Gorman
Cc: Davidlohr Bueso
Cc: Hillf Danton
Cc: Mike Kravetz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2016-02-06 10:10:40 +0800
b4330afbe mm/hugetlb: fix gigantic page initialization/allocation ... Browse Code »

Attempting to preallocate 1G gigantic huge pages at boot time with
"hugepagesz=1G hugepages=1" on the kernel command line will prevent
booting with the following:

kernel BUG at mm/hugetlb.c:1218!

When mapcount accounting was reworked, the setting of
compound_mapcount_ptr in prep_compound_gigantic_page was overlooked. As
a result, the validation of mapcount in free_huge_page fails.

The "BUG_ON" checks in free_huge_page were also changed to
"VM_BUG_ON_PAGE" to assist with debugging.

Fixes: 53f9263baba69 ("mm: rework mapcount accounting to enable 4k mapping of THPs")
Signed-off-by: Mike Kravetz
Signed-off-by: Naoya Horiguchi
Acked-by: Kirill A. Shutemov
Acked-by: David Rientjes
Tested-by: Vlastimil Babka
Cc: "Aneesh Kumar K.V"
Cc: Jerome Marchand
Cc: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Kravetz
2016-02-06 10:10:40 +0800
cf2a82ee4 mm: downgrade VM_BUG in isolate_lru_page() to warning ... Browse Code »

Calling isolate_lru_page() is wrong and shouldn't happen, but it not
nessesary fatal: the page just will not be isolated if it's not on LRU.

Let's downgrade the VM_BUG_ON_PAGE() to WARN_RATELIMIT().

Signed-off-by: Kirill A. Shutemov
Cc: Dmitry Vyukov
Cc: Vlastimil Babka
Cc: David Rientjes
Cc: Naoya Horiguchi
Acked-by: Michal Hocko
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-02-06 10:10:40 +0800
77bf45e78 mempolicy: do not try to queue pages from !vma_migratable() ... Browse Code »

Maybe I miss some point, but I don't see a reason why we try to queue
pages from non migratable VMAs.

This testcase steps on VM_BUG_ON_PAGE() in isolate_lru_page():

#include
#include
#include
#include
#include

#define SIZE 0x2000

int foo;

int main()
{
int fd;
char *p;
unsigned long mask = 2;

fd = open("/dev/sg0", O_RDWR);
p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
/* Faultin pages */
foo = p[0] + p[0x1000];
mbind(p, SIZE, MPOL_BIND, &mask, 4, MPOL_MF_MOVE | MPOL_MF_STRICT);
return 0;
}

The only case when we can queue pages from such VMA is MPOL_MF_STRICT
plus MPOL_MF_MOVE or MPOL_MF_MOVE_ALL for VMA which has pages on LRU,
but gfp mask is not sutable for migaration (see mapping_gfp_mask() check
in vma_migratable()). That's looks like a bug to me.

Let's filter out non-migratable vma at start of queue_pages_test_walk()
and go to queue_pages_pte_range() only if MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL flag is set.

Signed-off-by: Kirill A. Shutemov
Signed-off-by: Dmitry Vyukov
Cc: Vlastimil Babka
Cc: David Rientjes
Cc: Naoya Horiguchi
Cc: Michal Hocko
Cc: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Kirill A. Shutemov
2016-02-06 10:10:40 +0800
564e81a57 mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress ... Browse Code »

Jan Stancek has reported that system occasionally hanging after "oom01"
testcase from LTP triggers OOM. Guessing from a result that there is a
kworker thread doing memory allocation and the values between "Node 0
Normal free:" and "Node 0 Normal:" differs when hanging, vmstat is not
up-to-date for some reason.

According to commit 373ccbe59270 ("mm, vmstat: allow WQ concurrency to
discover memory reclaim doesn't make any progress"), it meant to force
the kworker thread to take a short sleep, but it by error used
schedule_timeout(1). We missed that schedule_timeout() in state
TASK_RUNNING doesn't do anything.

Fix it by using schedule_timeout_uninterruptible(1) which forces the
kworker thread to take a short sleep in order to make sure that vmstat
is up-to-date.

Fixes: 373ccbe59270 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Signed-off-by: Tetsuo Handa
Reported-by: Jan Stancek
Acked-by: Michal Hocko
Cc: Tejun Heo
Cc: Cristopher Lameter
Cc: Joonsoo Kim
Cc: Arkadiusz Miskiewicz
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2016-02-06 10:10:40 +0800
ccde8bd40 vmstat: make vmstat_update deferrable ... Browse Code »

Commit 0eb77e988032 ("vmstat: make vmstat_updater deferrable again and
shut down on idle") made vmstat_shepherd deferrable. vmstat_update
itself is still useing standard timer which might interrupt idle task.
This is possible because "mm, vmstat: make quiet_vmstat lighter" removed
cancel_delayed_work from the quiet_vmstat.

Change vmstat_work to use DEFERRABLE_WORK to prevent from pointless
wakeups from the idle context.

Acked-by: Christoph Lameter
Signed-off-by: Michal Hocko
Cc: Mike Galbraith
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2016-02-06 10:10:40 +0800
f01f17d37 mm, vmstat: make quiet_vmstat lighter ... Browse Code »

Mike has reported a considerable overhead of refresh_cpu_vm_stats from
the idle entry during pipe test:

12.89% [kernel] [k] refresh_cpu_vm_stats.isra.12
4.75% [kernel] [k] __schedule
4.70% [kernel] [k] mutex_unlock
3.14% [kernel] [k] __switch_to

This is caused by commit 0eb77e988032 ("vmstat: make vmstat_updater
deferrable again and shut down on idle") which has placed quiet_vmstat
into cpu_idle_loop. The main reason here seems to be that the idle
entry has to get over all zones and perform atomic operations for each
vmstat entry even though there might be no per cpu diffs. This is a
pointless overhead for _each_ idle entry.

Make sure that quiet_vmstat is as light as possible.

First of all it doesn't make any sense to do any local sync if the
current cpu is already set in oncpu_stat_off because vmstat_update puts
itself there only if there is nothing to do.

Then we can check need_update which should be a cheap way to check for
potential per-cpu diffs and only then do refresh_cpu_vm_stats.

The original patch also did cancel_delayed_work which we are not doing
here. There are two reasons for that. Firstly cancel_delayed_work from
idle context will blow up on RT kernels (reported by Mike):

CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.5.0-rt3 #7
Hardware name: MEDION MS-7848/MS-7848, BIOS M7848W08.20C 09/23/2013
Call Trace:
dump_stack+0x49/0x67
___might_sleep+0xf5/0x180
rt_spin_lock+0x20/0x50
try_to_grab_pending+0x69/0x240
cancel_delayed_work+0x26/0xe0
quiet_vmstat+0x75/0xa0
cpu_idle_loop+0x38/0x3e0
cpu_startup_entry+0x13/0x20
start_secondary+0x114/0x140

And secondly, even on !RT kernels it might add some non trivial overhead
which is not necessary. Even if the vmstat worker wakes up and preempts
idle then it will be most likely a single shot noop because the stats
were already synced and so it would end up on the oncpu_stat_off anyway.
We just need to teach both vmstat_shepherd and vmstat_update to stop
scheduling the worker if there is nothing to do.

[mgalbraith@suse.de: cancel pending work of the cpu_stat_off CPU]
Signed-off-by: Michal Hocko
Reported-by: Mike Galbraith
Acked-by: Christoph Lameter
Signed-off-by: Mike Galbraith
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Hocko
2016-02-06 10:10:40 +0800
1ce221036 mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT ... Browse Code »

The description mentions kswapd threads, while the deferred struct page
initialization is actually done by one-off "pgdatinitX" threads.

Fix the description so that potentially users are not confused about
pgdatinit threads using CPU after boot instead of kswapd.

Signed-off-by: Vlastimil Babka
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vlastimil Babka
2016-02-06 10:10:40 +0800
1f1ffb8a1 memblock: don't mark memblock_phys_mem_size() as __init ... Browse Code »

At the moment memblock_phys_mem_size() is marked as __init, and so is
discarded after boot. This is different from most of the memblock
functions which are marked __init_memblock, and are only discarded after
boot if memory hotplug is not configured.

To allow for upcoming code which will need memblock_phys_mem_size() in
the hotplug path, change it from __init to __init_memblock.

Signed-off-by: David Gibson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Gibson
2016-02-06 10:10:40 +0800
d7ce36924 dump_stack: avoid potential deadlocks ... Browse Code »

Some servers experienced fatal deadlocks because of a combination of
bugs, leading to multiple cpus calling dump_stack().

The checksumming bug was fixed in commit 34ae6a1aa054 ("ipv6: update
skb->csum when CE mark is propagated").

The second problem is a faulty locking in dump_stack()

CPU1 runs in process context and calls dump_stack(), grabs dump_lock.

CPU2 receives a TCP packet under softirq, grabs socket spinlock, and
call dump_stack() from netdev_rx_csum_fault().

dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since
dump_lock is owned by CPU1

While dumping its stack, CPU1 is interrupted by a softirq, and happens
to process a packet for the TCP socket locked by CPU2.

CPU1 spins forever in spin_lock() : deadlock

Stack trace on CPU1 looked like :

NMI backtrace for cpu 1
RIP: _raw_spin_lock+0x25/0x30
...
Call Trace:

tcp_v6_rcv+0x243/0x620
ip6_input_finish+0x11f/0x330
ip6_input+0x38/0x40
ip6_rcv_finish+0x3c/0x90
ipv6_rcv+0x2a9/0x500
process_backlog+0x461/0xaa0
net_rx_action+0x147/0x430
__do_softirq+0x167/0x2d0
call_softirq+0x1c/0x30
do_softirq+0x3f/0x80
irq_exit+0x6e/0xc0
smp_call_function_single_interrupt+0x35/0x40
call_function_single_interrupt+0x6a/0x70

printk+0x4d/0x4f
printk_address+0x31/0x33
print_trace_address+0x33/0x3c
print_context_stack+0x7f/0x119
dump_trace+0x26b/0x28e
show_trace_log_lvl+0x4f/0x5c
show_stack_log_lvl+0x104/0x113
show_stack+0x42/0x44
dump_stack+0x46/0x58
netdev_rx_csum_fault+0x38/0x3c
__skb_checksum_complete_head+0x6e/0x80
__skb_checksum_complete+0x11/0x20
tcp_rcv_established+0x2bd5/0x2fd0
tcp_v6_do_rcv+0x13c/0x620
sk_backlog_rcv+0x15/0x30
release_sock+0xd2/0x150
tcp_recvmsg+0x1c1/0xfc0
inet_recvmsg+0x7d/0x90
sock_recvmsg+0xaf/0xe0
___sys_recvmsg+0x111/0x3b0
SyS_recvmsg+0x5c/0xb0
system_call_fastpath+0x16/0x1b

Fixes: b58d977432c8 ("dump_stack: serialize the output from dump_stack()")
Signed-off-by: Eric Dumazet
Cc: Alex Thorlton
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Dumazet
2016-02-06 10:10:40 +0800
acf128d04 mm: validate_mm browse_rb SMP race condition ... Browse Code »

The mmap_sem for reading in validate_mm called from expand_stack is not
enough to prevent the argumented rbtree rb_subtree_gap information to
change from under us because expand_stack may be running from other
threads concurrently which will hold the mmap_sem for reading too.

The argumented rbtree is updated with vma_gap_update under the
page_table_lock so use it in browse_rb() too to avoid false positives.

Signed-off-by: Andrea Arcangeli
Reported-by: Dmitry Vyukov
Tested-by: Dmitry Vyukov
Cc: Konstantin Khlebnikov
Cc: Oleg Nesterov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrea Arcangeli
2016-02-06 10:10:40 +0800
af1ddcb5c m32r: fix build failure due to SMP and MMU ... Browse Code »

One of the randconfig build failed with the error:

arch/m32r/kernel/smp.c: In function 'smp_flush_tlb_mm':
arch/m32r/kernel/smp.c:283:20: error: subscripted value is neither array nor pointer nor vector
mmc = &mm->context[cpu_id];
^
arch/m32r/kernel/smp.c: In function 'smp_flush_tlb_page':
arch/m32r/kernel/smp.c:353:20: error: subscripted value is neither array nor pointer nor vector
mmc = &mm->context[cpu_id];
^
arch/m32r/kernel/smp.c: In function 'smp_invalidate_interrupt':
arch/m32r/kernel/smp.c:479:41: error: subscripted value is neither array nor pointer nor vector
unsigned long *mmc = &flush_mm->context[cpu_id];

It turned out that CONFIG_SMP was defined but CONFIG_MMU was not
defined. But arch/m32r/include/asm/mmu.h only defines mm_context_t as
an array when both CONFIG_SMP and CONFIG_MMU are defined. And
arch/m32r/kernel/smp.c is always using context as an array. So without
MMU SMP can not work.

Signed-off-by: Sudip Mukherjee
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sudip Mukherjee
2016-02-06 10:10:40 +0800
9c5a05bc3 block: fix pfn_mkwrite() DAX fault handler ... Browse Code »

Previously the pfn_mkwrite() fault handler for raw block devices called
bldev_dax_fault() -> __dax_fault() to do a full DAX page fault.

Really what the pfn_mkwrite() fault handler needs to do is call
dax_pfn_mkwrite() to make sure that the radix tree entry for the given
PTE is marked as dirty so that a follow-up fsync or msync call will
flush it durably to media.

Fixes: 5a023cdba50c ("block: enable dax for raw block devices")
Signed-off-by: Ross Zwisler
Cc: Alexander Viro
Cc: Dan Williams
Cc: Dave Chinner
Reviewed-by: Jan Kara
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ross Zwisler
2016-02-06 10:10:40 +0800
823dd3224 signals: avoid random wakeups in sigsuspend() ... Browse Code »

A random wakeup can get us out of sigsuspend() without TIF_SIGPENDING
being set.

Avoid that by making sure we were signaled, like sys_pause() does.

Signed-off-by: Sasha Levin
Acked-by: Oleg Nesterov
Acked-by: Peter Zijlstra (Intel)
Cc: Dmitry Vyukov
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sasha Levin
2016-02-06 10:10:40 +0800
79e2f8dd5 Merge branches 'pm-core' and 'pm-domains' ... Browse Code »

* pm-core:
PM: Avoid false-positive warnings in dev_pm_domain_set()
ACPI / LPSS: set PM domain via helper setter

* pm-domains:
PM / Domains: Silence compiler warning for an unused function

Rafael J. Wysocki
2016-02-06 07:34:01 +0800
929823040 Merge branches 'pnp' and 'acpi-cppc' ... Browse Code »

* pnp:
PNP: Add Haswell-ULT to Intel MCH size workaround

* acpi-cppc:
ACPI / CPPC: remove redundant mbox_send_message() declaration

Rafael J. Wysocki
2016-02-06 07:33:52 +0800
dd6f86af3 Merge tag 'media/v4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media ... Browse Code »

Pull media fixes from Mauro Carvalho Chehab:
- vb2: fix a vb2_thread regression and DVB read() breakages
- vsp1: fix compilation and links creation
- s5k6a3: Fix VIDIOC_SUBDEV_G_FMT ioctl for TRY format
- exynos4-is: fix a build issue, format negotiation and sensor detection
- Fix a regression with pvrusb2 and ir-kbd-i2c
- atmel-isi: fix debug message which only show the first format
- tda1004x: fix a tuning bug if G_PROPERTY is called too early
- saa7134-alsa: fix a bug at device unbinding/driver removal
- Fix build of one driver if !HAS_DMA
- soc_camera: cleanup control device on async_unbind

* tag 'media/v4.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] saa7134-alsa: Only frees registered sound cards
[media] vb2-core: call threadio->fnc() if !VB2_BUF_STATE_ERROR
[media] vb2: fix nasty vb2_thread regression
[media] tda1004x: only update the frontend properties if locked
[media] media: i2c: Don't export ir-kbd-i2c module alias
[media] exynos4-is: make VIDEO_SAMSUNG_EXYNOS4_IS tristate
[media] media: Kconfig: add dependency of HAS_DMA
[media] exynos4-is: Wait for 100us before opening sensor
[media] exynos4-is: Open shouldn't fail when sensor entity is not linked
[media] s5k6a3: Fix VIDIOC_SUBDEV_G_FMT ioctl for TRY format
[media] exynos4-is: fix a format string bug
[media] drivers/media: vsp1_video: fix compile error
[media] atmel-isi: fix debug message which only show the first format
[media] soc_camera: cleanup control device on async_unbind
[media] v4l: vsp1: Fix wrong entities links creation

Linus Torvalds
2016-02-06 04:46:38 +0800
ea5a273c7 Merge tag 'sound-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"This was a busy week and I had to prepare a pile of duct tapes for the
bugs reported by syzkaller fuzzer in wide range of ALSA core APIs:
timer, rawmidi, sequencer, and PCM OSS emulation. Let's see how many
other holes we need to plug.

Besides that, a few usual boring stuff, HD- and USB-audio quirks, have
been added"

* tag 'sound-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: timer: Fix leftover link at closing
ALSA: seq: Fix lockdep warnings due to double mutex locks
ALSA: rawmidi: Fix race at copying & updating the position
ALSA: rawmidi: Make snd_rawmidi_transmit() race-free
ALSA: hda - Add fixup for Mac Mini 7,1 model
ALSA: hda/realtek - Support headset mode for ALC225
ALSA: hda/realtek - Support Dell headset mode for ALC225
ALSA: hda/realtek - New codec support of ALC225
ALSA: timer: Sync timer deletion at closing the system timer
ALSA: timer: Fix link corruption due to double start or stop
ALSA: seq: Fix yet another races among ALSA timer accesses
ALSA: pcm: Fix potential deadlock in OSS emulation
ALSA: rawmidi: Remove kernel WARNING for NULL user-space buffer check
ALSA: seq: Fix race at closing in virmidi driver
ALSA: emu10k1: correctly handling failed thread creation
ALSA: usb-audio: Add quirk for Microsoft LifeCam HD-6000
ALSA: usb-audio: Add native DSD support for PS Audio NuWave DAC
ALSA: usb-audio: Fix OPPO HA-1 vendor ID

Linus Torvalds
2016-02-06 04:34:44 +0800
ed1741b77 Merge git://www.linux-watchdog.org/linux-watchdog ... Browse Code »

Pull watchdog fixes from Wim Van Sebroeck:
"This fixes several Kconfig dependencies, a compilation warning in
pcwd_usb, a failure to abort the sp805 wdt after a ping and the
max63xx wdt's MODULE_LICENSE"

* git://www.linux-watchdog.org/linux-watchdog:
watchdog: Fix dependencies for !HAS_IOMEM archs
watchdog: imgdpc: select WATCHDOG_CORE
watchdog: tango: rename ARCH_TANGOX to ARCH_TANGO
watchdog: pcwd_usb: fix compilation warning
watchdog: sp805: ping fails to abort wdt reset
watchdog: max63xx: make module's license marker match the header

Linus Torvalds
2016-02-06 03:20:15 +0800