Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

06 Feb, 2015

2 commits

8a71cc4d0 quota: Switch ->get_dqblk() and ->set_dqblk() to use bytes as space units ... Browse Code »

commit 14bf61ffe6ac54afcd1e888a4407fe16054483db upstream.

Currently ->get_dqblk() and ->set_dqblk() use struct fs_disk_quota which
tracks space limits and usage in 512-byte blocks. However VFS quotas
track usage in bytes (as some filesystems require that) and we need to
somehow pass this information. Upto now it wasn't a problem because we
didn't do any unit conversion (thus VFS quota routines happily stuck
number of bytes into d_bcount field of struct fd_disk_quota). Only if
you tried to use Q_XGETQUOTA or Q_XSETQLIM for VFS quotas (or Q_GETQUOTA
/ Q_SETQUOTA for XFS quotas), you got bogus results. Hardly anyone
tried this but reportedly some Samba users hit the problem in practice.
So when we want interfaces compatible we need to fix this.

We bite the bullet and define another quota structure used for passing
information from/to ->get_dqblk()/->set_dqblk. It's somewhat sad we have
to have more conversion routines in fs/quota/quota.c and another copying
of quota structure slows down getting of quota information by about 2%
but it seems cleaner than overloading e.g. units of d_bcount to bytes.

Reviewed-by: Christoph Hellwig
Signed-off-by: Jan Kara
Signed-off-by: Greg Kroah-Hartman

Jan Kara
2015-02-06 14:36:09 +0800
da7f8de96 vm: add VM_FAULT_SIGSEGV handling support ... Browse Code »

commit 33692f27597fcab536d7cbbcc8f52905133e4aa7 upstream.

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works. However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV. And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d4514 ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space. And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it. They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this. A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.

Reported-and-tested-by: Takashi Iwai
Tested-by: Jan Engelhardt
Acked-by: Heiko Carstens # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2015-02-06 14:36:01 +0800

30 Jan, 2015

8 commits

f2efa8653 crypto: prefix module autoloading with "crypto-" ... Browse Code »

commit 5d26a105b5a73e5635eae0629b42fa0a90e07b7b upstream.

This prefixes all crypto module loading with "crypto-" so we never run
the risk of exposing module auto-loading to userspace via a crypto API,
as demonstrated by Mathias Krause:

https://lkml.org/lkml/2013/3/4/70

Signed-off-by: Kees Cook
Signed-off-by: Herbert Xu
Signed-off-by: Greg Kroah-Hartman

Kees Cook
2015-01-30 09:40:46 +0800
c4302b5e4 ACPI / PM: Do not disable wakeup GPEs that have not been enabled ... Browse Code »

commit 175f8e2650f7ca6b33d338be3ccc1c00e89594ea upstream.

In some cases acpi_device_wakeup() may be called to ensure wakeup
power to be off for a given device even though that device's wakeup
GPE has not been enabled so far. It calls acpi_disable_gpe() on a
GPE that's not enabled and this causes ACPICA to return the AE_LIMIT
status code from that call which then is reported as an error by the
ACPICA's debug facilities (if enabled). This may lead to a fair
amount of confusion, so introduce a new ACPI device wakeup flag
to store the wakeup GPE status and avoid disabling wakeup GPEs
that have not been enabled.

Reported-and-tested-by: Venkat Raghavulu
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Greg Kroah-Hartman

Rafael J. Wysocki
2015-01-30 09:40:46 +0800
513c66820 mm: get rid of radix tree gfp mask for pagecache_get_page ... Browse Code »

commit 45f87de57f8fad59302fd263dd81ffa4843b5b24 upstream.

Commit 2457aec63745 ("mm: non-atomically mark page accessed during page
cache allocation where possible") has added a separate parameter for
specifying gfp mask for radix tree allocations.

Not only this is less than optimal from the API point of view because it
is error prone, it is also buggy currently because
grab_cache_page_write_begin is using GFP_KERNEL for radix tree and if
fgp_flags doesn't contain FGP_NOFS (mostly controlled by fs by
AOP_FLAG_NOFS flag) but the mapping_gfp_mask has __GFP_FS cleared then
the radix tree allocation wouldn't obey the restriction and might
recurse into filesystem and cause deadlocks. This is the case for most
filesystems unfortunately because only ext4 and gfs2 are using
AOP_FLAG_NOFS.

Let's simply remove radix_gfp_mask parameter because the allocation
context is same for both page cache and for the radix tree. Just make
sure that the radix tree gets only the sane subset of the mask (e.g. do
not pass __GFP_WRITE).

Long term it is more preferable to convert remaining users of
AOP_FLAG_NOFS to use mapping_gfp_mask instead and simplify this
interface even further.

Reported-by: Dave Chinner
Signed-off-by: Michal Hocko
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Michal Hocko
2015-01-30 09:40:46 +0800
5f4d5404e time: settimeofday: Validate the values of tv from user ... Browse Code »

commit 6ada1fc0e1c4775de0e043e1bd3ae9d065491aa5 upstream.

An unvalidated user input is multiplied by a constant, which can result in
an undefined behaviour for large values. While this is validated later,
we should avoid triggering undefined behaviour.

Cc: Thomas Gleixner
Cc: Ingo Molnar
Signed-off-by: Sasha Levin
[jstultz: include trivial milisecond->microsecond correction noticed
by Andy]
Signed-off-by: John Stultz
Signed-off-by: Greg Kroah-Hartman

Sasha Levin
2015-01-30 09:40:44 +0800
b01f14468 PCI: Add flag for devices where we can't use bus reset ... Browse Code »

commit f331a859e0ee5a898c1f47596eddad4c4f02d657 upstream.

Enable a mechanism for devices to quirk that they do not behave when
doing a PCI bus reset. We require a modest level of spec compliant
behavior in order to do a reset, for instance the device should come
out of reset without throwing errors and PCI config space should be
accessible after reset. This is too much to ask for some devices.

Link: http://lkml.kernel.org/r/20140923210318.498dacbd@dualc.maya.org
Signed-off-by: Alex Williamson
Signed-off-by: Bjorn Helgaas
Signed-off-by: Greg Kroah-Hartman

Alex Williamson
2015-01-30 09:40:44 +0800
9f2e98fd1 PCI: Add pci_claim_bridge_resource() to clip window if necessary ... Browse Code »

commit 8505e729a2f6eb0803ff943a15f133dd10afff3a upstream.

Add pci_claim_bridge_resource() to claim a PCI-PCI bridge window. This is
like regular pci_claim_resource(), except that if we fail to claim the
window, we check to see if we can reduce the size of the window and try
again.

This is for scenarios like this:

pci_bus 0000:00: root bus resource [mem 0xc0000000-0xffffffff]
pci 0000:00:01.0: bridge window [mem 0xbdf00000-0xddefffff 64bit pref]
pci 0000:01:00.0: reg 0x10: [mem 0xc0000000-0xcfffffff pref]

The 00:01.0 window is illegal: it starts before the host bridge window, so
we have to assume the [0xbdf00000-0xbfffffff] region is inaccessible. We
can make it legal by clipping it to [mem 0xc0000000-0xddefffff 64bit pref].

Previously we discarded the 00:01.0 window and tried to reassign that part
of the hierarchy from scratch. That is a problem because Linux doesn't
always assign things optimally. For example, in this case, BIOS put the
01:00.0 device in a prefetchable window below 4GB, but after 5b28541552ef,
Linux puts the prefetchable window above 4GB where the 32-bit 01:00.0
device can't use it.

Clipping the 00:01.0 window is less intrusive than completely reassigning
things and is sufficient to let us use most of the BIOS configuration. Of
course, it's possible that devices below 00:01.0 will no longer fit. If
that's the case, we'll have to reassign things. But that's a separate
problem.

[bhelgaas: changelog, split into separate patch]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=85491
Reported-by: Marek Kordik
Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources")
Signed-off-by: Yinghai Lu
Signed-off-by: Bjorn Helgaas
Signed-off-by: Greg Kroah-Hartman

Yinghai Lu
2015-01-30 09:40:43 +0800
99bde8fc1 libata: allow sata_sil24 to opt-out of tag ordered submission ... Browse Code »

commit 72dd299d5039a336493993dcc63413cf31d0e662 upstream.

Ronny reports: https://bugzilla.kernel.org/show_bug.cgi?id=87101
"Since commit 8a4aeec8d "libata/ahci: accommodate tag ordered
controllers" the access to the harddisk on the first SATA-port is
failing on its first access. The access to the harddisk on the
second port is working normal.

When reverting the above commit, access to both harddisks is working
fine again."

Maintain tag ordered submission as the default, but allow sata_sil24 to
continue with the old behavior.

Cc: Tejun Heo
Reported-by: Ronny Hegewald
Signed-off-by: Dan Williams
Signed-off-by: Tejun Heo
Signed-off-by: Greg Kroah-Hartman

Dan Williams
2015-01-30 09:40:42 +0800
b47d1db63 can: m_can: tag current CAN FD controllers as non-ISO ... Browse Code »

commit 6cfda7fbebe8a4fd33ea5722fa0212f98f643c35 upstream.

During the CAN FD standardization process within the ISO it turned out that
the failure detection capability has to be improved.

The CAN in Automation organization (CiA) defined the already implemented CAN
FD controllers as 'non-ISO' and the upcoming improved CAN FD controllers as
'ISO' compliant. See at http://www.can-cia.com/index.php?id=1937

Finally there will be three types of CAN FD controllers in the future:

1. ISO compliant (fixed)
2. non-ISO compliant (fixed, like the M_CAN IP v3.0.1 in m_can.c)
3. ISO/non-ISO CAN FD controllers (switchable, like the PEAK USB FD)

So the current M_CAN driver for the M_CAN IP v3.0.1 has to expose its non-ISO
implementation by setting the CAN_CTRLMODE_FD_NON_ISO ctrlmode at startup.
As this bit cannot be switched at configuration time CAN_CTRLMODE_FD_NON_ISO
must not be set in ctrlmode_supported of the current M_CAN driver.

Signed-off-by: Oliver Hartkopp
Signed-off-by: Marc Kleine-Budde
Signed-off-by: Greg Kroah-Hartman

Oliver Hartkopp
2015-01-30 09:40:42 +0800

28 Jan, 2015

4 commits

a27d8a231 genirq: Prevent proc race against freeing of irq descriptors ... Browse Code »

commit c291ee622165cb2c8d4e7af63fffd499354a23be upstream.

Since the rework of the sparse interrupt code to actually free the
unused interrupt descriptors there exists a race between the /proc
interfaces to the irq subsystem and the code which frees the interrupt
descriptor.

CPU0 CPU1
show_interrupts()
desc = irq_to_desc(X);
free_desc(desc)
remove_from_radix_tree();
kfree(desc);
raw_spinlock_irq(&desc->lock);

/proc/interrupts is the only interface which can actively corrupt
kernel memory via the lock access. /proc/stat can only read from freed
memory. Extremly hard to trigger, but possible.

The interfaces in /proc/irq/N/ are not affected by this because the
removal of the proc file is serialized in procfs against concurrent
readers/writers. The removal happens before the descriptor is freed.

For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
as the descriptor is never freed. It's merely cleared out with the irq
descriptor lock held. So any concurrent proc access will either see
the old correct value or the cleared out ones.

Protect the lookup and access to the irq descriptor in
show_interrupts() with the sparse_irq_lock.

Provide kstat_irqs_usr() which is protecting the lookup and access
with sparse_irq_lock and switch /proc/stat to use it.

Document the existing kstat_irqs interfaces so it's clear that the
caller needs to take care about protection. The users of these
interfaces are either not affected due to SPARSE_IRQ=n or already
protected against removal.

Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2015-01-28 00:29:37 +0800
937723cf8 uapi/linux/target_core_user.h: fix headers_install.sh badness ... Browse Code »

commit 3875f15207f9ecb3f24a8e91e7ad196899139595 upstream.

scripts/headers_install.sh will transform __packed to
__attribute__((packed)), so the #ifndef is not necessary.
(and, in fact, it's problematic, because we'll end up with the header
containing:
#ifndef __attribute__((packed))
#define __attribu...
and so forth.)

Signed-off-by: Kyle McMartin
Signed-off-by: Nicholas Bellinger
Signed-off-by: Greg Kroah-Hartman

Kyle McMartin
2015-01-28 00:29:37 +0800
12d5e0bb5 net: Generalize ndo_gso_check to ndo_features_check ... Browse Code »

[ Upstream commit 5f35227ea34bb616c436d9da47fc325866c428f3 ]

GSO isn't the only offload feature with restrictions that
potentially can't be expressed with the current features mechanism.
Checksum is another although it's a general issue that could in
theory apply to anything. Even if it may be possible to
implement these restrictions in other ways, it can result in
duplicate code or inefficient per-packet behavior.

This generalizes ndo_gso_check so that drivers can remove any
features that don't make sense for a given packet, similar to
netif_skb_features(). It also converts existing driver
restrictions to the new format, completing the work that was
done to support tunnel protocols since the issues apply to
checksums as well.

By actually removing features from the set that are used to do
offloading, it solves another problem with the existing
interface. In these cases, GSO would run with the original set
of features and not do anything because it appears that
segmentation is not required.

CC: Tom Herbert
CC: Joe Stringer
CC: Eric Dumazet
CC: Hayes Wang
Signed-off-by: Jesse Gross
Acked-by: Tom Herbert
Fixes: 04ffcb255f22 ("net: Add ndo_gso_check")
Tested-by: Hayes Wang
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

Jesse Gross
2015-01-28 00:29:33 +0800
b7efdb64a in6: fix conflict with glibc ... Browse Code »

[ Upstream commit 6d08acd2d32e3e877579315dc3202d7a5f336d98 ]

Resolve conflicts between glibc definition of IPV6 socket options
and those defined in Linux headers. Looks like earlier efforts to
solve this did not cover all the definitions.

It resolves warnings during iproute2 build.
Please consider for stable as well.

Signed-off-by: Stephen Hemminger
Acked-by: Hannes Frederic Sowa
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman

stephen hemminger
2015-01-28 00:29:33 +0800

16 Jan, 2015

7 commits

c03aed64c mm: propagate error from stack expansion even for guard page ... Browse Code »

commit fee7e49d45149fba60156f5b59014f764d3e3728 upstream.

Jay Foad reports that the address sanitizer test (asan) sometimes gets
confused by a stack pointer that ends up being outside the stack vma
that is reported by /proc/maps.

This happens due to an interaction between RLIMIT_STACK and the guard
page: when we do the guard page check, we ignore the potential error
from the stack expansion, which effectively results in a missing guard
page, since the expected stack expansion won't have been done.

And since /proc/maps explicitly ignores the guard page (commit
d7824370e263: "mm: fix up some user-visible effects of the stack guard
page"), the stack pointer ends up being outside the reported stack area.

This is the minimal patch: it just propagates the error. It also
effectively makes the guard page part of the stack limit, which in turn
measn that the actual real stack is one page less than the stack limit.

Let's see if anybody notices. We could teach acct_stack_growth() to
allow an extra page for a grow-up/grow-down stack in the rlimit test,
but I don't want to add more complexity if it isn't needed.

Reported-and-tested-by: Jay Foad
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Linus Torvalds
2015-01-16 22:59:57 +0800
a78e877e9 mm: protect set_page_dirty() from ongoing truncation ... Browse Code »

commit 2d6d7f98284648c5ed113fe22a132148950b140f upstream.

Tejun, while reviewing the code, spotted the following race condition
between the dirtying and truncation of a page:

__set_page_dirty_nobuffers() __delete_from_page_cache()
if (TestSetPageDirty(page))
page->mapping = NULL
if (PageDirty())
dec_zone_page_state(page, NR_FILE_DIRTY);
dec_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);
if (page->mapping)
account_page_dirtied(page)
__inc_zone_page_state(page, NR_FILE_DIRTY);
__inc_bdi_stat(mapping->backing_dev_info, BDI_RECLAIMABLE);

which results in an imbalance of NR_FILE_DIRTY and BDI_RECLAIMABLE.

Dirtiers usually lock out truncation, either by holding the page lock
directly, or in case of zap_pte_range(), by pinning the mapcount with
the page table lock held. The notable exception to this rule, though,
is do_wp_page(), for which this race exists. However, do_wp_page()
already waits for a locked page to unlock before setting the dirty bit,
in order to prevent a race where clear_page_dirty() misses the page bit
in the presence of dirty ptes. Upgrade that wait to a fully locked
set_page_dirty() to also cover the situation explained above.

Afterwards, the code in set_page_dirty() dealing with a truncation race
is no longer needed. Remove it.

Reported-by: Tejun Heo
Signed-off-by: Johannes Weiner
Acked-by: Kirill A. Shutemov
Reviewed-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Johannes Weiner
2015-01-16 22:59:57 +0800
a4a59e583 Revert "mac80211: Fix accounting of the tailroom-needed counter" ... Browse Code »

commit 1e359a5de861a57aa04d92bb620f52a5c1d7f8b1 upstream.

This reverts commit ca34e3b5c808385b175650605faa29e71e91991b.

It turns out that the p54 and cw2100 drivers assume that there's
tailroom even when they don't say they really need it. However,
there's currently no way for them to explicitly say they do need
it, so for now revert this.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331.

Fixes: ca34e3b5c808 ("mac80211: Fix accounting of the tailroom-needed counter")
Reported-by: Christopher Chavez
Bisected-by: Larry Finger
Debugged-by: Christian Lamparter
Signed-off-by: Johannes Berg
Signed-off-by: Greg Kroah-Hartman

Johannes Berg
2015-01-16 22:59:56 +0800
cb64628f0 Drivers: hv: util: make struct hv_do_fcopy match Hyper-V host messages ... Browse Code »

commit 31d4ea1a093fcf668d5f95af44b8d41488bdb7ec upstream.

An attempt to fix fcopy on i586 (bc5a5b0 Drivers: hv: util: Properly pack the data
for file copy functionality) led to a regression on x86_64 (and actually didn't fix
i586 breakage). Fcopy messages from Hyper-V host come in the following format:

struct do_fcopy_hdr | 36 bytes
0000 | 4 bytes
offset | 8 bytes
size | 4 bytes
data | 6144 bytes

On x86_64 struct hv_do_fcopy matched this format without ' __attribute__((packed))'
and on i586 adding ' __attribute__((packed))' to it doesn't change anything. Keep
the structure packed and add padding to match re reality. Tested both i586 and x86_64
on Hyper-V Server 2012 R2.

Signed-off-by: Vitaly Kuznetsov
Signed-off-by: K. Y. Srinivasan
Signed-off-by: Greg Kroah-Hartman

Vitaly Kuznetsov
2015-01-16 22:59:53 +0800
2ce1eeb34 tracing/sched: Check preempt_count() for current when reading task->state ... Browse Code »

commit aee4e5f3d3abb7a2239dd02f6d8fb173413fd02f upstream.

When recording the state of a task for the sched_switch tracepoint a check of
task_preempt_count() is performed to see if PREEMPT_ACTIVE is set. This is
because, technically, a task being preempted is really in the TASK_RUNNING
state, and that is what should be recorded when tracing a sched_switch,
even if the task put itself into another state (it hasn't scheduled out
in that state yet).

But with the change to use per_cpu preempt counts, the
task_thread_info(p)->preempt_count is no longer used, and instead
task_preempt_count(p) is used.

The problem is that this does not use the current preempt count but a stale
one from a previous sched_switch. The task_preempt_count(p) uses
saved_preempt_count and not preempt_count(). But for tracing sched_switch,
if p is current, we really want preempt_count().

I hit this bug when I was tracing sleep and the call from do_nanosleep()
scheduled out in the "RUNNING" state.

sleep-4290 [000] 537272.259992: sched_switch: sleep:4290 [120] R ==> swapper/0:0 [120]
sleep-4290 [000] 537272.260015: kernel_stack:
=> __schedule (ffffffff8150864a)
=> schedule (ffffffff815089f8)
=> do_nanosleep (ffffffff8150b76c)
=> hrtimer_nanosleep (ffffffff8108d66b)
=> SyS_nanosleep (ffffffff8108d750)
=> return_to_handler (ffffffff8150e8e5)
=> tracesys_phase2 (ffffffff8150c844)

After a bit of hair pulling, I found that the state was really
TASK_INTERRUPTIBLE, but the saved_preempt_count had an old PREEMPT_ACTIVE
set and caused the sched_switch tracepoint to show it as RUNNING.

Link: http://lkml.kernel.org/r/20141210174428.3cb7542a@gandalf.local.home

Acked-by: Ingo Molnar
Cc: Peter Zijlstra
Fixes: 01028747559a "sched: Create more preempt_count accessors"
Signed-off-by: Steven Rostedt
Signed-off-by: Greg Kroah-Hartman

Steven Rostedt (Red Hat)
2015-01-16 22:59:52 +0800
c7ba2d794 pstore-ram: Allow optional mapping with pgprot_noncached ... Browse Code »

commit 027bc8b08242c59e19356b4b2c189f2d849ab660 upstream.

On some ARMs the memory can be mapped pgprot_noncached() and still
be working for atomic operations. As pointed out by Colin Cross
, in some cases you do want to use
pgprot_noncached() if the SoC supports it to see a debug printk
just before a write hanging the system.

On ARMs, the atomic operations on strongly ordered memory are
implementation defined. So let's provide an optional kernel parameter
for configuring pgprot_noncached(), and use pgprot_writecombine() by
default.

Cc: Arnd Bergmann
Cc: Rob Herring
Cc: Randy Dunlap
Cc: Anton Vorontsov
Cc: Colin Cross
Cc: Olof Johansson
Cc: Russell King
Acked-by: Kees Cook
Signed-off-by: Tony Lindgren
Signed-off-by: Tony Luck
Signed-off-by: Greg Kroah-Hartman

Tony Lindgren
2015-01-16 22:59:47 +0800
ad233513d powerpc: add little endian flag to syscall_get_arch() ... Browse Code »

commit 63f13448d81c910a284b096149411a719cbed501 upstream.

Since both ppc and ppc64 have LE variants which are now reported by uname, add
that flag (__AUDIT_ARCH_LE) to syscall_get_arch() and add AUDIT_ARCH_PPC64LE
variant.

Without this, perf trace and auditctl fail.

Mainline kernel reports ppc64le (per a058801) but there is no matching
AUDIT_ARCH_PPC64LE.

Since 32-bit PPC LE is not supported by audit, don't advertise it in
AUDIT_ARCH_PPC* variants.

See:
https://www.redhat.com/archives/linux-audit/2014-August/msg00082.html
https://www.redhat.com/archives/linux-audit/2014-December/msg00004.html

Signed-off-by: Richard Guy Briggs
Acked-by: Paul Moore
Signed-off-by: Michael Ellerman
Signed-off-by: Greg Kroah-Hartman

Richard Guy Briggs
2015-01-16 22:59:46 +0800

09 Jan, 2015

4 commits

3d7c0c1f6 audit: restore AUDIT_LOGINUID unset ABI ... Browse Code »

commit 041d7b98ffe59c59fdd639931dea7d74f9aa9a59 upstream.

A regression was caused by commit 780a7654cee8:
audit: Make testing for a valid loginuid explicit.
(which in turn attempted to fix a regression caused by e1760bd)

When audit_krule_to_data() fills in the rules to get a listing, there was a
missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.

This broke userspace by not returning the same information that was sent and
expected.

The rule:
auditctl -a exit,never -F auid=-1
gives:
auditctl -l
LIST_RULES: exit,never f24=0 syscall=all
when it should give:
LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all

Tag it so that it is reported the same way it was set. Create a new
private flags audit_krule field (pflags) to store it that won't interact with
the public one from the API.

Signed-off-by: Richard Guy Briggs
Signed-off-by: Paul Moore
Signed-off-by: Greg Kroah-Hartman

Richard Guy Briggs
2015-01-09 02:30:27 +0800
4a7215f13 userns: Add a knob to disable setgroups on a per user namespace basis ... Browse Code »

commit 9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8 upstream.

- Expose the knob to user space through a proc file /proc//setgroups

A value of "deny" means the setgroups system call is disabled in the
current processes user namespace and can not be enabled in the
future in this user namespace.

A value of "allow" means the segtoups system call is enabled.

- Descendant user namespaces inherit the value of setgroups from
their parents.

- A proc file is used (instead of a sysctl) as sysctls currently do
not allow checking the permissions at open time.

- Writing to the proc file is restricted to before the gid_map
for the user namespace is set.

This ensures that disabling setgroups at a user namespace
level will never remove the ability to call setgroups
from a process that already has that ability.

A process may opt in to the setgroups disable for itself by
creating, entering and configuring a user namespace or by calling
setns on an existing user namespace with setgroups disabled.
Processes without privileges already can not call setgroups so this
is a noop. Prodcess with privilege become processes without
privilege when entering a user namespace and as with any other path
to dropping privilege they would not have the ability to call
setgroups. So this remains within the bounds of what is possible
without a knob to disable setgroups permanently in a user namespace.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Greg Kroah-Hartman

Eric W. Biederman
2015-01-09 02:30:26 +0800
d5c3ebc43 userns: Don't allow setgroups until a gid mapping has been setablished ... Browse Code »

commit 273d2c67c3e179adb1e74f403d1e9a06e3f841b5 upstream.

setgroups is unique in not needing a valid mapping before it can be called,
in the case of setgroups(0, NULL) which drops all supplemental groups.

The design of the user namespace assumes that CAP_SETGID can not actually
be used until a gid mapping is established. Therefore add a helper function
to see if the user namespace gid mapping has been established and call
that function in the setgroups permission check.

This is part of the fix for CVE-2014-8989, being able to drop groups
without privilege using user namespaces.

Reviewed-by: Andy Lutomirski
Signed-off-by: "Eric W. Biederman"
Signed-off-by: Greg Kroah-Hartman

Eric W. Biederman
2015-01-09 02:30:25 +0800
e726c9a0a groups: Consolidate the setgroups permission checks ... Browse Code »

commit 7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.

Today there are 3 instances of setgroups and due to an oversight their
permission checking has diverged. Add a common function so that
they may all share the same permission checking code.

This corrects the current oversight in the current permission checks
and adds a helper to avoid this in the future.

A user namespace security fix will update this new helper, shortly.

Signed-off-by: "Eric W. Biederman"
Signed-off-by: Greg Kroah-Hartman

Eric W. Biederman
2015-01-09 02:30:25 +0800

17 Dec, 2014

1 commit

679829c2e move d_rcu from overlapping d_child to overlapping d_alias ... Browse Code »

commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.

Signed-off-by: Al Viro
Signed-off-by: Greg Kroah-Hartman

Al Viro
2014-12-17 01:39:06 +0800

05 Dec, 2014

1 commit

d0747f10e uapi: fix to export linux/vm_sockets.h ... Browse Code »

A typo "header=y" was introduced by commit 7071cf7fc435 ("uapi: add
missing network related headers to kbuild").

Signed-off-by: Masahiro Yamada
Cc: Stephen Hemminger
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masahiro Yamada
2014-12-05 07:28:40 +0800

29 Nov, 2014

2 commits

889106387 Merge tag 'staging-3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging ... Browse Code »

Pull staging/IIO driver fixes from Greg KH:
"Here are some staging and IIO driver fixes for 3.18-rc7 that resolve a
number of reported issues, and a new device id for a staging wireless
driver.

All of these have been in linux-next"

* tag 'staging-3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: r8188eu: Add new device ID for DLink GO-USB-N150
staging: r8188eu: Fix scheduling while atomic error introduced in commit fadbe0cd
iio: accel: bmc150: set low default thresholds
iio: accel: bmc150: Fix iio_event_spec direction
iio: accel: bmc150: Send x, y and z motion separately
iio: accel: bmc150: Error handling when mode set fails
iio: gyro: bmg160: Fix iio_event_spec direction
iio: gyro: bmg160: Send x, y and z motion separately
iio: gyro: bmg160: Don't let interrupt mode to be open drain
iio: gyro: bmg160: Error handling when mode set fails
iio: adc: men_z188_adc: Add terminating entry for men_z188_ids
iio: accel: kxcjk-1013: Fix kxcjk10013_set_range
iio: Fix IIO_EVENT_CODE_EXTRACT_DIR bit mask

Linus Torvalds
2014-11-29 08:08:09 +0800
16cf45c09 Merge tag 'sound-3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"No excitement, here are only minor fixes: an endian fix for the new
DSD format we added in 3.18, a fix for HP mute LED, and a fix for
Native Instrument quirk"

* tag 'sound-3.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: pcm: Add big-endian DSD sample formats and fix XMOS DSD sample format
ALSA: hda - One more HP machine needs to change mute led quirk
ALSA: usb-audio: Use snd_usb_ctl_msg() for Native Instruments quirk

Linus Torvalds
2014-11-29 05:54:53 +0800

28 Nov, 2014

1 commit

8e8459719 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:
"Several small fixes here:

1) Don't crash in tg3 driver when the number of tx queues has been
configured to be different from the number of rx queues. From
Thadeu Lima de Souza Cascardo.

2) VLAN filter not disabled properly in promisc mode in ixgbe driver,
from Vlad Yasevich.

3) Fix OOPS on dellink op in VTI tunnel driver, from Xin Long.

4) IPV6 GRE driver WCCP code checks skb->protocol for ETH_P_IP
instead of ETH_P_IPV6, whoops. From Yuri Chislov.

5) Socket matching in ping driver is buggy when packet AF does not
match socket's AF. Fix from Jane Zhou.

6) Fix checksum calculation errors in VXLAN due to where the
udp_tunnel6_xmit_skb() helper gets it's saddr/daddr from. From
Alexander Duyck.

7) Fix 5G detection problem in rtlwifi driver, from Larry Finger.

8) Fix NULL deref in tcp_v{4,6}_send_reset, from Eric Dumazet.

9) Various missing netlink attribute verifications in bridging code,
from Thomas Graf.

10) tcp_recvmsg() unconditionally calls ipv4 ip_recv_error even for
ipv6 sockets, whoops. Fix from Willem de Bruijn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (29 commits)
net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socks
bridge: Sanitize IFLA_EXT_MASK for AF_BRIDGE:RTM_GETLINK
bridge: Add missing policy entry for IFLA_BRPORT_FAST_LEAVE
net: Check for presence of IFLA_AF_SPEC
net: Validate IFLA_BRIDGE_MODE attribute length
bridge: Validate IFLA_BRIDGE_FLAGS attribute length
stmmac: platform: fix default values of the filter bins setting
net/mlx4_core: Limit count field to 24 bits in qp_alloc_res
net: dsa: bcm_sf2: reset switch prior to initialization
net: dsa: bcm_sf2: fix unmapping registers in case of errors
tg3: fix ring init when there are more TX than RX channels
tcp: fix possible NULL dereference in tcp_vX_send_reset()
rtlwifi: Change order in device startup
rtlwifi: rtl8821ae: Fix 5G detection problem
Revert "netfilter: conntrack: fix race in __nf_conntrack_confirm against get_next_corpse"
vxlan: Fix boolean flip in VXLAN_F_UDP_ZERO_CSUM6_[TX|RX]
ip6_udp_tunnel: Fix checksum calculation
net-timestamp: Fix a documentation typo
net/ping: handle protocol mismatching scenario
af_packet: fix sparse warning
...

Linus Torvalds
2014-11-28 10:05:05 +0800

27 Nov, 2014

2 commits

f4713a3df net-timestamp: make tcp_recvmsg call ipv6_recv_error for AF_INET6 socks ... Browse Code »
13

TCP timestamping introduced MSG_ERRQUEUE handling for TCP sockets.
If the socket is of family AF_INET6, call ipv6_recv_error instead
of ip_recv_error.

This change is more complex than a single branch due to the loadable
ipv6 module. It reuses a pre-existing indirect function call from
ping. The ping code is safe to call, because it is part of the core
ipv6 module and always present when AF_INET6 sockets are active.

Fixes: 4ed2d765 (net-timestamp: TCP timestamping)
Signed-off-by: Willem de Bruijn

----

It may also be worthwhile to add WARN_ON_ONCE(sk->family == AF_INET6)
to ip_recv_error.
Signed-off-by: David S. Miller

Willem de Bruijn
2014-11-27 04:45:04 +0800
8bb9b9a00 Merge tag 'iio-fixes-for-3.18c' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/jic23/iio into staging-linus

Jonathan writes:

Third set of IIO fixes for the 3.18 cycle.

Most of these are fairly standard little fixes, a bmc150 and bmg160 patch
is to make an ABI change to indicated a specific axis in an event rather
than the generic option in the original drivers. As both of these drivers
are new in this cycle it would be ideal to push this minor change through
even though it isn't strictly a fix. A couple of other 'fixes' change
defaults for some settings on these new drivers to more intuitive calues.
Looks like some useful feedback has been coming in for this driver
since it was applied.

* IIO_EVENT_CODE_EXTRACT_DIR bit mask was wrong and has been for a while
0xCF clearly doesn't give a contiguous bitmask.
* kxcjk-1013 range setting was failing to mask out the previous value
in the register and hence was 'enable only'.
* men_z188 device id table wasn't null terminated.
* bmg160 and bmc150 both failed to correctly handling an error in mode
setting.
* bmg160 and bmc150 both had a bug in setting the event direction in the
event spec (leads to an attribute name being incorrect)
* bmg160 defaulted to an open drain output for the interrupt - as a default
this obviously only works with some interrupt chips - hence change the
default to push-pull (note this is a new driver so we aren't going to
cause any regressions with this change).
* bmc150 had an unintuitive default for the rate of change (motion detector)
so change it to 0 (new driver so change of default won't cause any
regressions).

Greg Kroah-Hartman
2014-11-27 03:06:36 +0800

26 Nov, 2014

3 commits

d3fccc7ef kvm: fix kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn() ... Browse Code »

This reverts commit 85c8555ff0 ("KVM: check for !is_zero_pfn() in
kvm_is_mmio_pfn()") and renames the function to kvm_is_reserved_pfn.

The problem being addressed by the patch above was that some ARM code
based the memory mapping attributes of a pfn on the return value of
kvm_is_mmio_pfn(), whose name indeed suggests that such pfns should
be mapped as device memory.

However, kvm_is_mmio_pfn() doesn't do quite what it says on the tin,
and the existing non-ARM users were already using it in a way which
suggests that its name should probably have been 'kvm_is_reserved_pfn'
from the beginning, e.g., whether or not to call get_page/put_page on
it etc. This means that returning false for the zero page is a mistake
and the patch above should be reverted.

Signed-off-by: Ard Biesheuvel
Signed-off-by: Paolo Bonzini

Ard Biesheuvel
2014-11-26 21:40:45 +0800
d1ca00078 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc ... Browse Code »

Pull powerpc fixes from Ben Herrenschmidt:
"This series fix a nasty issue with radeon adapters on powerpc servers,
it's all CC'ed stable and has the relevant maintainers ack's/reviews.

Basically, some (radeon) adapters have issues with MSI addresses above
1T (only support 40-bits). We had powerpc specific quirk but it only
listed a specific revision of an adapter that we shipped with our
machines and didn't properly handle the audio function which some
distros enable nowadays.

So we made the quirk generic and fixed both the graphic and audio
drivers properly to use it.

Without that, ppc64 server machines will crash at boot with a radeon
adapter.

Note: This has been brewing for a while, it just needed a last respin
which got delayed due to us moving ozlabs to a new location in town
and other such things taking priority"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/pci: Remove unused force_32bit_msi quirk
powerpc/pseries: Honor the generic "no_64bit_msi" flag
powerpc/powernv: Honor the generic "no_64bit_msi" flag
sound/radeon: Move 64-bit MSI quirk from arch to driver
gpu/radeon: Set flag to indicate broken 64-bit MSI
PCI/MSI: Add device flag indicating that 64-bit MSIs don't work
ALSA: hda - Limit 40bit DMA for AMD HDMI controllers

Linus Torvalds
2014-11-26 10:43:21 +0800
a7e90924e Merge tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux ... Browse Code »

Pull clock fixes from Mike Turquette:
"The fixes for the clock framework are all regressions in drivers, plus
a single fix in one of the basic clock templates. No fixes to the
core this time around.

As with most clock driver fixes these run the gamut from fixing a
build warning to fixing wrecked memory timings, with a little USB
tossed in for fun"

* tag 'clk-fixes-for-linus' of https://git.linaro.org/people/mike.turquette/linux:
clk: pxa: fix pxa27x CCCR bit usage
clk-divider: Fix READ_ONLY when divider > 1
clk: qcom: Fix duplicate rbcpr clock name
clk: at91: usb: fix at91sam9x5 recalc, round and set rate
clk: at91: usb: fix at91rm9200 round and set rate

Linus Torvalds
2014-11-26 09:52:56 +0800

24 Nov, 2014

3 commits

f144d1496 PCI/MSI: Add device flag indicating that 64-bit MSIs don't work ... Browse Code »
5

This can be set by quirks/drivers to be used by the architecture code
that assigns the MSI addresses.

We additionally add verification in the core MSI code that the values
assigned by the architecture do satisfy the limitation in order to fail
gracefully if they don't (ie. the arch hasn't been updated to deal with
that quirk yet).

Signed-off-by: Benjamin Herrenschmidt
CC:
Acked-by: Bjorn Helgaas

Benjamin Herrenschmidt
2014-11-24 11:11:34 +0800
9f2e0f637 Merge branch 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull percpu fix from Tejun Heo:
"This contains one patch to fix a race condition which can lead to
percpu_ref using a percpu pointer which is corrupted with a set DEAD
bit. The bug was introduced while separating out the ATOMIC mode flag
from the DEAD flag. The fix is pretty straight forward.

I just committed the patch to the percpu tree but am sending out the
pull request early as I'll be on vacation for a week. The patch
should be fairly safe and while the latency will be higher I'll be
checking emails"

* 'for-3.18-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-ref: fix DEAD flag contamination of percpu pointer

Linus Torvalds
2014-11-24 03:33:49 +0800
4aab3b5b3 percpu-ref: fix DEAD flag contamination of percpu pointer ... Browse Code »

While decoupling ATOMIC and DEAD flags, f47ad4578461 ("percpu_ref:
decouple switching to percpu mode and reinit") updated
__ref_is_percpu() so that it only tests ATOMIC flag to determine
whether the ref is in percpu mode or not; however, while DEAD implies
ATOMIC, the two flags are set separately during percpu_ref_kill() and
if __ref_is_percpu() races percpu_ref_kill(), it may see DEAD w/o
ATOMIC. Because __ref_is_percpu() returns @ref->percpu_count_ptr
value verbatim as the percpu pointer after testing ATOMIC, the pointer
may now be contaminated with the DEAD flag.

This can be fixed by clearing the flag bits before returning the
pointer which was the fix proposed by Shaohua; however, as DEAD
implies ATOMIC, we can just test for both flags at once and avoid the
explicit masking.

Update __ref_is_percpu() so that it tests that both ATOMIC and DEAD
are clear before returning @ref->percpu_count_ptr as the percpu
pointer.

Signed-off-by: Tejun Heo
Reported-and-Reviewed-by: Shaohua Li
Link: http://lkml.kernel.org/r/995deb699f5b873c45d667df4add3b06f73c2c25.1416638887.git.shli@kernel.org
Fixes: f47ad4578461 ("percpu_ref: decouple switching to percpu mode and reinit")

Tejun Heo
2014-11-24 01:36:06 +0800

22 Nov, 2014

2 commits

8a84e01e1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix BUG when decrypting empty packets in mac80211, from Ronald Wahl.

2) nf_nat_range is not fully initialized and this is copied back to
userspace, from Daniel Borkmann.

3) Fix read past end of b uffer in netfilter ipset, also from Dan
Carpenter.

4) Signed integer overflow in ipv4 address mask creation helper
inet_make_mask(), from Vincent BENAYOUN.

5) VXLAN, be2net, mlx4_en, and qlcnic need ->ndo_gso_check() methods to
properly describe the device's capabilities, from Joe Stringer.

6) Fix memory leaks and checksum miscalculations in openvswitch, from
Pravin B SHelar and Jesse Gross.

7) FIB rules passes back ambiguous error code for unreachable routes,
making behavior confusing for userspace. Fix from Panu Matilainen.

8) ieee802154fake_probe() doesn't release resources properly on error,
from Alexey Khoroshilov.

9) Fix skb_over_panic in add_grhead(), from Daniel Borkmann.

10) Fix access of stale slave pointers in bonding code, from Nikolay
Aleksandrov.

11) Fix stack info leak in PPP pptp code, from Mathias Krause.

12) Cure locking bug in IPX stack, from Jiri Bohac.

13) Revert SKB fclone memory freeing optimization that is racey and can
allow accesses to freed up memory, from Eric Dumazet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (71 commits)
tcp: Restore RFC5961-compliant behavior for SYN packets
net: Revert "net: avoid one atomic operation in skb_clone()"
virtio-net: validate features during probe
cxgb4 : Fix DCB priority groups being returned in wrong order
ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg
openvswitch: Don't validate IPv6 label masks.
pptp: fix stack info leak in pptp_getname()
brcmfmac: don't include linux/unaligned/access_ok.h
cxgb4i : Don't block unload/cxgb4 unload when remote closes TCP connection
ipv6: delete protocol and unregister rtnetlink when cleanup
net/mlx4_en: Add VXLAN ndo calls to the PF net device ops too
bonding: fix curr_active_slave/carrier with loadbalance arp monitoring
mac80211: minstrel_ht: fix a crash in rate sorting
vxlan: Inline vxlan_gso_check().
can: m_can: update to support CAN FD features
can: m_can: fix incorrect error messages
can: m_can: add missing delay after setting CCCR_INIT bit
can: m_can: fix not set can_dlc for remote frame
can: m_can: fix possible sleep in napi poll
can: m_can: add missing message RAM initialization
...

Linus Torvalds
2014-11-22 09:20:36 +0800
9a7e4f563 Merge tag 'sound-3.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound ... Browse Code »

Pull sound fixes from Takashi Iwai:
"This batch ended up as a relatively high volume due to pending ASoC
fixes. But most of fixes there are trivial and/or device- specific
fixes and quirks, so safe to apply. The only (ASoC) core fixes are
the DPCM race fix and the machine-driver matching fix for
componentization"

* tag 'sound-3.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - fix the mic mute led problem for Latitude E5550
ALSA: hda - move DELL_WMI_MIC_MUTE_LED to the tail in the quirk chain
ASoC: wm_adsp: Avoid attempt to free buffers that might still be in use
ALSA: usb-audio: Set the Control Selector to SU_SELECTOR_CONTROL for UAC2
ALSA: usb-audio: Add ctrl message delay quirk for Marantz/Denon devices
ASoC: sgtl5000: Fix SMALL_POP bit definition
ASoC: cs42l51: re-hook of_match_table pointer
ASoC: rt5670: change dapm routes of PLL connection
ASoC: rt5670: correct the incorrect default values
ASoC: samsung: Add MODULE_DEVICE_TABLE for Snow
ASoC: max98090: Correct pclk divisor settings
ASoC: dpcm: Fix race between FE/BE updates and trigger
ASoC: Fix snd_soc_find_dai() matching component by name
ASoC: rsnd: remove unsupported PAUSE flag
ASoC: fsi: remove unsupported PAUSE flag
ASoC: rt5645: Mark RT5645_TDM_CTRL_3 as readable
ASoC: rockchip-i2s: fix infinite loop in rockchip_snd_rxctrl
ASoC: es8328-i2c: Fix i2c_device_id name field in es8328_id
ASoC: fsl_asrc: Add reg_defaults for regmap to fix kernel dump

Linus Torvalds
2014-11-22 09:11:56 +0800