Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

10 Jan, 2015

1 commit

aa9291355 Merge tag 'for_linus-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb ... Browse Code »

Pull kgdb/kdb fixes from Jason Wessel:
"These have been around since 3.17 and in kgdb-next for the last 9
weeks and some will go back to -stable.

Summary of changes:

Cleanups
- kdb: Remove unused command flags, repeat flags and KDB_REPEAT_NONE

Fixes
- kgdb/kdb: Allow access on a single core, if a CPU round up is
deemed impossible, which will allow inspection of the now "trashed"
kernel
- kdb: Add enable mask for the command groups
- kdb: access controls to restrict sensitive commands"

* tag 'for_linus-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
kernel/debug/debug_core.c: Logging clean-up
kgdb: timeout if secondary CPUs ignore the roundup
kdb: Allow access to sensitive commands to be restricted by default
kdb: Add enable mask for groups of commands
kdb: Categorize kdb commands (similar to SysRq categorization)
kdb: Remove KDB_REPEAT_NONE flag
kdb: Use KDB_REPEAT_* values as flags
kdb: Rename kdb_register_repeat() to kdb_register_flags()
kdb: Rename kdb_repeat_t to kdb_cmdflags_t, cmd_repeat to cmd_flags
kdb: Remove currently unused kdbtab_t->cmd_flags

Linus Torvalds
2015-01-10 12:51:10 +0800

09 Jan, 2015

1 commit

3245d6aca exit: fix race between wait_consider_task() and wait_task_zombie() ... Browse Code »

wait_consider_task() checks EXIT_ZOMBIE after EXIT_DEAD/EXIT_TRACE and
both checks can fail if we race with EXIT_ZOMBIE -> EXIT_DEAD/EXIT_TRACE
change in between, gcc needs to reload p->exit_state after
security_task_wait(). In this case ->notask_error will be wrongly
cleared and do_wait() can hang forever if it was the last eligible
child.

Many thanks to Arne who carefully investigated the problem.

Note: this bug is very old but it was pure theoretical until commit
b3ab03160dfa ("wait: completely ignore the EXIT_DEAD tasks"). Before
this commit "-O2" was probably enough to guarantee that compiler won't
read ->exit_state twice.

Signed-off-by: Oleg Nesterov
Reported-by: Arne Goedeke
Tested-by: Arne Goedeke
Cc: [3.15+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Oleg Nesterov
2015-01-09 07:10:51 +0800

01 Jan, 2015

1 commit

5e0f872c7 Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit ... Browse Code »

Pull audit fix from Paul Moore:
"One audit patch to resolve a panic/oops when recording filenames in
the audit log, see the mail archive link below.

The fix isn't as nice as I would like, as it involves an allocate/copy
of the filename, but it solves the problem and the overhead should
only affect users who have configured audit rules involving file
names.

We'll revisit this issue with future kernels in an attempt to make
this suck less, but in the meantime I think this fix should go into
the next release of v3.19-rcX.

[ https://marc.info/?t=141986927600001&r=1&w=2 ]"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
audit: create private file name copies when auditing inodes

Linus Torvalds
2015-01-01 06:52:18 +0800

31 Dec, 2014

1 commit

2c90331cf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen.

2) Fix receive checksum handling in enic driver, from Govindarajulu
Varadarajan.

3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from
Herbert Xu. Also, add code to detect drivers that have this mistake
in the future.

4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai.

5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP
input path,f rom Nicolas Dichtel.

6) Fix MPLS action validation in openvswitch, from Pravin B Shelar.

7) Fix double SKB free in vxlan driver, also from Pravin.

8) When we scrub a packet, which happens when we are switching the
context of the packet (namespace, etc.), we should reset the
secmark. From Thomas Graf.

9) ->ndo_gso_check() needs to do more than return true/false, it also
has to allow the driver to clear netdev feature bits in order for
the caller to be able to proceed properly. From Jesse Gross.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
genetlink: A genl_bind() to an out-of-range multicast group should not WARN().
netlink/genetlink: pass network namespace to bind/unbind
ne2k-pci: Add pci_disable_device in error handling
bonding: change error message to debug message in __bond_release_one()
genetlink: pass multicast bind/unbind to families
netlink: call unbind when releasing socket
netlink: update listeners directly when removing socket
genetlink: pass only network namespace to genl_has_listeners()
netlink: rename netlink_unbind() to netlink_undo_bind()
net: Generalize ndo_gso_check to ndo_features_check
net: incorrect use of init_completion fixup
neigh: remove next ptr from struct neigh_table
net: xilinx: Remove unnecessary temac_property in the driver
net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
openvswitch: fix odd_ptr_err.cocci warnings
Bluetooth: Fix accepting connections when not using mgmt
Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR
brcmfmac: Do not crash if platform data is not populated
ipw2200: select CFG80211_WEXT
...

Linus Torvalds
2014-12-31 02:45:47 +0800

30 Dec, 2014

1 commit

fcf22d826 audit: create private file name copies when auditing inodes ... Browse Code »

Unfortunately, while commit 4a928436 ("audit: correctly record file
names with different path name types") fixed a problem where we were
not recording filenames, it created a new problem by attempting to use
these file names after they had been freed. This patch resolves the
issue by creating a copy of the filename which the audit subsystem
frees after it is done with the string.

At some point it would be nice to resolve this issue with refcounts,
or something similar, instead of having to allocate/copy strings, but
that is almost surely beyond the scope of a -rcX patch so we'll defer
that for later. On the plus side, only audit users should be impacted
by the string copying.

Reported-by: Toralf Foerster
Signed-off-by: Paul Moore

Paul Moore
2014-12-30 22:26:21 +0800

27 Dec, 2014

1 commit

023e2cfa3 netlink/genetlink: pass network namespace to bind/unbind ... Browse Code »

Netlink families can exist in multiple namespaces, and for the most
part multicast subscriptions are per network namespace. Thus it only
makes sense to have bind/unbind notifications per network namespace.

To achieve this, pass the network namespace of a given client socket
to the bind/unbind functions.

Also do this in generic netlink, and there also make sure that any
bind for multicast groups that only exist in init_net is rejected.
This isn't really a problem if it is accepted since a client in a
different namespace will never receive any notifications from such
a group, but it can confuse the family if not rejected (it's also
possible to silently (without telling the family) accept it, but it
would also have to be ignored on unbind so families that take any
kind of action on bind/unbind won't do unnecessary work for invalid
clients like that.

Signed-off-by: Johannes Berg
Signed-off-by: David S. Miller

Johannes Berg
2014-12-27 16:07:50 +0800

24 Dec, 2014

2 commits

66b3f4f0a Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit ... Browse Code »

Pull audit fixes from Paul Moore:
"Four patches to fix various problems with the audit subsystem, all are
fairly small and straightforward.

One patch fixes a problem where we weren't using the correct gfp
allocation flags (GFP_KERNEL regardless of context, oops), one patch
fixes a problem with old userspace tools (this was broken for a
while), one patch fixes a problem where we weren't recording pathnames
correctly, and one fixes a problem with PID based filters.

In general I don't think there is anything controversial with this
patchset, and it fixes some rather unfortunate bugs; the allocation
flag one can be particularly scary looking for users"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
audit: restore AUDIT_LOGINUID unset ABI
audit: correctly record file names with different path name types
audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb
audit: don't attempt to lookup PIDs when changing PID filtering audit rules

Linus Torvalds
2014-12-24 10:13:16 +0800
041d7b98f audit: restore AUDIT_LOGINUID unset ABI ... Browse Code »
3

A regression was caused by commit 780a7654cee8:
audit: Make testing for a valid loginuid explicit.
(which in turn attempted to fix a regression caused by e1760bd)

When audit_krule_to_data() fills in the rules to get a listing, there was a
missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.

This broke userspace by not returning the same information that was sent and
expected.

The rule:
auditctl -a exit,never -F auid=-1
gives:
auditctl -l
LIST_RULES: exit,never f24=0 syscall=all
when it should give:
LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all

Tag it so that it is reported the same way it was set. Create a new
private flags audit_krule field (pflags) to store it that won't interact with
the public one from the API.

Cc: stable@vger.kernel.org # v3.10-rc1+
Signed-off-by: Richard Guy Briggs
Signed-off-by: Paul Moore

Richard Guy Briggs
2014-12-24 05:40:18 +0800

23 Dec, 2014

1 commit

4a9284360 audit: correctly record file names with different path name types ... Browse Code »
13

There is a problem with the audit system when multiple audit records
are created for the same path, each with a different path name type.
The root cause of the problem is in __audit_inode() when an exact
match (both the path name and path name type) is not found for a
path name record; the existing code creates a new path name record,
but it never sets the path name in this record, leaving it NULL.
This patch corrects this problem by assigning the path name to these
newly created records.

There are many ways to reproduce this problem, but one of the
easiest is the following (assuming auditd is running):

# mkdir /root/tmp/test
# touch /root/tmp/test/567
# auditctl -a always,exit -F dir=/root/tmp/test
# touch /root/tmp/test/567

Afterwards, or while the commands above are running, check the audit
log and pay special attention to the PATH records. A faulty kernel
will display something like the following for the file creation:

type=SYSCALL msg=audit(1416957442.025:93): arch=c000003e syscall=2
success=yes exit=3 ... comm="touch" exe="/usr/bin/touch"
type=CWD msg=audit(1416957442.025:93): cwd="/root/tmp"
type=PATH msg=audit(1416957442.025:93): item=0 name="test/"
inode=401409 ... nametype=PARENT
type=PATH msg=audit(1416957442.025:93): item=1 name=(null)
inode=393804 ... nametype=NORMAL
type=PATH msg=audit(1416957442.025:93): item=2 name=(null)
inode=393804 ... nametype=NORMAL

While a patched kernel will show the following:

type=SYSCALL msg=audit(1416955786.566:89): arch=c000003e syscall=2
success=yes exit=3 ... comm="touch" exe="/usr/bin/touch"
type=CWD msg=audit(1416955786.566:89): cwd="/root/tmp"
type=PATH msg=audit(1416955786.566:89): item=0 name="test/"
inode=401409 ... nametype=PARENT
type=PATH msg=audit(1416955786.566:89): item=1 name="test/567"
inode=393804 ... nametype=NORMAL

This issue was brought up by a number of people, but special credit
should go to hujianyang@huawei.com for reporting the problem along
with an explanation of the problem and a patch. While the original
patch did have some problems (see the archive link below), it did
demonstrate the problem and helped kickstart the fix presented here.

* https://lkml.org/lkml/2014/9/5/66

Reported-by: hujianyang
Signed-off-by: Paul Moore
Acked-by: Richard Guy Briggs

Paul Moore
2014-12-23 01:27:39 +0800

21 Dec, 2014

1 commit

5d6a54688 Merge tag 'pm-config-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull CONFIG_PM_RUNTIME elimination from Rafael Wysocki:
"This removes the last few uses of CONFIG_PM_RUNTIME introduced
recently and makes that config option finally go away.

CONFIG_PM will be available directly from the menu now and also it
will be selected automatically if CONFIG_SUSPEND or CONFIG_HIBERNATION
is set"

* tag 'pm-config-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: Eliminate CONFIG_PM_RUNTIME
tty: 8250_omap: Replace CONFIG_PM_RUNTIME with CONFIG_PM
sound: sst-haswell-pcm: Replace CONFIG_PM_RUNTIME with CONFIG_PM
spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM

Linus Torvalds
2014-12-21 05:37:44 +0800

20 Dec, 2014

6 commits

54dc77d97 audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb ... Browse Code »

Eric Paris explains: Since kauditd_send_multicast_skb() gets called in
audit_log_end(), which can come from any context (aka even a sleeping context)
GFP_KERNEL can't be used. Since the audit_buffer knows what context it should
use, pass that down and use that.

See: https://lkml.org/lkml/2014/12/16/542

BUG: sleeping function called from invalid context at mm/slab.c:2849
in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin
2 locks held by sulogin/885:
#0: (&sig->cred_guard_mutex){+.+.+.}, at: [] prepare_bprm_creds+0x28/0x8b
#1: (tty_files_lock){+.+.+.}, at: [] selinux_bprm_committing_creds+0x55/0x22b
CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-20141216 #30
Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014
ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375
ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006
0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38
Call Trace:
[] dump_stack+0x50/0xa8
[] ___might_sleep+0x1b6/0x1be
[] __might_sleep+0x119/0x128
[] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f
[] kmem_cache_alloc+0x43/0x1c9
[] __alloc_skb+0x42/0x1a3
[] skb_copy+0x3e/0xa3
[] audit_log_end+0x83/0x100
[] ? avc_audit_pre_callback+0x103/0x103
[] common_lsm_audit+0x441/0x450
[] slow_avc_audit+0x63/0x67
[] avc_has_perm+0xca/0xe3
[] inode_has_perm+0x5a/0x65
[] selinux_bprm_committing_creds+0x98/0x22b
[] security_bprm_committing_creds+0xe/0x10
[] install_exec_creds+0xe/0x79
[] load_elf_binary+0xe36/0x10d7
[] search_binary_handler+0x81/0x18c
[] do_execveat_common.isra.31+0x4e3/0x7b7
[] do_execve+0x1f/0x21
[] SyS_execve+0x25/0x29
[] stub_execve+0x69/0xa0

Cc: stable@vger.kernel.org #v3.16-rc1
Reported-by: Valdis Kletnieks
Signed-off-by: Richard Guy Briggs
Tested-by: Valdis Kletnieks
Signed-off-by: Paul Moore

Richard Guy Briggs
2014-12-20 07:37:56 +0800
3640dcfa4 audit: don't attempt to lookup PIDs when changing PID filtering audit rules ... Browse Code »

Commit f1dc4867 ("audit: anchor all pid references in the initial pid
namespace") introduced a find_vpid() call when adding/removing audit
rules with PID/PPID filters; unfortunately this is problematic as
find_vpid() only works if there is a task with the associated PID
alive on the system. The following commands demonstrate a simple
reproducer.

# auditctl -D
# auditctl -l
# autrace /bin/true
# auditctl -l

This patch resolves the problem by simply using the PID provided by
the user without any additional validation, e.g. no calls to check to
see if the task/PID exists.

Cc: stable@vger.kernel.org # 3.15
Cc: Richard Guy Briggs
Signed-off-by: Paul Moore
Acked-by: Eric Paris
Reviewed-by: Richard Guy Briggs

Paul Moore
2014-12-20 07:35:53 +0800
464ed18eb PM: Eliminate CONFIG_PM_RUNTIME ... Browse Code »
5

Having switched over all of the users of CONFIG_PM_RUNTIME to use
CONFIG_PM directly, turn the latter into a user-selectable option
and drop the former entirely from the tree.

Signed-off-by: Rafael J. Wysocki
Reviewed-by: Ulf Hansson
Acked-by: Kevin Hilman

Rafael J. Wysocki
2014-12-20 05:55:06 +0800
4bb9374e0 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull NOHZ update from Thomas Gleixner:
"Remove the call into the nohz idle code from the fake 'idle' thread in
the powerclamp driver along with the export of those functions which
was smuggeled in via the thermal tree. People have tried to hack
around it in the nohz core code, but it just violates all rightful
assumptions of that code about the only valid calling context (i.e.
the proper idle task).

The powerclamp trainwreck will still work, it just wont get the
benefit of long idle sleeps"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/powerclamp: Remove tick_nohz_idle abuse

Linus Torvalds
2014-12-20 05:29:20 +0800
ac88ee3b6 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull irq core fix from Thomas Gleixner:
"A single fix plugging a long standing race between proc/stat and
proc/interrupts access and freeing of interrupt descriptors"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Prevent proc race against freeing of irq descriptors

Linus Torvalds
2014-12-20 05:26:08 +0800
88a57667f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes and cleanups from Ingo Molnar:
"A kernel fix plus mostly tooling fixes, but also some tooling
restructuring and cleanups"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
perf: Fix building warning on ARM 32
perf symbols: Fix use after free in filename__read_build_id
perf evlist: Use roundup_pow_of_two
tools: Adopt roundup_pow_of_two
perf tools: Make the mmap length autotuning more robust
tools: Adopt rounddown_pow_of_two and deps
tools: Adopt fls_long and deps
tools: Move bitops.h from tools/perf/util to tools/
tools: Introduce asm-generic/bitops.h
tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib
tools: Whitespace prep patches for moving bitops.h
tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/
tools: Move code originally from linux/log2.h to tools/include/linux/
tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h
perf evlist: Do not use hard coded value for a mmap_pages default
perf trace: Let the perf_evlist__mmap autosize the number of pages to use
perf evlist: Improve the strerror_mmap method
perf evlist: Clarify sterror_mmap variable names
perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg
perf trace: Provide a better explanation when mmap fails
...

Linus Torvalds
2014-12-20 05:15:24 +0800

19 Dec, 2014

3 commits

a5fd9733a tick/powerclamp: Remove tick_nohz_idle abuse ... Browse Code »
3

commit 4dbd27711cd9 "tick: export nohz tick idle symbols for module
use" was merged via the thermal tree without an explicit ack from the
relevant maintainers.

The exports are abused by the intel powerclamp driver which implements
a fake idle state from a sched FIFO task. This causes all kinds of
wreckage in the NOHZ core code which rightfully assumes that
tick_nohz_idle_enter/exit() are only called from the idle task itself.

Recent changes in the NOHZ core lead to a failure of the powerclamp
driver and now people try to hack completely broken and backwards
workarounds into the NOHZ core code. This is completely unacceptable
and just papers over the real problem. There are way more subtle
issues lurking around the corner.

The real solution is to fix the powerclamp driver by rewriting it with
a sane concept, but that's beyond the scope of this.

So the only solution for now is to remove the calls into the core NOHZ
code from the powerclamp trainwreck along with the exports.

Fixes: d6d71ee4a14a "PM: Introduce Intel PowerClamp Driver"
Signed-off-by: Thomas Gleixner
Cc: Preeti U Murthy
Cc: Viresh Kumar
Cc: Frederic Weisbecker
Cc: Fengguang Wu
Cc: Frederic Weisbecker
Cc: Pan Jacob jun
Cc: LKP
Cc: Peter Zijlstra
Cc: Zhang Rui
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412181110110.17382@nanos
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2014-12-19 21:05:52 +0800
d790be386 Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux ... Browse Code »

Pull module updates from Rusty Russell:
"The exciting thing here is the getting rid of stop_machine on module
removal. This is possible by using a simple atomic_t for the counter,
rather than our fancy per-cpu counter: it turns out that no one is
doing a module increment per net packet, so the slowdown should be in
the noise"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
param: do not set store func without write perm
params: cleanup sysfs allocation
kernel:module Fix coding style errors and warnings.
module: Remove stop_machine from module unloading
module: Replace module_ref with atomic_t refcnt
lib/bug: Use RCU list ops for module_bug_list
module: Unlink module with RCU synchronizing instead of stop_machine
module: Wait for RCU synchronizing before releasing a module

Linus Torvalds
2014-12-19 12:55:41 +0800
c0f486fde Merge tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull more ACPI and power management updates from Rafael Wysocki:
"These are regression fixes (leds-gpio, ACPI backlight driver,
operating performance points library, ACPI device enumeration
messages, cpupower tool), other bug fixes (ACPI EC driver, ACPI device
PM), some cleanups in the operating performance points (OPP)
framework, continuation of CONFIG_PM_RUNTIME elimination, a couple of
minor intel_pstate driver changes, a new MAINTAINERS entry for it and
an ACPI fan driver change needed for better support of thermal
management in user space.

Specifics:

- Fix a regression in leds-gpio introduced by a recent commit that
inadvertently changed the name of one of the properties used by the
driver (Fabio Estevam).

- Fix a regression in the ACPI backlight driver introduced by a
recent fix that missed one special case that had to be taken into
account (Aaron Lu).

- Drop the level of some new kernel messages from the ACPI core
introduced by a recent commit to KERN_DEBUG which they should have
used from the start and drop some other unuseful KERN_ERR messages
printed by ACPI (Rafael J Wysocki).

- Revert an incorrect commit modifying the cpupower tool (Prarit
Bhargava).

- Fix two regressions introduced by recent commits in the OPP library
and clean up some existing minor issues in that code (Viresh
Kumar).

- Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout the
tree (or drop it where that can be done) in order to make it
possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki, Ulf
Hansson, Ludovic Desroches).

There will be one more "CONFIG_PM_RUNTIME removal" batch after this
one, because some new uses of it have been introduced during the
current merge window, but that should be sufficient to finally get
rid of it.

- Make the ACPI EC driver more robust against race conditions related
to GPE handler installation failures (Lv Zheng).

- Prevent the ACPI device PM core code from attempting to disable
GPEs that it has not enabled which confuses ACPICA and makes it
report errors unnecessarily (Rafael J Wysocki).

- Add a "force" command line switch to the intel_pstate driver to
make it possible to override the blacklisting of some systems in
that driver if needed (Ethan Zhao).

- Improve intel_pstate code documentation and add a MAINTAINERS entry
for it (Kristen Carlson Accardi).

- Make the ACPI fan driver create cooling device interfaces witn
names that reflect the IDs of the ACPI device objects they are
associated with, except for "generic" ACPI fans (PNP ID "PNP0C0B").

That's necessary for user space thermal management tools to be able
to connect the fans with the parts of the system they are supposed
to be cooling properly. From Srinivas Pandruvada"

* tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
MAINTAINERS: add entry for intel_pstate
ACPI / video: update the skip case for acpi_video_device_in_dod()
power / PM: Eliminate CONFIG_PM_RUNTIME
NFC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
SCSI / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
ACPI / EC: Fix unexpected ec_remove_handlers() invocations
Revert "tools: cpupower: fix return checks for sysfs_get_idlestate_count()"
tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
x86 / PM: Replace CONFIG_PM_RUNTIME in io_apic.c
PM: Remove the SET_PM_RUNTIME_PM_OPS() macro
mmc: atmel-mci: use SET_RUNTIME_PM_OPS() macro
PM / Kconfig: Replace PM_RUNTIME with PM in dependencies
ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
sound / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
phy / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
video / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
tty / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM
ACPI / PM: Do not disable wakeup GPEs that have not been enabled
ACPI / utils: Drop error messages from acpi_evaluate_reference()
...

Linus Torvalds
2014-12-19 12:28:33 +0800

18 Dec, 2014

2 commits

b0a65b0cc param: do not set store func without write perm ... Browse Code »

When a module_param is defined without DAC write permissions, it can
still be changed at runtime and updated. Drivers using a 0444 permission
may be surprised that these values can still be changed.

For drivers that want to allow updates, any S_IW* flag will set the
"store" function as before. Drivers without S_IW* flags will have the
"store" function unset, unforcing a read-only value. Drivers that wish
neither "store" nor "get" can continue to use "0" for perms to stay out
of sysfs entirely.

Old behavior:
# cd /sys/module/snd/parameters
# ls -l
total 0
-r--r--r-- 1 root root 4096 Dec 11 13:55 cards_limit
-r--r--r-- 1 root root 4096 Dec 11 13:55 major
-r--r--r-- 1 root root 4096 Dec 11 13:55 slots
# cat major
116
# echo -1 > major
-bash: major: Permission denied
# chmod u+w major
# echo -1 > major
# cat major
-1

New behavior:
...
# chmod u+w major
# echo -1 > major
-bash: echo: write error: Input/output error

Signed-off-by: Kees Cook
Signed-off-by: Rusty Russell

Kees Cook
2014-12-18 10:08:51 +0800
87c31b39a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace ... Browse Code »

Pull user namespace related fixes from Eric Biederman:
"As these are bug fixes almost all of thes changes are marked for
backporting to stable.

The first change (implicitly adding MNT_NODEV on remount) addresses a
regression that was created when security issues with unprivileged
remount were closed. I go on to update the remount test to make it
easy to detect if this issue reoccurs.

Then there are a handful of mount and umount related fixes.

Then half of the changes deal with the a recently discovered design
bug in the permission checks of gid_map. Unix since the beginning has
allowed setting group permissions on files to less than the user and
other permissions (aka ---rwx---rwx). As the unix permission checks
stop as soon as a group matches, and setgroups allows setting groups
that can not later be dropped, results in a situtation where it is
possible to legitimately use a group to assign fewer privileges to a
process. Which means dropping a group can increase a processes
privileges.

The fix I have adopted is that gid_map is now no longer writable
without privilege unless the new file /proc/self/setgroups has been
set to permanently disable setgroups.

The bulk of user namespace using applications even the applications
using applications using user namespaces without privilege remain
unaffected by this change. Unfortunately this ix breaks a couple user
space applications, that were relying on the problematic behavior (one
of which was tools/selftests/mount/unprivileged-remount-test.c).

To hopefully prevent needing a regression fix on top of my security
fix I rounded folks who work with the container implementations mostly
like to be affected and encouraged them to test the changes.

> So far nothing broke on my libvirt-lxc test bed. :-)
> Tested with openSUSE 13.2 and libvirt 1.2.9.
> Tested-by: Richard Weinberger

> Tested on Fedora20 with libvirt 1.2.11, works fine.
> Tested-by: Chen Hanxiao

> Ok, thanks - yes, unprivileged lxc is working fine with your kernels.
> Just to be sure I was testing the right thing I also tested using
> my unprivileged nsexec testcases, and they failed on setgroup/setgid
> as now expected, and succeeded there without your patches.
> Tested-by: Serge Hallyn

> I tested this with Sandstorm. It breaks as is and it works if I add
> the setgroups thing.
> Tested-by: Andy Lutomirski # breaks things as designed :("

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
userns: Unbreak the unprivileged remount tests
userns; Correct the comment in map_write
userns: Allow setting gid_maps without privilege when setgroups is disabled
userns: Add a knob to disable setgroups on a per user namespace basis
userns: Rename id_map_mutex to userns_state_mutex
userns: Only allow the creator of the userns unprivileged mappings
userns: Check euid no fsuid when establishing an unprivileged uid mapping
userns: Don't allow unprivileged creation of gid mappings
userns: Don't allow setgroups until a gid mapping has been setablished
userns: Document what the invariant required for safe unprivileged mappings.
groups: Consolidate the setgroups permission checks
mnt: Clear mnt_expire during pivot_root
mnt: Carefully set CL_UNPRIVILEGED in clone_mnt
mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers.
umount: Do not allow unmounting rootfs.
umount: Disallow unprivileged mount force
mnt: Update unprivileged remount test
mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount

Linus Torvalds
2014-12-18 04:31:40 +0800

17 Dec, 2014

2 commits

603ba7e41 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs pile #2 from Al Viro:
"Next pile (and there'll be one or two more).

The large piece in this one is getting rid of /proc/*/ns/* weirdness;
among other things, it allows to (finally) make nameidata completely
opaque outside of fs/namei.c, making for easier further cleanups in
there"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
coda_venus_readdir(): use file_inode()
fs/namei.c: fold link_path_walk() call into path_init()
path_init(): don't bother with LOOKUP_PARENT in argument
fs/namei.c: new helper (path_cleanup())
path_init(): store the "base" pointer to file in nameidata itself
make default ->i_fop have ->open() fail with ENXIO
make nameidata completely opaque outside of fs/namei.c
kill proc_ns completely
take the targets of /proc/*/ns/* symlinks to separate fs
bury struct proc_ns in fs/proc
copy address of proc_ns_ops into ns_common
new helpers: ns_alloc_inum/ns_free_inum
make proc_ns_operations work with struct ns_common * instead of void *
switch the rest of proc_ns_operations to working with &...->ns
netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
common object embedded into various struct ....ns

Linus Torvalds
2014-12-17 07:53:03 +0800
a7c180aa7 Merge tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace ... Browse Code »

Pull tracing updates from Steven Rostedt:
"As the merge window is still open, and this code was not as complex as
I thought it might be. I'm pushing this in now.

This will allow Thomas to debug his irq work for 3.20.

This adds two new features:

1) Allow traceopoints to be enabled right after mm_init().

By passing in the trace_event= kernel command line parameter,
tracepoints can be enabled at boot up. For debugging things like
the initialization of interrupts, it is needed to have tracepoints
enabled very early. People have asked about this before and this
has been on my todo list. As it can be helpful for Thomas to debug
his upcoming 3.20 IRQ work, I'm pushing this now. This way he can
add tracepoints into the IRQ set up and have users enable them when
things go wrong.

2) Have the tracepoints printed via printk() (the console) when they
are triggered.

If the irq code locks up or reboots the box, having the tracepoint
output go into the kernel ring buffer is useless for debugging.
But being able to add the tp_printk kernel command line option
along with the trace_event= option will have these tracepoints
printed as they occur, and that can be really useful for debugging
early lock up or reboot problems.

This code is not that intrusive and it passed all my tests. Thomas
tried them out too and it works for his needs.

Link: http://lkml.kernel.org/r/20141214201609.126831471@goodmis.org"

* tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Add tp_printk cmdline to have tracepoints go to printk()
tracing: Move enabling tracepoints to just after rcu_init()

Linus Torvalds
2014-12-17 04:53:59 +0800

16 Dec, 2014

1 commit

988adfdff Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux ... Browse Code »

Pull drm updates from Dave Airlie:
"Highlights:

- AMD KFD driver merge

This is the AMD HSA interface for exposing a lowlevel interface for
GPGPU use. They have an open source userspace built on top of this
interface, and the code looks as good as it was going to get out of
tree.

- Initial atomic modesetting work

The need for an atomic modesetting interface to allow userspace to
try and send a complete set of modesetting state to the driver has
arisen, and been suffering from neglect this past year. No more,
the start of the common code and changes for msm driver to use it
are in this tree. Ongoing work to get the userspace ioctl finished
and the code clean will probably wait until next kernel.

- DisplayID 1.3 and tiled monitor exposed to userspace.

Tiled monitor property is now exposed for userspace to make use of.

- Rockchip drm driver merged.

- imx gpu driver moved out of staging

Other stuff:

- core:
panel - MIPI DSI + new panels.
expose suggested x/y properties for virtual GPUs

- i915:
Initial Skylake (SKL) support
gen3/4 reset work
start of dri1/ums removal
infoframe tracking
fixes for lots of things.

- nouveau:
tegra k1 voltage support
GM204 modesetting support
GT21x memory reclocking work

- radeon:
CI dpm fixes
GPUVM improvements
Initial DPM fan control

- rcar-du:
HDMI support added
removed some support for old boards
slave encoder driver for Analog Devices adv7511

- exynos:
Exynos4415 SoC support

- msm:
a4xx gpu support
atomic helper conversion

- tegra:
iommu support
universal plane support
ganged-mode DSI support

- sti:
HDMI i2c improvements

- vmwgfx:
some late fixes.

- qxl:
use suggested x/y properties"

* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits)
drm: sti: fix module compilation issue
drm/i915: save/restore GMBUS freq across suspend/resume on gen4
drm: sti: correctly cleanup CRTC and planes
drm: sti: add HQVDP plane
drm: sti: add cursor plane
drm: sti: enable auxiliary CRTC
drm: sti: fix delay in VTG programming
drm: sti: prepare sti_tvout to support auxiliary crtc
drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
drm: sti: fix hdmi avi infoframe
drm: sti: remove event lock while disabling vblank
drm: sti: simplify gdp code
drm: sti: clear all mixer control
drm: sti: remove gpio for HDMI hot plug detection
drm: sti: allow to change hdmi ddc i2c adapter
drm/doc: Document drm_add_modes_noedid() usage
drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
drm: Zero out DRM object memory upon cleanup
drm/i915/bdw: Fix the write setting up the WIZ hashing mode
...

Linus Torvalds
2014-12-16 07:52:01 +0800

15 Dec, 2014

3 commits

0daa23029 tracing: Add tp_printk cmdline to have tracepoints go to printk() ... Browse Code »

Add the kernel command line tp_printk option that will have tracepoints
that are active sent to printk() as well as to the trace buffer.

Passing "tp_printk" will activate this. To turn it off, the sysctl
/proc/sys/kernel/tracepoint_printk can have '0' echoed into it. Note,
this only works if the cmdline option is used. Echoing 1 into the sysctl
file without the cmdline option will have no affect.

Note, this is a dangerous option. Having high frequency tracepoints send
their data to printk() can possibly cause a live lock. This is another
reason why this is only active if the command line option is used.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos

Suggested-by: Thomas Gleixner
Tested-by: Thomas Gleixner
Acked-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-12-15 23:17:38 +0800
5f893b263 tracing: Move enabling tracepoints to just after rcu_init() ... Browse Code »
13

Enabling tracepoints at boot up can be very useful. The tracepoint
can be initialized right after RCU has been. There's no need to
wait for the early_initcall() to be called. That's too late for some
things that can use tracepoints for debugging. Move the logic to
enable tracepoints out of the initcalls and into init/main.c to
right after rcu_init().

This also allows trace_printk() to be used early too.

Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos
Link: http://lkml.kernel.org/r/20141214164104.307127356@goodmis.org

Reviewed-by: Paul E. McKenney
Suggested-by: Thomas Gleixner
Tested-by: Thomas Gleixner
Acked-by: Thomas Gleixner
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-12-15 23:16:50 +0800
37da7bbbe Merge tag 'tty-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty ... Browse Code »

Pull tty/serial driver updates from Greg KH:
"Here's the big tty/serial driver update for 3.19-rc1.

There are a number of TTY core changes/fixes in here from Peter Hurley
that have all been teted in linux-next for a long time now. There are
also the normal serial driver updates as well, full details in the
changelog below"

* tag 'tty-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (219 commits)
serial: pxa: hold port.lock when reporting modem line changes
tty-hvsi_lib: Deletion of an unnecessary check before the function call "tty_kref_put"
tty: Deletion of unnecessary checks before two function calls
n_tty: Fix read_buf race condition, increment read_head after pushing data
serial: of-serial: add PM suspend/resume support
Revert "serial: of-serial: add PM suspend/resume support"
Revert "serial: of-serial: fix up PM ops on no_console_suspend and port type"
serial: 8250: don't attempt a trylock if in sysrq
serial: core: Add big-endian iotype
serial: samsung: use port->fifosize instead of hardcoded values
serial: samsung: prefer to use fifosize from driver data
serial: samsung: fix style problems
serial: samsung: wait for transfer completion before clock disable
serial: icom: fix error return code
serial: tegra: clean up tty-flag assignments
serial: Fix io address assign flow with Fintek PCI-to-UART Product
serial: mxs-auart: fix tx_empty against shift register
serial: mxs-auart: fix gpio change detection on interrupt
serial: mxs-auart: Fix mxs_auart_set_ldisc()
serial: 8250_dw: Use 64-bit access for OCTEON.
...

Linus Torvalds
2014-12-15 07:23:32 +0800

14 Dec, 2014

11 commits

caf292ae5 Merge branch 'for-3.19/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block driver core update from Jens Axboe:
"This is the pull request for the core block IO changes for 3.19. Not
a huge round this time, mostly lots of little good fixes:

- Fix a bug in sysfs blktrace interface causing a NULL pointer
dereference, when enabled/disabled through that API. From Arianna
Avanzini.

- Various updates/fixes/improvements for blk-mq:

- A set of updates from Bart, mostly fixing buts in the tag
handling.

- Cleanup/code consolidation from Christoph.

- Extend queue_rq API to be able to handle batching issues of IO
requests. NVMe will utilize this shortly. From me.

- A few tag and request handling updates from me.

- Cleanup of the preempt handling for running queues from Paolo.

- Prevent running of unmapped hardware queues from Ming Lei.

- Move the kdump memory limiting check to be in the correct
location, from Shaohua.

- Initialize all software queues at init time from Takashi. This
prevents a kobject warning when CPUs are brought online that
weren't online when a queue was registered.

- Single writeback fix for I_DIRTY clearing from Tejun. Queued with
the core IO changes, since it's just a single fix.

- Version X of the __bio_add_page() segment addition retry from
Maurizio. Hope the Xth time is the charm.

- Documentation fixup for IO scheduler merging from Jan.

- Introduce (and use) generic IO stat accounting helpers for non-rq
drivers, from Gu Zheng.

- Kill off artificial limiting of max sectors in a request from
Christoph"

* 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits)
bio: modify __bio_add_page() to accept pages that don't start a new segment
blk-mq: Fix uninitialized kobject at CPU hotplugging
blktrace: don't let the sysfs interface remove trace from running list
blk-mq: Use all available hardware queues
blk-mq: Micro-optimize bt_get()
blk-mq: Fix a race between bt_clear_tag() and bt_get()
blk-mq: Avoid that __bt_get_word() wraps multiple times
blk-mq: Fix a use-after-free
blk-mq: prevent unmapped hw queue from being scheduled
blk-mq: re-check for available tags after running the hardware queue
blk-mq: fix hang in bt_get()
blk-mq: move the kdump check to blk_mq_alloc_tag_set
blk-mq: cleanup tag free handling
blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq cpu map
blk: introduce generic io stat accounting help function
blk-mq: handle the single queue case in blk_mq_hctx_next_cpu
genhd: check for int overflow in disk_expand_part_tbl()
blk-mq: add blk_mq_free_hctx_request()
blk-mq: export blk_mq_free_request()
blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable
...

Linus Torvalds
2014-12-14 06:14:23 +0800
8f4385d59 Merge tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/g… ... Browse Code »

…it/rostedt/linux-trace

Pull tracing fixlet from Steven Rostedt:
"Remove unnecessary preempt_disable in printk()"

* tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
printk: Do not disable preemption for accessing printk_func

Linus Torvalds
2014-12-14 06:04:41 +0800
a99abce2d Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit ... Browse Code »

Pull audit updates from Paul Moore:
"Two small patches from the audit next branch; only one of which has
any real significant code changes, the other is simply a MAINTAINERS
update for audit.

The single code patch is pretty small and rather straightforward, it
changes the audit "version" number reported to userspace from an
integer to a bitmap which is used to indicate the functionality of the
running kernel. This really doesn't have much impact on the kernel,
but it will make life easier for the audit userspace folks.

Thankfully we were still on a version number which allowed us to do
this without breaking userspace"

* 'upstream' of git://git.infradead.org/users/pcmoore/audit:
audit: convert status version to a feature bitmap
audit: add Paul Moore to the MAINTAINERS entry

Linus Torvalds
2014-12-14 05:41:28 +0800
0809ab69a fsnotify: unify inode and mount marks handling ... Browse Code »

There's a lot of common code in inode and mount marks handling. Factor it
out to a common helper function.

Signed-off-by: Jan Kara
Cc: Eric Paris
Cc: Heinrich Schuchardt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jan Kara
2014-12-14 04:42:53 +0800
957e3facd gcov: enable GCOV_PROFILE_ALL from ARCH Kconfigs ... Browse Code »

Following the suggestions from Andrew Morton and Stephen Rothwell,
Dont expand the ARCH list in kernel/gcov/Kconfig. Instead,
define a ARCH_HAS_GCOV_PROFILE_ALL bool which architectures
can enable.

set ARCH_HAS_GCOV_PROFILE_ALL on Architectures where it was
previously allowed + ARM64 which I tested.

Signed-off-by: Riku Voipio
Cc: Peter Oberparleiter
Cc: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Riku Voipio
2014-12-14 04:42:51 +0800
d5393955c kexec: remove unnecessary KERN_ERR from kexec.c ... Browse Code »

Remove unnecessary KERN_ERR from pr_err() within kexec.c.

Signed-off-by: Masanari Iida
Acked-by: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masanari Iida
2014-12-14 04:42:51 +0800
51f39a1f0 syscalls: implement execveat() system call ... Browse Code »
13

This patchset adds execveat(2) for x86, and is derived from Meredydd
Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).

The primary aim of adding an execveat syscall is to allow an
implementation of fexecve(3) that does not rely on the /proc filesystem,
at least for executables (rather than scripts). The current glibc version
of fexecve(3) is implemented via /proc, which causes problems in sandboxed
or otherwise restricted environments.

Given the desire for a /proc-free fexecve() implementation, HPA suggested
(https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
an appropriate generalization.

Also, having a new syscall means that it can take a flags argument without
back-compatibility concerns. The current implementation just defines the
AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
added in future -- for example, flags for new namespaces (as suggested at
https://lkml.org/lkml/2006/7/11/474).

Related history:
- https://lkml.org/lkml/2006/12/27/123 is an example of someone
realizing that fexecve() is likely to fail in a chroot environment.
- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
documenting the /proc requirement of fexecve(3) in its manpage, to
"prevent other people from wasting their time".
- https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
problem where a process that did setuid() could not fexecve()
because it no longer had access to /proc/self/fd; this has since
been fixed.

This patch (of 4):

Add a new execveat(2) system call. execveat() is to execve() as openat()
is to open(): it takes a file descriptor that refers to a directory, and
resolves the filename relative to that.

In addition, if the filename is empty and AT_EMPTY_PATH is specified,
execveat() executes the file to which the file descriptor refers. This
replicates the functionality of fexecve(), which is a system call in other
UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/" (and
so relies on /proc being mounted).

The filename fed to the executed program as argv[0] (or the name of the
script fed to a script interpreter) will be of the form "/dev/fd/"
(for an empty filename) or "/dev/fd//", effectively
reflecting how the executable was found. This does however mean that
execution of a script in a /proc-less environment won't work; also, script
execution via an O_CLOEXEC file descriptor fails (as the file will not be
accessible after exec).

Based on patches by Meredydd Luff.

Signed-off-by: David Drysdale
Cc: Meredydd Luff
Cc: Shuah Khan
Cc: "Eric W. Biederman"
Cc: Andy Lutomirski
Cc: Alexander Viro
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Kees Cook
Cc: Arnd Bergmann
Cc: Rich Felker
Cc: Christoph Hellwig
Cc: Michael Kerrisk
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

David Drysdale
2014-12-14 04:42:51 +0800
9a92a6ce6 stacktrace: introduce snprint_stack_trace for buffer output ... Browse Code »

Current stacktrace only have the function for console output. page_owner
that will be introduced in following patch needs to print the output of
stacktrace into the buffer for our own output format so so new function,
snprint_stack_trace(), is needed.

Signed-off-by: Joonsoo Kim
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Minchan Kim
Cc: Dave Hansen
Cc: Michal Nazarewicz
Cc: Jungsoo Son
Cc: Ingo Molnar
Cc: Joonsoo Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2014-12-14 04:42:48 +0800
4a23717a2 uprobes: share the i_mmap_rwsem ... Browse Code »

Both register and unregister call build_map_info() in order to create the
list of mappings before installing or removing breakpoints for every mm
which maps file backed memory. As such, there is no reason to hold the
i_mmap_rwsem exclusively, so share it and allow concurrent readers to
build the mapping data.

Signed-off-by: Davidlohr Bueso
Acked-by: Srikar Dronamraju
Acked-by: "Kirill A. Shutemov"
Cc: Oleg Nesterov
Acked-by: Hugh Dickins
Acked-by: Peter Zijlstra (Intel)
Cc: Rik van Riel
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-12-14 04:42:45 +0800
c8c06efa8 mm: convert i_mmap_mutex to rwsem ... Browse Code »

The i_mmap_mutex is a close cousin of the anon vma lock, both protecting
similar data, one for file backed pages and the other for anon memory. To
this end, this lock can also be a rwsem. In addition, there are some
important opportunities to share the lock when there are no tree
modifications.

This conversion is straightforward. For now, all users take the write
lock.

[sfr@canb.auug.org.au: update fremap.c]
Signed-off-by: Davidlohr Bueso
Reviewed-by: Rik van Riel
Acked-by: "Kirill A. Shutemov"
Acked-by: Hugh Dickins
Cc: Oleg Nesterov
Acked-by: Peter Zijlstra (Intel)
Cc: Srikar Dronamraju
Acked-by: Mel Gorman
Signed-off-by: Stephen Rothwell
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-12-14 04:42:45 +0800
83cde9e8b mm: use new helper functions around the i_mmap_mutex ... Browse Code »

Convert all open coded mutex_lock/unlock calls to the
i_mmap_[lock/unlock]_write() helpers.

Signed-off-by: Davidlohr Bueso
Acked-by: Rik van Riel
Acked-by: "Kirill A. Shutemov"
Acked-by: Hugh Dickins
Cc: Oleg Nesterov
Acked-by: Peter Zijlstra (Intel)
Cc: Srikar Dronamraju
Acked-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Davidlohr Bueso
2014-12-14 04:42:45 +0800

13 Dec, 2014

2 commits

c291ee622 genirq: Prevent proc race against freeing of irq descriptors ... Browse Code »
3

Since the rework of the sparse interrupt code to actually free the
unused interrupt descriptors there exists a race between the /proc
interfaces to the irq subsystem and the code which frees the interrupt
descriptor.

CPU0 CPU1
show_interrupts()
desc = irq_to_desc(X);
free_desc(desc)
remove_from_radix_tree();
kfree(desc);
raw_spinlock_irq(&desc->lock);

/proc/interrupts is the only interface which can actively corrupt
kernel memory via the lock access. /proc/stat can only read from freed
memory. Extremly hard to trigger, but possible.

The interfaces in /proc/irq/N/ are not affected by this because the
removal of the proc file is serialized in procfs against concurrent
readers/writers. The removal happens before the descriptor is freed.

For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
as the descriptor is never freed. It's merely cleared out with the irq
descriptor lock held. So any concurrent proc access will either see
the old correct value or the cleared out ones.

Protect the lookup and access to the irq descriptor in
show_interrupts() with the sparse_irq_lock.

Provide kstat_irqs_usr() which is protecting the lookup and access
with sparse_irq_lock and switch /proc/stat to use it.

Document the existing kstat_irqs interfaces so it's clear that the
caller needs to take care about protection. The users of these
interfaces are either not affected due to SPARSE_IRQ=n or already
protected against removal.

Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
Signed-off-by: Thomas Gleixner
Cc: stable@vger.kernel.org

Thomas Gleixner
2014-12-13 20:33:07 +0800
798bc6d8d tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM ... Browse Code »

After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
selected) PM_RUNTIME is always set if PM is set, so files that are
build conditionally if CONFIG_PM_RUNTIME is set may now be build
if CONFIG_PM is set.

Replace CONFIG_PM_RUNTIME with CONFIG_PM in kernel/trace/Makefile
for this reason.

Signed-off-by: Rafael J. Wysocki
Acked-by: Steven Rostedt <rostedt@goodmis.org.

Rafael J. Wysocki
2014-12-13 09:23:30 +0800