Doug / smarc-fsl-linux-kernel | Embedian Git Server

31 Aug, 2013

1 commit

a8787645e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net ... Browse Code »

Pull networking fixes from David Miller:

1) There was a simplification in the ipv6 ndisc packet sending
attempted here, which avoided using memory accounting on the
per-netns ndisc socket for sending NDISC packets. It did fix some
important issues, but it causes regressions so it gets reverted here
too. Specifically, the problem with this change is that the IPV6
output path really depends upon there being a valid skb->sk
attached.

The reason we want to do this change in some form when we figure out
how to do it right, is that if a device goes down the ndisc_sk
socket send queue will fill up and block NDISC packets that we want
to send to other devices too. That's really bad behavior.

Hopefully Thomas can come up with a better version of this change.

2) Fix a severe TCP performance regression by reverting a change made
to dev_pick_tx() quite some time ago. From Eric Dumazet.

3) TIPC returns wrongly signed error codes, fix from Erik Hugne.

4) Fix OOPS when doing IPSEC over ipv4 tunnels due to orphaning the
skb->sk too early. Fix from Li Hongjun.

5) RAW ipv4 sockets can use the wrong routing key during lookup, from
Chris Clark.

6) Similar to #1 revert an older change that tried to use plain
alloc_skb() for SYN/ACK TCP packets, this broke the netfilter owner
mark which needs to see the skb->sk for such frames. From Phil
Oester.

7) BNX2x driver bug fixes from Ariel Elior and Yuval Mintz,
specifically in the handling of virtual functions.

8) IPSEC path error propagations to sockets is not done properly when
we have v4 in v6, and v6 in v4 type rules. Fix from Hannes Frederic
Sowa.

9) Fix missing channel context release in mac80211, from Johannes Berg.

10) Fix network namespace handing wrt. SCM_RIGHTS, from Andy
Lutomirski.

11) Fix usage of bogus NAPI weight in jme, netxen, and ps3_gelic
drivers. From Michal Schmidt.

12) Hopefully a complete and correct fix for the genetlink dump locking
and module reference counting. From Pravin B Shelar.

13) sk_busy_loop() must do a cpu_relax(), from Eliezer Tamir.

14) Fix handling of timestamp offset when restoring a snapshotted TCP
socket. From Andrew Vagin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
net: fec: fix time stamping logic after napi conversion
net: bridge: convert MLDv2 Query MRC into msecs_to_jiffies for max_delay
mISDN: return -EINVAL on error in dsp_control_req()
net: revert 8728c544a9c ("net: dev_pick_tx() fix")
Revert "ipv6: Don't depend on per socket memory for neighbour discovery messages"
ipv4 tunnels: fix an oops when using ipip/sit with IPsec
tipc: set sk_err correctly when connection fails
tcp: tcp_make_synack() should use sock_wmalloc
bridge: separate querier and query timer into IGMP/IPv4 and MLD/IPv6 ones
ipv6: Don't depend on per socket memory for neighbour discovery messages
ipv4: sendto/hdrincl: don't use destination address found in header
tcp: don't apply tsoffset if rcv_tsecr is zero
tcp: initialize rcv_tstamp for restored sockets
net: xilinx: fix memleak
net: usb: Add HP hs2434 device to ZLP exception table
net: add cpu_relax to busy poll loop
net: stmmac: fixed the pbl setting with DT
genl: Hold reference on correct module while netlink-dump.
genl: Fix genl dumpit() locking.
xfrm: Fix potential null pointer dereference in xdst_queue_output
...

Linus Torvalds
2013-08-31 08:43:17 +0800

30 Aug, 2013

1 commit

41615e811 Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup fix from Tejun Heo:
"During the percpu reference counting update which was merged during
v3.11-rc1, the cgroup destruction path was updated so that a cgroup in
the process of dying may linger on the children list, which was
necessary as the cgroup should still be included in child/descendant
iteration while percpu ref is being killed.

Unfortunately, I forgot to update cgroup destruction path accordingly
and cgroup destruction may fail spuriously with -EBUSY due to
lingering dying children even when there's no live child left - e.g.
"rmdir parent/child parent" will usually fail.

This can be easily fixed by iterating through the children list to
verify that there's no live child left. While this is very late in
the release cycle, this bug is very visible to userland and I believe
the fix is relatively safe.

Thanks Hugh for spotting and providing fix for the issue"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix rmdir EBUSY regression in 3.11

Linus Torvalds
2013-08-30 08:03:48 +0800

29 Aug, 2013

3 commits

bb78a92f4 cgroup: fix rmdir EBUSY regression in 3.11 ... Browse Code »

On 3.11-rc we are seeing cgroup directories left behind when they should
have been removed. Here's a trivial reproducer:

cd /sys/fs/cgroup/memory
mkdir parent parent/child; rmdir parent/child parent
rmdir: failed to remove `parent': Device or resource busy

It's because cgroup_destroy_locked() (step 1 of destruction) leaves
cgroup on parent's children list, letting cgroup_offline_fn() (step 2 of
destruction) remove it; but step 2 is run by work queue, which may not
yet have removed the children when parent destruction checks the list.

Fix that by checking through a non-empty list of children: if every one
of them has already been marked CGRP_DEAD, then it's safe to proceed:
those children are invisible to userspace, and should not obstruct rmdir.

(I didn't see any reason to keep the cgrp->children checks under the
unrelated css_set_lock, so moved them out.)

tj: Flattened nested ifs a bit and updated comment so that it's
correct on both for-3.11-fixes and for-3.12.

Signed-off-by: Hugh Dickins
Signed-off-by: Tejun Heo

Hugh Dickins
2013-08-29 23:05:07 +0800
b22ce2785 workqueue: cond_resched() after processing each work item ... Browse Code »

If !PREEMPT, a kworker running work items back to back can hog CPU.
This becomes dangerous when a self-requeueing work item which is
waiting for something to happen races against stop_machine. Such
self-requeueing work item would requeue itself indefinitely hogging
the kworker and CPU it's running on while stop_machine would wait for
that CPU to enter stop_machine while preventing anything else from
happening on all other CPUs. The two would deadlock.

Jamie Liu reports that this deadlock scenario exists around
scsi_requeue_run_queue() and libata port multiplier support, where one
port may exclude command processing from other ports. With the right
timing, scsi_requeue_run_queue() can end up requeueing itself trying
to execute an IO which is asked to be retried while another device has
an exclusive access, which in turn can't make forward progress due to
stop_machine.

Fix it by invoking cond_resched() after executing each work item.

Signed-off-by: Tejun Heo
Reported-by: Jamie Liu
References: http://thread.gmane.org/gmane.linux.kernel/1552567
Cc: stable@vger.kernel.org
--
kernel/workqueue.c | 9 +++++++++
1 file changed, 9 insertions(+)

Tejun Heo
2013-08-29 21:19:28 +0800
84a78a650 timer_list: correct the iterator for timer_list ... Browse Code »

Correct an issue with /proc/timer_list reported by Holger.

When reading from the proc file with a sufficiently small buffer, 2k so
not really that small, there was one could get hung trying to read the
file a chunk at a time.

The timer_list_start function failed to account for the possibility that
the offset was adjusted outside the timer_list_next.

Signed-off-by: Nathan Zimmer
Reported-by: Holger Hans Peter Freyther
Cc: John Stultz
Cc: Thomas Gleixner
Cc: Berke Durak
Cc: Jeff Layton
Tested-by: Al Viro
Cc: # 3.10.x
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nathan Zimmer
2013-08-29 10:26:38 +0800

28 Aug, 2013

1 commit

c2b1df2eb Rename nsproxy.pid_ns to nsproxy.pid_ns_for_children ... Browse Code »

nsproxy.pid_ns is *not* the task's pid namespace. The name should clarify
that.

This makes it more obvious that setns on a pid namespace is weird --
it won't change the pid namespace shown in procfs.

Signed-off-by: Andy Lutomirski
Reviewed-by: "Eric W. Biederman"
Signed-off-by: David S. Miller

Andy Lutomirski
2013-08-28 01:52:52 +0800

24 Aug, 2013

1 commit

e2982a04e Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup fix from Tejun Heo:
"A late fix for cgroup.

This fixes a behavior regression visible to userland which was created
by a commit merged during -rc1. While the behavior change isn't too
likely to be noticeable, the fix is relatively low risk and we'll need
to backport it through -stable anyway if the bug gets released"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: fix a regression in validating config change

Linus Torvalds
2013-08-24 01:58:50 +0800

21 Aug, 2013

1 commit

1c09b195d cpuset: fix a regression in validating config change ... Browse Code »

It's not allowed to clear masks of a cpuset if there're tasks in it,
but it's broken:

# mkdir /cgroup/sub
# echo 0 > /cgroup/sub/cpuset.cpus
# echo 0 > /cgroup/sub/cpuset.mems
# echo $$ > /cgroup/sub/tasks
# echo > /cgroup/sub/cpuset.cpus
(should fail)

This bug was introduced by commit 88fa523bff295f1d60244a54833480b02f775152
("cpuset: allow to move tasks to empty cpusets").

tj: Dropped temp bool variables and nestes the conditionals directly.

Signed-off-by: Li Zefan
Signed-off-by: Tejun Heo

Li Zefan
2013-08-21 20:40:27 +0800

20 Aug, 2013

2 commits

e91dade52 Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull timer fixes from Ingo Molnar:
"Three small fixlets"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz: fix compile warning in tick_nohz_init()
nohz: Do not warn about unstable tsc unless user uses nohz_full
sched_clock: Fix integer overflow

Linus Torvalds
2013-08-20 00:17:35 +0800
2203547f8 kernel: fix new kernel-doc warning in wait.c ... Browse Code »

Fix new kernel-doc warnings in kernel/wait.c:

Warning(kernel/wait.c:374): No description found for parameter 'p'
Warning(kernel/wait.c:374): Excess function parameter 'word' description in 'wake_up_atomic_t'
Warning(kernel/wait.c:374): Excess function parameter 'bit' description in 'wake_up_atomic_t'

Signed-off-by: Randy Dunlap
Cc: David Howells
Signed-off-by: Linus Torvalds

Randy Dunlap
2013-08-20 00:08:54 +0800

18 Aug, 2013

1 commit

50e37ccea Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup fix from Tejun Heo:
"This contains one patch to fix the return value of cpuset's cgroups
interface function, which used to always return -ENODEV for the writes
on the 'memory_pressure_enabled' file"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cpuset: fix the return value of cpuset_write_u64()

Linus Torvalds
2013-08-18 23:51:28 +0800

17 Aug, 2013

1 commit

2d2843e61 Merge tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull power management fix from Rafael Wysocki:
"The removal of delayed_work_pending() checks from kernel/power/qos.c
done in 3.9 introduced a deadlock in pm_qos_work_fn().

Fix from Stephen Boyd"

* tag 'pm-3.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout()

Linus Torvalds
2013-08-17 00:59:00 +0800

15 Aug, 2013

1 commit

f1d6e17f5 Merge branch 'akpm' (patches from Andrew Morton) ... Browse Code »

Merge a bunch of fixes from Andrew Morton.

* emailed patches from Andrew Morton :
fs/proc/task_mmu.c: fix buffer overflow in add_page_map()
arch: *: Kconfig: add "kernel/Kconfig.freezer" to "arch/*/Kconfig"
ocfs2: fix null pointer dereference in ocfs2_dir_foreach_blk_id()
x86 get_unmapped_area(): use proper mmap base for bottom-up direction
ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page
ocfs2: Revert 40bd62e to avoid regression in extended allocation
drivers/rtc/rtc-stmp3xxx.c: provide timeout for potentially endless loop polling a HW bit
hugetlb: fix lockdep splat caused by pmd sharing
aoe: adjust ref of head for compound page tails
microblaze: fix clone syscall
mm: save soft-dirty bits on file pages
mm: save soft-dirty bits on swapped pages
memcg: don't initialize kmem-cache destroying work for root caches

Linus Torvalds
2013-08-15 01:04:43 +0800

14 Aug, 2013

3 commits

dfa9771a7 microblaze: fix clone syscall ... Browse Code »

Fix inadvertent breakage in the clone syscall ABI for Microblaze that
was introduced in commit f3268edbe6fe ("microblaze: switch to generic
fork/vfork/clone").

The Microblaze syscall ABI for clone takes the parent tid address in the
4th argument; the third argument slot is used for the stack size. The
incorrectly-used CLONE_BACKWARDS type assigned parent tid to the 3rd
slot.

This commit restores the original ABI so that existing userspace libc
code will work correctly.

All kernel versions from v3.8-rc1 were affected.

Signed-off-by: Michal Simek
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michal Simek
2013-08-14 08:57:48 +0800
28fbc8b6a Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler fixes from Ingo Molnar:
"Docbook fixes that make 99% of the diffstat, plus a oneliner fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Ensure update_cfs_shares() is called for parents of continuously-running tasks
sched: Fix some kernel-doc warnings

Linus Torvalds
2013-08-14 07:58:17 +0800
40fea92ff PM / QoS: Fix workqueue deadlock when using pm_qos_update_request_timeout() ... Browse Code »

pm_qos_update_request_timeout() updates a qos and then schedules
a delayed work item to bring the qos back down to the default
after the timeout. When the work item runs, pm_qos_work_fn() will
call pm_qos_update_request() and deadlock because it tries to
cancel itself via cancel_delayed_work_sync(). Future callers of
that qos will also hang waiting to cancel the work that is
canceling itself. Let's extract the little bit of code that does
the real work of pm_qos_update_request() and call it from the
work function so that we don't deadlock.

Before ed1ac6e (PM: don't use [delayed_]work_pending()) this didn't
happen because the work function wouldn't try to cancel itself.

Signed-off-by: Stephen Boyd
Reviewed-by: Tejun Heo
Cc: 3.9+ # 3.9+
Signed-off-by: Rafael J. Wysocki

Stephen Boyd
2013-08-14 06:42:05 +0800

13 Aug, 2013

4 commits

e0acd0a68 sched: fix the theoretical signal_wake_up() vs schedule() race ... Browse Code »

This is only theoretical, but after try_to_wake_up(p) was changed
to check p->state under p->pi_lock the code like

__set_current_state(TASK_INTERRUPTIBLE);
schedule();

can miss a signal. This is the special case of wait-for-condition,
it relies on try_to_wake_up/schedule interaction and thus it does
not need mb() between __set_current_state() and if(signal_pending).

However, this __set_current_state() can move into the critical
section protected by rq->lock, now that try_to_wake_up() takes
another lock we need to ensure that it can't be reordered with
"if (signal_pending(current))" check inside that section.

The patch is actually one-liner, it simply adds smp_wmb() before
spin_lock_irq(rq->lock). This is what try_to_wake_up() already
does by the same reason.

We turn this wmb() into the new helper, smp_mb__before_spinlock(),
for better documentation and to allow the architectures to change
the default implementation.

While at it, kill smp_mb__after_lock(), it has no callers.

Perhaps we can also add smp_mb__before/after_spinunlock() for
prepare_to_wait().

Signed-off-by: Oleg Nesterov
Acked-by: Peter Zijlstra
Signed-off-by: Linus Torvalds

Oleg Nesterov
2013-08-13 23:19:26 +0800
a903f0865 cpuset: fix the return value of cpuset_write_u64() ... Browse Code »

Writing to this file always returns -ENODEV:

# echo 1 > cpuset.memory_pressure_enabled
-bash: echo: write error: No such device

Signed-off-by: Li Zefan
Cc: # 3.9+
Signed-off-by: Tejun Heo

Li Zefan
2013-08-13 22:54:40 +0800
278225588 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull w/w mutex deadlock injection fix from Ingo Molnar.

This bug made the CONFIG_DEBUG_WW_MUTEX_SLOWPATH=y option largely
useless, but wouldn't affect normal users.

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mutex: Fix w/w mutex deadlock injection

Linus Torvalds
2013-08-13 03:01:28 +0800
ae920eb24 Merge branch 'fortglx/3.11/time' of git://git.linaro.org/people/jstultz/linux into timers/urgent ... Browse Code »

Pull small fix for v3.11 from John Stultz.

Signed-off-by: Ingo Molnar

Ingo Molnar
2013-08-13 00:08:23 +0800

09 Aug, 2013

1 commit

8742f229b userns: limit the maximum depth of user_namespace->parent chain ... Browse Code »

Ensure that user_namespace->parent chain can't grow too much.
Currently we use the hardroded 32 as limit.

Reported-by: Andy Lutomirski
Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2013-08-09 04:11:39 +0800

08 Aug, 2013

1 commit

b7bc9e7d8 Merge tag 'trace-fixes-3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/gi… ... Browse Code »

…t/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
"Oleg Nesterov has been working hard in closing all the holes that can
lead to race conditions between deleting an event and accessing an
event debugfs file. This included a fix to the debugfs system (acked
by Greg Kroah-Hartman). We think that all the holes have been patched
and hopefully we don't find more. I haven't marked all of them for
stable because I need to examine them more to figure out how far back
some of the changes need to go.

Along the way, some other fixes have been made. Alexander Z Lam fixed
some logic where the wrong buffer was being modifed.

Andrew Vagin found a possible corruption for machines that actually
allocate cpumask, as a reference to one was being zeroed out by
mistake.

Dhaval Giani found a bad prototype when tracing is not configured.

And I not only had some changes to help Oleg, but also finally fixed a
long standing bug that Dave Jones and others have been hitting, where
a module unload and reload can cause the function tracing accounting
to get screwed up"

* tag 'trace-fixes-3.11-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix reset of time stamps during trace_clock changes
tracing: Make TRACE_ITER_STOP_ON_FREE stop the correct buffer
tracing: Fix trace_dump_stack() proto when CONFIG_TRACING is not set
tracing: Fix fields of struct trace_iterator that are zeroed by mistake
tracing/uprobes: Fail to unregister if probe event files are in use
tracing/kprobes: Fail to unregister if probe event files are in use
tracing: Add comment to describe special break case in probe_remove_event_call()
tracing: trace_remove_event_call() should fail if call/file is in use
debugfs: debugfs_remove_recursive() must not rely on list_empty(d_subdirs)
ftrace: Check module functions being traced on reload
ftrace: Consolidate some duplicate code for updating ftrace ops
tracing: Change remove_event_file_dir() to clear "d_subdirs"->i_private
tracing: Introduce remove_event_file_dir()
tracing: Change f_start() to take event_mutex and verify i_private != NULL
tracing: Change event_filter_read/write to verify i_private != NULL
tracing: Change event_enable/disable_read() to verify i_private != NULL
tracing: Turn event/id->i_private into call->event.type

Linus Torvalds
2013-08-08 04:01:30 +0800

07 Aug, 2013

5 commits

8e28921e5 Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup ... Browse Code »

Pull cgroup fix from Tejun Heo:
"Fix for a minor memory leak bug in the cgroup init failure path"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix a leak when percpu_ref_init() fails

Linus Torvalds
2013-08-07 04:59:28 +0800
4264bc137 Merge branch 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq ... Browse Code »

Pull two workqueue fixes from Tejun Heo:
"A lockdep notation update so that nested work_on_cpu() invocations
don't lead to spurious lockdep warnings and fix for an unbound attr
bug which made what's shown in sysfs deviate from the actual ones.
Both patches have pretty limited scope"

* 'for-3.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: copy workqueue_attrs with all fields
workqueue: allow work_on_cpu() to be called recursively

Linus Torvalds
2013-08-07 04:58:34 +0800
2cfe6c4ac printk: Fix return of braille_register_console() ... Browse Code »

Some of my configs I test with have CONFIG_A11Y_BRAILLE_CONSOLE set.
When I started testing against v3.11-rc4 my console went bonkers. Using
ktest to bisect the issue, it came down to:

commit bbeddf52a "printk: move braille console support into separate
braille.[ch] files"

Looking into the patch I found the problem. It's with the return of
braille_register_console(). As anything other than NULL is considered a
failure.

But for those of us that have CONFIG_A11Y_BRAILLE_CONSOLE set but do not
define a "brl" or "brl=" on the command line, we still may want a
console that those with sight can still use.

Return NULL (success) if "brl" or "brl=" is not on the console line.

Signed-off-by: Steven Rostedt
Acked-by: Joe Perches
Cc: Andrew Morton
Signed-off-by: Linus Torvalds

Steven Rostedt
2013-08-07 04:18:12 +0800
35114fcbe Revert "ptrace: PTRACE_DETACH should do flush_ptrace_hw_breakpoint(child)" ... Browse Code »

This reverts commit fab840fc2d542fabcab903db8e03589a6702ba5f.

This commit even has the test-case to prove that the tracee
can be killed by SIGTRAP if the debugger does not remove the
breakpoints before PTRACE_DETACH.

However, this is exactly what wineserver deliberately does,
set_thread_context() calls PTRACE_ATTACH + PTRACE_DETACH just
for PTRACE_POKEUSER(DR*) in between.

So we should revert this fix and document that PTRACE_DETACH
should keep the breakpoints.

Reported-by: Felipe Contreras
Signed-off-by: Oleg Nesterov
Signed-off-by: Linus Torvalds

Oleg Nesterov
2013-08-07 04:16:32 +0800
6160968ce userns: unshare_userns(&cred) should not populate cred on failure ... Browse Code »

unshare_userns(new_cred) does *new_cred = prepare_creds() before
create_user_ns() which can fail. However, the caller expects that
it doesn't need to take care of new_cred if unshare_userns() fails.

We could change the single caller, sys_unshare(), but I think it
would be more clean to avoid the side effects on failure, so with
this patch unshare_userns() does put_cred() itself and initializes
*new_cred only if create_user_ns() succeeeds.

Cc: stable@vger.kernel.org
Signed-off-by: Oleg Nesterov
Reviewed-by: Andy Lutomirski
Signed-off-by: Linus Torvalds

Oleg Nesterov
2013-08-07 04:13:24 +0800

03 Aug, 2013

4 commits

9457158bb tracing: Fix reset of time stamps during trace_clock changes ... Browse Code »

Fixed two issues with changing the timestamp clock with trace_clock:

- The global buffer was reset on instance clock changes. Change this to pass
the correct per-instance buffer
- ftrace_now() is used to set buf->time_start in tracing_reset_online_cpus().
This was incorrect because ftrace_now() used the global buffer's clock to
return the current time. Change this to use buffer_ftrace_now() which
returns the current time for the correct per-instance buffer.

Also removed tracing_reset_current() because it is not used anywhere

Link: http://lkml.kernel.org/r/1375493777-17261-2-git-send-email-azl@google.com

Cc: Vaibhav Nagarnaik
Cc: David Sharp
Cc: Alexander Z Lam
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Alexander Z Lam
Signed-off-by: Steven Rostedt

Alexander Z Lam
2013-08-03 10:40:09 +0800
711e12437 tracing: Make TRACE_ITER_STOP_ON_FREE stop the correct buffer ... Browse Code »

Releasing the free_buffer file in an instance causes the global buffer
to be stopped when TRACE_ITER_STOP_ON_FREE is enabled. Operate on the
correct buffer.

Link: http://lkml.kernel.org/r/1375493777-17261-1-git-send-email-azl@google.com

Cc: Vaibhav Nagarnaik
Cc: David Sharp
Cc: Alexander Z Lam
Cc: stable@vger.kernel.org # 3.10
Signed-off-by: Alexander Z Lam
Signed-off-by: Steven Rostedt

Alexander Z Lam
2013-08-03 10:39:29 +0800
ed5467da0 tracing: Fix fields of struct trace_iterator that are zeroed by mistake ... Browse Code »

tracing_read_pipe zeros all fields bellow "seq". The declaration contains
a comment about that, but it doesn't help.

The first field is "snapshot", it's true when current open file is
snapshot. Looks obvious, that it should not be zeroed.

The second field is "started". It was converted from cpumask_t to
cpumask_var_t (v2.6.28-4983-g4462344), in other words it was
converted from cpumask to pointer on cpumask.

Currently the reference on "started" memory is lost after the first read
from tracing_read_pipe and a proper object will never be freed.

The "started" is never dereferenced for trace_pipe, because trace_pipe
can't have the TRACE_FILE_ANNOTATE options.

Link: http://lkml.kernel.org/r/1375463803-3085183-1-git-send-email-avagin@openvz.org

Cc: stable@vger.kernel.org # 2.6.30
Signed-off-by: Andrew Vagin
Signed-off-by: Steven Rostedt

Andrew Vagin
2013-08-03 10:28:41 +0800
1fe0135b9 Merge tag 'pm+acpi-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm ... Browse Code »

Pull ACPI and power management fixes from Rafael Wysocki:

- Revert two cpuidle commits added during the 3.8 development cycle
that turn out to have introduced a significant performance regression
as requested by Jeremy Eder.

- The recent patches that made the freezer less heavy-weight introduced
a regression causing user-space-driven hibernation using the ioctl()
interface to block indefinitely when the hibernate process executes
try_to_freeze(). Fix from Colin Cross addresses this by adding a
process flag to mark the hibernate/suspend process to inform the
freezer that that process should be ignored.

- One of the recent cpufreq reverts uncovered a problem in the core
causing the cpufreq driver module refcount to become negative after a
system suspend-resume cycle. Fix from Rafael J Wysocki.

- The evaluation of the ACPI battery _BIX method has never worked
correctly, because the commit that added support for it forgot to
take the "Revision" field in the return package into account. As a
result, the reading of battery info doesn't work at all on some
systems, which is addressed by a fix from Lan Tianyu.

* tag 'pm+acpi-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes
ACPI / battery: Fix parsing _BIX return value
cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
Revert "cpuidle: Quickly notice prediction failure for repeat mode"
Revert "cpuidle: Quickly notice prediction failure in general case"

Linus Torvalds
2013-08-03 03:21:32 +0800

02 Aug, 2013

1 commit

c6c2401d8 tracing/uprobes: Fail to unregister if probe event files are in use ... Browse Code »

Uprobes suffer the same problem that kprobes have. There's a race between
writing to the "enable" file and removing the probe. The probe checks for
it being in use and if it is not, goes about deleting the probe and the
event that represents it. But the problem with that is, after it checks
if it is in use it can be enabled, and the deletion of the event (access
to the probe) will fail, as it is in use. But the uprobe will still be
deleted. This is a problem as the event can reference the uprobe that
was deleted.

The fix is to remove the event first, and check to make sure the event
removal succeeds. Then it is safe to remove the probe.

When the event exists, either ftrace or perf can enable the probe and
prevent the event from being removed.

Link: http://lkml.kernel.org/r/20130704034038.991525256@goodmis.org

Acked-by: Oleg Nesterov
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-08-02 06:25:50 +0800

01 Aug, 2013

8 commits

2865a8fb4 workqueue: copy workqueue_attrs with all fields ... Browse Code »

$echo '0' > /sys/bus/workqueue/devices/xxx/numa
$cat /sys/bus/workqueue/devices/xxx/numa

I got 1. It should be 0, the reason is copy_workqueue_attrs() called
in apply_workqueue_attrs() doesn't copy no_numa field.

Fix it by making copy_workqueue_attrs() copy ->no_numa too. This
would also make get_unbound_pool() set a pool's ->no_numa attribute
according to the workqueue attributes used when the pool was created.
While harmelss, as ->no_numa isn't a pool attribute, this is a bit
confusing. Clear it explicitly.

tj: Updated description and comments a bit.

Signed-off-by: Shaohua Li
Signed-off-by: Tejun Heo
Cc: stable@vger.kernel.org

Shaohua Li
2013-08-01 20:36:40 +0800
40c325926 tracing/kprobes: Fail to unregister if probe event files are in use ... Browse Code »

When a probe is being removed, it cleans up the event files that correspond
to the probe. But there is a race between writing to one of these files
and deleting the probe. This is especially true for the "enable" file.

CPU 0 CPU 1
----- -----

fd = open("enable",O_WRONLY);

probes_open()
release_all_trace_probes()
unregister_trace_probe()
if (trace_probe_is_enabled(tp))
return -EBUSY

write(fd, "1", 1)
__ftrace_set_clr_event()
call->class->reg()
(kprobe_register)
enable_trace_probe(tp)

__unregister_trace_probe(tp);
list_del(&tp->list)
unregister_probe_event(tp) class->unreg
(kprobe_register)
disable_trace_probe(tp) ] probes_open+0x3b/0xa7
PGD 7808a067 PUD 0
Oops: 0000 [#1] PREEMPT SMP
Dumping ftrace buffer:
---------------------------------
Modules linked in: ipt_MASQUERADE sunrpc ip6t_REJECT nf_conntrack_ipv6
CPU: 1 PID: 2070 Comm: test-kprobe-rem Not tainted 3.11.0-rc3-test+ #47
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
task: ffff880077756440 ti: ffff880076e52000 task.ti: ffff880076e52000
RIP: 0010:[] [] probes_open+0x3b/0xa7
RSP: 0018:ffff880076e53c38 EFLAGS: 00010203
RAX: 0000000500000001 RBX: ffff88007844f440 RCX: 0000000000000003
RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff880076e52000
RBP: ffff880076e53c58 R08: ffff880076e53bd8 R09: 0000000000000000
R10: ffff880077756440 R11: 0000000000000006 R12: ffffffff810dee35
R13: ffff880079250418 R14: 0000000000000000 R15: ffff88007844f450
FS: 00007f87a276f700(0000) GS:ffff88007d480000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000005000000f9 CR3: 0000000077262000 CR4: 00000000000007e0
Stack:
ffff880076e53c58 ffffffff81219ea0 ffff88007844f440 ffffffff810dee35
ffff880076e53ca8 ffffffff81130f78 ffff8800772986c0 ffff8800796f93a0
ffffffff81d1b5d8 ffff880076e53e04 0000000000000000 ffff88007844f440
Call Trace:
[] ? security_file_open+0x2c/0x30
[] ? unregister_trace_probe+0x4b/0x4b
[] do_dentry_open+0x162/0x226
[] finish_open+0x46/0x54
[] do_last+0x7f6/0x996
[] ? inode_permission+0x42/0x44
[] path_openat+0x232/0x496
[] do_filp_open+0x3a/0x8a
[] ? __alloc_fd+0x168/0x17a
[] do_sys_open+0x70/0x102
[] ? trace_hardirqs_on_caller+0x160/0x197
[] SyS_open+0x1e/0x20
[] system_call_fastpath+0x16/0x1b
Code: e5 41 54 53 48 89 f3 48 83 ec 10 48 23 56 78 48 39 c2 75 6c 31 f6 48 c7
RIP [] probes_open+0x3b/0xa7
RSP
CR2: 00000005000000f9
---[ end trace 35f17d68fc569897 ]---

The unregister_trace_probe() must be done first, and if it fails it must
fail the removal of the kprobe.

Several changes have already been made by Oleg Nesterov and Masami Hiramatsu
to allow moving the unregister_probe_event() before the removal of
the probe and exit the function if it fails. This prevents the tp
structure from being used after it is freed.

Link: http://lkml.kernel.org/r/20130704034038.819592356@goodmis.org

Acked-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2013-08-01 10:25:46 +0800
19788a900 Merge branch 'akpm' (patches from Andrew Morton) ... Browse Code »

Merge more patches from Andrew Morton:
"A bunch of fixes.

Plus Joe's printk move and rework. It's not a -rc3 thing but now
would be a nice time to offload it, while things are quiet. I've been
sitting on it all for a couple of weeks, no issues"

* emailed patches from Andrew Morton :
vmpressure: make sure there are no events queued after memcg is offlined
vmpressure: do not check for pending work to prevent from new work
vmpressure: change vmpressure::sr_lock to spinlock
printk: rename struct log to struct printk_log
printk: use pointer for console_cmdline indexing
printk: move braille console support into separate braille.[ch] files
printk: add console_cmdline.h
printk: move to separate directory for easier modification
drivers/rtc/rtc-twl.c: fix: rtcX/wakealarm attribute isn't created
mm: zbud: fix condition check on allocation size
thp, mm: avoid PageUnevictable on active/inactive lru lists
mm/swap.c: clear PageActive before adding pages onto unevictable list
arch/x86/platform/ce4100/ce4100.c: include reboot.h
mm: sched: numa: fix NUMA balancing when !SCHED_DEBUG
rapidio: fix use after free in rio_unregister_scan()
.gitignore: ignore *.lz4 files
MAINTAINERS: dynamic debug: Jason's not there...
dmi_scan: add comments on dmi_present() and the loop in dmi_scan_machine()
ocfs2/refcounttree: add the missing NULL check of the return value of find_or_create_page()
mm: mempolicy: fix mbind_range() && vma_adjust() interaction

Linus Torvalds
2013-08-01 08:52:04 +0800
62e32ac35 printk: rename struct log to struct printk_log ... Browse Code »

Rename the struct to enable moving portions of
printk.c to separate files.

The rename changes output of /proc/vmcoreinfo.

Signed-off-by: Joe Perches
Cc: Samuel Thibault
Cc: Ming Lei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2013-08-01 05:41:03 +0800
23475408c printk: use pointer for console_cmdline indexing ... Browse Code »

Make the code a bit more compact by always using a pointer for the active
console_cmdline.

Move overly indented code to correct indent level.

Signed-off-by: Joe Perches
Cc: Samuel Thibault
Cc: Ming Lei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2013-08-01 05:41:03 +0800
bbeddf52a printk: move braille console support into separate braille.[ch] files ... Browse Code »

Create files with prototypes and static inlines for braille support. Make
braille_console functions return 1 on success.

Corrected CONFIG_A11Y_BRAILLE_CONSOLE=n _braille_console_setup
return value to NULL.

Signed-off-by: Joe Perches
Reviewed-by: Samuel Thibault
Cc: Ming Lei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2013-08-01 05:41:03 +0800
d197c43d0 printk: add console_cmdline.h ... Browse Code »

Add an include file for the console_cmdline struct so that the braille
console driver can be separated.

Signed-off-by: Joe Perches
Cc: Samuel Thibault
Cc: Ming Lei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2013-08-01 05:41:03 +0800
b9ee979e9 printk: move to separate directory for easier modification ... Browse Code »

Make it easier to break up printk into bite-sized chunks.

Remove printk path/filename from comment.

Signed-off-by: Joe Perches
Cc: Samuel Thibault
Cc: Ming Lei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2013-08-01 05:41:03 +0800