12 Jan, 2015

3 commits

  • Pull scheduler fixes from Ingo Molnar:
    "Misc fixes: group scheduling corner case fix, two deadline scheduler
    fixes, effective_load() overflow fix, nested sleep fix, 6144 CPUs
    system fix"

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group()
    sched/deadline: Avoid double-accounting in case of missed deadlines
    sched/deadline: Fix migration of SCHED_DEADLINE tasks
    sched: Fix odd values in effective_load() calculations
    sched, fanotify: Deal with nested sleeps
    sched: Fix KMALLOC_MAX_SIZE overflow during cpumask allocation

    Linus Torvalds
     
  • Pull perf fixes from Ingo Molnar:
    "Mostly tooling fixes, but also some kernel side fixes: uncore PMU
    driver fix, user regs sampling fix and an instruction decoder fix that
    unbreaks PEBS precise sampling"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes
    perf/x86_64: Improve user regs sampling
    perf: Move task_pt_regs sampling into arch code
    x86: Fix off-by-one in instruction decoder
    perf hists browser: Fix segfault when showing callchain
    perf callchain: Free callchains when hist entries are deleted
    perf hists: Fix children sort key behavior
    perf diff: Fix to sort by baseline field by default
    perf list: Fix --raw-dump option
    perf probe: Fix crash in dwarf_getcfi_elf
    perf probe: Fix to fall back to find probe point in symbols
    perf callchain: Append callchains only when requested
    perf ui/tui: Print backtrace symbols when segfault occurs
    perf report: Show progress bar for output resorting

    Linus Torvalds
     
  • Pull locking fixes from Ingo Molnar:
    "A liblockdep fix and a mutex_unlock() mutex-debugging fix"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    mutex: Always clear owner field upon mutex_unlock()
    tools/liblockdep: Fix debug_check thinko in mutex destroy

    Linus Torvalds
     

10 Jan, 2015

1 commit

  • Pull kgdb/kdb fixes from Jason Wessel:
    "These have been around since 3.17 and in kgdb-next for the last 9
    weeks and some will go back to -stable.

    Summary of changes:

    Cleanups
    - kdb: Remove unused command flags, repeat flags and KDB_REPEAT_NONE

    Fixes
    - kgdb/kdb: Allow access on a single core, if a CPU round up is
    deemed impossible, which will allow inspection of the now "trashed"
    kernel
    - kdb: Add enable mask for the command groups
    - kdb: access controls to restrict sensitive commands"

    * tag 'for_linus-3.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/kgdb:
    kernel/debug/debug_core.c: Logging clean-up
    kgdb: timeout if secondary CPUs ignore the roundup
    kdb: Allow access to sensitive commands to be restricted by default
    kdb: Add enable mask for groups of commands
    kdb: Categorize kdb commands (similar to SysRq categorization)
    kdb: Remove KDB_REPEAT_NONE flag
    kdb: Use KDB_REPEAT_* values as flags
    kdb: Rename kdb_register_repeat() to kdb_register_flags()
    kdb: Rename kdb_repeat_t to kdb_cmdflags_t, cmd_repeat to cmd_flags
    kdb: Remove currently unused kdbtab_t->cmd_flags

    Linus Torvalds
     

09 Jan, 2015

7 commits

  • Currently if DEBUG_MUTEXES is enabled, the mutex->owner field is only
    cleared iff debug_locks is active. This exposes a race to other users of
    the field where the mutex->owner may be still set to a stale value,
    potentially upsetting mutex_spin_on_owner() among others.

    References: https://bugs.freedesktop.org/show_bug.cgi?id=87955
    Signed-off-by: Chris Wilson
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Davidlohr Bueso
    Cc: Daniel Vetter
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1420540175-30204-1-git-send-email-chris@chris-wilson.co.uk
    Signed-off-by: Ingo Molnar

    Chris Wilson
     
  • When alloc_fair_sched_group() in sched_create_group() fails,
    free_sched_group() is called, and free_fair_sched_group() is called by
    free_sched_group(). Since destroy_cfs_bandwidth() is called by
    free_fair_sched_group() without calling init_cfs_bandwidth(),
    RCU stall occurs at hrtimer_cancel():

    INFO: rcu_sched self-detected stall on CPU { 1} (t=60000 jiffies g=13074 c=13073 q=0)
    Task dump for CPU 1:
    (fprintd) R running task 0 6249 1 0x00000088
    ...
    Call Trace:
    [] sched_show_task+0xa8/0x110
    [] dump_cpu_task+0x3d/0x50
    [] rcu_dump_cpu_stacks+0x90/0xd0
    [] rcu_check_callbacks+0x491/0x700
    [] update_process_times+0x4b/0x80
    [] tick_sched_handle.isra.20+0x36/0x50
    [] tick_sched_timer+0x42/0x70
    [] __run_hrtimer+0x69/0x1a0
    [] ? tick_sched_handle.isra.20+0x50/0x50
    [] hrtimer_interrupt+0xef/0x230
    [] local_apic_timer_interrupt+0x3b/0x70
    [] smp_apic_timer_interrupt+0x45/0x60
    [] apic_timer_interrupt+0x6d/0x80
    [] ? lock_hrtimer_base.isra.23+0x18/0x50
    [] ? __kmalloc+0x211/0x230
    [] hrtimer_try_to_cancel+0x22/0xd0
    [] ? __kmalloc+0x211/0x230
    [] hrtimer_cancel+0x22/0x30
    [] free_fair_sched_group+0x25/0xd0
    [] free_sched_group+0x16/0x40
    [] sched_create_group+0x4b/0x80
    [] sched_autogroup_create_attach+0x43/0x1c0
    [] sys_setsid+0x7c/0x110
    [] system_call_fastpath+0x12/0x17

    Check whether init_cfs_bandwidth() was called before calling
    destroy_cfs_bandwidth().

    Signed-off-by: Tetsuo Handa
    [ Move the check into destroy_cfs_bandwidth() to aid compilability. ]
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Paul Turner
    Cc: Ben Segall
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/201412252210.GCC30204.SOMVFFOtQJFLOH@I-love.SAKURA.ne.jp
    Signed-off-by: Ingo Molnar

    Tetsuo Handa
     
  • The dl_runtime_exceeded() function is supposed to ckeck if
    a SCHED_DEADLINE task must be throttled, by checking if its
    current runtime is
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Juri Lelli
    Cc:
    Cc: Dario Faggioli
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1418813432-20797-3-git-send-email-luca.abeni@unitn.it
    Signed-off-by: Ingo Molnar

    Luca Abeni
     
  • According to global EDF, tasks should be migrated between runqueues
    without checking if their scheduling deadlines and runtimes are valid.
    However, SCHED_DEADLINE currently performs such a check:
    a migration happens doing:

    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, later_rq->cpu);
    activate_task(later_rq, next_task, 0);

    which ends up calling dequeue_task_dl(), setting the new CPU, and then
    calling enqueue_task_dl().

    enqueue_task_dl() then calls enqueue_dl_entity(), which calls
    update_dl_entity(), which can modify scheduling deadline and runtime,
    breaking global EDF scheduling.

    As a result, some of the properties of global EDF are not respected:
    for example, a taskset {(30, 80), (40, 80), (120, 170)} scheduled on
    two cores can have unbounded response times for the third task even
    if 30/80+40/80+120/170 = 1.5809 < 2

    This can be fixed by invoking update_dl_entity() only in case of
    wakeup, or if this is a new SCHED_DEADLINE task.

    Signed-off-by: Luca Abeni
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Juri Lelli
    Cc:
    Cc: Dario Faggioli
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1418813432-20797-2-git-send-email-luca.abeni@unitn.it
    Signed-off-by: Ingo Molnar

    Luca Abeni
     
  • In effective_load, we have (long w * unsigned long tg->shares) / long W,
    when w is negative, it is cast to unsigned long and hence the product is
    insanely large. Fix this by casting tg->shares to long.

    Reported-by: Sasha Levin
    Signed-off-by: Yuyang Du
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Dave Jones
    Cc: Andrey Ryabinin
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20141219002956.GA25405@intel.com
    Signed-off-by: Ingo Molnar

    Yuyang Du
     
  • On x86_64, at least, task_pt_regs may be only partially initialized
    in many contexts, so x86_64 should not use it without extra care
    from interrupt context, let alone NMI context.

    This will allow x86_64 to override the logic and will supply some
    scratch space to use to make a cleaner copy of user regs.

    Tested-by: Jiri Olsa
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Stephane Eranian
    Cc: chenggang.qcg@taobao.com
    Cc: Wu Fengguang
    Cc: Namhyung Kim
    Cc: Mike Galbraith
    Cc: Arjan van de Ven
    Cc: David Ahern
    Cc: Arnaldo Carvalho de Melo
    Cc: Catalin Marinas
    Cc: Jean Pihet
    Cc: Linus Torvalds
    Cc: Mark Salter
    Cc: Russell King
    Cc: Will Deacon
    Cc: linux-arm-kernel@lists.infradead.org
    Link: http://lkml.kernel.org/r/e431cd4c18c2e1c44c774f10758527fb2d1025c4.1420396372.git.luto@amacapital.net
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • wait_consider_task() checks EXIT_ZOMBIE after EXIT_DEAD/EXIT_TRACE and
    both checks can fail if we race with EXIT_ZOMBIE -> EXIT_DEAD/EXIT_TRACE
    change in between, gcc needs to reload p->exit_state after
    security_task_wait(). In this case ->notask_error will be wrongly
    cleared and do_wait() can hang forever if it was the last eligible
    child.

    Many thanks to Arne who carefully investigated the problem.

    Note: this bug is very old but it was pure theoretical until commit
    b3ab03160dfa ("wait: completely ignore the EXIT_DEAD tasks"). Before
    this commit "-O2" was probably enough to guarantee that compiler won't
    read ->exit_state twice.

    Signed-off-by: Oleg Nesterov
    Reported-by: Arne Goedeke
    Tested-by: Arne Goedeke
    Cc: [3.15+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

01 Jan, 2015

1 commit

  • Pull audit fix from Paul Moore:
    "One audit patch to resolve a panic/oops when recording filenames in
    the audit log, see the mail archive link below.

    The fix isn't as nice as I would like, as it involves an allocate/copy
    of the filename, but it solves the problem and the overhead should
    only affect users who have configured audit rules involving file
    names.

    We'll revisit this issue with future kernels in an attempt to make
    this suck less, but in the meantime I think this fix should go into
    the next release of v3.19-rcX.

    [ https://marc.info/?t=141986927600001&r=1&w=2 ]"

    * 'upstream' of git://git.infradead.org/users/pcmoore/audit:
    audit: create private file name copies when auditing inodes

    Linus Torvalds
     

31 Dec, 2014

1 commit

  • Pull networking fixes from David Miller:

    1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen.

    2) Fix receive checksum handling in enic driver, from Govindarajulu
    Varadarajan.

    3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from
    Herbert Xu. Also, add code to detect drivers that have this mistake
    in the future.

    4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai.

    5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP
    input path,f rom Nicolas Dichtel.

    6) Fix MPLS action validation in openvswitch, from Pravin B Shelar.

    7) Fix double SKB free in vxlan driver, also from Pravin.

    8) When we scrub a packet, which happens when we are switching the
    context of the packet (namespace, etc.), we should reset the
    secmark. From Thomas Graf.

    9) ->ndo_gso_check() needs to do more than return true/false, it also
    has to allow the driver to clear netdev feature bits in order for
    the caller to be able to proceed properly. From Jesse Gross.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
    genetlink: A genl_bind() to an out-of-range multicast group should not WARN().
    netlink/genetlink: pass network namespace to bind/unbind
    ne2k-pci: Add pci_disable_device in error handling
    bonding: change error message to debug message in __bond_release_one()
    genetlink: pass multicast bind/unbind to families
    netlink: call unbind when releasing socket
    netlink: update listeners directly when removing socket
    genetlink: pass only network namespace to genl_has_listeners()
    netlink: rename netlink_unbind() to netlink_undo_bind()
    net: Generalize ndo_gso_check to ndo_features_check
    net: incorrect use of init_completion fixup
    neigh: remove next ptr from struct neigh_table
    net: xilinx: Remove unnecessary temac_property in the driver
    net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
    net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
    openvswitch: fix odd_ptr_err.cocci warnings
    Bluetooth: Fix accepting connections when not using mgmt
    Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR
    brcmfmac: Do not crash if platform data is not populated
    ipw2200: select CFG80211_WEXT
    ...

    Linus Torvalds
     

30 Dec, 2014

1 commit

  • Unfortunately, while commit 4a928436 ("audit: correctly record file
    names with different path name types") fixed a problem where we were
    not recording filenames, it created a new problem by attempting to use
    these file names after they had been freed. This patch resolves the
    issue by creating a copy of the filename which the audit subsystem
    frees after it is done with the string.

    At some point it would be nice to resolve this issue with refcounts,
    or something similar, instead of having to allocate/copy strings, but
    that is almost surely beyond the scope of a -rcX patch so we'll defer
    that for later. On the plus side, only audit users should be impacted
    by the string copying.

    Reported-by: Toralf Foerster
    Signed-off-by: Paul Moore

    Paul Moore
     

27 Dec, 2014

1 commit

  • Netlink families can exist in multiple namespaces, and for the most
    part multicast subscriptions are per network namespace. Thus it only
    makes sense to have bind/unbind notifications per network namespace.

    To achieve this, pass the network namespace of a given client socket
    to the bind/unbind functions.

    Also do this in generic netlink, and there also make sure that any
    bind for multicast groups that only exist in init_net is rejected.
    This isn't really a problem if it is accepted since a client in a
    different namespace will never receive any notifications from such
    a group, but it can confuse the family if not rejected (it's also
    possible to silently (without telling the family) accept it, but it
    would also have to be ignored on unbind so families that take any
    kind of action on bind/unbind won't do unnecessary work for invalid
    clients like that.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

24 Dec, 2014

2 commits

  • Pull audit fixes from Paul Moore:
    "Four patches to fix various problems with the audit subsystem, all are
    fairly small and straightforward.

    One patch fixes a problem where we weren't using the correct gfp
    allocation flags (GFP_KERNEL regardless of context, oops), one patch
    fixes a problem with old userspace tools (this was broken for a
    while), one patch fixes a problem where we weren't recording pathnames
    correctly, and one fixes a problem with PID based filters.

    In general I don't think there is anything controversial with this
    patchset, and it fixes some rather unfortunate bugs; the allocation
    flag one can be particularly scary looking for users"

    * 'upstream' of git://git.infradead.org/users/pcmoore/audit:
    audit: restore AUDIT_LOGINUID unset ABI
    audit: correctly record file names with different path name types
    audit: use supplied gfp_mask from audit_buffer in kauditd_send_multicast_skb
    audit: don't attempt to lookup PIDs when changing PID filtering audit rules

    Linus Torvalds
     
  • A regression was caused by commit 780a7654cee8:
    audit: Make testing for a valid loginuid explicit.
    (which in turn attempted to fix a regression caused by e1760bd)

    When audit_krule_to_data() fills in the rules to get a listing, there was a
    missing clause to convert back from AUDIT_LOGINUID_SET to AUDIT_LOGINUID.

    This broke userspace by not returning the same information that was sent and
    expected.

    The rule:
    auditctl -a exit,never -F auid=-1
    gives:
    auditctl -l
    LIST_RULES: exit,never f24=0 syscall=all
    when it should give:
    LIST_RULES: exit,never auid=-1 (0xffffffff) syscall=all

    Tag it so that it is reported the same way it was set. Create a new
    private flags audit_krule field (pflags) to store it that won't interact with
    the public one from the API.

    Cc: stable@vger.kernel.org # v3.10-rc1+
    Signed-off-by: Richard Guy Briggs
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     

23 Dec, 2014

2 commits

  • When allocating space for load_balance_mask, in sched_init, when
    CPUMASK_OFFSTACK is set, we've managed to spill over
    KMALLOC_MAX_SIZE on our 6144 core machine. The patch below
    breaks up the allocations so that they don't overflow the max
    alloc size. It also allocates the masks on the the node from
    which they'll most commonly be accessed, to minimize remote
    accesses on NUMA machines.

    Suggested-by: George Beshers
    Signed-off-by: Alex Thorlton
    Cc: George Beshers
    Cc: Russ Anderson
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/1418928270-148543-1-git-send-email-athorlton@sgi.com
    Signed-off-by: Ingo Molnar

    Alex Thorlton
     
  • There is a problem with the audit system when multiple audit records
    are created for the same path, each with a different path name type.
    The root cause of the problem is in __audit_inode() when an exact
    match (both the path name and path name type) is not found for a
    path name record; the existing code creates a new path name record,
    but it never sets the path name in this record, leaving it NULL.
    This patch corrects this problem by assigning the path name to these
    newly created records.

    There are many ways to reproduce this problem, but one of the
    easiest is the following (assuming auditd is running):

    # mkdir /root/tmp/test
    # touch /root/tmp/test/567
    # auditctl -a always,exit -F dir=/root/tmp/test
    # touch /root/tmp/test/567

    Afterwards, or while the commands above are running, check the audit
    log and pay special attention to the PATH records. A faulty kernel
    will display something like the following for the file creation:

    type=SYSCALL msg=audit(1416957442.025:93): arch=c000003e syscall=2
    success=yes exit=3 ... comm="touch" exe="/usr/bin/touch"
    type=CWD msg=audit(1416957442.025:93): cwd="/root/tmp"
    type=PATH msg=audit(1416957442.025:93): item=0 name="test/"
    inode=401409 ... nametype=PARENT
    type=PATH msg=audit(1416957442.025:93): item=1 name=(null)
    inode=393804 ... nametype=NORMAL
    type=PATH msg=audit(1416957442.025:93): item=2 name=(null)
    inode=393804 ... nametype=NORMAL

    While a patched kernel will show the following:

    type=SYSCALL msg=audit(1416955786.566:89): arch=c000003e syscall=2
    success=yes exit=3 ... comm="touch" exe="/usr/bin/touch"
    type=CWD msg=audit(1416955786.566:89): cwd="/root/tmp"
    type=PATH msg=audit(1416955786.566:89): item=0 name="test/"
    inode=401409 ... nametype=PARENT
    type=PATH msg=audit(1416955786.566:89): item=1 name="test/567"
    inode=393804 ... nametype=NORMAL

    This issue was brought up by a number of people, but special credit
    should go to hujianyang@huawei.com for reporting the problem along
    with an explanation of the problem and a patch. While the original
    patch did have some problems (see the archive link below), it did
    demonstrate the problem and helped kickstart the fix presented here.

    * https://lkml.org/lkml/2014/9/5/66

    Reported-by: hujianyang
    Signed-off-by: Paul Moore
    Acked-by: Richard Guy Briggs

    Paul Moore
     

21 Dec, 2014

1 commit

  • Pull CONFIG_PM_RUNTIME elimination from Rafael Wysocki:
    "This removes the last few uses of CONFIG_PM_RUNTIME introduced
    recently and makes that config option finally go away.

    CONFIG_PM will be available directly from the menu now and also it
    will be selected automatically if CONFIG_SUSPEND or CONFIG_HIBERNATION
    is set"

    * tag 'pm-config-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM: Eliminate CONFIG_PM_RUNTIME
    tty: 8250_omap: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    sound: sst-haswell-pcm: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM

    Linus Torvalds
     

20 Dec, 2014

6 commits

  • Eric Paris explains: Since kauditd_send_multicast_skb() gets called in
    audit_log_end(), which can come from any context (aka even a sleeping context)
    GFP_KERNEL can't be used. Since the audit_buffer knows what context it should
    use, pass that down and use that.

    See: https://lkml.org/lkml/2014/12/16/542

    BUG: sleeping function called from invalid context at mm/slab.c:2849
    in_atomic(): 1, irqs_disabled(): 0, pid: 885, name: sulogin
    2 locks held by sulogin/885:
    #0: (&sig->cred_guard_mutex){+.+.+.}, at: [] prepare_bprm_creds+0x28/0x8b
    #1: (tty_files_lock){+.+.+.}, at: [] selinux_bprm_committing_creds+0x55/0x22b
    CPU: 1 PID: 885 Comm: sulogin Not tainted 3.18.0-next-20141216 #30
    Hardware name: Dell Inc. Latitude E6530/07Y85M, BIOS A15 06/20/2014
    ffff880223744f10 ffff88022410f9b8 ffffffff916ba529 0000000000000375
    ffff880223744f10 ffff88022410f9e8 ffffffff91063185 0000000000000006
    0000000000000000 0000000000000000 0000000000000000 ffff88022410fa38
    Call Trace:
    [] dump_stack+0x50/0xa8
    [] ___might_sleep+0x1b6/0x1be
    [] __might_sleep+0x119/0x128
    [] cache_alloc_debugcheck_before.isra.45+0x1d/0x1f
    [] kmem_cache_alloc+0x43/0x1c9
    [] __alloc_skb+0x42/0x1a3
    [] skb_copy+0x3e/0xa3
    [] audit_log_end+0x83/0x100
    [] ? avc_audit_pre_callback+0x103/0x103
    [] common_lsm_audit+0x441/0x450
    [] slow_avc_audit+0x63/0x67
    [] avc_has_perm+0xca/0xe3
    [] inode_has_perm+0x5a/0x65
    [] selinux_bprm_committing_creds+0x98/0x22b
    [] security_bprm_committing_creds+0xe/0x10
    [] install_exec_creds+0xe/0x79
    [] load_elf_binary+0xe36/0x10d7
    [] search_binary_handler+0x81/0x18c
    [] do_execveat_common.isra.31+0x4e3/0x7b7
    [] do_execve+0x1f/0x21
    [] SyS_execve+0x25/0x29
    [] stub_execve+0x69/0xa0

    Cc: stable@vger.kernel.org #v3.16-rc1
    Reported-by: Valdis Kletnieks
    Signed-off-by: Richard Guy Briggs
    Tested-by: Valdis Kletnieks
    Signed-off-by: Paul Moore

    Richard Guy Briggs
     
  • Commit f1dc4867 ("audit: anchor all pid references in the initial pid
    namespace") introduced a find_vpid() call when adding/removing audit
    rules with PID/PPID filters; unfortunately this is problematic as
    find_vpid() only works if there is a task with the associated PID
    alive on the system. The following commands demonstrate a simple
    reproducer.

    # auditctl -D
    # auditctl -l
    # autrace /bin/true
    # auditctl -l

    This patch resolves the problem by simply using the PID provided by
    the user without any additional validation, e.g. no calls to check to
    see if the task/PID exists.

    Cc: stable@vger.kernel.org # 3.15
    Cc: Richard Guy Briggs
    Signed-off-by: Paul Moore
    Acked-by: Eric Paris
    Reviewed-by: Richard Guy Briggs

    Paul Moore
     
  • Having switched over all of the users of CONFIG_PM_RUNTIME to use
    CONFIG_PM directly, turn the latter into a user-selectable option
    and drop the former entirely from the tree.

    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Ulf Hansson
    Acked-by: Kevin Hilman

    Rafael J. Wysocki
     
  • Pull NOHZ update from Thomas Gleixner:
    "Remove the call into the nohz idle code from the fake 'idle' thread in
    the powerclamp driver along with the export of those functions which
    was smuggeled in via the thermal tree. People have tried to hack
    around it in the nohz core code, but it just violates all rightful
    assumptions of that code about the only valid calling context (i.e.
    the proper idle task).

    The powerclamp trainwreck will still work, it just wont get the
    benefit of long idle sleeps"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    tick/powerclamp: Remove tick_nohz_idle abuse

    Linus Torvalds
     
  • Pull irq core fix from Thomas Gleixner:
    "A single fix plugging a long standing race between proc/stat and
    proc/interrupts access and freeing of interrupt descriptors"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    genirq: Prevent proc race against freeing of irq descriptors

    Linus Torvalds
     
  • Pull perf fixes and cleanups from Ingo Molnar:
    "A kernel fix plus mostly tooling fixes, but also some tooling
    restructuring and cleanups"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
    perf: Fix building warning on ARM 32
    perf symbols: Fix use after free in filename__read_build_id
    perf evlist: Use roundup_pow_of_two
    tools: Adopt roundup_pow_of_two
    perf tools: Make the mmap length autotuning more robust
    tools: Adopt rounddown_pow_of_two and deps
    tools: Adopt fls_long and deps
    tools: Move bitops.h from tools/perf/util to tools/
    tools: Introduce asm-generic/bitops.h
    tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib
    tools: Whitespace prep patches for moving bitops.h
    tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/
    tools: Move code originally from linux/log2.h to tools/include/linux/
    tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h
    perf evlist: Do not use hard coded value for a mmap_pages default
    perf trace: Let the perf_evlist__mmap autosize the number of pages to use
    perf evlist: Improve the strerror_mmap method
    perf evlist: Clarify sterror_mmap variable names
    perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg
    perf trace: Provide a better explanation when mmap fails
    ...

    Linus Torvalds
     

19 Dec, 2014

3 commits

  • commit 4dbd27711cd9 "tick: export nohz tick idle symbols for module
    use" was merged via the thermal tree without an explicit ack from the
    relevant maintainers.

    The exports are abused by the intel powerclamp driver which implements
    a fake idle state from a sched FIFO task. This causes all kinds of
    wreckage in the NOHZ core code which rightfully assumes that
    tick_nohz_idle_enter/exit() are only called from the idle task itself.

    Recent changes in the NOHZ core lead to a failure of the powerclamp
    driver and now people try to hack completely broken and backwards
    workarounds into the NOHZ core code. This is completely unacceptable
    and just papers over the real problem. There are way more subtle
    issues lurking around the corner.

    The real solution is to fix the powerclamp driver by rewriting it with
    a sane concept, but that's beyond the scope of this.

    So the only solution for now is to remove the calls into the core NOHZ
    code from the powerclamp trainwreck along with the exports.

    Fixes: d6d71ee4a14a "PM: Introduce Intel PowerClamp Driver"
    Signed-off-by: Thomas Gleixner
    Cc: Preeti U Murthy
    Cc: Viresh Kumar
    Cc: Frederic Weisbecker
    Cc: Fengguang Wu
    Cc: Frederic Weisbecker
    Cc: Pan Jacob jun
    Cc: LKP
    Cc: Peter Zijlstra
    Cc: Zhang Rui
    Cc: stable@vger.kernel.org
    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412181110110.17382@nanos
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Pull module updates from Rusty Russell:
    "The exciting thing here is the getting rid of stop_machine on module
    removal. This is possible by using a simple atomic_t for the counter,
    rather than our fancy per-cpu counter: it turns out that no one is
    doing a module increment per net packet, so the slowdown should be in
    the noise"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    param: do not set store func without write perm
    params: cleanup sysfs allocation
    kernel:module Fix coding style errors and warnings.
    module: Remove stop_machine from module unloading
    module: Replace module_ref with atomic_t refcnt
    lib/bug: Use RCU list ops for module_bug_list
    module: Unlink module with RCU synchronizing instead of stop_machine
    module: Wait for RCU synchronizing before releasing a module

    Linus Torvalds
     
  • Pull more ACPI and power management updates from Rafael Wysocki:
    "These are regression fixes (leds-gpio, ACPI backlight driver,
    operating performance points library, ACPI device enumeration
    messages, cpupower tool), other bug fixes (ACPI EC driver, ACPI device
    PM), some cleanups in the operating performance points (OPP)
    framework, continuation of CONFIG_PM_RUNTIME elimination, a couple of
    minor intel_pstate driver changes, a new MAINTAINERS entry for it and
    an ACPI fan driver change needed for better support of thermal
    management in user space.

    Specifics:

    - Fix a regression in leds-gpio introduced by a recent commit that
    inadvertently changed the name of one of the properties used by the
    driver (Fabio Estevam).

    - Fix a regression in the ACPI backlight driver introduced by a
    recent fix that missed one special case that had to be taken into
    account (Aaron Lu).

    - Drop the level of some new kernel messages from the ACPI core
    introduced by a recent commit to KERN_DEBUG which they should have
    used from the start and drop some other unuseful KERN_ERR messages
    printed by ACPI (Rafael J Wysocki).

    - Revert an incorrect commit modifying the cpupower tool (Prarit
    Bhargava).

    - Fix two regressions introduced by recent commits in the OPP library
    and clean up some existing minor issues in that code (Viresh
    Kumar).

    - Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout the
    tree (or drop it where that can be done) in order to make it
    possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki, Ulf
    Hansson, Ludovic Desroches).

    There will be one more "CONFIG_PM_RUNTIME removal" batch after this
    one, because some new uses of it have been introduced during the
    current merge window, but that should be sufficient to finally get
    rid of it.

    - Make the ACPI EC driver more robust against race conditions related
    to GPE handler installation failures (Lv Zheng).

    - Prevent the ACPI device PM core code from attempting to disable
    GPEs that it has not enabled which confuses ACPICA and makes it
    report errors unnecessarily (Rafael J Wysocki).

    - Add a "force" command line switch to the intel_pstate driver to
    make it possible to override the blacklisting of some systems in
    that driver if needed (Ethan Zhao).

    - Improve intel_pstate code documentation and add a MAINTAINERS entry
    for it (Kristen Carlson Accardi).

    - Make the ACPI fan driver create cooling device interfaces witn
    names that reflect the IDs of the ACPI device objects they are
    associated with, except for "generic" ACPI fans (PNP ID "PNP0C0B").

    That's necessary for user space thermal management tools to be able
    to connect the fans with the parts of the system they are supposed
    to be cooling properly. From Srinivas Pandruvada"

    * tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
    MAINTAINERS: add entry for intel_pstate
    ACPI / video: update the skip case for acpi_video_device_in_dod()
    power / PM: Eliminate CONFIG_PM_RUNTIME
    NFC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    SCSI / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    ACPI / EC: Fix unexpected ec_remove_handlers() invocations
    Revert "tools: cpupower: fix return checks for sysfs_get_idlestate_count()"
    tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    x86 / PM: Replace CONFIG_PM_RUNTIME in io_apic.c
    PM: Remove the SET_PM_RUNTIME_PM_OPS() macro
    mmc: atmel-mci: use SET_RUNTIME_PM_OPS() macro
    PM / Kconfig: Replace PM_RUNTIME with PM in dependencies
    ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    sound / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    phy / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    video / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    tty / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    ACPI / PM: Do not disable wakeup GPEs that have not been enabled
    ACPI / utils: Drop error messages from acpi_evaluate_reference()
    ...

    Linus Torvalds
     

18 Dec, 2014

2 commits

  • When a module_param is defined without DAC write permissions, it can
    still be changed at runtime and updated. Drivers using a 0444 permission
    may be surprised that these values can still be changed.

    For drivers that want to allow updates, any S_IW* flag will set the
    "store" function as before. Drivers without S_IW* flags will have the
    "store" function unset, unforcing a read-only value. Drivers that wish
    neither "store" nor "get" can continue to use "0" for perms to stay out
    of sysfs entirely.

    Old behavior:
    # cd /sys/module/snd/parameters
    # ls -l
    total 0
    -r--r--r-- 1 root root 4096 Dec 11 13:55 cards_limit
    -r--r--r-- 1 root root 4096 Dec 11 13:55 major
    -r--r--r-- 1 root root 4096 Dec 11 13:55 slots
    # cat major
    116
    # echo -1 > major
    -bash: major: Permission denied
    # chmod u+w major
    # echo -1 > major
    # cat major
    -1

    New behavior:
    ...
    # chmod u+w major
    # echo -1 > major
    -bash: echo: write error: Input/output error

    Signed-off-by: Kees Cook
    Signed-off-by: Rusty Russell

    Kees Cook
     
  • Pull user namespace related fixes from Eric Biederman:
    "As these are bug fixes almost all of thes changes are marked for
    backporting to stable.

    The first change (implicitly adding MNT_NODEV on remount) addresses a
    regression that was created when security issues with unprivileged
    remount were closed. I go on to update the remount test to make it
    easy to detect if this issue reoccurs.

    Then there are a handful of mount and umount related fixes.

    Then half of the changes deal with the a recently discovered design
    bug in the permission checks of gid_map. Unix since the beginning has
    allowed setting group permissions on files to less than the user and
    other permissions (aka ---rwx---rwx). As the unix permission checks
    stop as soon as a group matches, and setgroups allows setting groups
    that can not later be dropped, results in a situtation where it is
    possible to legitimately use a group to assign fewer privileges to a
    process. Which means dropping a group can increase a processes
    privileges.

    The fix I have adopted is that gid_map is now no longer writable
    without privilege unless the new file /proc/self/setgroups has been
    set to permanently disable setgroups.

    The bulk of user namespace using applications even the applications
    using applications using user namespaces without privilege remain
    unaffected by this change. Unfortunately this ix breaks a couple user
    space applications, that were relying on the problematic behavior (one
    of which was tools/selftests/mount/unprivileged-remount-test.c).

    To hopefully prevent needing a regression fix on top of my security
    fix I rounded folks who work with the container implementations mostly
    like to be affected and encouraged them to test the changes.

    > So far nothing broke on my libvirt-lxc test bed. :-)
    > Tested with openSUSE 13.2 and libvirt 1.2.9.
    > Tested-by: Richard Weinberger

    > Tested on Fedora20 with libvirt 1.2.11, works fine.
    > Tested-by: Chen Hanxiao

    > Ok, thanks - yes, unprivileged lxc is working fine with your kernels.
    > Just to be sure I was testing the right thing I also tested using
    > my unprivileged nsexec testcases, and they failed on setgroup/setgid
    > as now expected, and succeeded there without your patches.
    > Tested-by: Serge Hallyn

    > I tested this with Sandstorm. It breaks as is and it works if I add
    > the setgroups thing.
    > Tested-by: Andy Lutomirski # breaks things as designed :("

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Unbreak the unprivileged remount tests
    userns; Correct the comment in map_write
    userns: Allow setting gid_maps without privilege when setgroups is disabled
    userns: Add a knob to disable setgroups on a per user namespace basis
    userns: Rename id_map_mutex to userns_state_mutex
    userns: Only allow the creator of the userns unprivileged mappings
    userns: Check euid no fsuid when establishing an unprivileged uid mapping
    userns: Don't allow unprivileged creation of gid mappings
    userns: Don't allow setgroups until a gid mapping has been setablished
    userns: Document what the invariant required for safe unprivileged mappings.
    groups: Consolidate the setgroups permission checks
    mnt: Clear mnt_expire during pivot_root
    mnt: Carefully set CL_UNPRIVILEGED in clone_mnt
    mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers.
    umount: Do not allow unmounting rootfs.
    umount: Disallow unprivileged mount force
    mnt: Update unprivileged remount test
    mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount

    Linus Torvalds
     

17 Dec, 2014

2 commits

  • Pull vfs pile #2 from Al Viro:
    "Next pile (and there'll be one or two more).

    The large piece in this one is getting rid of /proc/*/ns/* weirdness;
    among other things, it allows to (finally) make nameidata completely
    opaque outside of fs/namei.c, making for easier further cleanups in
    there"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    coda_venus_readdir(): use file_inode()
    fs/namei.c: fold link_path_walk() call into path_init()
    path_init(): don't bother with LOOKUP_PARENT in argument
    fs/namei.c: new helper (path_cleanup())
    path_init(): store the "base" pointer to file in nameidata itself
    make default ->i_fop have ->open() fail with ENXIO
    make nameidata completely opaque outside of fs/namei.c
    kill proc_ns completely
    take the targets of /proc/*/ns/* symlinks to separate fs
    bury struct proc_ns in fs/proc
    copy address of proc_ns_ops into ns_common
    new helpers: ns_alloc_inum/ns_free_inum
    make proc_ns_operations work with struct ns_common * instead of void *
    switch the rest of proc_ns_operations to working with &...->ns
    netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
    make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
    common object embedded into various struct ....ns

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "As the merge window is still open, and this code was not as complex as
    I thought it might be. I'm pushing this in now.

    This will allow Thomas to debug his irq work for 3.20.

    This adds two new features:

    1) Allow traceopoints to be enabled right after mm_init().

    By passing in the trace_event= kernel command line parameter,
    tracepoints can be enabled at boot up. For debugging things like
    the initialization of interrupts, it is needed to have tracepoints
    enabled very early. People have asked about this before and this
    has been on my todo list. As it can be helpful for Thomas to debug
    his upcoming 3.20 IRQ work, I'm pushing this now. This way he can
    add tracepoints into the IRQ set up and have users enable them when
    things go wrong.

    2) Have the tracepoints printed via printk() (the console) when they
    are triggered.

    If the irq code locks up or reboots the box, having the tracepoint
    output go into the kernel ring buffer is useless for debugging.
    But being able to add the tp_printk kernel command line option
    along with the trace_event= option will have these tracepoints
    printed as they occur, and that can be really useful for debugging
    early lock up or reboot problems.

    This code is not that intrusive and it passed all my tests. Thomas
    tried them out too and it works for his needs.

    Link: http://lkml.kernel.org/r/20141214201609.126831471@goodmis.org"

    * tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Add tp_printk cmdline to have tracepoints go to printk()
    tracing: Move enabling tracepoints to just after rcu_init()

    Linus Torvalds
     

16 Dec, 2014

1 commit

  • Pull drm updates from Dave Airlie:
    "Highlights:

    - AMD KFD driver merge

    This is the AMD HSA interface for exposing a lowlevel interface for
    GPGPU use. They have an open source userspace built on top of this
    interface, and the code looks as good as it was going to get out of
    tree.

    - Initial atomic modesetting work

    The need for an atomic modesetting interface to allow userspace to
    try and send a complete set of modesetting state to the driver has
    arisen, and been suffering from neglect this past year. No more,
    the start of the common code and changes for msm driver to use it
    are in this tree. Ongoing work to get the userspace ioctl finished
    and the code clean will probably wait until next kernel.

    - DisplayID 1.3 and tiled monitor exposed to userspace.

    Tiled monitor property is now exposed for userspace to make use of.

    - Rockchip drm driver merged.

    - imx gpu driver moved out of staging

    Other stuff:

    - core:
    panel - MIPI DSI + new panels.
    expose suggested x/y properties for virtual GPUs

    - i915:
    Initial Skylake (SKL) support
    gen3/4 reset work
    start of dri1/ums removal
    infoframe tracking
    fixes for lots of things.

    - nouveau:
    tegra k1 voltage support
    GM204 modesetting support
    GT21x memory reclocking work

    - radeon:
    CI dpm fixes
    GPUVM improvements
    Initial DPM fan control

    - rcar-du:
    HDMI support added
    removed some support for old boards
    slave encoder driver for Analog Devices adv7511

    - exynos:
    Exynos4415 SoC support

    - msm:
    a4xx gpu support
    atomic helper conversion

    - tegra:
    iommu support
    universal plane support
    ganged-mode DSI support

    - sti:
    HDMI i2c improvements

    - vmwgfx:
    some late fixes.

    - qxl:
    use suggested x/y properties"

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits)
    drm: sti: fix module compilation issue
    drm/i915: save/restore GMBUS freq across suspend/resume on gen4
    drm: sti: correctly cleanup CRTC and planes
    drm: sti: add HQVDP plane
    drm: sti: add cursor plane
    drm: sti: enable auxiliary CRTC
    drm: sti: fix delay in VTG programming
    drm: sti: prepare sti_tvout to support auxiliary crtc
    drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
    drm: sti: fix hdmi avi infoframe
    drm: sti: remove event lock while disabling vblank
    drm: sti: simplify gdp code
    drm: sti: clear all mixer control
    drm: sti: remove gpio for HDMI hot plug detection
    drm: sti: allow to change hdmi ddc i2c adapter
    drm/doc: Document drm_add_modes_noedid() usage
    drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
    drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
    drm: Zero out DRM object memory upon cleanup
    drm/i915/bdw: Fix the write setting up the WIZ hashing mode
    ...

    Linus Torvalds
     

15 Dec, 2014

3 commits

  • Add the kernel command line tp_printk option that will have tracepoints
    that are active sent to printk() as well as to the trace buffer.

    Passing "tp_printk" will activate this. To turn it off, the sysctl
    /proc/sys/kernel/tracepoint_printk can have '0' echoed into it. Note,
    this only works if the cmdline option is used. Echoing 1 into the sysctl
    file without the cmdline option will have no affect.

    Note, this is a dangerous option. Having high frequency tracepoints send
    their data to printk() can possibly cause a live lock. This is another
    reason why this is only active if the command line option is used.

    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos

    Suggested-by: Thomas Gleixner
    Tested-by: Thomas Gleixner
    Acked-by: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Enabling tracepoints at boot up can be very useful. The tracepoint
    can be initialized right after RCU has been. There's no need to
    wait for the early_initcall() to be called. That's too late for some
    things that can use tracepoints for debugging. Move the logic to
    enable tracepoints out of the initcalls and into init/main.c to
    right after rcu_init().

    This also allows trace_printk() to be used early too.

    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos
    Link: http://lkml.kernel.org/r/20141214164104.307127356@goodmis.org

    Reviewed-by: Paul E. McKenney
    Suggested-by: Thomas Gleixner
    Tested-by: Thomas Gleixner
    Acked-by: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Pull tty/serial driver updates from Greg KH:
    "Here's the big tty/serial driver update for 3.19-rc1.

    There are a number of TTY core changes/fixes in here from Peter Hurley
    that have all been teted in linux-next for a long time now. There are
    also the normal serial driver updates as well, full details in the
    changelog below"

    * tag 'tty-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (219 commits)
    serial: pxa: hold port.lock when reporting modem line changes
    tty-hvsi_lib: Deletion of an unnecessary check before the function call "tty_kref_put"
    tty: Deletion of unnecessary checks before two function calls
    n_tty: Fix read_buf race condition, increment read_head after pushing data
    serial: of-serial: add PM suspend/resume support
    Revert "serial: of-serial: add PM suspend/resume support"
    Revert "serial: of-serial: fix up PM ops on no_console_suspend and port type"
    serial: 8250: don't attempt a trylock if in sysrq
    serial: core: Add big-endian iotype
    serial: samsung: use port->fifosize instead of hardcoded values
    serial: samsung: prefer to use fifosize from driver data
    serial: samsung: fix style problems
    serial: samsung: wait for transfer completion before clock disable
    serial: icom: fix error return code
    serial: tegra: clean up tty-flag assignments
    serial: Fix io address assign flow with Fintek PCI-to-UART Product
    serial: mxs-auart: fix tx_empty against shift register
    serial: mxs-auart: fix gpio change detection on interrupt
    serial: mxs-auart: Fix mxs_auart_set_ldisc()
    serial: 8250_dw: Use 64-bit access for OCTEON.
    ...

    Linus Torvalds
     

14 Dec, 2014

3 commits

  • Pull block driver core update from Jens Axboe:
    "This is the pull request for the core block IO changes for 3.19. Not
    a huge round this time, mostly lots of little good fixes:

    - Fix a bug in sysfs blktrace interface causing a NULL pointer
    dereference, when enabled/disabled through that API. From Arianna
    Avanzini.

    - Various updates/fixes/improvements for blk-mq:

    - A set of updates from Bart, mostly fixing buts in the tag
    handling.

    - Cleanup/code consolidation from Christoph.

    - Extend queue_rq API to be able to handle batching issues of IO
    requests. NVMe will utilize this shortly. From me.

    - A few tag and request handling updates from me.

    - Cleanup of the preempt handling for running queues from Paolo.

    - Prevent running of unmapped hardware queues from Ming Lei.

    - Move the kdump memory limiting check to be in the correct
    location, from Shaohua.

    - Initialize all software queues at init time from Takashi. This
    prevents a kobject warning when CPUs are brought online that
    weren't online when a queue was registered.

    - Single writeback fix for I_DIRTY clearing from Tejun. Queued with
    the core IO changes, since it's just a single fix.

    - Version X of the __bio_add_page() segment addition retry from
    Maurizio. Hope the Xth time is the charm.

    - Documentation fixup for IO scheduler merging from Jan.

    - Introduce (and use) generic IO stat accounting helpers for non-rq
    drivers, from Gu Zheng.

    - Kill off artificial limiting of max sectors in a request from
    Christoph"

    * 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits)
    bio: modify __bio_add_page() to accept pages that don't start a new segment
    blk-mq: Fix uninitialized kobject at CPU hotplugging
    blktrace: don't let the sysfs interface remove trace from running list
    blk-mq: Use all available hardware queues
    blk-mq: Micro-optimize bt_get()
    blk-mq: Fix a race between bt_clear_tag() and bt_get()
    blk-mq: Avoid that __bt_get_word() wraps multiple times
    blk-mq: Fix a use-after-free
    blk-mq: prevent unmapped hw queue from being scheduled
    blk-mq: re-check for available tags after running the hardware queue
    blk-mq: fix hang in bt_get()
    blk-mq: move the kdump check to blk_mq_alloc_tag_set
    blk-mq: cleanup tag free handling
    blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq cpu map
    blk: introduce generic io stat accounting help function
    blk-mq: handle the single queue case in blk_mq_hctx_next_cpu
    genhd: check for int overflow in disk_expand_part_tbl()
    blk-mq: add blk_mq_free_hctx_request()
    blk-mq: export blk_mq_free_request()
    blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable
    ...

    Linus Torvalds
     
  • …it/rostedt/linux-trace

    Pull tracing fixlet from Steven Rostedt:
    "Remove unnecessary preempt_disable in printk()"

    * tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    printk: Do not disable preemption for accessing printk_func

    Linus Torvalds
     
  • Pull audit updates from Paul Moore:
    "Two small patches from the audit next branch; only one of which has
    any real significant code changes, the other is simply a MAINTAINERS
    update for audit.

    The single code patch is pretty small and rather straightforward, it
    changes the audit "version" number reported to userspace from an
    integer to a bitmap which is used to indicate the functionality of the
    running kernel. This really doesn't have much impact on the kernel,
    but it will make life easier for the audit userspace folks.

    Thankfully we were still on a version number which allowed us to do
    this without breaking userspace"

    * 'upstream' of git://git.infradead.org/users/pcmoore/audit:
    audit: convert status version to a feature bitmap
    audit: add Paul Moore to the MAINTAINERS entry

    Linus Torvalds