20 Dec, 2014

2 commits

  • Pull irq core fix from Thomas Gleixner:
    "A single fix plugging a long standing race between proc/stat and
    proc/interrupts access and freeing of interrupt descriptors"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    genirq: Prevent proc race against freeing of irq descriptors

    Linus Torvalds
     
  • Pull perf fixes and cleanups from Ingo Molnar:
    "A kernel fix plus mostly tooling fixes, but also some tooling
    restructuring and cleanups"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (39 commits)
    perf: Fix building warning on ARM 32
    perf symbols: Fix use after free in filename__read_build_id
    perf evlist: Use roundup_pow_of_two
    tools: Adopt roundup_pow_of_two
    perf tools: Make the mmap length autotuning more robust
    tools: Adopt rounddown_pow_of_two and deps
    tools: Adopt fls_long and deps
    tools: Move bitops.h from tools/perf/util to tools/
    tools: Introduce asm-generic/bitops.h
    tools lib: Move asm-generic/bitops/find.h code to tools/include and tools/lib
    tools: Whitespace prep patches for moving bitops.h
    tools: Move code originally from asm-generic/atomic.h into tools/include/asm-generic/
    tools: Move code originally from linux/log2.h to tools/include/linux/
    tools: Move __ffs implementation to tools/include/asm-generic/bitops/__ffs.h
    perf evlist: Do not use hard coded value for a mmap_pages default
    perf trace: Let the perf_evlist__mmap autosize the number of pages to use
    perf evlist: Improve the strerror_mmap method
    perf evlist: Clarify sterror_mmap variable names
    perf evlist: Fixup brown paper bag on "hint" for --mmap-pages cmdline arg
    perf trace: Provide a better explanation when mmap fails
    ...

    Linus Torvalds
     

19 Dec, 2014

2 commits

  • Pull module updates from Rusty Russell:
    "The exciting thing here is the getting rid of stop_machine on module
    removal. This is possible by using a simple atomic_t for the counter,
    rather than our fancy per-cpu counter: it turns out that no one is
    doing a module increment per net packet, so the slowdown should be in
    the noise"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    param: do not set store func without write perm
    params: cleanup sysfs allocation
    kernel:module Fix coding style errors and warnings.
    module: Remove stop_machine from module unloading
    module: Replace module_ref with atomic_t refcnt
    lib/bug: Use RCU list ops for module_bug_list
    module: Unlink module with RCU synchronizing instead of stop_machine
    module: Wait for RCU synchronizing before releasing a module

    Linus Torvalds
     
  • Pull more ACPI and power management updates from Rafael Wysocki:
    "These are regression fixes (leds-gpio, ACPI backlight driver,
    operating performance points library, ACPI device enumeration
    messages, cpupower tool), other bug fixes (ACPI EC driver, ACPI device
    PM), some cleanups in the operating performance points (OPP)
    framework, continuation of CONFIG_PM_RUNTIME elimination, a couple of
    minor intel_pstate driver changes, a new MAINTAINERS entry for it and
    an ACPI fan driver change needed for better support of thermal
    management in user space.

    Specifics:

    - Fix a regression in leds-gpio introduced by a recent commit that
    inadvertently changed the name of one of the properties used by the
    driver (Fabio Estevam).

    - Fix a regression in the ACPI backlight driver introduced by a
    recent fix that missed one special case that had to be taken into
    account (Aaron Lu).

    - Drop the level of some new kernel messages from the ACPI core
    introduced by a recent commit to KERN_DEBUG which they should have
    used from the start and drop some other unuseful KERN_ERR messages
    printed by ACPI (Rafael J Wysocki).

    - Revert an incorrect commit modifying the cpupower tool (Prarit
    Bhargava).

    - Fix two regressions introduced by recent commits in the OPP library
    and clean up some existing minor issues in that code (Viresh
    Kumar).

    - Continue to replace CONFIG_PM_RUNTIME with CONFIG_PM throughout the
    tree (or drop it where that can be done) in order to make it
    possible to eliminate CONFIG_PM_RUNTIME (Rafael J Wysocki, Ulf
    Hansson, Ludovic Desroches).

    There will be one more "CONFIG_PM_RUNTIME removal" batch after this
    one, because some new uses of it have been introduced during the
    current merge window, but that should be sufficient to finally get
    rid of it.

    - Make the ACPI EC driver more robust against race conditions related
    to GPE handler installation failures (Lv Zheng).

    - Prevent the ACPI device PM core code from attempting to disable
    GPEs that it has not enabled which confuses ACPICA and makes it
    report errors unnecessarily (Rafael J Wysocki).

    - Add a "force" command line switch to the intel_pstate driver to
    make it possible to override the blacklisting of some systems in
    that driver if needed (Ethan Zhao).

    - Improve intel_pstate code documentation and add a MAINTAINERS entry
    for it (Kristen Carlson Accardi).

    - Make the ACPI fan driver create cooling device interfaces witn
    names that reflect the IDs of the ACPI device objects they are
    associated with, except for "generic" ACPI fans (PNP ID "PNP0C0B").

    That's necessary for user space thermal management tools to be able
    to connect the fans with the parts of the system they are supposed
    to be cooling properly. From Srinivas Pandruvada"

    * tag 'pm+acpi-3.19-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (32 commits)
    MAINTAINERS: add entry for intel_pstate
    ACPI / video: update the skip case for acpi_video_device_in_dod()
    power / PM: Eliminate CONFIG_PM_RUNTIME
    NFC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    SCSI / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    ACPI / EC: Fix unexpected ec_remove_handlers() invocations
    Revert "tools: cpupower: fix return checks for sysfs_get_idlestate_count()"
    tracing / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    x86 / PM: Replace CONFIG_PM_RUNTIME in io_apic.c
    PM: Remove the SET_PM_RUNTIME_PM_OPS() macro
    mmc: atmel-mci: use SET_RUNTIME_PM_OPS() macro
    PM / Kconfig: Replace PM_RUNTIME with PM in dependencies
    ARM / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    sound / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    phy / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    video / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    tty / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    spi: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    ACPI / PM: Do not disable wakeup GPEs that have not been enabled
    ACPI / utils: Drop error messages from acpi_evaluate_reference()
    ...

    Linus Torvalds
     

18 Dec, 2014

2 commits

  • When a module_param is defined without DAC write permissions, it can
    still be changed at runtime and updated. Drivers using a 0444 permission
    may be surprised that these values can still be changed.

    For drivers that want to allow updates, any S_IW* flag will set the
    "store" function as before. Drivers without S_IW* flags will have the
    "store" function unset, unforcing a read-only value. Drivers that wish
    neither "store" nor "get" can continue to use "0" for perms to stay out
    of sysfs entirely.

    Old behavior:
    # cd /sys/module/snd/parameters
    # ls -l
    total 0
    -r--r--r-- 1 root root 4096 Dec 11 13:55 cards_limit
    -r--r--r-- 1 root root 4096 Dec 11 13:55 major
    -r--r--r-- 1 root root 4096 Dec 11 13:55 slots
    # cat major
    116
    # echo -1 > major
    -bash: major: Permission denied
    # chmod u+w major
    # echo -1 > major
    # cat major
    -1

    New behavior:
    ...
    # chmod u+w major
    # echo -1 > major
    -bash: echo: write error: Input/output error

    Signed-off-by: Kees Cook
    Signed-off-by: Rusty Russell

    Kees Cook
     
  • Pull user namespace related fixes from Eric Biederman:
    "As these are bug fixes almost all of thes changes are marked for
    backporting to stable.

    The first change (implicitly adding MNT_NODEV on remount) addresses a
    regression that was created when security issues with unprivileged
    remount were closed. I go on to update the remount test to make it
    easy to detect if this issue reoccurs.

    Then there are a handful of mount and umount related fixes.

    Then half of the changes deal with the a recently discovered design
    bug in the permission checks of gid_map. Unix since the beginning has
    allowed setting group permissions on files to less than the user and
    other permissions (aka ---rwx---rwx). As the unix permission checks
    stop as soon as a group matches, and setgroups allows setting groups
    that can not later be dropped, results in a situtation where it is
    possible to legitimately use a group to assign fewer privileges to a
    process. Which means dropping a group can increase a processes
    privileges.

    The fix I have adopted is that gid_map is now no longer writable
    without privilege unless the new file /proc/self/setgroups has been
    set to permanently disable setgroups.

    The bulk of user namespace using applications even the applications
    using applications using user namespaces without privilege remain
    unaffected by this change. Unfortunately this ix breaks a couple user
    space applications, that were relying on the problematic behavior (one
    of which was tools/selftests/mount/unprivileged-remount-test.c).

    To hopefully prevent needing a regression fix on top of my security
    fix I rounded folks who work with the container implementations mostly
    like to be affected and encouraged them to test the changes.

    > So far nothing broke on my libvirt-lxc test bed. :-)
    > Tested with openSUSE 13.2 and libvirt 1.2.9.
    > Tested-by: Richard Weinberger

    > Tested on Fedora20 with libvirt 1.2.11, works fine.
    > Tested-by: Chen Hanxiao

    > Ok, thanks - yes, unprivileged lxc is working fine with your kernels.
    > Just to be sure I was testing the right thing I also tested using
    > my unprivileged nsexec testcases, and they failed on setgroup/setgid
    > as now expected, and succeeded there without your patches.
    > Tested-by: Serge Hallyn

    > I tested this with Sandstorm. It breaks as is and it works if I add
    > the setgroups thing.
    > Tested-by: Andy Lutomirski # breaks things as designed :("

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Unbreak the unprivileged remount tests
    userns; Correct the comment in map_write
    userns: Allow setting gid_maps without privilege when setgroups is disabled
    userns: Add a knob to disable setgroups on a per user namespace basis
    userns: Rename id_map_mutex to userns_state_mutex
    userns: Only allow the creator of the userns unprivileged mappings
    userns: Check euid no fsuid when establishing an unprivileged uid mapping
    userns: Don't allow unprivileged creation of gid mappings
    userns: Don't allow setgroups until a gid mapping has been setablished
    userns: Document what the invariant required for safe unprivileged mappings.
    groups: Consolidate the setgroups permission checks
    mnt: Clear mnt_expire during pivot_root
    mnt: Carefully set CL_UNPRIVILEGED in clone_mnt
    mnt: Move the clear of MNT_LOCKED from copy_tree to it's callers.
    umount: Do not allow unmounting rootfs.
    umount: Disallow unprivileged mount force
    mnt: Update unprivileged remount test
    mnt: Implicitly add MNT_NODEV on remount when it was implicitly added by mount

    Linus Torvalds
     

17 Dec, 2014

2 commits

  • Pull vfs pile #2 from Al Viro:
    "Next pile (and there'll be one or two more).

    The large piece in this one is getting rid of /proc/*/ns/* weirdness;
    among other things, it allows to (finally) make nameidata completely
    opaque outside of fs/namei.c, making for easier further cleanups in
    there"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    coda_venus_readdir(): use file_inode()
    fs/namei.c: fold link_path_walk() call into path_init()
    path_init(): don't bother with LOOKUP_PARENT in argument
    fs/namei.c: new helper (path_cleanup())
    path_init(): store the "base" pointer to file in nameidata itself
    make default ->i_fop have ->open() fail with ENXIO
    make nameidata completely opaque outside of fs/namei.c
    kill proc_ns completely
    take the targets of /proc/*/ns/* symlinks to separate fs
    bury struct proc_ns in fs/proc
    copy address of proc_ns_ops into ns_common
    new helpers: ns_alloc_inum/ns_free_inum
    make proc_ns_operations work with struct ns_common * instead of void *
    switch the rest of proc_ns_operations to working with &...->ns
    netns: switch ->get()/->put()/->install()/->inum() to working with &net->ns
    make mntns ->get()/->put()/->install()/->inum() work with &mnt_ns->ns
    common object embedded into various struct ....ns

    Linus Torvalds
     
  • Pull tracing updates from Steven Rostedt:
    "As the merge window is still open, and this code was not as complex as
    I thought it might be. I'm pushing this in now.

    This will allow Thomas to debug his irq work for 3.20.

    This adds two new features:

    1) Allow traceopoints to be enabled right after mm_init().

    By passing in the trace_event= kernel command line parameter,
    tracepoints can be enabled at boot up. For debugging things like
    the initialization of interrupts, it is needed to have tracepoints
    enabled very early. People have asked about this before and this
    has been on my todo list. As it can be helpful for Thomas to debug
    his upcoming 3.20 IRQ work, I'm pushing this now. This way he can
    add tracepoints into the IRQ set up and have users enable them when
    things go wrong.

    2) Have the tracepoints printed via printk() (the console) when they
    are triggered.

    If the irq code locks up or reboots the box, having the tracepoint
    output go into the kernel ring buffer is useless for debugging.
    But being able to add the tp_printk kernel command line option
    along with the trace_event= option will have these tracepoints
    printed as they occur, and that can be really useful for debugging
    early lock up or reboot problems.

    This code is not that intrusive and it passed all my tests. Thomas
    tried them out too and it works for his needs.

    Link: http://lkml.kernel.org/r/20141214201609.126831471@goodmis.org"

    * tag 'trace-3.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Add tp_printk cmdline to have tracepoints go to printk()
    tracing: Move enabling tracepoints to just after rcu_init()

    Linus Torvalds
     

16 Dec, 2014

1 commit

  • Pull drm updates from Dave Airlie:
    "Highlights:

    - AMD KFD driver merge

    This is the AMD HSA interface for exposing a lowlevel interface for
    GPGPU use. They have an open source userspace built on top of this
    interface, and the code looks as good as it was going to get out of
    tree.

    - Initial atomic modesetting work

    The need for an atomic modesetting interface to allow userspace to
    try and send a complete set of modesetting state to the driver has
    arisen, and been suffering from neglect this past year. No more,
    the start of the common code and changes for msm driver to use it
    are in this tree. Ongoing work to get the userspace ioctl finished
    and the code clean will probably wait until next kernel.

    - DisplayID 1.3 and tiled monitor exposed to userspace.

    Tiled monitor property is now exposed for userspace to make use of.

    - Rockchip drm driver merged.

    - imx gpu driver moved out of staging

    Other stuff:

    - core:
    panel - MIPI DSI + new panels.
    expose suggested x/y properties for virtual GPUs

    - i915:
    Initial Skylake (SKL) support
    gen3/4 reset work
    start of dri1/ums removal
    infoframe tracking
    fixes for lots of things.

    - nouveau:
    tegra k1 voltage support
    GM204 modesetting support
    GT21x memory reclocking work

    - radeon:
    CI dpm fixes
    GPUVM improvements
    Initial DPM fan control

    - rcar-du:
    HDMI support added
    removed some support for old boards
    slave encoder driver for Analog Devices adv7511

    - exynos:
    Exynos4415 SoC support

    - msm:
    a4xx gpu support
    atomic helper conversion

    - tegra:
    iommu support
    universal plane support
    ganged-mode DSI support

    - sti:
    HDMI i2c improvements

    - vmwgfx:
    some late fixes.

    - qxl:
    use suggested x/y properties"

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (969 commits)
    drm: sti: fix module compilation issue
    drm/i915: save/restore GMBUS freq across suspend/resume on gen4
    drm: sti: correctly cleanup CRTC and planes
    drm: sti: add HQVDP plane
    drm: sti: add cursor plane
    drm: sti: enable auxiliary CRTC
    drm: sti: fix delay in VTG programming
    drm: sti: prepare sti_tvout to support auxiliary crtc
    drm: sti: use drm_crtc_vblank_{on/off} instead of drm_vblank_{on/off}
    drm: sti: fix hdmi avi infoframe
    drm: sti: remove event lock while disabling vblank
    drm: sti: simplify gdp code
    drm: sti: clear all mixer control
    drm: sti: remove gpio for HDMI hot plug detection
    drm: sti: allow to change hdmi ddc i2c adapter
    drm/doc: Document drm_add_modes_noedid() usage
    drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
    drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
    drm: Zero out DRM object memory upon cleanup
    drm/i915/bdw: Fix the write setting up the WIZ hashing mode
    ...

    Linus Torvalds
     

15 Dec, 2014

3 commits

  • Add the kernel command line tp_printk option that will have tracepoints
    that are active sent to printk() as well as to the trace buffer.

    Passing "tp_printk" will activate this. To turn it off, the sysctl
    /proc/sys/kernel/tracepoint_printk can have '0' echoed into it. Note,
    this only works if the cmdline option is used. Echoing 1 into the sysctl
    file without the cmdline option will have no affect.

    Note, this is a dangerous option. Having high frequency tracepoints send
    their data to printk() can possibly cause a live lock. This is another
    reason why this is only active if the command line option is used.

    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos

    Suggested-by: Thomas Gleixner
    Tested-by: Thomas Gleixner
    Acked-by: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Enabling tracepoints at boot up can be very useful. The tracepoint
    can be initialized right after RCU has been. There's no need to
    wait for the early_initcall() to be called. That's too late for some
    things that can use tracepoints for debugging. Move the logic to
    enable tracepoints out of the initcalls and into init/main.c to
    right after rcu_init().

    This also allows trace_printk() to be used early too.

    Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1412121539300.16494@nanos
    Link: http://lkml.kernel.org/r/20141214164104.307127356@goodmis.org

    Reviewed-by: Paul E. McKenney
    Suggested-by: Thomas Gleixner
    Tested-by: Thomas Gleixner
    Acked-by: Thomas Gleixner
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • Pull tty/serial driver updates from Greg KH:
    "Here's the big tty/serial driver update for 3.19-rc1.

    There are a number of TTY core changes/fixes in here from Peter Hurley
    that have all been teted in linux-next for a long time now. There are
    also the normal serial driver updates as well, full details in the
    changelog below"

    * tag 'tty-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (219 commits)
    serial: pxa: hold port.lock when reporting modem line changes
    tty-hvsi_lib: Deletion of an unnecessary check before the function call "tty_kref_put"
    tty: Deletion of unnecessary checks before two function calls
    n_tty: Fix read_buf race condition, increment read_head after pushing data
    serial: of-serial: add PM suspend/resume support
    Revert "serial: of-serial: add PM suspend/resume support"
    Revert "serial: of-serial: fix up PM ops on no_console_suspend and port type"
    serial: 8250: don't attempt a trylock if in sysrq
    serial: core: Add big-endian iotype
    serial: samsung: use port->fifosize instead of hardcoded values
    serial: samsung: prefer to use fifosize from driver data
    serial: samsung: fix style problems
    serial: samsung: wait for transfer completion before clock disable
    serial: icom: fix error return code
    serial: tegra: clean up tty-flag assignments
    serial: Fix io address assign flow with Fintek PCI-to-UART Product
    serial: mxs-auart: fix tx_empty against shift register
    serial: mxs-auart: fix gpio change detection on interrupt
    serial: mxs-auart: Fix mxs_auart_set_ldisc()
    serial: 8250_dw: Use 64-bit access for OCTEON.
    ...

    Linus Torvalds
     

14 Dec, 2014

11 commits

  • Pull block driver core update from Jens Axboe:
    "This is the pull request for the core block IO changes for 3.19. Not
    a huge round this time, mostly lots of little good fixes:

    - Fix a bug in sysfs blktrace interface causing a NULL pointer
    dereference, when enabled/disabled through that API. From Arianna
    Avanzini.

    - Various updates/fixes/improvements for blk-mq:

    - A set of updates from Bart, mostly fixing buts in the tag
    handling.

    - Cleanup/code consolidation from Christoph.

    - Extend queue_rq API to be able to handle batching issues of IO
    requests. NVMe will utilize this shortly. From me.

    - A few tag and request handling updates from me.

    - Cleanup of the preempt handling for running queues from Paolo.

    - Prevent running of unmapped hardware queues from Ming Lei.

    - Move the kdump memory limiting check to be in the correct
    location, from Shaohua.

    - Initialize all software queues at init time from Takashi. This
    prevents a kobject warning when CPUs are brought online that
    weren't online when a queue was registered.

    - Single writeback fix for I_DIRTY clearing from Tejun. Queued with
    the core IO changes, since it's just a single fix.

    - Version X of the __bio_add_page() segment addition retry from
    Maurizio. Hope the Xth time is the charm.

    - Documentation fixup for IO scheduler merging from Jan.

    - Introduce (and use) generic IO stat accounting helpers for non-rq
    drivers, from Gu Zheng.

    - Kill off artificial limiting of max sectors in a request from
    Christoph"

    * 'for-3.19/core' of git://git.kernel.dk/linux-block: (26 commits)
    bio: modify __bio_add_page() to accept pages that don't start a new segment
    blk-mq: Fix uninitialized kobject at CPU hotplugging
    blktrace: don't let the sysfs interface remove trace from running list
    blk-mq: Use all available hardware queues
    blk-mq: Micro-optimize bt_get()
    blk-mq: Fix a race between bt_clear_tag() and bt_get()
    blk-mq: Avoid that __bt_get_word() wraps multiple times
    blk-mq: Fix a use-after-free
    blk-mq: prevent unmapped hw queue from being scheduled
    blk-mq: re-check for available tags after running the hardware queue
    blk-mq: fix hang in bt_get()
    blk-mq: move the kdump check to blk_mq_alloc_tag_set
    blk-mq: cleanup tag free handling
    blk-mq: use 'nr_cpu_ids' as highest CPU ID count for hwq cpu map
    blk: introduce generic io stat accounting help function
    blk-mq: handle the single queue case in blk_mq_hctx_next_cpu
    genhd: check for int overflow in disk_expand_part_tbl()
    blk-mq: add blk_mq_free_hctx_request()
    blk-mq: export blk_mq_free_request()
    blk-mq: use get_cpu/put_cpu instead of preempt_disable/preempt_enable
    ...

    Linus Torvalds
     
  • …it/rostedt/linux-trace

    Pull tracing fixlet from Steven Rostedt:
    "Remove unnecessary preempt_disable in printk()"

    * tag 'trace-seq-buf-3.19-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    printk: Do not disable preemption for accessing printk_func

    Linus Torvalds
     
  • Pull audit updates from Paul Moore:
    "Two small patches from the audit next branch; only one of which has
    any real significant code changes, the other is simply a MAINTAINERS
    update for audit.

    The single code patch is pretty small and rather straightforward, it
    changes the audit "version" number reported to userspace from an
    integer to a bitmap which is used to indicate the functionality of the
    running kernel. This really doesn't have much impact on the kernel,
    but it will make life easier for the audit userspace folks.

    Thankfully we were still on a version number which allowed us to do
    this without breaking userspace"

    * 'upstream' of git://git.infradead.org/users/pcmoore/audit:
    audit: convert status version to a feature bitmap
    audit: add Paul Moore to the MAINTAINERS entry

    Linus Torvalds
     
  • There's a lot of common code in inode and mount marks handling. Factor it
    out to a common helper function.

    Signed-off-by: Jan Kara
    Cc: Eric Paris
    Cc: Heinrich Schuchardt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Following the suggestions from Andrew Morton and Stephen Rothwell,
    Dont expand the ARCH list in kernel/gcov/Kconfig. Instead,
    define a ARCH_HAS_GCOV_PROFILE_ALL bool which architectures
    can enable.

    set ARCH_HAS_GCOV_PROFILE_ALL on Architectures where it was
    previously allowed + ARM64 which I tested.

    Signed-off-by: Riku Voipio
    Cc: Peter Oberparleiter
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Riku Voipio
     
  • Remove unnecessary KERN_ERR from pr_err() within kexec.c.

    Signed-off-by: Masanari Iida
    Acked-by: Vivek Goyal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masanari Iida
     
  • This patchset adds execveat(2) for x86, and is derived from Meredydd
    Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).

    The primary aim of adding an execveat syscall is to allow an
    implementation of fexecve(3) that does not rely on the /proc filesystem,
    at least for executables (rather than scripts). The current glibc version
    of fexecve(3) is implemented via /proc, which causes problems in sandboxed
    or otherwise restricted environments.

    Given the desire for a /proc-free fexecve() implementation, HPA suggested
    (https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
    an appropriate generalization.

    Also, having a new syscall means that it can take a flags argument without
    back-compatibility concerns. The current implementation just defines the
    AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
    added in future -- for example, flags for new namespaces (as suggested at
    https://lkml.org/lkml/2006/7/11/474).

    Related history:
    - https://lkml.org/lkml/2006/12/27/123 is an example of someone
    realizing that fexecve() is likely to fail in a chroot environment.
    - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
    documenting the /proc requirement of fexecve(3) in its manpage, to
    "prevent other people from wasting their time".
    - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
    problem where a process that did setuid() could not fexecve()
    because it no longer had access to /proc/self/fd; this has since
    been fixed.

    This patch (of 4):

    Add a new execveat(2) system call. execveat() is to execve() as openat()
    is to open(): it takes a file descriptor that refers to a directory, and
    resolves the filename relative to that.

    In addition, if the filename is empty and AT_EMPTY_PATH is specified,
    execveat() executes the file to which the file descriptor refers. This
    replicates the functionality of fexecve(), which is a system call in other
    UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/" (and
    so relies on /proc being mounted).

    The filename fed to the executed program as argv[0] (or the name of the
    script fed to a script interpreter) will be of the form "/dev/fd/"
    (for an empty filename) or "/dev/fd//", effectively
    reflecting how the executable was found. This does however mean that
    execution of a script in a /proc-less environment won't work; also, script
    execution via an O_CLOEXEC file descriptor fails (as the file will not be
    accessible after exec).

    Based on patches by Meredydd Luff.

    Signed-off-by: David Drysdale
    Cc: Meredydd Luff
    Cc: Shuah Khan
    Cc: "Eric W. Biederman"
    Cc: Andy Lutomirski
    Cc: Alexander Viro
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Kees Cook
    Cc: Arnd Bergmann
    Cc: Rich Felker
    Cc: Christoph Hellwig
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Drysdale
     
  • Current stacktrace only have the function for console output. page_owner
    that will be introduced in following patch needs to print the output of
    stacktrace into the buffer for our own output format so so new function,
    snprint_stack_trace(), is needed.

    Signed-off-by: Joonsoo Kim
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Minchan Kim
    Cc: Dave Hansen
    Cc: Michal Nazarewicz
    Cc: Jungsoo Son
    Cc: Ingo Molnar
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     
  • Both register and unregister call build_map_info() in order to create the
    list of mappings before installing or removing breakpoints for every mm
    which maps file backed memory. As such, there is no reason to hold the
    i_mmap_rwsem exclusively, so share it and allow concurrent readers to
    build the mapping data.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Srikar Dronamraju
    Acked-by: "Kirill A. Shutemov"
    Cc: Oleg Nesterov
    Acked-by: Hugh Dickins
    Acked-by: Peter Zijlstra (Intel)
    Cc: Rik van Riel
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • The i_mmap_mutex is a close cousin of the anon vma lock, both protecting
    similar data, one for file backed pages and the other for anon memory. To
    this end, this lock can also be a rwsem. In addition, there are some
    important opportunities to share the lock when there are no tree
    modifications.

    This conversion is straightforward. For now, all users take the write
    lock.

    [sfr@canb.auug.org.au: update fremap.c]
    Signed-off-by: Davidlohr Bueso
    Reviewed-by: Rik van Riel
    Acked-by: "Kirill A. Shutemov"
    Acked-by: Hugh Dickins
    Cc: Oleg Nesterov
    Acked-by: Peter Zijlstra (Intel)
    Cc: Srikar Dronamraju
    Acked-by: Mel Gorman
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     
  • Convert all open coded mutex_lock/unlock calls to the
    i_mmap_[lock/unlock]_write() helpers.

    Signed-off-by: Davidlohr Bueso
    Acked-by: Rik van Riel
    Acked-by: "Kirill A. Shutemov"
    Acked-by: Hugh Dickins
    Cc: Oleg Nesterov
    Acked-by: Peter Zijlstra (Intel)
    Cc: Srikar Dronamraju
    Acked-by: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davidlohr Bueso
     

13 Dec, 2014

3 commits

  • Since the rework of the sparse interrupt code to actually free the
    unused interrupt descriptors there exists a race between the /proc
    interfaces to the irq subsystem and the code which frees the interrupt
    descriptor.

    CPU0 CPU1
    show_interrupts()
    desc = irq_to_desc(X);
    free_desc(desc)
    remove_from_radix_tree();
    kfree(desc);
    raw_spinlock_irq(&desc->lock);

    /proc/interrupts is the only interface which can actively corrupt
    kernel memory via the lock access. /proc/stat can only read from freed
    memory. Extremly hard to trigger, but possible.

    The interfaces in /proc/irq/N/ are not affected by this because the
    removal of the proc file is serialized in procfs against concurrent
    readers/writers. The removal happens before the descriptor is freed.

    For architectures which have CONFIG_SPARSE_IRQ=n this is a non issue
    as the descriptor is never freed. It's merely cleared out with the irq
    descriptor lock held. So any concurrent proc access will either see
    the old correct value or the cleared out ones.

    Protect the lookup and access to the irq descriptor in
    show_interrupts() with the sparse_irq_lock.

    Provide kstat_irqs_usr() which is protecting the lookup and access
    with sparse_irq_lock and switch /proc/stat to use it.

    Document the existing kstat_irqs interfaces so it's clear that the
    caller needs to take care about protection. The users of these
    interfaces are either not affected due to SPARSE_IRQ=n or already
    protected against removal.

    Fixes: 1f5a5b87f78f "genirq: Implement a sane sparse_irq allocator"
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org

    Thomas Gleixner
     
  • After commit b2b49ccbdd54 (PM: Kconfig: Set PM_RUNTIME if PM_SLEEP is
    selected) PM_RUNTIME is always set if PM is set, so files that are
    build conditionally if CONFIG_PM_RUNTIME is set may now be build
    if CONFIG_PM is set.

    Replace CONFIG_PM_RUNTIME with CONFIG_PM in kernel/trace/Makefile
    for this reason.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Steven Rostedt <rostedt@goodmis.org.

    Rafael J. Wysocki
     
  • Pull trivial tree update from Jiri Kosina:
    "Usual stuff: documentation updates, printk() fixes, etc"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits)
    intel_ips: fix a type in error message
    cpufreq: cpufreq-dt: Move newline to end of error message
    ps3rom: fix error return code
    treewide: fix typo in printk and Kconfig
    ARM: dts: bcm63138: change "interupts" to "interrupts"
    Replace mentions of "list_struct" to "list_head"
    kernel: trace: fix printk message
    scsi: mpt2sas: fix ioctl in comment
    zbud, zswap: change module author email
    clocksource: Fix 'clcoksource' typo in comment
    arm: fix wording of "Crotex" in CONFIG_ARCH_EXYNOS3 help
    gpio: msm-v1: make boolean argument more obvious
    usb: Fix typo in usb-serial-simple.c
    PCI: Fix comment typo 'COMFIG_PM_OPS'
    powerpc: Fix comment typo 'CONIFG_8xx'
    powerpc: Fix comment typos 'CONFiG_ALTIVEC'
    clk: st: Spelling s/stucture/structure/
    isci: Spelling s/stucture/structure/
    usb: gadget: zero: Spelling s/infrastucture/infrastructure/
    treewide: Fix company name in module descriptions
    ...

    Linus Torvalds
     

12 Dec, 2014

10 commits

  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Pull cgroup update from Tejun Heo:
    "cpuset got simplified a bit. cgroup core got a fix on unified
    hierarchy and grew some effective css related interfaces which will be
    used for blkio support for writeback IO traffic which is currently
    being worked on"

    * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
    cgroup: implement cgroup_get_e_css()
    cgroup: add cgroup_subsys->css_e_css_changed()
    cgroup: add cgroup_subsys->css_released()
    cgroup: fix the async css offline wait logic in cgroup_subtree_control_write()
    cgroup: restructure child_subsys_mask handling in cgroup_subtree_control_write()
    cgroup: separate out cgroup_calc_child_subsys_mask() from cgroup_refresh_child_subsys_mask()
    cpuset: lock vs unlock typo
    cpuset: simplify cpuset_node_allowed API
    cpuset: convert callback_mutex to a spinlock

    Linus Torvalds
     
  • Pull workqueue update from Tejun Heo:
    "Work items which may be involved in memory reclaim path may be
    executed by the rescuer under memory pressure. When a rescuer gets
    activated, it processes whatever are on the pending list and then goes
    back to sleep until the manager kicks it again which involves 100ms
    delay.

    This is problematic for self-requeueing work items or the ones running
    on ordered workqueues as there always is only one work item on the
    pending list when the rescuer kicks in. The execution of that work
    item produces more to execute but the rescuer won't see them until
    after the said 100ms has passed, so such workqueues would only execute
    one work item every 100ms under prolonged memory pressure, which BTW
    may be being prolonged due to the slow execution.

    Neil wrote up a patch which fixes this issue by keeping the rescuer
    working as long as the target workqueue is busy but doesn't have
    enough workers"

    * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: allow rescuer thread to do more work.
    workqueue: invert the order between pool->lock and wq_mayday_lock
    workqueue: cosmetic update in rescuer_thread()

    Linus Torvalds
     
  • Pull percpu updates from Tejun Heo:
    "Nothing interesting. A patch to convert the remaining __get_cpu_var()
    users, another to fix non-critical off-by-one in an assertion and a
    cosmetic conversion to lockless_dereference() in percpu-ref.

    The back-merge from mainline is to receive lockless_dereference()"

    * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: Replace smp_read_barrier_depends() with lockless_dereference()
    percpu: Convert remaining __get_cpu_var uses in 3.18-rcX
    percpu: off by one in BUG_ON()

    Linus Torvalds
     
  • Here's a batch of i915 fixes for 3.19.

    * tag 'drm-intel-next-fixes-2014-12-11' of git://anongit.freedesktop.org/drm-intel:
    drm/i915: save/restore GMBUS freq across suspend/resume on gen4
    drm/i915: Remove '& 0xffff' from the mask given to WA_REG()
    drm/i915: Invert the mask and val arguments in wa_add() and WA_REG()
    drm/i915/bdw: Fix the write setting up the WIZ hashing mode
    drm/i915: Don't complain about stolen conflicts on gen3
    drm/i915: resume MST after reading back hw state
    drm/i915: Handle inaccurate time conversion issues
    drm/i915: compute wait_ioctl timeout correctly
    drm/i915: don't always do full mode sets when infoframes are enabled

    Dave Airlie
     
  • Pull s390 updates from Martin Schwidefsky:
    "The most notable change for this pull request is the ftrace rework
    from Heiko. It brings a small performance improvement and the ground
    work to support a new gcc option to replace the mcount blocks with a
    single nop.

    Two new s390 specific system calls are added to emulate user space
    mmio for PCI, an artifact of the how PCI memory is accessed.

    Two patches for the memory management with changes to common code.
    For KVM mm_forbids_zeropage is added which disables the empty zero
    page for an mm that is used by a KVM process. And an optimization,
    pmdp_get_and_clear_full is added analog to ptep_get_and_clear_full.

    Some micro optimization for the cmpxchg and the spinlock code.

    And as usual bug fixes and cleanups"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits)
    s390/cputime: fix 31-bit compile
    s390/scm_block: make the number of reqs per HW req configurable
    s390/scm_block: handle multiple requests in one HW request
    s390/scm_block: allocate aidaw pages only when necessary
    s390/scm_block: use mempool to manage aidaw requests
    s390/eadm: change timeout value
    s390/mm: fix memory leak of ptlock in pmd_free_tlb
    s390: use local symbol names in entry[64].S
    s390/ptrace: always include vector registers in core files
    s390/simd: clear vector register pointer on fork/clone
    s390: translate cputime magic constants to macros
    s390/idle: convert open coded idle time seqcount
    s390/idle: add missing irq off lockdep annotation
    s390/debug: avoid function call for debug_sprintf_*
    s390/kprobes: fix instruction copy for out of line execution
    s390: remove diag 44 calls from cpu_relax()
    s390/dasd: retry partition detection
    s390/dasd: fix list corruption for sleep_on requests
    s390/dasd: fix infinite term I/O loop
    s390/dasd: remove unused code
    ...

    Linus Torvalds
     
  • It is important that all maps are less than PAGE_SIZE
    or else setting the last byte of the buffer to '0'
    could write off the end of the allocated storage.

    Correct the misleading comment.

    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Now that setgroups can be disabled and not reenabled, setting gid_map
    without privielge can now be enabled when setgroups is disabled.

    This restores most of the functionality that was lost when unprivileged
    setting of gid_map was removed. Applications that use this functionality
    will need to check to see if they use setgroups or init_groups, and if they
    don't they can be fixed by simply disabling setgroups before writing to
    gid_map.

    Cc: stable@vger.kernel.org
    Reviewed-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • - Expose the knob to user space through a proc file /proc//setgroups

    A value of "deny" means the setgroups system call is disabled in the
    current processes user namespace and can not be enabled in the
    future in this user namespace.

    A value of "allow" means the segtoups system call is enabled.

    - Descendant user namespaces inherit the value of setgroups from
    their parents.

    - A proc file is used (instead of a sysctl) as sysctls currently do
    not allow checking the permissions at open time.

    - Writing to the proc file is restricted to before the gid_map
    for the user namespace is set.

    This ensures that disabling setgroups at a user namespace
    level will never remove the ability to call setgroups
    from a process that already has that ability.

    A process may opt in to the setgroups disable for itself by
    creating, entering and configuring a user namespace or by calling
    setns on an existing user namespace with setgroups disabled.
    Processes without privileges already can not call setgroups so this
    is a noop. Prodcess with privilege become processes without
    privilege when entering a user namespace and as with any other path
    to dropping privilege they would not have the ability to call
    setgroups. So this remains within the bounds of what is possible
    without a knob to disable setgroups permanently in a user namespace.

    Cc: stable@vger.kernel.org
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     
  • Pull networking updates from David Miller:

    1) New offloading infrastructure and example 'rocker' driver for
    offloading of switching and routing to hardware.

    This work was done by a large group of dedicated individuals, not
    limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend,
    Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu

    2) Start making the networking operate on IOV iterators instead of
    modifying iov objects in-situ during transfers. Thanks to Al Viro
    and Herbert Xu.

    3) A set of new netlink interfaces for the TIPC stack, from Richard
    Alpe.

    4) Remove unnecessary looping during ipv6 routing lookups, from Martin
    KaFai Lau.

    5) Add PAUSE frame generation support to gianfar driver, from Matei
    Pavaluca.

    6) Allow for larger reordering levels in TCP, which are easily
    achievable in the real world right now, from Eric Dumazet.

    7) Add a variable of napi_schedule that doesn't need to disable cpu
    interrupts, from Eric Dumazet.

    8) Use a doubly linked list to optimize neigh_parms_release(), from
    Nicolas Dichtel.

    9) Various enhancements to the kernel BPF verifier, and allow eBPF
    programs to actually be attached to sockets. From Alexei
    Starovoitov.

    10) Support TSO/LSO in sunvnet driver, from David L Stevens.

    11) Allow controlling ECN usage via routing metrics, from Florian
    Westphal.

    12) Remote checksum offload, from Tom Herbert.

    13) Add split-header receive, BQL, and xmit_more support to amd-xgbe
    driver, from Thomas Lendacky.

    14) Add MPLS support to openvswitch, from Simon Horman.

    15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen
    Klassert.

    16) Do gro flushes on a per-device basis using a timer, from Eric
    Dumazet. This tries to resolve the conflicting goals between the
    desired handling of bulk vs. RPC-like traffic.

    17) Allow userspace to ask for the CPU upon what a packet was
    received/steered, via SO_INCOMING_CPU. From Eric Dumazet.

    18) Limit GSO packets to half the current congestion window, from Eric
    Dumazet.

    19) Add a generic helper so that all drivers set their RSS keys in a
    consistent way, from Eric Dumazet.

    20) Add xmit_more support to enic driver, from Govindarajulu
    Varadarajan.

    21) Add VLAN packet scheduler action, from Jiri Pirko.

    22) Support configurable RSS hash functions via ethtool, from Eyal
    Perry.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits)
    Fix race condition between vxlan_sock_add and vxlan_sock_release
    net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header
    net/mlx4: Add support for A0 steering
    net/mlx4: Refactor QUERY_PORT
    net/mlx4_core: Add explicit error message when rule doesn't meet configuration
    net/mlx4: Add A0 hybrid steering
    net/mlx4: Add mlx4_bitmap zone allocator
    net/mlx4: Add a check if there are too many reserved QPs
    net/mlx4: Change QP allocation scheme
    net/mlx4_core: Use tasklet for user-space CQ completion events
    net/mlx4_core: Mask out host side virtualization features for guests
    net/mlx4_en: Set csum level for encapsulated packets
    be2net: Export tunnel offloads only when a VxLAN tunnel is created
    gianfar: Fix dma check map error when DMA_API_DEBUG is enabled
    cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call
    net: fec: only enable mdio interrupt before phy device link up
    net: fec: clear all interrupt events to support i.MX6SX
    net: fec: reset fep link status in suspend function
    net: sock: fix access via invalid file descriptor
    net: introduce helper macro for_each_cmsghdr
    ...

    Linus Torvalds
     

11 Dec, 2014

4 commits

  • As printk_func will either be the default function, or a per_cpu function
    for the current CPU, there's no reason to disable preemption to access
    it from printk. That's because if the printk_func is not the default
    then the caller had better disabled preemption as they were the one to
    change it.

    Link: http://lkml.kernel.org/r/CA+55aFz5-_LKW4JHEBoWinN9_ouNcGRWAF2FUA35u46FRN-Kxw@mail.gmail.com

    Suggested-by: Linus Torvalds
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     
  • We allow PMU driver to change the cpu on which the event
    should be installed to. This happened in patch:

    e2d37cd213dc ("perf: Allow the PMU driver to choose the CPU on which to install events")

    This patch also forces all the group members to follow
    the currently opened events cpu if the group happened
    to be moved.

    This and the change of event->cpu in perf_install_in_context()
    function introduced in:

    0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()")

    forces group members to change their event->cpu,
    if the currently-opened-event's PMU changed the cpu
    and there is a group move.

    Above behaviour causes problem for breakpoint events,
    which uses event->cpu to touch cpu specific data for
    breakpoints accounting. By changing event->cpu, some
    breakpoints slots were wrongly accounted for given
    cpu.

    Vinces's perf fuzzer hit this issue and caused following
    WARN on my setup:

    WARNING: CPU: 0 PID: 20214 at arch/x86/kernel/hw_breakpoint.c:119 arch_install_hw_breakpoint+0x142/0x150()
    Can't find any breakpoint slot
    [...]

    This patch changes the group moving code to keep the event's
    original cpu.

    Reported-by: Vince Weaver
    Signed-off-by: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Vince Weaver
    Cc: Yan, Zheng
    Cc:
    Link: http://lkml.kernel.org/r/1418243031-20367-3-git-send-email-jolsa@kernel.org
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Pull ACPI and power management updates from Rafael Wysocki:
    "This time we have some more new material than we used to have during
    the last couple of development cycles.

    The most important part of it to me is the introduction of a unified
    interface for accessing device properties provided by platform
    firmware. It works with Device Trees and ACPI in a uniform way and
    drivers using it need not worry about where the properties come from
    as long as the platform firmware (either DT or ACPI) makes them
    available. It covers both devices and "bare" device node objects
    without struct device representation as that turns out to be necessary
    in some cases. This has been in the works for quite a few months (and
    development cycles) and has been approved by all of the relevant
    maintainers.

    On top of that, some drivers are switched over to the new interface
    (at25, leds-gpio, gpio_keys_polled) and some additional changes are
    made to the core GPIO subsystem to allow device drivers to manipulate
    GPIOs in the "canonical" way on platforms that provide GPIO
    information in their ACPI tables, but don't assign names to GPIO lines
    (in which case the driver needs to do that on the basis of what it
    knows about the device in question). That also has been approved by
    the GPIO core maintainers and the rfkill driver is now going to use
    it.

    Second is support for hardware P-states in the intel_pstate driver.
    It uses CPUID to detect whether or not the feature is supported by the
    processor in which case it will be enabled by default. However, it
    can be disabled entirely from the kernel command line if necessary.

    Next is support for a platform firmware interface based on ACPI
    operation regions used by the PMIC (Power Management Integrated
    Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms.
    That interface is used for manipulating power resources and for
    thermal management: sensor temperature reporting, trip point setting
    and so on.

    Also the ACPI core is now going to support the _DEP configuration
    information in a limited way. Basically, _DEP it supposed to reflect
    off-the-hierarchy dependencies between devices which may be very
    indirect, like when AML for one device accesses locations in an
    operation region handled by another device's driver (usually, the
    device depended on this way is a serial bus or GPIO controller). The
    support added this time is sufficient to make the ACPI battery driver
    work on Asus T100A, but it is general enough to be able to cover some
    other use cases in the future.

    Finally, we have a new cpufreq driver for the Loongson1B processor.

    In addition to the above, there are fixes and cleanups all over the
    place as usual and a traditional ACPICA update to a recent upstream
    release.

    As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for
    Intel platforms should be able to handle power management of the DMA
    engine correctly, the cpufreq-dt driver should interact with the
    thermal subsystem in a better way and the ACPI backlight driver should
    handle some more corner cases, among other things.

    On top of the ACPICA update there are fixes for race conditions in the
    ACPICA's interrupt handling code which might lead to some random and
    strange looking failures on some systems.

    In the cleanups department the most visible part is the series of
    commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration
    option. That was triggered by a discussion regarding the generic
    power domains code during which we realized that trying to support
    certain combinations of PM config options was painful and not really
    worth it, because nobody would use them in production anyway. For
    this reason, we decided to make CONFIG_PM_SLEEP select
    CONFIG_PM_RUNTIME and that lead to the conclusion that the latter
    became redundant and CONFIG_PM could be used instead of it. The
    material here makes that replacement in a major part of the tree, but
    there will be at least one more batch of that in the second part of
    the merge window.

    Specifics:

    - Support for retrieving device properties information from ACPI _DSD
    device configuration objects and a unified device properties
    interface for device drivers (and subsystems) on top of that. As
    stated above, this works with Device Trees and ACPI and allows
    device drivers to be written in a platform firmware (DT or ACPI)
    agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are
    now going to use this new interface and the GPIO subsystem is
    additionally modified to allow device drivers to assign names to
    GPIO resources returned by ACPI _CRS objects (in case _DSD is not
    present or does not provide the expected data). The changes in
    this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron
    Lu, and Darren Hart with some fixes from others (Fabio Estevam,
    Geert Uytterhoeven).

    - Support for Hardware Managed Performance States (HWP) as described
    in Volume 3, section 14.4, of the Intel SDM in the intel_pstate
    driver. CPUID is used to detect whether or not the feature is
    supported by the processor. If supported, it will be enabled
    automatically unless the intel_pstate=no_hwp switch is present in
    the kernel command line. From Dirk Brandewie.

    - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie).

    - Support for firmware interface based on ACPI operation regions used
    by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR
    platforms for power resource control and thermal management (Aaron
    Lu).

    - Limited support for retrieving off-the-hierarchy dependencies
    between devices from ACPI _DEP device configuration objects and
    deferred probing support for the ACPI battery driver based on the
    _DEP information to make that driver work on Asus T100A (Lan
    Tianyu).

    - New cpufreq driver for the Loongson1B processor (Kelvin Cheung).

    - ACPICA update to upstream revision 20141107 which only affects
    tools (Bob Moore).

    - Fixes for race conditions in the ACPICA's interrupt handling code
    and in the ACPI code related to system suspend and resume (Lv Zheng
    and Rafael J Wysocki).

    - ACPI core fix for an RCU-related issue in the ioremap() regions
    management code that slowed down significantly after CPUs had been
    allowed to enter idle states even if they'd had RCU callbakcs
    queued and triggered some problems in certain proprietary graphics
    driver (and elsewhere). The fix replaces synchronize_rcu() in that
    code with synchronize_rcu_expedited() which makes the issue go
    away. From Konstantin Khlebnikov.

    - ACPI LPSS (Low-Power Subsystem) driver fix to handle power
    management of the DMA engine included into the LPSS correctly. The
    problem is that the DMA engine doesn't have ACPI PM support of its
    own and it simply is turned off when the last LPSS device having
    ACPI PM support goes into D3cold. To work around that, the PM
    domain used by the ACPI LPSS driver is redesigned so at least one
    device with ACPI PM support will be on as long as the DMA engine is
    in use. From Andy Shevchenko.

    - ACPI backlight driver fix to avoid using it on "Win8-compatible"
    systems where it doesn't work and where it was used by default by
    mistake (Aaron Lu).

    - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki,
    Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin
    Chaugule (mostly related to the upcoming ARM64 support).

    - Intel RAPL (Running Average Power Limit) power capping driver fixes
    and improvements including new processor IDs (Jacob Pan).

    - Generic power domains modification to power up domains after
    attaching devices to them to meet the expectations of device
    drivers and bus types assuming devices to be accessible at probe
    time (Ulf Hansson).

    - Preliminary support for controlling device clocks from the generic
    power domains core code and modifications of the ARM/shmobile
    platform to use that feature (Ulf Hansson).

    - Assorted minor fixes and cleanups of the generic power domains core
    code (Ulf Hansson, Geert Uytterhoeven).

    - Assorted minor fixes and cleanups of the device clocks control code
    in the PM core (Geert Uytterhoeven, Grygorii Strashko).

    - Consolidation of device power management Kconfig options by making
    CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter
    which is now redundant (Rafael J Wysocki and Kevin Hilman). That
    is the first batch of the changes needed for this purpose.

    - Core device runtime power management support code cleanup related
    to the execution of callbacks (Andrzej Hajda).

    - cpuidle ARM support improvements (Lorenzo Pieralisi).

    - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a
    new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and
    Bartlomiej Zolnierkiewicz).

    - New cpufreq driver callback (->ready) to be executed when the
    cpufreq core is ready to use a given policy object and cpufreq-dt
    driver modification to use that callback for cooling device
    registration (Viresh Kumar).

    - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James
    Geboski, Tomeu Vizoso).

    - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate,
    cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao,
    Stefan Wahren, Petr Cvek).

    - OPP (Operating Performance Points) framework modification to allow
    OPPs to be removed too and update of a few cpufreq drivers
    (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added
    during initialization) on driver removal (Viresh Kumar).

    - Hibernation core fixes and cleanups (Tina Ruchandani and Markus
    Elfring).

    - PM Kconfig fix related to CPU power management (Pankaj Dubey).

    - cpupower tool fix (Prarit Bhargava)"

    * tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (120 commits)
    i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c
    dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    tools: cpupower: fix return checks for sysfs_get_idlestate_count()
    drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME
    MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    leds: leds-gpio: Fix multiple instances registration without 'label' property
    iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    hwrandom / exynos / PM: Use CONFIG_PM in #ifdef
    block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM
    USB / PM: Drop CONFIG_PM_RUNTIME from the USB core
    PM: Merge the SET*_RUNTIME_PM_OPS() macros
    ...

    Linus Torvalds
     
  • Pull nmi-safe seq_buf printk update from Steven Rostedt:
    "This code is a fork from the trace-3.19 pull as it needed the
    trace_seq clean ups from that branch.

    This code solves the issue of performing stack dumps from NMI context.
    The issue is that printk() is not safe from NMI context as if the NMI
    were to trigger when a printk() was being performed, the NMI could
    deadlock from the printk() internal locks. This has been seen in
    practice.

    With lots of review from Petr Mladek, this code went through several
    iterations, and we feel that it is now at a point of quality to be
    accepted into mainline.

    Here's what is contained in this patch set:

    - Creates a "seq_buf" generic buffer utility that allows a descriptor
    to be passed around where functions can write their own "printk()"
    formatted strings into it. The generic version was pulled out of
    the trace_seq() code that was made specifically for tracing.

    - The seq_buf code was change to model the seq_file code. I have a
    patch (not included for 3.19) that converts the seq_file.c code
    over to use seq_buf.c like the trace_seq.c code does. This was
    done to make sure that seq_buf.c is compatible with seq_file.c. I
    may try to get that patch in for 3.20.

    - The seq_buf.c file was moved to lib/ to remove it from being
    dependent on CONFIG_TRACING.

    - The printk() was updated to allow for a per_cpu "override" of the
    internal calls. That is, instead of writing to the console, a call
    to printk() may do something else. This made it easier to allow
    the NMI to change what printk() does in order to call dump_stack()
    without needing to update that code as well.

    - Finally, the dump_stack from all CPUs via NMI code was converted to
    use the seq_buf code. The caller to trigger the NMI code would
    wait till all the NMIs finished, and then it would print the
    seq_buf data to the console safely from a non NMI context

    One added bonus is that this code also makes the NMI dump stack work
    on PREEMPT_RT kernels. As printk() includes sleeping locks on
    PREEMPT_RT, printk() only writes to console if the console does not
    use any rt_mutex converted spin locks. Which a lot do"

    * tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    x86/nmi: Fix use of unallocated cpumask_var_t
    printk/percpu: Define printk_func when printk is not defined
    x86/nmi: Perform a safe NMI stack trace on all CPUs
    printk: Add per_cpu printk func to allow printk to be diverted
    seq_buf: Move the seq_buf code to lib/
    seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF
    tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions
    tracing: Have seq_buf use full buffer
    seq_buf: Add seq_buf_can_fit() helper function
    tracing: Add paranoid size check in trace_printk_seq()
    tracing: Use trace_seq_used() and seq_buf_used() instead of len
    tracing: Clean up tracing_fill_pipe_page()
    seq_buf: Create seq_buf_used() to find out how much was written
    tracing: Add a seq_buf_clear() helper and clear len and readpos in init
    tracing: Convert seq_buf fields to be like seq_file fields
    tracing: Convert seq_buf_path() to be like seq_path()
    tracing: Create seq_buf layer in trace_seq

    Linus Torvalds