15 Jan, 2016

1 commit

  • In include/asm-generic/sections.h:

    /*
    * Usage guidelines:
    * _text, _data: architecture specific, don't use them in
    * arch-independent code
    * [_stext, _etext]: contains .text.* sections, may also contain
    * .rodata.*
    * and/or .init.* sections

    _text is not guaranteed across architectures. Architectures such as ARM
    may reuse parts which are not actually text and erroneously trigger a bug.
    Switch to using _stext which is guaranteed to contain text sections.

    Came out of https://lkml.kernel.org/g/

    Signed-off-by: Laura Abbott
    Reviewed-by: Kees Cook
    Cc: Russell King
    Cc: Arnd Bergmann
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laura Abbott
     

14 Jan, 2016

1 commit

  • Pull libnvdimm updates from Dan Williams:
    "The bulk of this has appeared in -next and independently received a
    build success notification from the kbuild robot. The 'for-4.5/block-
    dax' topic branch was rebased over the weekend to drop the "block
    device end-of-life" rework that Al would like to see re-implemented
    with a notifier, and to address bug reports against the badblocks
    integration.

    There is pending feedback against "libnvdimm: Add a poison list and
    export badblocks" received last week. Linda identified some localized
    fixups that we will handle incrementally.

    Summary:

    - Media error handling: The 'badblocks' implementation that
    originated in md-raid is up-levelled to a generic capability of a
    block device. This initial implementation is limited to being
    consulted in the pmem block-i/o path. Later, 'badblocks' will be
    consulted when creating dax mappings.

    - Raw block device dax: For virtualization and other cases that want
    large contiguous mappings of persistent memory, add the capability
    to dax-mmap a block device directly.

    - Increased /dev/mem restrictions: Add an option to treat all
    io-memory as IORESOURCE_EXCLUSIVE, i.e. disable /dev/mem access
    while a driver is actively using an address range. This behavior
    is controlled via the new CONFIG_IO_STRICT_DEVMEM option and can be
    overridden by the existing "iomem=relaxed" kernel command line
    option.

    - Miscellaneous fixes include a 'pfn'-device huge page alignment fix,
    block device shutdown crash fix, and other small libnvdimm fixes"

    * tag 'libnvdimm-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (32 commits)
    block: kill disk_{check|set|clear|alloc}_badblocks
    libnvdimm, pmem: nvdimm_read_bytes() badblocks support
    pmem, dax: disable dax in the presence of bad blocks
    pmem: fail io-requests to known bad blocks
    libnvdimm: convert to statically allocated badblocks
    libnvdimm: don't fail init for full badblocks list
    block, badblocks: introduce devm_init_badblocks
    block: clarify badblocks lifetime
    badblocks: rename badblocks_free to badblocks_exit
    libnvdimm, pmem: move definition of nvdimm_namespace_add_poison to nd.h
    libnvdimm: Add a poison list and export badblocks
    nfit_test: Enable DSMs for all test NFITs
    md: convert to use the generic badblocks code
    block: Add badblock management for gendisks
    badblocks: Add core badblock management code
    block: fix del_gendisk() vs blkdev_ioctl crash
    block: enable dax for raw block devices
    block: introduce bdev_file_inode()
    restrict /dev/mem to idle io memory ranges
    arch: consolidate CONFIG_STRICT_DEVM in lib/Kconfig.debug
    ...

    Linus Torvalds
     

13 Jan, 2016

5 commits

  • Pull tracing updates from Steven Rostedt:
    "Not much new with tracing for this release. Mostly just clean ups and
    minor fixes.

    Here's what else is new:

    - A new TRACE_EVENT_FN_COND macro, combining both _FN and _COND for
    those that want both.

    - New selftest to test the instance create and delete

    - Better debug output when ftrace fails"

    * tag 'trace-v4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (24 commits)
    ftrace: Fix the race between ftrace and insmod
    ftrace: Add infrastructure for delayed enabling of module functions
    x86: ftrace: Fix the comments for ftrace_modify_code_direct()
    tracing: Fix comment to use tracing_on over tracing_enable
    metag: ftrace: Fix the comments for ftrace_modify_code
    sh: ftrace: Fix the comments for ftrace_modify_code()
    ia64: ftrace: Fix the comments for ftrace_modify_code()
    ftrace: Clean up ftrace_module_init() code
    ftrace: Join functions ftrace_module_init() and ftrace_init_module()
    tracing: Introduce TRACE_EVENT_FN_COND macro
    tracing: Use seq_buf_used() in seq_buf_to_user() instead of len
    bpf: Constify bpf_verifier_ops structure
    ftrace: Have ftrace_ops_get_func() handle RCU and PER_CPU flags too
    ftrace: Remove use of control list and ops
    ftrace: Fix output of enabled_functions for showing tramp
    ftrace: Fix a typo in comment
    ftrace: Show all tramps registered to a record on ftrace_bug()
    ftrace: Add variable ftrace_expected for archs to show expected code
    ftrace: Add new type to distinguish what kind of ftrace_bug()
    tracing: Update cond flag when enabling or disabling a trigger
    ...

    Linus Torvalds
     
  • Pull networking updates from Davic Miller:

    1) Support busy polling generically, for all NAPI drivers. From Eric
    Dumazet.

    2) Add byte/packet counter support to nft_ct, from Floriani Westphal.

    3) Add RSS/XPS support to mvneta driver, from Gregory Clement.

    4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
    Frederic Sowa.

    5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.

    6) Add support for VLAN device bridging to mlxsw switch driver, from
    Ido Schimmel.

    7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.

    8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.

    9) Reorganize wireless drivers into per-vendor directories just like we
    do for ethernet drivers. From Kalle Valo.

    10) Provide a way for administrators "destroy" connected sockets via the
    SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.

    11) Add support to add/remove multicast routes via netlink, from Nikolay
    Aleksandrov.

    12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.

    13) Add forwarding and packet duplication facilities to nf_tables, from
    Pablo Neira Ayuso.

    14) Dead route support in MPLS, from Roopa Prabhu.

    15) TSO support for thunderx chips, from Sunil Goutham.

    16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.

    17) Rationalize, consolidate, and more completely document the checksum
    offloading facilities in the networking stack. From Tom Herbert.

    18) Support aborting an ongoing scan in mac80211/cfg80211, from
    Vidyullatha Kanchanapally.

    19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
    net: bnxt: always return values from _bnxt_get_max_rings
    net: bpf: reject invalid shifts
    phonet: properly unshare skbs in phonet_rcv()
    dwc_eth_qos: Fix dma address for multi-fragment skbs
    phy: remove an unneeded condition
    mdio: remove an unneed condition
    mdio_bus: NULL dereference on allocation error
    net: Fix typo in netdev_intersect_features
    net: freescale: mac-fec: Fix build error from phy_device API change
    net: freescale: ucc_geth: Fix build error from phy_device API change
    bonding: Prevent IPv6 link local address on enslaved devices
    IB/mlx5: Add flow steering support
    net/mlx5_core: Export flow steering API
    net/mlx5_core: Make ipv4/ipv6 location more clear
    net/mlx5_core: Enable flow steering support for the IB driver
    net/mlx5_core: Initialize namespaces only when supported by device
    net/mlx5_core: Set priority attributes
    net/mlx5_core: Connect flow tables
    net/mlx5_core: Introduce modify flow table command
    net/mlx5_core: Managing root flow table
    ...

    Linus Torvalds
     
  • Pull crypto update from Herbert Xu:
    "Algorithms:
    - Add RSA padding algorithm

    Drivers:
    - Add GCM mode support to atmel
    - Add atmel support for SAMA5D2 devices
    - Add cipher modes to talitos
    - Add rockchip driver for rk3288
    - Add qat support for C3XXX and C62X"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (103 commits)
    crypto: hifn_795x, picoxcell - use ablkcipher_request_cast
    crypto: qat - fix SKU definiftion for c3xxx dev
    crypto: qat - Fix random config build issue
    crypto: ccp - use to_pci_dev and to_platform_device
    crypto: qat - Rename dh895xcc mmp firmware
    crypto: 842 - remove WARN inside printk
    crypto: atmel-aes - add debug facilities to monitor register accesses.
    crypto: atmel-aes - add support to GCM mode
    crypto: atmel-aes - change the DMA threshold
    crypto: atmel-aes - fix the counter overflow in CTR mode
    crypto: atmel-aes - fix atmel-ctr-aes driver for RFC 3686
    crypto: atmel-aes - create sections to regroup functions by usage
    crypto: atmel-aes - fix typo and indentation
    crypto: atmel-aes - use SIZE_IN_WORDS() helper macro
    crypto: atmel-aes - improve performances of data transfer
    crypto: atmel-aes - fix atmel_aes_remove()
    crypto: atmel-aes - remove useless AES_FLAGS_DMA flag
    crypto: atmel-aes - reduce latency of DMA completion
    crypto: atmel-aes - remove unused 'err' member of struct atmel_aes_dev
    crypto: atmel-aes - rework crypto request completion
    ...

    Linus Torvalds
     
  • Pull misc vfs updates from Al Viro:
    "All kinds of stuff. That probably should've been 5 or 6 separate
    branches, but by the time I'd realized how large and mixed that bag
    had become it had been too close to -final to play with rebasing.

    Some fs/namei.c cleanups there, memdup_user_nul() introduction and
    switching open-coded instances, burying long-dead code, whack-a-mole
    of various kinds, several new helpers for ->llseek(), assorted
    cleanups and fixes from various people, etc.

    One piece probably deserves special mention - Neil's
    lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
    called without ->i_mutex and tries to avoid ever taking it. That, of
    course, means that it's not useful for any directory modifications,
    but things like getting inode attributes in nfds readdirplus are fine
    with that. I really should've asked for moratorium on lookup-related
    changes this cycle, but since I hadn't done that early enough... I
    *am* asking for that for the coming cycle, though - I'm going to try
    and get conversion of i_mutex to rwsem with ->lookup() done under lock
    taken shared.

    There will be a patch closer to the end of the window, along the lines
    of the one Linus had posted last May - mechanical conversion of
    ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
    inode_is_locked()/inode_lock_nested(). To quote Linus back then:

    -----
    | This is an automated patch using
    |
    | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    | with a very few manual fixups
    -----

    I'm going to send that once the ->i_mutex-affecting stuff in -next
    gets mostly merged (or when Linus says he's about to stop taking
    merges)"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    nfsd: don't hold i_mutex over userspace upcalls
    fs:affs:Replace time_t with time64_t
    fs/9p: use fscache mutex rather than spinlock
    proc: add a reschedule point in proc_readfd_common()
    logfs: constify logfs_block_ops structures
    fcntl: allow to set O_DIRECT flag on pipe
    fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
    fs: xattr: Use kvfree()
    [s390] page_to_phys() always returns a multiple of PAGE_SIZE
    nbd: use ->compat_ioctl()
    fs: use block_device name vsprintf helper
    lib/vsprintf: add %*pg format specifier
    fs: use gendisk->disk_name where possible
    poll: plug an unused argument to do_poll
    amdkfd: don't open-code memdup_user()
    cdrom: don't open-code memdup_user()
    rsxx: don't open-code memdup_user()
    mtip32xx: don't open-code memdup_user()
    [um] mconsole: don't open-code memdup_user_nul()
    [um] hostaudio: don't open-code memdup_user()
    ...

    Linus Torvalds
     
  • Pull iov_iter infrastructure updates from Al Viro:
    "A couple of iov_iter updates"

    * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    iov_iter: export import_single_range()
    iov_iter: constify {csum_and_,}copy_to_iter()

    Linus Torvalds
     

12 Jan, 2016

4 commits

  • Pull MMC updates from Ulf Hansson:
    "MMC core:
    - Optimize boot time by detecting cards simultaneously
    - Make runtime resume default behavior for MMC/SD
    - Enable MMC/SD/SDIO devices to suspend/resume asynchronously
    - Allow more than 8 partitions per card
    - Introduce MMC_CAP2_NO_SDIO to prevent unsupported SDIO commands
    - Support the standard DT wakeup-source property
    - Fix driver strength switching for HS200 and HS400
    - Fix switch command timeout
    - Fix invalid vdd in voltage switch power cycle for SDIO

    MMC host:
    - sdhci: Restore behavior when setting VDD via external regulator
    - sdhci: A couple of changes/fixes related to the dma support
    - sdhci-tegra: Add Tegra210 support
    - sdhci-tegra: Support for UHS-I cards including tuning support
    - sdhci-of-at91: Add PM support
    - sh_mmcif: Rework dma channel handling
    - mvsdio: Delete platform data code path"

    * tag 'mmc-v4.5' of git://git.linaro.org/people/ulf.hansson/mmc: (52 commits)
    mmc: dw_mmc: remove the unused quirks
    mmc: sdhci-pci: use to_pci_dev()
    mmc: cb710: use to_platform_device()
    mmc: tegra: use correct accessor for misc ctrl register
    mmc: tegra: enable UHS-I modes
    mmc: tegra: implement UHS tuning
    mmc: tegra: disable SPI_MODE_CLKEN
    mmc: tegra: implement module external clock change
    mmc: sdhci: restore behavior when setting VDD via external regulator
    mmc: It is not an error for the card to be removed while suspended
    mmc: block: Allow more than 8 partitions per card
    mmc: core: Optimize boot time by detecting cards simultaneously
    mmc: dw_mmc: use resource_size_t to store physical address
    mmc: core: fix __mmc_switch timeout caused by preempt
    mmc: usdhi6rol0: handle NULL data in timeout
    mmc: of_mmc_spi: Add IRQF_ONESHOT to interrupt flags
    mmc: mediatek: change some dev_err to dev_dbg
    mmc: enable MMC/SD/SDIO device to suspend/resume asynchronously
    mmc: sdhci: Fix sdhci_runtime_pm_bus_on/off()
    mmc: sdhci: 64-bit DMA actually has 4-byte alignment
    ...

    Linus Torvalds
     
  • Pull workqueue update from Tejun Heo:
    "Workqueue changes for v4.5. One cleanup patch and three to improve
    the debuggability.

    Workqueue now has a stall detector which dumps workqueue state if any
    worker pool hasn't made forward progress over a certain amount of time
    (30s by default) and also triggers a warning if a workqueue which can
    be used in memory reclaim path tries to wait on something which can't
    be.

    These should make workqueue hangs a lot easier to debug."

    * 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: simplify the apply_workqueue_attrs_locked()
    workqueue: implement lockup detector
    watchdog: introduce touch_softlockup_watchdog_sched()
    workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue

    Linus Torvalds
     
  • Pull perf updates from Ingo Molnar:
    "Kernel side changes:

    - Intel Knights Landing support. (Harish Chegondi)

    - Intel Broadwell-EP uncore PMU support. (Kan Liang)

    - Core code improvements. (Peter Zijlstra.)

    - Event filter, LBR and PEBS fixes. (Stephane Eranian)

    - Enable cycles:pp on Intel Atom. (Stephane Eranian)

    - Add cycles:ppp support for Skylake. (Andi Kleen)

    - Various x86 NMI overhead optimizations. (Andi Kleen)

    - Intel PT enhancements. (Takao Indoh)

    - AMD cache events fix. (Vince Weaver)

    Tons of tooling changes:

    - Show random perf tool tips in the 'perf report' bottom line
    (Namhyung Kim)

    - perf report now defaults to --group if the perf.data file has
    grouped events, try it with:

    # perf record -e '{cycles,instructions}' -a sleep 1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.093 MB perf.data (1247 samples) ]
    # perf report
    # Samples: 1K of event 'anon group { cycles, instructions }'
    # Event count (approx.): 1955219195
    #
    # Overhead Command Shared Object Symbol

    2.86% 0.22% swapper [kernel.kallsyms] [k] intel_idle
    1.05% 0.33% firefox libxul.so [.] js::SetObjectElement
    1.05% 0.00% kworker/0:3 [kernel.kallsyms] [k] gen6_ring_get_seqno
    0.88% 0.17% chrome chrome [.] 0x0000000000ee27ab
    0.65% 0.86% firefox libxul.so [.] js::ValueToId
    0.64% 0.23% JS Helper libxul.so [.] js::SplayTree::splay
    0.62% 1.27% firefox libxul.so [.] js::GetIterator
    0.61% 1.74% firefox libxul.so [.] js::NativeSetProperty
    0.61% 0.31% firefox libxul.so [.] js::SetPropertyByDefining

    - Introduce the 'perf stat record/report' workflow:

    Generate perf.data files from 'perf stat', to tap into the
    scripting capabilities perf has instead of defining a 'perf stat'
    specific scripting support to calculate event ratios, etc.

    Simple example:

    $ perf stat record -e cycles usleep 1

    Performance counter stats for 'usleep 1':

    1,134,996 cycles

    0.000670644 seconds time elapsed

    $ perf stat report

    Performance counter stats for '/home/acme/bin/perf stat record -e cycles usleep 1':

    1,134,996 cycles

    0.000670644 seconds time elapsed

    $

    It generates PERF_RECORD_ userspace records to store the details:

    $ perf report -D | grep PERF_RECORD
    0xf0 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27637
    0x118 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
    0x12a [0x40]: PERF_RECORD_STAT_CONFIG
    0x16a [0x30]: PERF_RECORD_STAT
    -1 -1 0x19a [0x40]: PERF_RECORD_MMAP -1/0: [0xffffffff81000000(0x1f000000) @ 0xffffffff81000000]: x [kernel.kallsyms]_text
    0x1da [0x18]: PERF_RECORD_STAT_ROUND
    [acme@ssdandy linux]$

    An effort was made to make perf.data files generated like this to
    not generate cryptic messages when processed by older tools.

    The 'perf script' bits need rebasing, will go up later.

    - Make command line options always available, even when they depend
    on some feature being enabled, warning the user about use of such
    options (Wang Nan)

    - Support hw breakpoint events (mem:0xAddress) in the default output
    mode in 'perf script' (Wang Nan)

    - Fixes and improvements for supporting annotating ARM binaries,
    support ARM call and jump instructions, more work needed to have
    arch specific stuff separated into tools/perf/arch/*/annotate/
    (Russell King)

    - Add initial 'perf config' command, for now just with a --list
    command to the contents of the configuration file in use and a
    basic man page describing its format, commands for doing edits and
    detailed documentation are being reviewed and proof-read. (Taeung
    Song)

    - Allows BPF scriptlets specify arguments to be fetched using DWARF
    info, using a prologue generated at compile/build time (He Kuang,
    Wang Nan)

    - Allow attaching BPF scriptlets to module symbols (Wang Nan)

    - Allow attaching BPF scriptlets to userspace code using uprobe (Wang
    Nan)

    - BPF programs now can specify 'perf probe' tunables via its section
    name, separating key=val values using semicolons (Wang Nan)

    Testing some of these new BPF features:

    Use case: get callchains when receiving SSL packets, filter then in the
    kernel, at arbitrary place.

    # cat ssl.bpf.c
    #define SEC(NAME) __attribute__((section(NAME), used))

    struct pt_regs;

    SEC("func=__inet_lookup_established hnum")
    int func(struct pt_regs *ctx, int err, unsigned short port)
    {
    return err == 0 && port == 443;
    }

    char _license[] SEC("license") = "GPL";
    int _version SEC("version") = LINUX_VERSION_CODE;
    #
    # perf record -a -g -e ssl.bpf.c
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.787 MB perf.data (3 samples) ]
    # perf script | head -30
    swapper 0 [000] 58783.268118: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
    8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
    896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
    8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
    855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
    8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
    8572a8 process_backlog (/lib/modules/4.3.0+/build/vmlinux)
    856b11 net_rx_action (/lib/modules/4.3.0+/build/vmlinux)
    2a284b __do_softirq (/lib/modules/4.3.0+/build/vmlinux)
    2a2ba3 irq_exit (/lib/modules/4.3.0+/build/vmlinux)
    96b7a4 do_IRQ (/lib/modules/4.3.0+/build/vmlinux)
    969807 ret_from_intr (/lib/modules/4.3.0+/build/vmlinux)
    2dede5 cpu_startup_entry (/lib/modules/4.3.0+/build/vmlinux)
    95d5bc rest_init (/lib/modules/4.3.0+/build/vmlinux)
    1163ffa start_kernel ([kernel.vmlinux].init.text)
    11634d7 x86_64_start_reservations ([kernel.vmlinux].init.text)
    1163623 x86_64_start_kernel ([kernel.vmlinux].init.text)

    qemu-system-x86 9178 [003] 58785.792417: perf_bpf_probe:func: (ffffffff816a0f60) hnum=0x1bb
    8a0f61 __inet_lookup_established (/lib/modules/4.3.0+/build/vmlinux)
    896def ip_rcv_finish (/lib/modules/4.3.0+/build/vmlinux)
    8976c2 ip_rcv (/lib/modules/4.3.0+/build/vmlinux)
    855eba __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
    8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
    856660 netif_receive_skb_internal (/lib/modules/4.3.0+/build/vmlinux)
    8566ec netif_receive_skb_sk (/lib/modules/4.3.0+/build/vmlinux)
    430a br_handle_frame_finish ([bridge])
    48bc br_handle_frame ([bridge])
    855f44 __netif_receive_skb_core (/lib/modules/4.3.0+/build/vmlinux)
    8565d8 __netif_receive_skb (/lib/modules/4.3.0+/build/vmlinux)
    #

    - Use 'perf probe' various options to list functions, see what
    variables can be collected at any given point, experiment first
    collecting without a filter, then filter, use it together with
    'perf trace', 'perf top', with or without callchains, if it
    explodes, please tell us!

    - Introduce a new callchain mode: "folded", that will list per line
    representations of all callchains for a give histogram entry,
    facilitating 'perf report' output processing by other tools, such
    as Brendan Gregg's flamegraph tools (Namhyung Kim)

    E.g:

    # perf report | grep -v ^# | head
    18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
    |
    ---cpu_startup_entry
    |
    |--12.07%--start_secondary
    |
    --6.30%--rest_init
    start_kernel
    x86_64_start_reservations
    x86_64_start_kernel
    #

    Becomes, in "folded" mode:

    # perf report -g folded | grep -v ^# | head -5
    18.37% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry
    12.07% cpu_startup_entry;start_secondary
    6.30% cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
    16.90% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle
    11.23% call_cpuidle;cpu_startup_entry;start_secondary
    5.67% call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
    16.90% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter
    11.23% cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
    5.67% cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel
    15.12% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state
    #

    The user can also select one of "count", "period" or "percent" as
    the first column.

    ... and lots of infrastructure enhancements, plus fixes and other
    changes, features I failed to list - see the shortlog and the git log
    for details"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (271 commits)
    perf evlist: Add --trace-fields option to show trace fields
    perf record: Store data mmaps for dwarf unwind
    perf libdw: Check for mmaps also in MAP__VARIABLE tree
    perf unwind: Check for mmaps also in MAP__VARIABLE tree
    perf unwind: Use find_map function in access_dso_mem
    perf evlist: Remove perf_evlist__(enable|disable)_event functions
    perf evlist: Make perf_evlist__open() open evsels with their cpus and threads (like perf record does)
    perf report: Show random usage tip on the help line
    perf hists: Export a couple of hist functions
    perf diff: Use perf_hpp__register_sort_field interface
    perf tools: Add overhead/overhead_children keys defaults via string
    perf tools: Remove list entry from struct sort_entry
    perf tools: Include all tools/lib directory for tags/cscope/TAGS targets
    perf script: Align event name properly
    perf tools: Add missing headers in perf's MANIFEST
    perf tools: Do not show trace command if it's not compiled in
    perf report: Change default to use event group view
    perf top: Decay periods in callchains
    tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/
    tools lib: Sync tools/lib/find_bit.c with the kernel
    ...

    Linus Torvalds
     
  • Pull locking updates from Ingo Molnar:
    "So we have a laundry list of locking subsystem changes:

    - continuing barrier API and code improvements

    - futex enhancements

    - atomics API improvements

    - pvqspinlock enhancements: in particular lock stealing and adaptive
    spinning

    - qspinlock micro-enhancements"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op
    futex: Cleanup the goto confusion in requeue_pi()
    futex: Remove pointless put_pi_state calls in requeue()
    futex: Document pi_state refcounting in requeue code
    futex: Rename free_pi_state() to put_pi_state()
    futex: Drop refcount if requeue_pi() acquired the rtmutex
    locking/barriers, arch: Remove ambiguous statement in the smp_store_mb() documentation
    lcoking/barriers, arch: Use smp barriers in smp_store_release()
    locking/cmpxchg, arch: Remove tas() definitions
    locking/pvqspinlock: Queue node adaptive spinning
    locking/pvqspinlock: Allow limited lock stealing
    locking/pvqspinlock: Collect slowpath lock statistics
    sched/core, locking: Document Program-Order guarantees
    locking, sched: Introduce smp_cond_acquire() and use it
    locking/pvqspinlock, x86: Optimize the PV unlock code path
    locking/qspinlock: Avoid redundant read of next pointer
    locking/qspinlock: Prefetch the next node cacheline
    locking/qspinlock: Use _acquire/_release() versions of cmpxchg() & xchg()
    atomics: Add test for atomic operations with _relaxed variants

    Linus Torvalds
     

09 Jan, 2016

3 commits

  • This effectively promotes IORESOURCE_BUSY to IORESOURCE_EXCLUSIVE
    semantics by default. If userspace really believes it is safe to access
    the memory region it can also perform the extra step of disabling an
    active driver. This protects device address ranges with read side
    effects and otherwise directs userspace to use the driver.

    Persistent memory presents a large "mistake surface" to /dev/mem as now
    accidental writes can corrupt a filesystem.

    In general if a device driver is busily using a memory region it already
    informs other parts of the kernel to not touch it via
    request_mem_region(). /dev/mem should honor the same safety restriction
    by default. Debugging a device driver from userspace becomes more
    difficult with this enabled. Any application using /dev/mem or mmap of
    sysfs pci resources will now need to perform the extra step of either:

    1/ Disabling the driver, for example:

    echo > /dev/bus//drivers//unbind

    2/ Rebooting with "iomem=relaxed" on the command line

    3/ Recompiling with CONFIG_IO_STRICT_DEVMEM=n

    Traditional users of /dev/mem like dosemu are unaffected because the
    first 1MB of memory is not subject to the IO_STRICT_DEVMEM restriction.
    Legacy X configurations use /dev/mem to talk to graphics hardware, but
    that functionality has since moved to kernel graphics drivers.

    Cc: Arnd Bergmann
    Cc: Russell King
    Cc: Andrew Morton
    Cc: Greg Kroah-Hartman
    Acked-by: Kees Cook
    Acked-by: Ingo Molnar
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Let all the archs that implement devmem_is_allowed() opt-in to a common
    definition of CONFIG_STRICT_DEVM in lib/Kconfig.debug.

    Cc: Kees Cook
    Cc: Russell King
    Cc: Will Deacon
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Andrew Morton
    Cc: Greg Kroah-Hartman
    Cc: "David S. Miller"
    Acked-by: Catalin Marinas
    Acked-by: Heiko Carstens
    [heiko: drop 'default y' for s390]
    Acked-by: Ingo Molnar
    Suggested-by: Arnd Bergmann
    Signed-off-by: Dan Williams

    Dan Williams
     
  • Al Viro
     

07 Jan, 2016

1 commit

  • This allow to directly print block_device name.
    Currently one should use bdevname() with temporal char buffer.
    This is very ineffective because bloat stack usage for deep IO call-traces

    Example:
    %pg -> sda, sda1 or loop0p1

    [AV: fixed a minor braino - position updates should not be dependent
    upon having reached the of buffer]

    Signed-off-by: Dmitry Monakhov
    Signed-off-by: Al Viro

    Dmitry Monakhov
     

06 Jan, 2016

1 commit

  • …k/linux-rcu into core/rcu

    Pull RCU changes from Paul E. McKenney:

    - Adding transitivity uniformly to rcu_node structure ->lock
    acquisitions. (This is implemented by the first two commits
    on top of v4.4-rc2 due to the pervasive nature of this change.)

    - Documentation updates, including RCU requirements.

    - Expedited grace-period changes.

    - Miscellaneous fixes.

    - Linked-list fixes, courtesy of KTSAN.

    - Torture-test updates.

    - Late-breaking fix to sysrq-generated crash.

    Signed-off-by: Ingo Molnar <mingo@kernel.org>

    Ingo Molnar
     

04 Jan, 2016

1 commit


01 Jan, 2016

1 commit


24 Dec, 2015

1 commit

  • commit 5ac48378414d ("tracing: Use trace_seq_used() and seq_buf_used()
    instead of len") changed the tracing code to use trace_seq_used() and
    seq_buf_used() instead of using the seq_buf len directly to avoid
    overflow issues, but missed a spot in seq_buf_to_user() that makes use
    of s->len.

    Cleaned up the code a bit as well per suggestion of Steve Rostedt.

    Link: http://lkml.kernel.org/r/1447703848-2951-1-git-send-email-jsnitsel@redhat.com

    Signed-off-by: Jerry Snitselaar
    Signed-off-by: Steven Rostedt

    Jerry Snitselaar
     

23 Dec, 2015

1 commit

  • Remove the WARN() from the beN_to_cpu macro, which is used as a param to a
    pr_debug() call. With a certain kernel config, this printk-in-printk
    results in the no_printk() macro trying to recursively call the
    no_printk() macro, and since macros can't recursively call themselves
    a build error results.

    Reported-by: Randy Dunlap
    Signed-off-by: Dan Streetman
    Signed-off-by: Herbert Xu

    Dan Streetman
     

22 Dec, 2015

1 commit


19 Dec, 2015

2 commits

  • The commit c6ff5268293ef98e48a99597e765ffc417e39fa5 ("rhashtable:
    Fix walker list corruption") causes a suspicious RCU usage warning
    because we no longer hold ht->mutex when we dereference ht->tbl.

    However, this is a false positive because we now hold ht->lock
    which also guarantees that ht->tbl won't disppear from under us.

    This patch kills the warning by using rcu_dereference_protected.

    Reported-by: kernel test robot
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Add couple of test cases for interpreter but also JITs, f.e. to test that
    when imm32 moves are being done, upper 32bits of the regs are being zero
    extended.

    Without JIT:

    [...]
    [ 1114.129301] test_bpf: #43 MOV REG64 jited:0 128 PASS
    [ 1114.130626] test_bpf: #44 MOV REG32 jited:0 139 PASS
    [ 1114.132055] test_bpf: #45 LD IMM64 jited:0 124 PASS
    [...]

    With JIT (generated code can as usual be nicely verified with the help of
    bpf_jit_disasm tool):

    [...]
    [ 1062.726782] test_bpf: #43 MOV REG64 jited:1 6 PASS
    [ 1062.726890] test_bpf: #44 MOV REG32 jited:1 6 PASS
    [ 1062.726993] test_bpf: #45 LD IMM64 jited:1 6 PASS
    [...]

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

18 Dec, 2015

3 commits

  • Conflicts:
    drivers/net/geneve.c

    Here we had an overlapping change, where in 'net' the extraneous stats
    bump was being removed whilst in 'net-next' the final argument to
    udp_tunnel6_xmit_skb() was being changed.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull networking fixes from David Miller:

    1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of
    people reported this... From Arnd Bergmann.

    2) Don't init mutex twice in i40e driver, from Jesse Brandeburg.

    3) Fix spurious EBUSY in rhashtable, from Herbert Xu.

    4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas.

    5) Fix race with work structure access in pppoe driver causing
    corruptions, from Guillaume Nault.

    6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb()
    actually succeeded or not, from Sergei Shtylyov.

    7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from
    Bjørn Mork.

    8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc.

    9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo
    Leitner.

    10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven.

    11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling
    properly as well, from Jiri Benc.

    12) Handle request sockets properly in xfrm layer, from Eric Dumazet.

    13) Double stats update in ipv6 geneve transmit path, fix from Pravin B
    Shelar.

    14) sk->sk_policy[] needs RCU protection, and as a result
    xfrm_policy_destroy() needs to free policies using an RCU grace
    period, from Eric Dumazet.

    15) SCTP needs to clone ipv6 tx options in order to avoid use after
    free, from Eric Dumazet.

    16) Missing kbuild export if ila.h, from Stephen Hemminger.

    17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from
    Tobias Klauser.

    18) Validate protocol value range in ->create() methods, from Hannes
    Frederic Sowa.

    19) Fix early socket demux races that result in illegal dst reuse, from
    Eric Dumazet.

    20) Validate socket address length in pptp code, from WANG Cong.

    21) skb_reorder_vlan_header() uses incorrect offset and can corrupt
    packets, from Vlad Yasevich.

    22) Fix memory leaks in nl80211 registry code, from Ola Olsson.

    23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and
    qlcnic. From Dan Carpenter.

    24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for
    example, AF_ALG will interpret it as an async call. From Tadeusz
    Struk.

    25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from
    Eric Dumazet.

    26) rhashtable enforces the minimum table size not early enough,
    breaking how we calculate the per-cpu lock allocations. From
    Herbert Xu.

    27) Fix FCC port lockup in 82xx driver, from Martin Roth.

    28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa.

    29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and
    sock_setsockopt() wrt. timestamp handling. From WANG Cong.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits)
    net: check both type and procotol for tcp sockets
    drivers: net: xgene: fix Tx flow control
    tcp: restore fastopen with no data in SYN packet
    af_unix: Revert 'lock_interruptible' in stream receive code
    fou: clean up socket with kfree_rcu
    82xx: FCC: Fixing a bug causing to FCC port lock-up
    gianfar: Don't enable RX Filer if not supported
    net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration
    rhashtable: Fix walker list corruption
    rhashtable: Enforce minimum size on initial hash table
    inet: tcp: fix inetpeer_set_addr_v4()
    ipv6: automatically enable stable privacy mode if stable_secret set
    net: fix uninitialized variable issue
    bluetooth: Validate socket address length in sco_sock_bind().
    net_sched: make qdisc_tree_decrease_qlen() work for non mq
    ser_gigaset: remove unnecessary kfree() calls from release method
    ser_gigaset: fix deallocation of platform device structure
    ser_gigaset: turn nonsense checks into WARN_ON
    ser_gigaset: fix up NULL checks
    qlcnic: fix a timeout loop
    ...

    Linus Torvalds
     
  • Pull libnvdimm fixes from Dan Williams:

    - Two bug fixes for misuse of PAGE_MASK in scatterlist and dma-debug.
    These are tagged for -stable. The scatterlist impact is potentially
    corrupted dma addresses on HIGHMEM enabled platforms.

    - A minor locking fix for the NFIT hot-add implementation that is new
    in 4.4-rc. This would only trigger in the case a hot-add raced
    driver removal.

    * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
    dma-debug: Fix dma_debug_entry offset calculation
    Revert "scatterlist: use sg_phys()"
    nfit: acpi_nfit_notify(): Do not leave device locked

    Linus Torvalds
     

17 Dec, 2015

2 commits

  • dma-debug uses struct dma_debug_entry to keep track of dma coherent
    memory allocation requests. The virtual address is converted into a pfn
    and an offset. Previously, the offset was calculated using an incorrect
    bit mask. As a result, we saw incorrect error messages from dma-debug
    like the following:

    "DMA-API: exceeded 7 overlapping mappings of cacheline 0x03e00000"

    Cacheline 0x03e00000 does not exist on our platform.

    Cc:
    Fixes: 0abdd7a81b7e ("dma-debug: introduce debug_dma_assert_idle()")
    Signed-off-by: Daniel Mentz
    Signed-off-by: Dan Williams

    Daniel Mentz
     
  • The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable:
    Fix sleeping inside RCU critical section in walk_stop") introduced
    a new spinlock for the walker list. However, it did not convert
    all existing users of the list over to the new spin lock. Some
    continued to use the old mutext for this purpose. This obviously
    led to corruption of the list.

    The fix is to use the spin lock everywhere where we touch the list.

    This also allows us to do rcu_rad_lock before we take the lock in
    rhashtable_walk_start. With the old mutex this would've deadlocked
    but it's safe with the new spin lock.

    Fixes: ba7c95ea3870 ("rhashtable: Fix sleeping inside RCU...")
    Reported-by: Colin Ian King
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

16 Dec, 2015

1 commit

  • William Hua wrote:
    >
    > I wasn't aware there was an enforced minimum size. I simply set the
    > nelem_hint in the rhastable_params struct to 1, expecting it to grow as
    > needed. This caused a segfault afterwards when trying to insert an
    > element.

    OK we're doing the size computation before we enforce the limit
    on min_size.

    ---8
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

09 Dec, 2015

2 commits

  • The patch 9497df88ab5567daa001829051c5f87161a81ff0 ("rhashtable:
    Fix reader/rehash race") added a pair of barriers. In fact the
    wmb is superfluous because every subsequent write to the old or
    new hash table uses rcu_assign_pointer, which itself carriers a
    full barrier prior to the assignment.

    Therefore we may remove the explicit wmb.

    Signed-off-by: Herbert Xu
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Workqueue stalls can happen from a variety of usage bugs such as
    missing WQ_MEM_RECLAIM flag or concurrency managed work item
    indefinitely staying RUNNING. These stalls can be extremely difficult
    to hunt down because the usual warning mechanisms can't detect
    workqueue stalls and the internal state is pretty opaque.

    To alleviate the situation, this patch implements workqueue lockup
    detector. It periodically monitors all worker_pools periodically and,
    if any pool failed to make forward progress longer than the threshold
    duration, triggers warning and dumps workqueue state as follows.

    BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
    Showing busy workqueues and worker pools:
    workqueue events: flags=0x0
    pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
    pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
    workqueue events_power_efficient: flags=0x80
    pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
    pending: check_lifetime, neigh_periodic_work
    workqueue cgroup_pidlist_destroy: flags=0x0
    pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
    pending: cgroup_pidlist_destroy_work_fn
    ...

    The detection mechanism is controller through kernel parameter
    workqueue.watchdog_thresh and can be updated at runtime through the
    sysfs module parameter file.

    v2: Decoupled from softlockup control knobs.

    Signed-off-by: Tejun Heo
    Acked-by: Don Zickus
    Cc: Ulrich Obergfell
    Cc: Michal Hocko
    Cc: Chris Mason
    Cc: Andrew Morton

    Tejun Heo
     

07 Dec, 2015

2 commits


06 Dec, 2015

2 commits

  • asm/atomic.h doesn't really need asm/processor.h anymore. Everything
    it uses has moved to other header files. So remove that include.

    processor.h is a nasty header that includes lots of
    other headers and makes it prone to include loops. Removing the
    include here makes asm/atomic.h a "leaf" header that can
    be safely included in most other headers.

    The only fallout is in the lib/atomic tester which relied on
    this implicit include. Give it an explicit include.
    (the include is in ifdef because the user is also in ifdef)

    Signed-off-by: Andi Kleen
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Arnaldo Carvalho de Melo
    Cc: Jiri Olsa
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Cc: Thomas Gleixner
    Cc: Vince Weaver
    Cc: rostedt@goodmis.org
    Link: http://lkml.kernel.org/r/1449018060-1742-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar

    Andi Kleen
     
  • This reverts commit d3716f18a7d841565c930efde30737a3557eee69.

    vmalloc cannot be used in BH disabled contexts, even
    with GFP_ATOMIC. And we certainly want to support
    rhashtable users inserting entries with software
    interrupts disabled.

    Signed-off-by: David S. Miller

    David S. Miller
     

05 Dec, 2015

2 commits

  • When an rhashtable user pounds rhashtable hard with back-to-back
    insertions we may end up growing the table in GFP_ATOMIC context.
    Unfortunately when the table reaches a certain size this often
    fails because we don't have enough physically contiguous pages
    to hold the new table.

    Eric Dumazet suggested (and in fact wrote this patch) using
    __vmalloc instead which can be used in GFP_ATOMIC context.

    Reported-by: Phil Sutter
    Suggested-by: Eric Dumazet
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Thomas and Phil observed that under stress rhashtable insertion
    sometimes failed with EBUSY, even though this error should only
    ever been seen when we're under attack and our hash chain length
    has grown to an unacceptable level, even after a rehash.

    It turns out that the logic for detecting whether there is an
    existing rehash is faulty. In particular, when two threads both
    try to grow the same table at the same time, one of them may see
    the newly grown table and thus erroneously conclude that it had
    been rehashed. This is what leads to the EBUSY error.

    This patch fixes this by remembering the current last table we
    used during insertion so that rhashtable_insert_rehash can detect
    when another thread has also done a resize/rehash. When this is
    detected we will give up our resize/rehash and simply retry the
    insertion with the new table.

    Reported-by: Thomas Graf
    Reported-by: Phil Sutter
    Signed-off-by: Herbert Xu
    Tested-by: Phil Sutter
    Signed-off-by: David S. Miller

    Herbert Xu
     

04 Dec, 2015

1 commit


02 Dec, 2015

1 commit

  • This module allows to insert errors in some of netdevice's notifier
    events. All network drivers use these notifiers to signal various events
    and to check if they are allowed, e.g. PRECHANGEMTU and CHANGEMTU
    afterwards. Until recently I had to run failure tests by injecting
    a custom module, but now this infrastructure makes it trivial to test
    these failure paths. Some of the recent bugs I fixed were found using
    this module.
    Here's an example:
    $ cd /sys/kernel/debug/notifier-error-inject/netdev
    $ echo -22 > actions/NETDEV_CHANGEMTU/error
    $ ip link set eth0 mtu 1024
    RTNETLINK answers: Invalid argument

    CC: Akinobu Mita
    CC: "David S. Miller"
    CC: netdev
    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     

24 Nov, 2015

1 commit

  • Code that does lockless emptiness testing of non-RCU lists is relying
    on the list-addition code to write the list head's ->next pointer
    atomically. This commit therefore adds WRITE_ONCE() to list-addition
    pointer stores that could affect the head's ->next pointer.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney