08 Oct, 2016

1 commit

  • When doing an nmi backtrace of many cores, most of which are idle, the
    output is a little overwhelming and very uninformative. Suppress
    messages for cpus that are idling when they are interrupted and just
    emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN".

    We do this by grouping all the cpuidle code together into a new
    .cpuidle.text section, and then checking the address of the interrupted
    PC to see if it lies within that section.

    This commit suitably tags x86 and tile idle routines, and only adds in
    the minimal framework for other architectures.

    Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com
    Signed-off-by: Chris Metcalf
    Acked-by: Peter Zijlstra (Intel)
    Tested-by: Peter Zijlstra (Intel)
    Tested-by: Daniel Thompson [arm]
    Tested-by: Petr Mladek
    Cc: Aaron Tomlin
    Cc: Peter Zijlstra (Intel)
    Cc: "Rafael J. Wysocki"
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Metcalf
     

04 Oct, 2016

1 commit

  • Pull char/misc driver updates from Greg KH:
    "Here's the "big" char and misc driver update for 4.9-rc1.

    Lots of little things here, all over the driver tree for subsystems
    that flow through me. Nothing major that I can discern, full details
    are in the shortlog.

    All have been in the linux-next tree with no reported issues"

    * tag 'char-misc-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (144 commits)
    drivers/misc/hpilo: Changes to support new security states in iLO5 FW
    at25: fix debug and error messaging
    misc/genwqe: ensure zero initialization
    vme: fake: remove unexpected unlock in fake_master_set()
    vme: fake: mark symbols static where possible
    spmi: pmic-arb: Return an error code if sanity check fails
    Drivers: hv: get rid of id in struct vmbus_channel
    Drivers: hv: make VMBus bus ids persistent
    mcb: Add a dma_device to mcb_device
    mcb: Enable PCI bus mastering by default
    mei: stop the stall timer worker if not needed
    clk: probe common clock drivers earlier
    vme: fake: fix build for 64-bit dma_addr_t
    ttyprintk: Neaten and simplify printing
    mei: me: add kaby point device ids
    coresight: tmc: mark symbols static where possible
    coresight: perf: deal with error condition properly
    Drivers: hv: hv_util: Avoid dynamic allocation in time synch
    fpga manager: Add hardware dependency to Zynq driver
    Drivers: hv: utils: Support TimeSync version 4.0 protocol samples.
    ...

    Linus Torvalds
     

08 Sep, 2016

2 commits

  • Update the syscall number after each PTRACE_SETREGS on ORIG_*AX.

    This is needed to get the potentially altered syscall number in the
    seccomp filters after RET_TRACE.

    This fix four seccomp_bpf tests:
    > [ RUN ] TRACE_syscall.skip_after_RET_TRACE
    > seccomp_bpf.c:1560:TRACE_syscall.skip_after_RET_TRACE:Expected -1 (18446744073709551615) == syscall(39) (26)
    > seccomp_bpf.c:1561:TRACE_syscall.skip_after_RET_TRACE:Expected 1 (1) == (*__errno_location ()) (22)
    > [ FAIL ] TRACE_syscall.skip_after_RET_TRACE
    > [ RUN ] TRACE_syscall.kill_after_RET_TRACE
    > TRACE_syscall.kill_after_RET_TRACE: Test exited normally instead of by signal (code: 1)
    > [ FAIL ] TRACE_syscall.kill_after_RET_TRACE
    > [ RUN ] TRACE_syscall.skip_after_ptrace
    > seccomp_bpf.c:1622:TRACE_syscall.skip_after_ptrace:Expected -1 (18446744073709551615) == syscall(39) (26)
    > seccomp_bpf.c:1623:TRACE_syscall.skip_after_ptrace:Expected 1 (1) == (*__errno_location ()) (22)
    > [ FAIL ] TRACE_syscall.skip_after_ptrace
    > [ RUN ] TRACE_syscall.kill_after_ptrace
    > TRACE_syscall.kill_after_ptrace: Test exited normally instead of by signal (code: 1)
    > [ FAIL ] TRACE_syscall.kill_after_ptrace

    Fixes: 26703c636c1f ("um/ptrace: run seccomp after ptrace")

    Signed-off-by: Mickaël Salaün
    Acked-by: Kees Cook
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: James Morris
    Cc: user-mode-linux-devel@lists.sourceforge.net
    Signed-off-by: James Morris
    Signed-off-by: Kees Cook

    Mickaël Salaün
     
  • Keep the same semantic as before the commit 26703c636c1f: deallocate
    audit context and fake a proper syscall exit.

    This fix a kernel panic triggered by the seccomp_bpf test:
    > [ RUN ] global.ERRNO_valid
    > BUG: failure at kernel/auditsc.c:1504/__audit_syscall_entry()!
    > Kernel panic - not syncing: BUG!

    Fixes: 26703c636c1f ("um/ptrace: run seccomp after ptrace")

    Signed-off-by: Mickaël Salaün
    Acked-by: Kees Cook
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: James Morris
    Cc: user-mode-linux-devel@lists.sourceforge.net
    Signed-off-by: James Morris
    Signed-off-by: Kees Cook

    Mickaël Salaün
     

05 Sep, 2016

1 commit


31 Aug, 2016

1 commit

  • This patch removes module_init()/module_exit() from driver code by using
    module_misc_device() macro. All modules in this patch has a print
    statement which is removed when module_misc_device() macro is used.
    If undesirable this patch can be dropped entirely, this is the only
    purpose of making this as a separate patch.

    Signed-off-by: PrasannaKumar Muralidharan
    Signed-off-by: Greg Kroah-Hartman

    PrasannaKumar Muralidharan
     

24 Aug, 2016

1 commit

  • Commit e41f501d3912 ("vmlinux.lds: account for destructor sections")
    added '.text.exit' to EXIT_TEXT which is discarded at link time by default.
    This breaks compilation of UML:
    `.text.exit' referenced in section `.fini_array' of
    /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(sdlerror.o):
    defined in discarded section `.text.exit' of
    /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(sdlerror.o)

    Apparently UML doesn't want to discard exit text, so let's place all EXIT_TEXT
    sections in .exit.text.

    Fixes: e41f501d3912 ("vmlinux.lds: account for destructor sections")
    Reported-by: Stefan Traby
    Signed-off-by: Andrey Ryabinin
    Cc:
    Acked-by: Dmitry Vyukov
    Signed-off-by: Richard Weinberger

    Andrey Ryabinin
     

05 Aug, 2016

1 commit

  • Pull UML updates from Richard Weinberger:
    "Beside of various fixes this also contains patches to enable features
    such was Kcov, kmemleak and TRACE_IRQFLAGS_SUPPORT on UML"

    * 'for-linus-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
    hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common()
    um: Support kcov
    um: Enable TRACE_IRQFLAGS_SUPPORT
    um: Use asm-generic/irqflags.h
    um: Fix possible deadlock in sig_handler_common()
    um: Select HAVE_DEBUG_KMEMLEAK
    um: Setup physical memory in setup_arch()
    um: Eliminate null test after alloc_bootmem

    Linus Torvalds
     

04 Aug, 2016

7 commits

  • This adds support for kcov to UML.

    There is a small problem where UML will randomly segfault during boot;
    this is because current_thread_info() occasionally returns an invalid
    (non-NULL) pointer and we try to dereference it in
    __sanitizer_cov_trace_pc(). I consider this a bug in UML itself and this
    patch merely exposes it.

    [v2: disable instrumentation in UML-specific code]

    Cc: Quentin Casasnovas
    Cc: Richard Weinberger
    Cc: Thomas Meyer
    Cc: user-mode-linux-devel
    Cc: Dmitry Vyukov
    Signed-off-by: Vegard Nossum
    Signed-off-by: Richard Weinberger

    Vegard Nossum
     
  • Now we have everything we need, so enable
    TRACE_IRQFLAGS_SUPPORT.

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • Instead proving its own arch_local_irq_save() and arch_irqs_disabled()
    version use the generic version from asm-generic/irqflags.h.

    A nice side effect is that um gets a few additional arch_ functions
    as well.

    Signed-off-by: Daniel Wagner
    [rw: Massaged commit message]
    Signed-off-by: Richard Weinberger

    Daniel Wagner
     
  • We are in atomic context and must not sleep.
    Sleeping here is possible since malloc() maps
    to kmalloc() with GFP_KERNEL.

    Cc: stable@vger.kernel.org
    Fixes: b6024b21 ("um: extend fpstate to _xstate to support YMM registers")
    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • Now we have the infrastructure to support kmemleak.
    Enable the HAVE flag.

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • Currently UML sets up physical memory very early,
    long before setup_arch() was called by the kernel main
    function.
    This can cause problems when code paths in UML's memory setup
    code assume that the kernel is already running.
    i.e. when kmemleak is enabled it will evaluate current()
    in free_bootmem(). That early current() is undefined and
    UML explodes.

    Solve the problem by setting up physical memory in setup_arch(),
    at this stage the kernel has materialized and basic infrastructure
    such as current() works.

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • alloc_bootmem function never returns NULL. Thus a NULL test after a
    call to this function is unnecessary.

    The Coccinelle semantic patch used to make this change is follows:
    @@
    expression E;
    statement S;
    @@

    E =
    alloc_bootmem(...)
    ... when != E
    - if (E == NULL) S

    Signed-off-by: Amitoj Kaur Chawla
    Signed-off-by: Richard Weinberger

    Amitoj Kaur Chawla
     

03 Aug, 2016

1 commit

  • Pull kbuild updates from Michal Marek:

    - GCC plugin support by Emese Revfy from grsecurity, with a fixup from
    Kees Cook. The plugins are meant to be used for static analysis of
    the kernel code. Two plugins are provided already.

    - reduction of the gcc commandline by Arnd Bergmann.

    - IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada

    - bin2c fix by Michael Tautschnig

    - setlocalversion fix by Wolfram Sang

    * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    gcc-plugins: disable under COMPILE_TEST
    kbuild: Abort build on bad stack protector flag
    scripts: Fix size mismatch of kexec_purgatory_size
    kbuild: make samples depend on headers_install
    Kbuild: don't add obj tree in additional includes
    Kbuild: arch: look for generated headers in obtree
    Kbuild: always prefix objtree in LINUXINCLUDE
    Kbuild: avoid duplicate include path
    Kbuild: don't add ../../ to include path
    vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
    kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
    kconfig.h: use already defined macros for IS_REACHABLE() define
    export.h: use __is_defined() to check if __KSYM_* is defined
    kconfig.h: use __is_defined() to check if MODULE is defined
    kbuild: setlocalversion: print error to STDERR
    Add sancov plugin
    Add Cyclomatic complexity GCC plugin
    GCC plugin infrastructure
    Shared library support

    Linus Torvalds
     

30 Jul, 2016

1 commit

  • Pull security subsystem updates from James Morris:
    "Highlights:

    - TPM core and driver updates/fixes
    - IPv6 security labeling (CALIPSO)
    - Lots of Apparmor fixes
    - Seccomp: remove 2-phase API, close hole where ptrace can change
    syscall #"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
    apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
    tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
    tpm: Factor out common startup code
    tpm: use devm_add_action_or_reset
    tpm2_i2c_nuvoton: add irq validity check
    tpm: read burstcount from TPM_STS in one 32-bit transaction
    tpm: fix byte-order for the value read by tpm2_get_tpm_pt
    tpm_tis_core: convert max timeouts from msec to jiffies
    apparmor: fix arg_size computation for when setprocattr is null terminated
    apparmor: fix oops, validate buffer size in apparmor_setprocattr()
    apparmor: do not expose kernel stack
    apparmor: fix module parameters can be changed after policy is locked
    apparmor: fix oops in profile_unpack() when policy_db is not present
    apparmor: don't check for vmalloc_addr if kvzalloc() failed
    apparmor: add missing id bounds check on dfa verification
    apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
    apparmor: use list_next_entry instead of list_entry_next
    apparmor: fix refcount race when finding a child profile
    apparmor: fix ref count leak when profile sha1 hash is read
    apparmor: check that xindex is in trans_table bounds
    ...

    Linus Torvalds
     

27 Jul, 2016

6 commits

  • Merge updates from Andrew Morton:

    - a few misc bits

    - ocfs2

    - most(?) of MM

    * emailed patches from Andrew Morton : (125 commits)
    thp: fix comments of __pmd_trans_huge_lock()
    cgroup: remove unnecessary 0 check from css_from_id()
    cgroup: fix idr leak for the first cgroup root
    mm: memcontrol: fix documentation for compound parameter
    mm: memcontrol: remove BUG_ON in uncharge_list
    mm: fix build warnings in
    mm, thp: convert from optimistic swapin collapsing to conservative
    mm, thp: fix comment inconsistency for swapin readahead functions
    thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
    shmem: split huge pages beyond i_size under memory pressure
    thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
    khugepaged: add support of collapse for tmpfs/shmem pages
    shmem: make shmem_inode_info::lock irq-safe
    khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
    thp: extract khugepaged from mm/huge_memory.c
    shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
    shmem: add huge pages support
    shmem: get_unmapped_area align huge page
    shmem: prepare huge= mount option and sysfs knob
    mm, rmap: account shmem thp pages
    ...

    Linus Torvalds
     
  • We always have vma->vm_mm around.

    Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • This allows an arch which needs to do special handing with respect to
    different page size when flushing tlb to implement the same in mmu
    gather.

    Link: http://lkml.kernel.org/r/1465049193-22197-3-git-send-email-aneesh.kumar@linux.vnet.ibm.com
    Signed-off-by: Aneesh Kumar K.V
    Cc: Benjamin Herrenschmidt
    Cc: Michael Ellerman
    Cc: Hugh Dickins
    Cc: "Kirill A. Shutemov"
    Cc: Andrea Arcangeli
    Cc: Joonsoo Kim
    Cc: Mel Gorman
    Cc: David Rientjes
    Cc: Vlastimil Babka
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     
  • This updates the generic and arch specific implementation to return true
    if we need to do a tlb flush. That means if a __tlb_remove_page
    indicate a flush is needed, the page we try to remove need to be tracked
    and added again after the flush. We need to track it because we have
    already update the pte to none and we can't just loop back.

    This change is done to enable us to do a tlb_flush when we try to flush
    a range that consists of different page sizes. For architectures like
    ppc64, we can do a range based tlb flush and we need to track page size
    for that. When we try to remove a huge page, we will force a tlb flush
    and starts a new mmu gather.

    [aneesh.kumar@linux.vnet.ibm.com: mm-change-the-interface-for-__tlb_remove_page-v3]
    Link: http://lkml.kernel.org/r/1465049193-22197-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/1464860389-29019-2-git-send-email-aneesh.kumar@linux.vnet.ibm.com
    Signed-off-by: Aneesh Kumar K.V
    Cc: Benjamin Herrenschmidt
    Cc: Michael Ellerman
    Cc: Hugh Dickins
    Cc: "Kirill A. Shutemov"
    Cc: Andrea Arcangeli
    Cc: Joonsoo Kim
    Cc: Mel Gorman
    Cc: David Rientjes
    Cc: Vlastimil Babka
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar K.V
     
  • Pull block driver updates from Jens Axboe:
    "This branch also contains core changes. I've come to the conclusion
    that from 4.9 and forward, I'll be doing just a single branch. We
    often have dependencies between core and drivers, and it's hard to
    always split them up appropriately without pulling core into drivers
    when that happens.

    That said, this contains:

    - separate secure erase type for the core block layer, from
    Christoph.

    - set of discard fixes, from Christoph.

    - bio shrinking fixes from Christoph, as a followup up to the
    op/flags change in the core branch.

    - map and append request fixes from Christoph.

    - NVMeF (NVMe over Fabrics) code from Christoph. This is pretty
    exciting!

    - nvme-loop fixes from Arnd.

    - removal of ->driverfs_dev from Dan, after providing a
    device_add_disk() helper.

    - bcache fixes from Bhaktipriya and Yijing.

    - cdrom subchannel read fix from Vchannaiah.

    - set of lightnvm updates from Wenwei, Matias, Johannes, and Javier.

    - set of drbd updates and fixes from Fabian, Lars, and Philipp.

    - mg_disk error path fix from Bart.

    - user notification for failed device add for loop, from Minfei.

    - NVMe in general:
    + NVMe delay quirk from Guilherme.
    + SR-IOV support and command retry limits from Keith.
    + fix for memory-less NUMA node from Masayoshi.
    + use UINT_MAX for discard sectors, from Minfei.
    + cancel IO fixes from Ming.
    + don't allocate unused major, from Neil.
    + error code fixup from Dan.
    + use constants for PSDT/FUSE from James.
    + variable init fix from Jay.
    + fabrics fixes from Ming, Sagi, and Wei.
    + various fixes"

    * 'for-4.8/drivers' of git://git.kernel.dk/linux-block: (115 commits)
    nvme/pci: Provide SR-IOV support
    nvme: initialize variable before logical OR'ing it
    block: unexport various bio mapping helpers
    scsi/osd: open code blk_make_request
    target: stop using blk_make_request
    block: simplify and export blk_rq_append_bio
    block: ensure bios return from blk_get_request are properly initialized
    virtio_blk: use blk_rq_map_kern
    memstick: don't allow REQ_TYPE_BLOCK_PC requests
    block: shrink bio size again
    block: simplify and cleanup bvec pool handling
    block: get rid of bio_rw and READA
    block: don't ignore -EOPNOTSUPP blkdev_issue_write_same
    block: introduce BLKDEV_DISCARD_ZERO to fix zeroout
    NVMe: don't allocate unused nvme_major
    nvme: avoid crashes when node 0 is memoryless node.
    nvme: Limit command retries
    loop: Make user notify for adding loop device failed
    nvme-loop: fix nvme-loop Kconfig dependencies
    nvmet: fix return value check in nvmet_subsys_alloc()
    ...

    Linus Torvalds
     
  • Pull core block updates from Jens Axboe:

    - the big change is the cleanup from Mike Christie, cleaning up our
    uses of command types and modified flags. This is what will throw
    some merge conflicts

    - regression fix for the above for btrfs, from Vincent

    - following up to the above, better packing of struct request from
    Christoph

    - a 2038 fix for blktrace from Arnd

    - a few trivial/spelling fixes from Bart Van Assche

    - a front merge check fix from Damien, which could cause issues on
    SMR drives

    - Atari partition fix from Gabriel

    - convert cfq to highres timers, since jiffies isn't granular enough
    for some devices these days. From Jan and Jeff

    - CFQ priority boost fix idle classes, from me

    - cleanup series from Ming, improving our bio/bvec iteration

    - a direct issue fix for blk-mq from Omar

    - fix for plug merging not involving the IO scheduler, like we do for
    other types of merges. From Tahsin

    - expose DAX type internally and through sysfs. From Toshi and Yigal

    * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
    block: Fix front merge check
    block: do not merge requests without consulting with io scheduler
    block: Fix spelling in a source code comment
    block: expose QUEUE_FLAG_DAX in sysfs
    block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
    Btrfs: fix comparison in __btrfs_map_block()
    block: atari: Return early for unsupported sector size
    Doc: block: Fix a typo in queue-sysfs.txt
    cfq-iosched: Charge at least 1 jiffie instead of 1 ns
    cfq-iosched: Fix regression in bonnie++ rewrite performance
    cfq-iosched: Convert slice_resid from u64 to s64
    block: Convert fifo_time from ulong to u64
    blktrace: avoid using timespec
    block/blk-cgroup.c: Declare local symbols static
    block/bio-integrity.c: Add #include "blk.h"
    block/partition-generic.c: Remove a set-but-not-used variable
    block: bio: kill BIO_MAX_SIZE
    cfq-iosched: temporarily boost queue priority for idle classes
    block: drbd: avoid to use BIO_MAX_SIZE
    block: bio: remove BIO_MAX_SECTORS
    ...

    Linus Torvalds
     

19 Jul, 2016

1 commit

  • There are very few files that need add an -I$(obj) gcc for the preprocessor
    or the assembler. For C files, we add always these for both the objtree and
    srctree, but for the other ones we require the Makefile to add them, and
    Kbuild then adds it for both trees.

    As a preparation for changing the meaning of the -I$(obj) directive to
    only refer to the srctree, this changes the two instances in arch/x86 to use
    an explictit $(objtree) prefix where needed, otherwise we won't find the
    headers any more, as reported by the kbuild 0day builder.

    arch/x86/realmode/rm/realmode.lds.S:75:20: fatal error: pasyms.h: No such file or directory

    Signed-off-by: Arnd Bergmann
    Signed-off-by: Michal Marek

    Arnd Bergmann
     

25 Jun, 2016

1 commit

  • This is the third version of the patchset previously sent [1]. I have
    basically only rebased it on top of 4.7-rc1 tree and dropped "dm: get
    rid of superfluous gfp flags" which went through dm tree. I am sending
    it now because it is tree wide and chances for conflicts are reduced
    considerably when we want to target rc2. I plan to send the next step
    and rename the flag and move to a better semantic later during this
    release cycle so we will have a new semantic ready for 4.8 merge window
    hopefully.

    Motivation:

    While working on something unrelated I've checked the current usage of
    __GFP_REPEAT in the tree. It seems that a majority of the usage is and
    always has been bogus because __GFP_REPEAT has always been about costly
    high order allocations while we are using it for order-0 or very small
    orders very often. It seems that a big pile of them is just a
    copy&paste when a code has been adopted from one arch to another.

    I think it makes some sense to get rid of them because they are just
    making the semantic more unclear. Please note that GFP_REPEAT is
    documented as

    * __GFP_REPEAT: Try hard to allocate the memory, but the allocation attempt

    * _might_ fail. This depends upon the particular VM implementation.
    while !costly requests have basically nofail semantic. So one could
    reasonably expect that order-0 request with __GFP_REPEAT will not loop
    for ever. This is not implemented right now though.

    I would like to move on with __GFP_REPEAT and define a better semantic
    for it.

    $ git grep __GFP_REPEAT origin/master | wc -l
    111
    $ git grep __GFP_REPEAT | wc -l
    36

    So we are down to the third after this patch series. The remaining
    places really seem to be relying on __GFP_REPEAT due to large allocation
    requests. This still needs some double checking which I will do later
    after all the simple ones are sorted out.

    I am touching a lot of arch specific code here and I hope I got it right
    but as a matter of fact I even didn't compile test for some archs as I
    do not have cross compiler for them. Patches should be quite trivial to
    review for stupid compile mistakes though. The tricky parts are usually
    hidden by macro definitions and thats where I would appreciate help from
    arch maintainers.

    [1] http://lkml.kernel.org/r/1461849846-27209-1-git-send-email-mhocko@kernel.org

    This patch (of 19):

    __GFP_REPEAT has a rather weak semantic but since it has been introduced
    around 2.6.12 it has been ignored for low order allocations. Yet we
    have the full kernel tree with its usage for apparently order-0
    allocations. This is really confusing because __GFP_REPEAT is
    explicitly documented to allow allocation failures which is a weaker
    semantic than the current order-0 has (basically nofail).

    Let's simply drop __GFP_REPEAT from those places. This would allow to
    identify place which really need allocator to retry harder and formulate
    a more specific semantic for what the flag is supposed to do actually.

    Link: http://lkml.kernel.org/r/1464599699-30131-2-git-send-email-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Cc: "David S. Miller"
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: "Theodore Ts'o"
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Chen Liqin
    Cc: Chris Metcalf [for tile]
    Cc: Guan Xuetao
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: John Crispin
    Cc: Lennox Wu
    Cc: Ley Foon Tan
    Cc: Martin Schwidefsky
    Cc: Matt Fleming
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Russell King
    Cc: Thomas Gleixner
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

23 Jun, 2016

1 commit


15 Jun, 2016

2 commits


08 Jun, 2016

2 commits

  • This patch allows to build the whole kernel with GCC plugins. It was ported from
    grsecurity/PaX. The infrastructure supports building out-of-tree modules and
    building in a separate directory. Cross-compilation is supported too.
    Currently the x86, arm, arm64 and uml architectures enable plugins.

    The directory of the gcc plugins is scripts/gcc-plugins. You can use a file or a directory
    there. The plugins compile with these options:
    * -fno-rtti: gcc is compiled with this option so the plugins must use it too
    * -fno-exceptions: this is inherited from gcc too
    * -fasynchronous-unwind-tables: this is inherited from gcc too
    * -ggdb: it is useful for debugging a plugin (better backtrace on internal
    errors)
    * -Wno-narrowing: to suppress warnings from gcc headers (ipa-utils.h)
    * -Wno-unused-variable: to suppress warnings from gcc headers (gcc_version
    variable, plugin-version.h)

    The infrastructure introduces a new Makefile target called gcc-plugins. It
    supports all gcc versions from 4.5 to 6.0. The scripts/gcc-plugin.sh script
    chooses the proper host compiler (gcc-4.7 can be built by either gcc or g++).
    This script also checks the availability of the included headers in
    scripts/gcc-plugins/gcc-common.h.

    The gcc-common.h header contains frequently included headers for GCC plugins
    and it has a compatibility layer for the supported gcc versions.

    The gcc-generate-*-pass.h headers automatically generate the registration
    structures for GIMPLE, SIMPLE_IPA, IPA and RTL passes.

    Note that 'make clean' keeps the *.so files (only the distclean or mrproper
    targets clean all) because they are needed for out-of-tree modules.

    Based on work created by the PaX Team.

    Signed-off-by: Emese Revfy
    Acked-by: Kees Cook
    Signed-off-by: Michal Marek

    Emese Revfy
     
  • This adds a REQ_OP_FLUSH operation that is sent to request_fn
    based drivers by the block layer's flush code, instead of
    sending requests with the request->cmd_flags REQ_FLUSH bit set.

    Signed-off-by: Mike Christie
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Mike Christie
     

28 May, 2016

1 commit


24 May, 2016

1 commit


22 May, 2016

2 commits

  • This patch extends save_fp_registers() and restore_fp_registers() to use
    PTRACE_GETREGSET and PTRACE_SETREGSET with the XSTATE note type, adding
    support for new processor state extensions between context switches.

    When the new ptrace requests are unavailable, it falls back to the old
    PTRACE_GETFPREGS and PTRACE_SETFPREGS methods, which have been renamed to
    save_i387_registers() and restore_i387_registers().

    Now these functions expect *fp_regs to have the space of an _xstate struct.
    Thus, this also makes ptrace in UML responde to PTRACE_GETFPREGS/_SETFPREG
    requests with a user_i387_struct (thus independent from HOST_FP_SIZE), and
    by calling save_i387_registers() and restore_i387_registers() instead of
    the extended save_fp_registers() and restore_fp_registers() functions.

    Signed-off-by: Eli Cooper

    Eli Cooper
     
  • Extends fpstate to _xstate, in order to hold AVX/YMM registers.

    To avoid oversized stack frame, the following functions have been
    refactored by using malloc.
    - sig_handler_common
    - timer_real_alarm_handler

    Signed-off-by: Eli Cooper

    Eli Cooper
     

21 May, 2016

1 commit

  • Define HAVE_EXIT_THREAD for archs which want to do something in
    exit_thread. For others, let's define exit_thread as an empty inline.

    This is a cleanup before we change the prototype of exit_thread to
    accept a task parameter.

    [akpm@linux-foundation.org: fix mips]
    Signed-off-by: Jiri Slaby
    Cc: "David S. Miller"
    Cc: "H. Peter Anvin"
    Cc: "James E.J. Bottomley"
    Cc: Aurelien Jacquiot
    Cc: Benjamin Herrenschmidt
    Cc: Catalin Marinas
    Cc: Chen Liqin
    Cc: Chris Metcalf
    Cc: Chris Zankel
    Cc: David Howells
    Cc: Fenghua Yu
    Cc: Geert Uytterhoeven
    Cc: Guan Xuetao
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: Ingo Molnar
    Cc: Ivan Kokshaysky
    Cc: James Hogan
    Cc: Jeff Dike
    Cc: Jesper Nilsson
    Cc: Jiri Slaby
    Cc: Jonas Bonn
    Cc: Koichi Yasutake
    Cc: Lennox Wu
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Martin Schwidefsky
    Cc: Matt Turner
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Mikael Starvik
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Ralf Baechle
    Cc: Rich Felker
    Cc: Richard Henderson
    Cc: Richard Kuo
    Cc: Richard Weinberger
    Cc: Russell King
    Cc: Steven Miao
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: Vineet Gupta
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby
     

18 May, 2016

1 commit

  • Pull networking updates from David Miller:
    "Highlights:

    1) Support SPI based w5100 devices, from Akinobu Mita.

    2) Partial Segmentation Offload, from Alexander Duyck.

    3) Add GMAC4 support to stmmac driver, from Alexandre TORGUE.

    4) Allow cls_flower stats offload, from Amir Vadai.

    5) Implement bpf blinding, from Daniel Borkmann.

    6) Optimize _ASYNC_ bit twiddling on sockets, unless the socket is
    actually using FASYNC these atomics are superfluous. From Eric
    Dumazet.

    7) Run TCP more preemptibly, also from Eric Dumazet.

    8) Support LED blinking, EEPROM dumps, and rxvlan offloading in mlx5e
    driver, from Gal Pressman.

    9) Allow creating ppp devices via rtnetlink, from Guillaume Nault.

    10) Improve BPF usage documentation, from Jesper Dangaard Brouer.

    11) Support tunneling offloads in qed, from Manish Chopra.

    12) aRFS offloading in mlx5e, from Maor Gottlieb.

    13) Add RFS and RPS support to SCTP protocol, from Marcelo Ricardo
    Leitner.

    14) Add MSG_EOR support to TCP, this allows controlling packet
    coalescing on application record boundaries for more accurate
    socket timestamp sampling. From Martin KaFai Lau.

    15) Fix alignment of 64-bit netlink attributes across the board, from
    Nicolas Dichtel.

    16) Per-vlan stats in bridging, from Nikolay Aleksandrov.

    17) Several conversions of drivers to ethtool ksettings, from Philippe
    Reynes.

    18) Checksum neutral ILA in ipv6, from Tom Herbert.

    19) Factorize all of the various marvell dsa drivers into one, from
    Vivien Didelot

    20) Add VF support to qed driver, from Yuval Mintz"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1649 commits)
    Revert "phy dp83867: Fix compilation with CONFIG_OF_MDIO=m"
    Revert "phy dp83867: Make rgmii parameters optional"
    r8169: default to 64-bit DMA on recent PCIe chips
    phy dp83867: Make rgmii parameters optional
    phy dp83867: Fix compilation with CONFIG_OF_MDIO=m
    bpf: arm64: remove callee-save registers use for tmp registers
    asix: Fix offset calculation in asix_rx_fixup() causing slow transmissions
    switchdev: pass pointer to fib_info instead of copy
    net_sched: close another race condition in tcf_mirred_release()
    tipc: fix nametable publication field in nl compat
    drivers: net: Don't print unpopulated net_device name
    qed: add support for dcbx.
    ravb: Add missing free_irq() calls to ravb_close()
    qed: Remove a stray tab
    net: ethernet: fec-mpc52xx: use phy_ethtool_{get|set}_link_ksettings
    net: ethernet: fec-mpc52xx: use phydev from struct net_device
    bpf, doc: fix typo on bpf_asm descriptions
    stmmac: hardware TX COE doesn't work when force_thresh_dma_mode is set
    net: ethernet: fs-enet: use phy_ethtool_{get|set}_link_ksettings
    net: ethernet: fs-enet: use phydev from struct net_device
    ...

    Linus Torvalds
     

05 May, 2016

1 commit

  • Replace all trans_start updates with netif_trans_update helper.
    change was done via spatch:

    struct net_device *d;
    @@
    - d->trans_start = jiffies
    + netif_trans_update(d)

    Compile tested only.

    Cc: user-mode-linux-devel@lists.sourceforge.net
    Cc: linux-xtensa@linux-xtensa.org
    Cc: linux1394-devel@lists.sourceforge.net
    Cc: linux-rdma@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Cc: MPT-FusionLinux.pdl@broadcom.com
    Cc: linux-scsi@vger.kernel.org
    Cc: linux-can@vger.kernel.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-omap@vger.kernel.org
    Cc: linux-hams@vger.kernel.org
    Cc: linux-usb@vger.kernel.org
    Cc: linux-wireless@vger.kernel.org
    Cc: linux-s390@vger.kernel.org
    Cc: devel@driverdev.osuosl.org
    Cc: b.a.t.m.a.n@lists.open-mesh.org
    Cc: linux-bluetooth@vger.kernel.org
    Signed-off-by: Florian Westphal
    Acked-by: Felipe Balbi
    Acked-by: Mugunthan V N
    Acked-by: Antonio Quartulli
    Signed-off-by: David S. Miller

    Florian Westphal
     

13 Apr, 2016

1 commit


23 Mar, 2016

1 commit

  • This commit fixes the following security hole affecting systems where
    all of the following conditions are fulfilled:

    - The fs.suid_dumpable sysctl is set to 2.
    - The kernel.core_pattern sysctl's value starts with "/". (Systems
    where kernel.core_pattern starts with "|/" are not affected.)
    - Unprivileged user namespace creation is permitted. (This is
    true on Linux >=3.8, but some distributions disallow it by
    default using a distro patch.)

    Under these conditions, if a program executes under secure exec rules,
    causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
    namespace, changes its root directory and crashes, the coredump will be
    written using fsuid=0 and a path derived from kernel.core_pattern - but
    this path is interpreted relative to the root directory of the process,
    allowing the attacker to control where a coredump will be written with
    root privileges.

    To fix the security issue, always interpret core_pattern for dumps that
    are written under SUID_DUMP_ROOT relative to the root directory of init.

    Signed-off-by: Jann Horn
    Acked-by: Kees Cook
    Cc: Al Viro
    Cc: "Eric W. Biederman"
    Cc: Andy Lutomirski
    Cc: Oleg Nesterov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jann Horn
     

21 Mar, 2016

1 commit

  • Pull x86 protection key support from Ingo Molnar:
    "This tree adds support for a new memory protection hardware feature
    that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

    There's a background article at LWN.net:

    https://lwn.net/Articles/643797/

    The gist is that protection keys allow the encoding of
    user-controllable permission masks in the pte. So instead of having a
    fixed protection mask in the pte (which needs a system call to change
    and works on a per page basis), the user can map a (handful of)
    protection mask variants and can change the masks runtime relatively
    cheaply, without having to change every single page in the affected
    virtual memory range.

    This allows the dynamic switching of the protection bits of large
    amounts of virtual memory, via user-space instructions. It also
    allows more precise control of MMU permission bits: for example the
    executable bit is separate from the read bit (see more about that
    below).

    This tree adds the MM infrastructure and low level x86 glue needed for
    that, plus it adds a high level API to make use of protection keys -
    if a user-space application calls:

    mmap(..., PROT_EXEC);

    or

    mprotect(ptr, sz, PROT_EXEC);

    (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
    this special case, and will set a special protection key on this
    memory range. It also sets the appropriate bits in the Protection
    Keys User Rights (PKRU) register so that the memory becomes unreadable
    and unwritable.

    So using protection keys the kernel is able to implement 'true'
    PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
    PROT_READ as well. Unreadable executable mappings have security
    advantages: they cannot be read via information leaks to figure out
    ASLR details, nor can they be scanned for ROP gadgets - and they
    cannot be used by exploits for data purposes either.

    We know about no user-space code that relies on pure PROT_EXEC
    mappings today, but binary loaders could start making use of this new
    feature to map binaries and libraries in a more secure fashion.

    There is other pending pkeys work that offers more high level system
    call APIs to manage protection keys - but those are not part of this
    pull request.

    Right now there's a Kconfig that controls this feature
    (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
    (like most x86 CPU feature enablement code that has no runtime
    overhead), but it's not user-configurable at the moment. If there's
    any serious problem with this then we can make it configurable and/or
    flip the default"

    * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
    x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
    mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
    x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
    mm/core, x86/mm/pkeys: Add execute-only protection keys support
    x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
    x86/mm/pkeys: Allow kernel to modify user pkey rights register
    x86/fpu: Allow setting of XSAVE state
    x86/mm: Factor out LDT init from context init
    mm/core, x86/mm/pkeys: Add arch_validate_pkey()
    mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
    x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
    x86/mm/pkeys: Add Kconfig prompt to existing config option
    x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
    x86/mm/pkeys: Dump PKRU with other kernel registers
    mm/core, x86/mm/pkeys: Differentiate instruction fetches
    x86/mm/pkeys: Optimize fault handling in access_error()
    mm/core: Do not enforce PKEY permissions on remote mm access
    um, pkeys: Add UML arch_*_access_permitted() methods
    mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
    x86/mm/gup: Simplify get_user_pages() PTE bit handling
    ...

    Linus Torvalds