23 Mar, 2016

1 commit

  • This commit fixes the following security hole affecting systems where
    all of the following conditions are fulfilled:

    - The fs.suid_dumpable sysctl is set to 2.
    - The kernel.core_pattern sysctl's value starts with "/". (Systems
    where kernel.core_pattern starts with "|/" are not affected.)
    - Unprivileged user namespace creation is permitted. (This is
    true on Linux >=3.8, but some distributions disallow it by
    default using a distro patch.)

    Under these conditions, if a program executes under secure exec rules,
    causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
    namespace, changes its root directory and crashes, the coredump will be
    written using fsuid=0 and a path derived from kernel.core_pattern - but
    this path is interpreted relative to the root directory of the process,
    allowing the attacker to control where a coredump will be written with
    root privileges.

    To fix the security issue, always interpret core_pattern for dumps that
    are written under SUID_DUMP_ROOT relative to the root directory of init.

    Signed-off-by: Jann Horn
    Acked-by: Kees Cook
    Cc: Al Viro
    Cc: "Eric W. Biederman"
    Cc: Andy Lutomirski
    Cc: Oleg Nesterov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jann Horn
     

21 Mar, 2016

1 commit

  • Pull x86 protection key support from Ingo Molnar:
    "This tree adds support for a new memory protection hardware feature
    that is available in upcoming Intel CPUs: 'protection keys' (pkeys).

    There's a background article at LWN.net:

    https://lwn.net/Articles/643797/

    The gist is that protection keys allow the encoding of
    user-controllable permission masks in the pte. So instead of having a
    fixed protection mask in the pte (which needs a system call to change
    and works on a per page basis), the user can map a (handful of)
    protection mask variants and can change the masks runtime relatively
    cheaply, without having to change every single page in the affected
    virtual memory range.

    This allows the dynamic switching of the protection bits of large
    amounts of virtual memory, via user-space instructions. It also
    allows more precise control of MMU permission bits: for example the
    executable bit is separate from the read bit (see more about that
    below).

    This tree adds the MM infrastructure and low level x86 glue needed for
    that, plus it adds a high level API to make use of protection keys -
    if a user-space application calls:

    mmap(..., PROT_EXEC);

    or

    mprotect(ptr, sz, PROT_EXEC);

    (note PROT_EXEC-only, without PROT_READ/WRITE), the kernel will notice
    this special case, and will set a special protection key on this
    memory range. It also sets the appropriate bits in the Protection
    Keys User Rights (PKRU) register so that the memory becomes unreadable
    and unwritable.

    So using protection keys the kernel is able to implement 'true'
    PROT_EXEC on x86 CPUs: without protection keys PROT_EXEC implies
    PROT_READ as well. Unreadable executable mappings have security
    advantages: they cannot be read via information leaks to figure out
    ASLR details, nor can they be scanned for ROP gadgets - and they
    cannot be used by exploits for data purposes either.

    We know about no user-space code that relies on pure PROT_EXEC
    mappings today, but binary loaders could start making use of this new
    feature to map binaries and libraries in a more secure fashion.

    There is other pending pkeys work that offers more high level system
    call APIs to manage protection keys - but those are not part of this
    pull request.

    Right now there's a Kconfig that controls this feature
    (CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) that is default enabled
    (like most x86 CPU feature enablement code that has no runtime
    overhead), but it's not user-configurable at the moment. If there's
    any serious problem with this then we can make it configurable and/or
    flip the default"

    * 'mm-pkeys-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
    x86/mm/pkeys: Fix mismerge of protection keys CPUID bits
    mm/pkeys: Fix siginfo ABI breakage caused by new u64 field
    x86/mm/pkeys: Fix access_error() denial of writes to write-only VMA
    mm/core, x86/mm/pkeys: Add execute-only protection keys support
    x86/mm/pkeys: Create an x86 arch_calc_vm_prot_bits() for VMA flags
    x86/mm/pkeys: Allow kernel to modify user pkey rights register
    x86/fpu: Allow setting of XSAVE state
    x86/mm: Factor out LDT init from context init
    mm/core, x86/mm/pkeys: Add arch_validate_pkey()
    mm/core, arch, powerpc: Pass a protection key in to calc_vm_flag_bits()
    x86/mm/pkeys: Actually enable Memory Protection Keys in the CPU
    x86/mm/pkeys: Add Kconfig prompt to existing config option
    x86/mm/pkeys: Dump pkey from VMA in /proc/pid/smaps
    x86/mm/pkeys: Dump PKRU with other kernel registers
    mm/core, x86/mm/pkeys: Differentiate instruction fetches
    x86/mm/pkeys: Optimize fault handling in access_error()
    mm/core: Do not enforce PKEY permissions on remote mm access
    um, pkeys: Add UML arch_*_access_permitted() methods
    mm/gup, x86/mm/pkeys: Check VMAs and PTEs for protection keys
    x86/mm/gup: Simplify get_user_pages() PTE bit handling
    ...

    Linus Torvalds
     

18 Mar, 2016

1 commit

  • There are few things about *pte_alloc*() helpers worth cleaning up:

    - 'vma' argument is unused, let's drop it;

    - most __pte_alloc() callers do speculative check for pmd_none(),
    before taking ptl: let's introduce pte_alloc() macro which does
    the check.

    The only direct user of __pte_alloc left is userfaultfd, which has
    different expectation about atomicity wrt pmd.

    - pte_alloc_map() and pte_alloc_map_lock() are redefined using
    pte_alloc().

    [sudeep.holla@arm.com: fix build for arm64 hugetlbpage]
    [sfr@canb.auug.org.au: fix arch/arm/mm/mmu.c some more]
    Signed-off-by: Kirill A. Shutemov
    Cc: Dave Hansen
    Signed-off-by: Sudeep Holla
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

06 Mar, 2016

2 commits


19 Feb, 2016

1 commit

  • UML has a special mmu_context.h and needs updates whenever the generic one
    is updated.

    Signed-off-by: Dave Hansen
    Cc: Dave Hansen
    Cc: Jeff Dike
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Richard Weinberger
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-mm@kvack.org
    Cc: user-mode-linux-devel@lists.sourceforge.net
    Cc: user-mode-linux-user@lists.sourceforge.net
    Link: http://lkml.kernel.org/r/20160218183557.AE1DB383@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

06 Feb, 2016

1 commit

  • Commit 16da306849d0 ("um: kill pfn_t") introduced a compile warning for
    defconfig (SUBARCH=i386):

    arch/um/kernel/skas/mmu.c:38:206:
    warning: right shift count >= width of type [-Wshift-count-overflow]

    Aforementioned patch changes the definition of the phys_to_pfn() macro
    from

    ((pfn_t) ((p) >> PAGE_SHIFT))

    to

    ((p) >> PAGE_SHIFT)

    This effectively changes the phys_to_pfn() expansion's type from
    unsigned long long to unsigned long.

    Through the callchain init_stub_pte() => mk_pte(), the expansion of
    phys_to_pfn() is (indirectly) fed into the 'phys' argument of the
    pte_set_val(pte, phys, prot) macro, eventually leading to

    (pte).pte_high = (phys) >> 32;

    This results in the warning from above.

    Since UML only deals with 32 bit addresses, the upper 32 bits from
    'phys' used to be always zero anyway. Also, all page protection flags
    defined by UML don't use any bits beyond bit 9. Since the contents of a
    PTE are defined within architecture scope only, the ->pte_high member
    can be safely removed.

    Remove the ->pte_high member from struct pte_t.
    Rename ->pte_low to ->pte.
    Adapt the pte helper macros in arch/um/include/asm/page.h.

    Noteworthy is the pte_copy() macro where a smp_wmb() gets dropped. This
    write barrier doesn't seem to be paired with any read barrier though and
    thus, was useless anyway.

    Fixes: 16da306849d0 ("um: kill pfn_t")
    Signed-off-by: Nicolai Stange
    Cc: Dan Williams
    Cc: Richard Weinberger
    Cc: Nicolai Stange
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicolai Stange
     

16 Jan, 2016

1 commit

  • The core has developed a need for a "pfn_t" type [1]. Convert the usage
    of pfn_t by usermode-linux to an unsigned long, and update pfn_to_phys()
    to drop its expectation of a typed pfn.

    [1]: https://lists.01.org/pipermail/linux-nvdimm/2015-September/002199.html

    Signed-off-by: Dan Williams
    Cc: Dave Hansen
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     

13 Jan, 2016

1 commit

  • Pull misc vfs updates from Al Viro:
    "All kinds of stuff. That probably should've been 5 or 6 separate
    branches, but by the time I'd realized how large and mixed that bag
    had become it had been too close to -final to play with rebasing.

    Some fs/namei.c cleanups there, memdup_user_nul() introduction and
    switching open-coded instances, burying long-dead code, whack-a-mole
    of various kinds, several new helpers for ->llseek(), assorted
    cleanups and fixes from various people, etc.

    One piece probably deserves special mention - Neil's
    lookup_one_len_unlocked(). Similar to lookup_one_len(), but gets
    called without ->i_mutex and tries to avoid ever taking it. That, of
    course, means that it's not useful for any directory modifications,
    but things like getting inode attributes in nfds readdirplus are fine
    with that. I really should've asked for moratorium on lookup-related
    changes this cycle, but since I hadn't done that early enough... I
    *am* asking for that for the coming cycle, though - I'm going to try
    and get conversion of i_mutex to rwsem with ->lookup() done under lock
    taken shared.

    There will be a patch closer to the end of the window, along the lines
    of the one Linus had posted last May - mechanical conversion of
    ->i_mutex accesses to inode_lock()/inode_unlock()/inode_trylock()/
    inode_is_locked()/inode_lock_nested(). To quote Linus back then:

    -----
    | This is an automated patch using
    |
    | sed 's/mutex_lock(&\(.*\)->i_mutex)/inode_lock(\1)/'
    | sed 's/mutex_unlock(&\(.*\)->i_mutex)/inode_unlock(\1)/'
    | sed 's/mutex_lock_nested(&\(.*\)->i_mutex,[ ]*I_MUTEX_\([A-Z0-9_]*\))/inode_lock_nested(\1, I_MUTEX_\2)/'
    | sed 's/mutex_is_locked(&\(.*\)->i_mutex)/inode_is_locked(\1)/'
    | sed 's/mutex_trylock(&\(.*\)->i_mutex)/inode_trylock(\1)/'
    |
    | with a very few manual fixups
    -----

    I'm going to send that once the ->i_mutex-affecting stuff in -next
    gets mostly merged (or when Linus says he's about to stop taking
    merges)"

    * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    nfsd: don't hold i_mutex over userspace upcalls
    fs:affs:Replace time_t with time64_t
    fs/9p: use fscache mutex rather than spinlock
    proc: add a reschedule point in proc_readfd_common()
    logfs: constify logfs_block_ops structures
    fcntl: allow to set O_DIRECT flag on pipe
    fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE
    fs: xattr: Use kvfree()
    [s390] page_to_phys() always returns a multiple of PAGE_SIZE
    nbd: use ->compat_ioctl()
    fs: use block_device name vsprintf helper
    lib/vsprintf: add %*pg format specifier
    fs: use gendisk->disk_name where possible
    poll: plug an unused argument to do_poll
    amdkfd: don't open-code memdup_user()
    cdrom: don't open-code memdup_user()
    rsxx: don't open-code memdup_user()
    mtip32xx: don't open-code memdup_user()
    [um] mconsole: don't open-code memdup_user_nul()
    [um] hostaudio: don't open-code memdup_user()
    ...

    Linus Torvalds
     

11 Jan, 2016

9 commits

  • Open the memory mapped file with the O_TMPFILE flag when available.

    Signed-off-by: Mickaël Salaün
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Acked-by: Tristan Schmelcher
    Signed-off-by: Richard Weinberger

    Mickaël Salaün
     
  • Remove the insecure 0777 mode for temporary file to prohibit other users
    to change the executable mapped code.

    An attacker could gain access to the mapped file descriptor from the
    temporary file (before it is unlinked) in a read-only mode but it should
    not be accessible in write mode to avoid arbitrary code execution.

    To not change the hostfs behavior, the temporary file creation
    permission now depends on the current umask(2) and the implementation of
    mkstemp(3).

    Signed-off-by: Mickaël Salaün
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Acked-by: Tristan Schmelcher
    Signed-off-by: Richard Weinberger

    Mickaël Salaün
     
  • This brings SECCOMP_MODE_STRICT and SECCOMP_MODE_FILTER support through
    prctl(2) and seccomp(2) to User-mode Linux for i386 and x86_64
    subarchitectures.

    secure_computing() is called first in handle_syscall() so that the
    syscall emulation will be aborted quickly if matching a seccomp rule.

    This is inspired from Meredydd Luff's patch
    (https://gerrit.chromium.org/gerrit/21425).

    Signed-off-by: Mickaël Salaün
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Will Drewry
    Cc: Chris Metcalf
    Cc: Michael Ellerman
    Cc: James Hogan
    Cc: Meredydd Luff
    Cc: David Drysdale
    Signed-off-by: Richard Weinberger
    Acked-by: Kees Cook

    Mickaël Salaün
     
  • Add subarchitecture-independent implementation of asm-generic/syscall.h
    allowing access to user system call parameters and results:
    * syscall_get_nr()
    * syscall_rollback()
    * syscall_get_error()
    * syscall_get_return_value()
    * syscall_set_return_value()
    * syscall_get_arguments()
    * syscall_set_arguments()
    * syscall_get_arch() provided by arch/x86/um/asm/syscall.h

    This provides the necessary syscall helpers needed by
    HAVE_ARCH_SECCOMP_FILTER plus syscall_get_error().

    This is inspired from Meredydd Luff's patch
    (https://gerrit.chromium.org/gerrit/21425).

    Signed-off-by: Mickaël Salaün
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Will Drewry
    Cc: Meredydd Luff
    Cc: David Drysdale
    Signed-off-by: Richard Weinberger
    Acked-by: Kees Cook

    Mickaël Salaün
     
  • This fix two related bugs:
    * PTRACE_GETREGS doesn't get the right orig_ax (syscall) value
    * PTRACE_SETREGS can't set the orig_ax value (erased by initial value)

    Get rid of the now useless and error-prone get_syscall().

    Fix inconsistent behavior in the ptrace implementation for i386 when
    updating orig_eax automatically update the syscall number as well. This
    is now updated in handle_syscall().

    Signed-off-by: Mickaël Salaün
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Thomas Gleixner
    Cc: Kees Cook
    Cc: Andy Lutomirski
    Cc: Will Drewry
    Cc: Thomas Meyer
    Cc: Nicolas Iooss
    Cc: Anton Ivanov
    Cc: Meredydd Luff
    Cc: David Drysdale
    Signed-off-by: Richard Weinberger
    Acked-by: Kees Cook

    Mickaël Salaün
     
  • This decreases the number of syscalls per read/write by half.

    Signed-off-by: Anton Ivanov
    Signed-off-by: Richard Weinberger

    Anton Ivanov
     
  • Software IRQ processing in generic architectures assumes that the
    exit out of hard IRQ may have re-enabled interrupts (some
    architectures may have an implicit EOI). It presumes them enabled
    and toggles the flags once more just in case unless this is turned
    off in the architecture specific hardirq.h by setting
    __ARCH_IRQ_EXIT_IRQS_DISABLED

    This patch adds this to UML where due to the way IRQs are handled
    it is an optimization (it works fine without it too).

    Signed-off-by: Anton Ivanov
    Signed-off-by: Richard Weinberger

    Anton Ivanov
     
  • The existing IRQ handler design in UML does not prevent reentrancy

    This is mitigated by fd-enable/fd-disable semantics for the IO
    portion of the UML subsystem. The timer, however, can and is
    re-entered resulting in very deep stack usage and occasional
    stack exhaustion.

    This patch prevents this by checking if there is a timer
    interrupt in-flight before processing any pending timer interrupts.

    Signed-off-by: Anton Ivanov
    Signed-off-by: Richard Weinberger

    Anton Ivanov
     
  • I was seeing some really weird behaviour where piping UML's output
    somewhere would cause output to get duplicated:

    $ ./vmlinux | head -n 40
    Checking that ptrace can change system call numbers...Core dump limits :
    soft - 0
    hard - NONE
    OK
    Checking syscall emulation patch for ptrace...Core dump limits :
    soft - 0
    hard - NONE
    OK
    Checking advanced syscall emulation patch for ptrace...Core dump limits :
    soft - 0
    hard - NONE
    OK
    Core dump limits :
    soft - 0
    hard - NONE

    This is because these tests do a fork() which duplicates the non-empty
    stdout buffer, then glibc flushes the duplicated buffer as each child
    exits.

    A simple workaround is to flush before forking.

    Cc: stable@vger.kernel.org
    Signed-off-by: Vegard Nossum
    Signed-off-by: Richard Weinberger

    Vegard Nossum
     

04 Jan, 2016

2 commits


09 Dec, 2015

3 commits

  • When using va_list ensure that va_start will be followed by va_end.

    Signed-off-by: Geyslan G. Bem
    Signed-off-by: Richard Weinberger

    Geyslan G. Bem
     
  • On gcc Ubuntu 4.8.4-2ubuntu1~14.04, linking vmlinux fails with:

    arch/um/os-Linux/built-in.o: In function `os_timer_create':
    /android/kernel/android/arch/um/os-Linux/time.c:51: undefined reference to `timer_create'
    arch/um/os-Linux/built-in.o: In function `os_timer_set_interval':
    /android/kernel/android/arch/um/os-Linux/time.c:84: undefined reference to `timer_settime'
    arch/um/os-Linux/built-in.o: In function `os_timer_remain':
    /android/kernel/android/arch/um/os-Linux/time.c:109: undefined reference to `timer_gettime'
    arch/um/os-Linux/built-in.o: In function `os_timer_one_shot':
    /android/kernel/android/arch/um/os-Linux/time.c:132: undefined reference to `timer_settime'
    arch/um/os-Linux/built-in.o: In function `os_timer_disable':
    /android/kernel/android/arch/um/os-Linux/time.c:145: undefined reference to `timer_settime'

    This is because -lrt appears in the generated link commandline
    after arch/um/os-Linux/built-in.o. Fix this by removing -lrt from
    arch/um/Makefile and adding it to the UM-specific section of
    scripts/link-vmlinux.sh.

    Signed-off-by: Lorenzo Colitti
    Signed-off-by: Richard Weinberger

    Lorenzo Colitti
     
  • If get_signal() returns us a signal to post
    we must not call it again, otherwise the already
    posted signal will be overridden.
    Before commit a610d6e672d this was the case as we stopped
    the while after a successful handle_signal().

    Cc: # 3.10-
    Fixes: a610d6e672d ("pull clearing RESTORE_SIGMASK into block_sigmask()")
    Signed-off-by: Richard Weinberger

    Richard Weinberger
     

07 Nov, 2015

6 commits

  • UML is using an obsolete itimer call for
    all timers and "polls" for kernel space timer firing
    in its userspace portion resulting in a long list
    of bugs and incorrect behaviour(s). It also uses
    ITIMER_VIRTUAL for its timer which results in the
    timer being dependent on it running and the cpu
    load.

    This patch fixes this by moving to posix high resolution
    timers firing off CLOCK_MONOTONIC and relaying the timer
    correctly to the UML userspace.

    Fixes:
    - crashes when hosts suspends/resumes
    - broken userspace timers - effecive ~40Hz instead
    of what they should be. Note - this modifies skas behavior
    by no longer setting an itimer per clone(). Timer events
    are relayed instead.
    - kernel network packet scheduling disciplines
    - tcp behaviour especially under load
    - various timer related corner cases

    Finally, overall responsiveness of userspace is better.

    Signed-off-by: Thomas Meyer
    Signed-off-by: Anton Ivanov
    [rw: massaged commit message]
    Signed-off-by: Richard Weinberger

    Anton Ivanov
     
  • since GFP_KERNEL with GFP_ATOMIC while spinlock is held,
    as code while holding a spinlock should be atomic.
    GFP_KERNEL may sleep and can cause deadlock,
    where as GFP_ATOMIC may fail but certainly avoids deadlockdex f70dd54..d898f6c 100644

    Signed-off-by: Saurabh Sengar
    Signed-off-by: Richard Weinberger

    Saurabh Sengar
     
  • If UML runs on the host side out of memory, report this
    condition more nicely.

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • We can use __NR_syscall_max.

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • To support changing syscall numbers we have to store
    it after syscall_trace_enter().

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • ...such that processes within UML can do a ptrace(PTRACE_OLDSETOPTIONS, ...)

    Signed-off-by: Richard Weinberger

    Richard Weinberger
     

20 Oct, 2015

3 commits

  • We have to exclude memory locations

    Richard Weinberger
     
  • If UML is executing a helper program it is using
    waitpid() with the __WCLONE flag to wait for the program
    as the helper is executed from a clone()'ed thread.
    While using __WCLONE is perfectly fine for clone()'ed
    childs it won't detect terminated childs if the helper
    has issued an execve().

    We have to use __WALL to wait for both clone()'ed and
    regular childs to detect the termination before and
    after an execve().

    Reported-and-tested-by: Thomas Meyer
    Signed-off-by: Richard Weinberger

    Richard Weinberger
     
  • Commit 30b11ee9a (um: Remove copy&paste code from init.h)
    uncovered an issue wrt. out-of-tree builds.
    For out-of-tree builds, we must not rely on relative paths.
    Before 30b11ee9a it worked by chance as no host code included
    generated header files.

    Acked-by: Randy Dunlap
    Signed-off-by: Richard Weinberger

    Richard Weinberger
     

04 Oct, 2015

1 commit

  • Pull strscpy string copy function implementation from Chris Metcalf.

    Chris sent this during the merge window, but I waffled back and forth on
    the pull request, which is why it's going in only now.

    The new "strscpy()" function is definitely easier to use and more secure
    than either strncpy() or strlcpy(), both of which are horrible nasty
    interfaces that have serious and irredeemable problems.

    strncpy() has a useless return value, and doesn't NUL-terminate an
    overlong result. To make matters worse, it pads a short result with
    zeroes, which is a performance disaster if you have big buffers.

    strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
    the insane NUL padding, but having a differently broken return value
    which returns the original length of the source string. Which means
    that it will read characters past the count from the source buffer, and
    you have to trust the source to be properly terminated. It also makes
    error handling fragile, since the test for overflow is unnecessarily
    subtle.

    strscpy() avoids both these problems, guaranteeing the NUL termination
    (but not excessive padding) if the destination size wasn't zero, and
    making the overflow condition very obvious by returning -E2BIG. It also
    doesn't read past the size of the source, and can thus be used for
    untrusted source data too.

    So why did I waffle about this for so long?

    Every time we introduce a new-and-improved interface, people start doing
    these interminable series of trivial conversion patches.

    And every time that happens, somebody does some silly mistake, and the
    conversion patch to the improved interface actually makes things worse.
    Because the patch is mindnumbing and trivial, nobody has the attention
    span to look at it carefully, and it's usually done over large swatches
    of source code which means that not every conversion gets tested.

    So I'm pulling the strscpy() support because it *is* a better interface.
    But I will refuse to pull mindless conversion patches. Use this in
    places where it makes sense, but don't do trivial patches to fix things
    that aren't actually known to be broken.

    * 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
    tile: use global strscpy() rather than private copy
    string: provide strscpy()
    Make asm/word-at-a-time.h available on all architectures

    Linus Torvalds
     

02 Sep, 2015

1 commit

  • Pull timer updates from Thomas Gleixner:
    "Rather large, but nothing exiting:

    - new range check for settimeofday() to prevent that boot time
    becomes negative.
    - fix for file time rounding
    - a few simplifications of the hrtimer code
    - fix for the proc/timerlist code so the output of clock realtime
    timers is accurate
    - more y2038 work
    - tree wide conversion of clockevent drivers to the new callbacks"

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (88 commits)
    hrtimer: Handle failure of tick_init_highres() gracefully
    hrtimer: Unconfuse switch_hrtimer_base() a bit
    hrtimer: Simplify get_target_base() by returning current base
    hrtimer: Drop return code of hrtimer_switch_to_hres()
    time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()
    time: Introduce current_kernel_time64()
    time: Introduce struct itimerspec64
    time: Add the common weak version of update_persistent_clock()
    time: Always make sure wall_to_monotonic isn't positive
    time: Fix nanosecond file time rounding in timespec_trunc()
    timer_list: Add the base offset so remaining nsecs are accurate for non monotonic timers
    cris/time: Migrate to new 'set-state' interface
    kernel: broadcast-hrtimer: Migrate to new 'set-state' interface
    xtensa/time: Migrate to new 'set-state' interface
    unicore/time: Migrate to new 'set-state' interface
    um/time: Migrate to new 'set-state' interface
    sparc/time: Migrate to new 'set-state' interface
    sh/localtimer: Migrate to new 'set-state' interface
    score/time: Migrate to new 'set-state' interface
    s390/time: Migrate to new 'set-state' interface
    ...

    Linus Torvalds
     

10 Aug, 2015

1 commit

  • Migrate um driver to the new 'set-state' interface provided by
    clockevents core, the earlier 'set-mode' interface is marked obsolete
    now.

    This also enables us to implement callbacks for new states of clockevent
    devices, for example: ONESHOT_STOPPED.

    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: user-mode-linux-devel@lists.sourceforge.net
    Cc: user-mode-linux-user@lists.sourceforge.net
    Signed-off-by: Viresh Kumar
    Signed-off-by: Daniel Lezcano

    Viresh Kumar
     

31 Jul, 2015

1 commit


18 Jul, 2015

1 commit

  • Commit 2ae416b142b6 ("mm: new mm hook framework") introduced an empty
    header file (mm-arch-hooks.h) for every architecture, even those which
    doesn't need to define mm hooks.

    As suggested by Geert Uytterhoeven, this could be cleaned through the use
    of a generic header file included via each per architecture
    asm/include/Kbuild file.

    The PowerPC architecture is not impacted here since this architecture has
    to defined the arch_remap MM hook.

    Signed-off-by: Laurent Dufour
    Suggested-by: Geert Uytterhoeven
    Acked-by: Geert Uytterhoeven
    Acked-by: Vineet Gupta
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Laurent Dufour
     

09 Jul, 2015

1 commit

  • Added the x86 implementation of word-at-a-time to the
    generic version, which previously only supported big-endian.

    Omitted the x86-specific load_unaligned_zeropad(), which in
    any case is also not present for the existing BE-only
    implementation of a word-at-a-time, and is only used under
    CONFIG_DCACHE_WORD_ACCESS.

    Added as a "generic-y" to the Kbuilds of all architectures
    that didn't previously have it.

    Signed-off-by: Chris Metcalf

    Chris Metcalf
     

07 Jul, 2015

1 commit

  • Once x86 exports its do_signal(), the prototypes will clash.

    Fix the clash and also improve the code a bit: remove the
    unnecessary kern_do_signal() indirection. This allows
    interrupt_end() to share the 'regs' parameter calculation.

    Also remove the unused return code to match x86.

    Minimally build and boot tested.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Andy Lutomirski
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: Denys Vlasenko
    Cc: Frederic Weisbecker
    Cc: H. Peter Anvin
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Richard Weinberger
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/r/67c57eac09a589bac3c6c5ff22f9623ec55a184a.1435952415.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Jul, 2015

1 commit

  • Pull module updates from Rusty Russell:
    "Main excitement here is Peter Zijlstra's lockless rbtree optimization
    to speed module address lookup. He found some abusers of the module
    lock doing that too.

    A little bit of parameter work here too; including Dan Streetman's
    breaking up the big param mutex so writing a parameter can load
    another module (yeah, really). Unfortunately that broke the usual
    suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were
    appended too"

    * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits)
    modules: only use mod->param_lock if CONFIG_MODULES
    param: fix module param locks when !CONFIG_SYSFS.
    rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE()
    module: add per-module param_lock
    module: make perm const
    params: suppress unused variable error, warn once just in case code changes.
    modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'.
    kernel/module.c: avoid ifdefs for sig_enforce declaration
    kernel/workqueue.c: remove ifdefs over wq_power_efficient
    kernel/params.c: export param_ops_bool_enable_only
    kernel/params.c: generalize bool_enable_only
    kernel/module.c: use generic module param operaters for sig_enforce
    kernel/params: constify struct kernel_param_ops uses
    sysfs: tightened sysfs permission checks
    module: Rework module_addr_{min,max}
    module: Use __module_address() for module_address_lookup()
    module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING
    module: Optimize __module_address() using a latched RB-tree
    rbtree: Implement generic latch_tree
    seqlock: Introduce raw_read_seqcount_latch()
    ...

    Linus Torvalds