22 Feb, 2018

1 commit

  • commit 75f296d93bcebcfe375884ddac79e30263a31766 upstream.

    Convert all allocations that used a NOTRACK flag to stop using it.

    Link: http://lkml.kernel.org/r/20171007030159.22241-3-alexander.levin@verizon.com
    Signed-off-by: Sasha Levin
    Cc: Alexander Potapenko
    Cc: Eric W. Biederman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Steven Rostedt
    Cc: Tim Hansen
    Cc: Vegard Nossum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Levin, Alexander (Sasha Levin)
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

14 Oct, 2017

1 commit

  • Kmemleak considers any pointers on task stacks as references. This
    patch clears newly allocated and reused vmap stacks.

    Link: http://lkml.kernel.org/r/150728990124.744199.8403409836394318684.stgit@buzz
    Signed-off-by: Konstantin Khlebnikov
    Acked-by: Catalin Marinas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

16 Aug, 2017

1 commit

  • In some cases, an architecture might wish its stacks to be aligned to a
    boundary larger than THREAD_SIZE. For example, using an alignment of
    double THREAD_SIZE can allow for stack overflows smaller than
    THREAD_SIZE to be detected by checking a single bit of the stack
    pointer.

    This patch allows architectures to override the alignment of VMAP'd
    stacks, by defining THREAD_ALIGN. Where not defined, this defaults to
    THREAD_SIZE, as is the case today.

    Signed-off-by: Mark Rutland
    Reviewed-by: Will Deacon
    Tested-by: Laura Abbott
    Cc: Andy Lutomirski
    Cc: Ard Biesheuvel
    Cc: Catalin Marinas
    Cc: James Morse
    Cc: linux-kernel@vger.kernel.org

    Mark Rutland
     

30 Jun, 2017

1 commit


03 May, 2017

1 commit


05 Apr, 2017

1 commit

  • This patch moves the arch_within_stack_frames() return value enum up in
    the header files so that per-architecture implementations can reuse the
    same return values.

    Signed-off-by: Sahara
    Signed-off-by: James Morse
    [kees: adjusted naming and commit log]
    Signed-off-by: Kees Cook

    Sahara
     

20 Mar, 2017

1 commit

  • Intel supports faulting on the CPUID instruction beginning with Ivy Bridge.
    When enabled, the processor will fault on attempts to execute the CPUID
    instruction with CPL>0. Exposing this feature to userspace will allow a
    ptracer to trap and emulate the CPUID instruction.

    When supported, this feature is controlled by toggling bit 0 of
    MSR_MISC_FEATURES_ENABLES. It is documented in detail in Section 2.3.2 of
    https://bugzilla.kernel.org/attachment.cgi?id=243991

    Implement a new pair of arch_prctls, available on both x86-32 and x86-64.

    ARCH_GET_CPUID: Returns the current CPUID state, either 0 if CPUID faulting
    is enabled (and thus the CPUID instruction is not available) or 1 if
    CPUID faulting is not enabled.

    ARCH_SET_CPUID: Set the CPUID state to the second argument. If
    cpuid_enabled is 0 CPUID faulting will be activated, otherwise it will
    be deactivated. Returns ENODEV if CPUID faulting is not supported on
    this system.

    The state of the CPUID faulting flag is propagated across forks, but reset
    upon exec.

    Signed-off-by: Kyle Huey
    Cc: Grzegorz Andrejczuk
    Cc: kvm@vger.kernel.org
    Cc: Radim Krčmář
    Cc: Peter Zijlstra
    Cc: Dave Hansen
    Cc: Andi Kleen
    Cc: linux-kselftest@vger.kernel.org
    Cc: Nadav Amit
    Cc: Robert O'Callahan
    Cc: Richard Weinberger
    Cc: "Rafael J. Wysocki"
    Cc: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Len Brown
    Cc: Shuah Khan
    Cc: user-mode-linux-devel@lists.sourceforge.net
    Cc: Jeff Dike
    Cc: Alexander Viro
    Cc: user-mode-linux-user@lists.sourceforge.net
    Cc: David Matlack
    Cc: Boris Ostrovsky
    Cc: Dmitry Safonov
    Cc: linux-fsdevel@vger.kernel.org
    Cc: Paolo Bonzini
    Link: http://lkml.kernel.org/r/20170320081628.18952-9-khuey@kylehuey.com
    Signed-off-by: Thomas Gleixner

    Kyle Huey
     

12 Nov, 2016

2 commits

  • When CONFIG_THREAD_INFO_IN_TASK is selected, the current_thread_info()
    macro relies on current having been defined prior to its use. However,
    not all users of current_thread_info() include , and thus
    current is not guaranteed to be defined.

    When CONFIG_THREAD_INFO_IN_TASK is not selected, it's possible that
    get_current() / current are based upon current_thread_info(), and
    includes . Thus always including
    would result in circular dependences on some platforms.

    To ensure both cases work, this patch includes , but only
    when CONFIG_THREAD_INFO_IN_TASK is selected.

    Signed-off-by: Mark Rutland
    Acked-by: Heiko Carstens
    Reviewed-by: Andy Lutomirski
    Cc: Andrew Morton
    Cc: Kees Cook
    Signed-off-by: Catalin Marinas

    Mark Rutland
     
  • Since commit f56141e3e2d9aabf ("all arches, signal: move restart_block
    to struct task_struct"), thread_info and restart_block have been
    logically distinct, yet struct restart_block is still defined in
    .

    At least one architecture (erroneously) uses restart_block as part of
    its thread_info, and thus the definition of restart_block must come
    before the include of . Subsequent patches in this
    series need to shuffle the order of includes and definitions in
    , and will make this ordering fragile.

    This patch moves the definition of restart_block out to its own header.
    This serves as generic cleanup, logically separating thread_info and
    restart_block, and also makes it easier to avoid fragility.

    Signed-off-by: Mark Rutland
    Reviewed-by: Andy Lutomirski
    Cc: Andrew Morton
    Cc: Heiko Carstens
    Cc: Kees Cook
    Signed-off-by: Catalin Marinas

    Mark Rutland
     

20 Oct, 2016

1 commit

  • The following commit:

    c65eacbe290b ("sched/core: Allow putting thread_info into task_struct")

    ... made 'struct thread_info' a generic struct with only a
    single ::flags member, if CONFIG_THREAD_INFO_IN_TASK_STRUCT=y is
    selected.

    This change however seems to be quite x86 centric, since at least the
    generic preemption code (asm-generic/preempt.h) assumes that struct
    thread_info also has a preempt_count member, which apparently was not
    true for x86.

    We could add a bit more #ifdefs to solve this problem too, but it seems
    to be much simpler to make struct thread_info arch specific
    again. This also makes the conversion to THREAD_INFO_IN_TASK_STRUCT a
    bit easier for architectures that have a couple of arch specific stuff
    in their thread_info definition.

    The arch specific stuff _could_ be moved to thread_struct. However
    keeping them in thread_info makes it easier: accessing thread_info
    members is simple, since it is at the beginning of the task_struct,
    while the thread_struct is at the end. At least on s390 the offsets
    needed to access members of the thread_struct (with task_struct as
    base) are too large for various asm instructions. This is not a
    problem when keeping these members within thread_info.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Mark Rutland
    Acked-by: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: keescook@chromium.org
    Cc: linux-arch@vger.kernel.org
    Link: http://lkml.kernel.org/r/1476901693-8492-2-git-send-email-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     

24 Sep, 2016

1 commit

  • The generic THREAD_INFO_IN_TASK definition of thread_info::flags is a
    u32, matching x86 prior to the introduction of THREAD_INFO_IN_TASK.

    However, common helpers like test_ti_thread_flag() implicitly assume
    that thread_info::flags has at least the size and alignment of unsigned
    long, and relying on padding and alignment provided by other elements of
    task_struct is somewhat fragile. Additionally, some architectures use
    more that 32 bits for thread_info::flags, and others may need to in
    future.

    With THREAD_INFO_IN_TASK, task struct follows thread_info with a long
    field, and thus we no longer save any space as we did back in commit:

    affa219b60a11b32 ("x86: change thread_info's flag field back to 32 bits")

    Given all this, it makes more sense for the generic thread_info::flags
    to be an unsigned long.

    In fact given contains/uses the helpers mentioned
    above, BE arches *must* use unsigned long (or something of the same size)
    today, or they wouldn't work.

    Make it so.

    Signed-off-by: Mark Rutland
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1474651447-30447-1-git-send-email-mark.rutland@arm.com
    Signed-off-by: Ingo Molnar

    Mark Rutland
     

15 Sep, 2016

1 commit

  • If an arch opts in by setting CONFIG_THREAD_INFO_IN_TASK_STRUCT,
    then thread_info is defined as a single 'u32 flags' and is the first
    entry of task_struct. thread_info::task is removed (it serves no
    purpose if thread_info is embedded in task_struct), and
    thread_info::cpu gets its own slot in task_struct.

    This is heavily based on a patch written by Linus.

    Originally-from: Linus Torvalds
    Signed-off-by: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Jann Horn
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/a0898196f0476195ca02713691a5037a14f2aac5.1473801993.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     

08 Sep, 2016

1 commit


07 Sep, 2016

1 commit

  • Instead of having each caller of check_object_size() need to remember to
    check for a const size parameter, move the check into check_object_size()
    itself. This actually matches the original implementation in PaX, though
    this commit cleans up the now-redundant builtin_const() calls in the
    various architectures.

    Signed-off-by: Kees Cook

    Kees Cook
     

09 Aug, 2016

1 commit

  • Pull usercopy protection from Kees Cook:
    "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
    copy_from_user bounds checking for most architectures on SLAB and
    SLUB"

    * tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
    mm: SLUB hardened usercopy support
    mm: SLAB hardened usercopy support
    s390/uaccess: Enable hardened usercopy
    sparc/uaccess: Enable hardened usercopy
    powerpc/uaccess: Enable hardened usercopy
    ia64/uaccess: Enable hardened usercopy
    arm64/uaccess: Enable hardened usercopy
    ARM: uaccess: Enable hardened usercopy
    x86/uaccess: Enable hardened usercopy
    mm: Hardened usercopy
    mm: Implement stack frame object validation
    mm: Add is_migrate_cma_page

    Linus Torvalds
     

03 Aug, 2016

1 commit

  • In general, there's no need for the "restore sigmask" flag to live in
    ti->flags. alpha, ia64, microblaze, powerpc, sh, sparc (64-bit only),
    tile, and x86 use essentially identical alternative implementations,
    placing the flag in ti->status.

    Replace those optimized implementations with an equally good common
    implementation that stores it in a bitfield in struct task_struct and
    drop the custom implementations.

    Additional architectures can opt in by removing their
    TIF_RESTORE_SIGMASK defines.

    Link: http://lkml.kernel.org/r/8a14321d64a28e40adfddc90e18a96c086a6d6f9.1468522723.git.luto@kernel.org
    Signed-off-by: Andy Lutomirski
    Tested-by: Michael Ellerman [powerpc]
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Michal Simek
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Yoshinori Sato
    Cc: Rich Felker
    Cc: "David S. Miller"
    Cc: Chris Metcalf
    Cc: Peter Zijlstra
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dmitry Safonov
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     

27 Jul, 2016

2 commits

  • This is the start of porting PAX_USERCOPY into the mainline kernel. This
    is the first set of features, controlled by CONFIG_HARDENED_USERCOPY. The
    work is based on code by PaX Team and Brad Spengler, and an earlier port
    from Casey Schaufler. Additional non-slab page tests are from Rik van Riel.

    This patch contains the logic for validating several conditions when
    performing copy_to_user() and copy_from_user() on the kernel object
    being copied to/from:
    - address range doesn't wrap around
    - address range isn't NULL or zero-allocated (with a non-zero copy size)
    - if on the slab allocator:
    - object size must be less than or equal to copy size (when check is
    implemented in the allocator, which appear in subsequent patches)
    - otherwise, object must not span page allocations (excepting Reserved
    and CMA ranges)
    - if on the stack
    - object must not extend before/after the current process stack
    - object must be contained by a valid stack frame (when there is
    arch/build support for identifying stack frames)
    - object must not overlap with kernel text

    Signed-off-by: Kees Cook
    Tested-by: Valdis Kletnieks
    Tested-by: Michael Ellerman

    Kees Cook
     
  • This creates per-architecture function arch_within_stack_frames() that
    should validate if a given object is contained by a kernel stack frame.
    Initial implementation is on x86.

    This is based on code from PaX.

    Signed-off-by: Kees Cook

    Kees Cook
     

15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

05 Jun, 2014

1 commit

  • Currently to allocate a page that should be charged to kmemcg (e.g.
    threadinfo), we pass __GFP_KMEMCG flag to the page allocator. The page
    allocated is then to be freed by free_memcg_kmem_pages. Apart from
    looking asymmetrical, this also requires intrusion to the general
    allocation path. So let's introduce separate functions that will
    alloc/free pages charged to kmemcg.

    The new functions are called alloc_kmem_pages and free_kmem_pages. They
    should be used when the caller actually would like to use kmalloc, but
    has to fall back to the page allocator for the allocation is large.
    They only differ from alloc_pages and free_pages in that besides
    allocating or freeing pages they also charge them to the kmem resource
    counter of the current memory cgroup.

    [sfr@canb.auug.org.au: export kmalloc_order() to modules]
    Signed-off-by: Vladimir Davydov
    Acked-by: Greg Thelen
    Cc: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Glauber Costa
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

18 Apr, 2014

1 commit


25 Sep, 2013

2 commits

  • Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
    regressed several workloads and caused excessive reschedule
    interrupts.

    The patch in question failed to notice that the x86 code had an
    inverted sense of the polling state versus the new generic code (x86:
    default polling, generic: default !polling).

    Fix the two prominent x86 mwait based idle drivers and introduce a few
    new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
    usage).

    Also switch the idle routines to using tif_need_resched() which is an
    immediate TIF_NEED_RESCHED test as opposed to need_resched which will
    end up being slightly different.

    Reported-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    Cc: lenb@kernel.org
    Cc: tglx@linutronix.de
    Link: http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Preemption semantics are going to change which mandate a change.

    All DRM usage sites are already broken and will not be affected (much)
    by this change. DRM people are aware and will remove the last few
    stragglers.

    For now, leave an empty stub that generates a warning, once all users
    are gone we can remove this.

    Signed-off-by: Peter Zijlstra
    Cc: airlied@linux.ie
    Cc: daniel.vetter@ffwll.ch
    Cc: paulmck@linux.vnet.ibm.com
    Link: http://lkml.kernel.org/n/tip-qfc1el2zvhxiyut4ai99ij4n@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

19 Dec, 2012

1 commit

  • Because those architectures will draw their stacks directly from the page
    allocator, rather than the slab cache, we can directly pass __GFP_KMEMCG
    flag, and issue the corresponding free_pages.

    This code path is taken when the architecture doesn't define
    CONFIG_ARCH_THREAD_INFO_ALLOCATOR (only ia64 seems to), and has
    THREAD_SIZE >= PAGE_SIZE. Luckily, most - if not all - of the remaining
    architectures fall in this category.

    This will guarantee that every stack page is accounted to the memcg the
    process currently lives on, and will have the allocations to fail if they
    go over limit.

    For the time being, I am defining a new variant of THREADINFO_GFP, not to
    mess with the other path. Once the slab is also tracked by memcg, we can
    get rid of that flag.

    Tested to successfully protect against :(){ :|:& };:

    Signed-off-by: Glauber Costa
    Acked-by: Frederic Weisbecker
    Acked-by: Kamezawa Hiroyuki
    Reviewed-by: Michal Hocko
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Greg Thelen
    Cc: Johannes Weiner
    Cc: JoonSoo Kim
    Cc: Mel Gorman
    Cc: Pekka Enberg
    Cc: Rik van Riel
    Cc: Suleiman Souhlal
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Glauber Costa
     

02 Jun, 2012

3 commits


08 May, 2012

1 commit

  • These flags can be useful for extra allocations outside of the core
    code.

    Add __GFP_NOTRACK to them, so the archs which have kmemcheck do
    not have to provide extra allocators just for that reason.

    Signed-off-by: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20120505150141.428211694@linutronix.de

    Thomas Gleixner
     

23 May, 2011

1 commit


02 Feb, 2011

1 commit


18 Sep, 2010

1 commit


06 Apr, 2009

1 commit

  • PI Futexes and their underlying rt_mutex cannot be left ownerless if
    there are pending waiters as this will break the PI boosting logic, so
    the standard requeue commands aren't sufficient. The new commands
    properly manage pi futex ownership by ensuring a futex with waiters
    has an owner at all times. This will allow glibc to properly handle
    pi mutexes with pthread_condvars.

    The approach taken here is to create two new futex op codes:

    FUTEX_WAIT_REQUEUE_PI:
    Tasks will use this op code to wait on a futex (such as a non-pi waitqueue)
    and wake after they have been requeued to a pi futex. Prior to returning to
    userspace, they will acquire this pi futex (and the underlying rt_mutex).

    futex_wait_requeue_pi() is the result of a high speed collision between
    futex_wait() and futex_lock_pi() (with the first part of futex_lock_pi() being
    done by futex_proxy_trylock_atomic() on behalf of the top_waiter).

    FUTEX_REQUEUE_PI (and FUTEX_CMP_REQUEUE_PI):
    This call must be used to wake tasks waiting with FUTEX_WAIT_REQUEUE_PI,
    regardless of how many tasks the caller intends to wake or requeue.
    pthread_cond_broadcast() should call this with nr_wake=1 and
    nr_requeue=INT_MAX. pthread_cond_signal() should call this with nr_wake=1 and
    nr_requeue=0. The reason being we need both callers to get the benefit of the
    futex_proxy_trylock_atomic() routine. futex_requeue() also enqueues the
    top_waiter on the rt_mutex via rt_mutex_start_proxy_lock().

    Signed-off-by: Darren Hart
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Thomas Gleixner

    Darren Hart
     

06 Sep, 2008

1 commit

  • with hrtimer poll/select, the signal restart data no longer is a single
    long representing a jiffies count, but it becomes a second/nanosecond pair
    that also needs to encode if there was a timeout at all or not.

    This patch adds a struct to the restart_block union for this purpose

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Arjan van de Ven

    Thomas Gleixner
     

30 Apr, 2008

3 commits

  • Change all the #ifdef TIF_RESTORE_SIGMASK conditionals in non-arch code to
    #ifdef HAVE_SET_RESTORE_SIGMASK. If arch code defines it first, the generic
    set_restore_sigmask() using TIF_RESTORE_SIGMASK is not defined.

    Signed-off-by: Roland McGrath
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • Set TIF_SIGPENDING in set_restore_sigmask. This lets arch code take
    TIF_RESTORE_SIGMASK out of the set of bits that will be noticed on return to
    user mode. On some machines those bits are scarce, and we can free this
    unneeded one up for other uses.

    It is probably the case that TIF_SIGPENDING is always set anyway everywhere
    set_restore_sigmask() is used. But this is some cheap paranoia in case there
    is an arcane case where it might not be.

    Signed-off-by: Roland McGrath
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • This adds the set_restore_sigmask() inline in and
    replaces every set_thread_flag(TIF_RESTORE_SIGMASK) with a call to it. No
    change, but abstracts the details of the flag protocol from all the calls.

    Signed-off-by: Roland McGrath
    Cc: Oleg Nesterov
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

17 Apr, 2008

1 commit


02 Feb, 2008

1 commit

  • To allow the implementation of optimized rw-locks in user space, glibc
    needs a possibility to select waiters for wakeup depending on a bitset
    mask.

    This requires two new futex OPs: FUTEX_WAIT_BITS and FUTEX_WAKE_BITS
    These OPs are basically the same as FUTEX_WAIT and FUTEX_WAKE plus an
    additional argument - a bitset. Further the FUTEX_WAIT_BITS OP is
    expecting an absolute timeout value instead of the relative one, which
    is used for the FUTEX_WAIT OP.

    FUTEX_WAIT_BITS calls into the kernel with a bitset. The bitset is
    stored in the futex_q structure, which is used to enqueue the waiter
    into the hashed futex waitqueue.

    FUTEX_WAKE_BITS also calls into the kernel with a bitset. The wakeup
    function logically ANDs the bitset with the bitset stored in each
    waiters futex_q structure. If the result is zero (i.e. none of the set
    bits in the bitsets is matching), then the waiter is not woken up. If
    the result is not zero (i.e. one of the set bits in the bitsets is
    matching), then the waiter is woken.

    The bitset provided by the caller must be non zero. In case the
    provided bitset is zero the kernel returns EINVAL.

    Internaly the new OPs are only extensions to the existing FUTEX_WAIT
    and FUTEX_WAKE functions. The existing OPs hand a bitset with all bits
    set into the futex_wait() and futex_wake() functions.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     

30 Jan, 2008

1 commit