19 Feb, 2016

2 commits

  • As discussed earlier, we attempt to enforce protection keys in
    software.

    However, the code checks all faults to ensure that they are not
    violating protection key permissions. It was assumed that all
    faults are either write faults where we check PKRU[key].WD (write
    disable) or read faults where we check the AD (access disable)
    bit.

    But, there is a third category of faults for protection keys:
    instruction faults. Instruction faults never run afoul of
    protection keys because they do not affect instruction fetches.

    So, plumb the PF_INSTR bit down in to the
    arch_vma_access_permitted() function where we do the protection
    key checks.

    We also add a new FAULT_FLAG_INSTRUCTION. This is because
    handle_mm_fault() is not passed the architecture-specific
    error_code where we keep PF_INSTR, so we need to encode the
    instruction fetch information in to the arch-generic fault
    flags.

    Signed-off-by: Dave Hansen
    Reviewed-by: Thomas Gleixner
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20160212210224.96928009@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     
  • We try to enforce protection keys in software the same way that we
    do in hardware. (See long example below).

    But, we only want to do this when accessing our *own* process's
    memory. If GDB set PKRU[6].AD=1 (disable access to PKEY 6), then
    tried to PTRACE_POKE a target process which just happened to have
    some mprotect_pkey(pkey=6) memory, we do *not* want to deny the
    debugger access to that memory. PKRU is fundamentally a
    thread-local structure and we do not want to enforce it on access
    to _another_ thread's data.

    This gets especially tricky when we have workqueues or other
    delayed-work mechanisms that might run in a random process's context.
    We can check that we only enforce pkeys when operating on our *own* mm,
    but delayed work gets performed when a random user context is active.
    We might end up with a situation where a delayed-work gup fails when
    running randomly under its "own" task but succeeds when running under
    another process. We want to avoid that.

    To avoid that, we use the new GUP flag: FOLL_REMOTE and add a
    fault flag: FAULT_FLAG_REMOTE. They indicate that we are
    walking an mm which is not guranteed to be the same as
    current->mm and should not be subject to protection key
    enforcement.

    Thanks to Jerome Glisse for pointing out this scenario.

    Signed-off-by: Dave Hansen
    Reviewed-by: Thomas Gleixner
    Cc: Alexey Kardashevskiy
    Cc: Andrea Arcangeli
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Boaz Harrosh
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dan Williams
    Cc: Dave Chinner
    Cc: Dave Hansen
    Cc: David Gibson
    Cc: Denys Vlasenko
    Cc: Dominik Dingel
    Cc: Dominik Vogt
    Cc: Eric B Munson
    Cc: Geliang Tang
    Cc: Guan Xuetao
    Cc: H. Peter Anvin
    Cc: Heiko Carstens
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: Jason Low
    Cc: Jerome Marchand
    Cc: Joerg Roedel
    Cc: Kirill A. Shutemov
    Cc: Konstantin Khlebnikov
    Cc: Laurent Dufour
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Matthew Wilcox
    Cc: Mel Gorman
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Mikulas Patocka
    Cc: Minchan Kim
    Cc: Oleg Nesterov
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Sasha Levin
    Cc: Shachar Raindel
    Cc: Vlastimil Babka
    Cc: Xie XiuQi
    Cc: iommu@lists.linux-foundation.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-mm@kvack.org
    Cc: linux-s390@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

18 Feb, 2016

1 commit

  • Today, for normal faults and page table walks, we check the VMA
    and/or PTE to ensure that it is compatible with the action. For
    instance, if we get a write fault on a non-writeable VMA, we
    SIGSEGV.

    We try to do the same thing for protection keys. Basically, we
    try to make sure that if a user does this:

    mprotect(ptr, size, PROT_NONE);
    *ptr = foo;

    they see the same effects with protection keys when they do this:

    mprotect(ptr, size, PROT_READ|PROT_WRITE);
    set_pkey(ptr, size, 4);
    wrpkru(0xffffff3f); // access disable pkey 4
    *ptr = foo;

    The state to do that checking is in the VMA, but we also
    sometimes have to do it on the page tables only, like when doing
    a get_user_pages_fast() where we have no VMA.

    We add two functions and expose them to generic code:

    arch_pte_access_permitted(pte_flags, write)
    arch_vma_access_permitted(vma, write)

    These are, of course, backed up in x86 arch code with checks
    against the PTE or VMA's protection key.

    But, there are also cases where we do not want to respect
    protection keys. When we ptrace(), for instance, we do not want
    to apply the tracer's PKRU permissions to the PTEs from the
    process being traced.

    Signed-off-by: Dave Hansen
    Reviewed-by: Thomas Gleixner
    Cc: Alexey Kardashevskiy
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Andy Lutomirski
    Cc: Aneesh Kumar K.V
    Cc: Arnd Bergmann
    Cc: Benjamin Herrenschmidt
    Cc: Boaz Harrosh
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Gibson
    Cc: David Hildenbrand
    Cc: David Vrabel
    Cc: Denys Vlasenko
    Cc: Dominik Dingel
    Cc: Dominik Vogt
    Cc: Guan Xuetao
    Cc: H. Peter Anvin
    Cc: Heiko Carstens
    Cc: Hugh Dickins
    Cc: Jason Low
    Cc: Jerome Marchand
    Cc: Juergen Gross
    Cc: Kirill A. Shutemov
    Cc: Laurent Dufour
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Matthew Wilcox
    Cc: Mel Gorman
    Cc: Michael Ellerman
    Cc: Michal Hocko
    Cc: Mikulas Patocka
    Cc: Minchan Kim
    Cc: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Sasha Levin
    Cc: Shachar Raindel
    Cc: Stephen Smalley
    Cc: Toshi Kani
    Cc: Vlastimil Babka
    Cc: linux-arch@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-mm@kvack.org
    Cc: linux-s390@vger.kernel.org
    Cc: linuxppc-dev@lists.ozlabs.org
    Link: http://lkml.kernel.org/r/20160212210219.14D5D715@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

19 Nov, 2014

1 commit

  • The x86 MPX patch set calls arch_unmap() and arch_bprm_mm_init()
    from fs/exec.c, so we need at least a stub for them in all
    architectures. They are only called under an #ifdef for
    CONFIG_MMU=y, so we can at least restict this to architectures
    with MMU support.

    blackfin/c6x have no MMU support, so do not call arch_unmap().
    They also do not include mm_hooks.h or mmu_context.h at all and
    do not need to be touched.

    s390, um and unicore32 do not use asm-generic/mm_hooks.h, so got
    their own arch_unmap() versions. (I also moved um's
    arch_dup_mmap() to be closer to the other mm_hooks.h functions).

    xtensa only includes mm_hooks when MMU=y, which should be fine
    since arch_unmap() is called only from MMU=y code.

    For the rest, we use the stub copies of these functions in
    asm-generic/mm_hook.h.

    I cross compiled defconfigs for cris (to check NOMMU) and s390
    to make sure that this works. I also checked a 64-bit build
    of UML and all my normal x86 builds including PARAVIRT on and
    off.

    Signed-off-by: Dave Hansen
    Cc: Dave Hansen
    Cc: linux-arch@vger.kernel.org
    Cc: x86@kernel.org
    Link: http://lkml.kernel.org/r/20141118182350.8B4AA2C2@viggo.jf.intel.com
    Signed-off-by: Thomas Gleixner

    Dave Hansen
     

03 May, 2007

1 commit

  • Add hooks to allow a paravirt implementation to track the lifetime of
    an mm. Paravirtualization requires three hooks, but only two are
    needed in common code. They are:

    arch_dup_mmap, which is called when a new mmap is created at fork

    arch_exit_mmap, which is called when the last process reference to an
    mm is dropped, which typically happens on exit and exec.

    The third hook is activate_mm, which is called from the arch-specific
    activate_mm() macro/function, and so doesn't need stub versions for
    other architectures. It's called when an mm is first used.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: linux-arch@vger.kernel.org
    Cc: James Bottomley
    Acked-by: Ingo Molnar

    Jeremy Fitzhardinge