30 Sep, 2015

2 commits

  • commit b1b4e435e4ef7de77f07bf2a42c8380b960c2d44 upstream.

    When detecting a serial port on newer PA-RISC machines (with iosapic) we have a
    long way to go to find the right IRQ line, registering it, then registering the
    serial port and the irq handler for the serial port. During this phase spurious
    interrupts for the serial port may happen which then crashes the kernel because
    the action handler might not have been set up yet.

    So, basically it's a race condition between the serial port hardware and the
    CPU which sets up the necessary fields in the irq sructs. The main reason for
    this race is, that we unmask the serial port irqs too early without having set
    up everything properly before (which isn't easily possible because we need the
    IRQ number to register the serial ports).

    This patch is a work-around for this problem. It adds checks to the CPU irq
    handler to verify if the IRQ action field has been initialized already. If not,
    we just skip this interrupt (which isn't critical for a serial port at bootup).
    The real fix would probably involve rewriting all PA-RISC specific IRQ code
    (for CPU, IOSAPIC, GSC and EISA) to use IRQ domains with proper parenting of
    the irq chips and proper irq enabling along this line.

    This bug has been in the PA-RISC port since the beginning, but the crashes
    happened very rarely with currently used hardware. But on the latest machine
    which I bought (a C8000 workstation), which uses the fastest CPUs (4 x PA8900,
    1GHz) and which has the largest possible L1 cache size (64MB each), the kernel
    crashed at every boot because of this race. So, without this patch the machine
    would currently be unuseable.

    For the record, here is the flow logic:
    1. serial_init_chip() in 8250_gsc.c calls iosapic_serial_irq().
    2. iosapic_serial_irq() calls txn_alloc_irq() to find the irq.
    3. iosapic_serial_irq() calls cpu_claim_irq() to register the CPU irq
    4. cpu_claim_irq() unmasks the CPU irq (which it shouldn't!)
    5. serial_init_chip() then registers the 8250 port.
    Problems:
    - In step 4 the CPU irq shouldn't have been registered yet, but after step 5
    - If serial irq happens between 4 and 5 have finished, the kernel will crash

    Signed-off-by: Helge Deller
    Signed-off-by: Greg Kroah-Hartman

    Helge Deller
     
  • commit 1b59ddfcf1678de38a1f8ca9fb8ea5eebeff1843 upstream.

    The attached change fixes the condition used in the "sub" instruction.
    A double word comparison is needed. This fixes the 64-bit LWS CAS
    operation on 64-bit kernels.

    I can now enable 64-bit atomic support in GCC.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller
    Signed-off-by: Greg Kroah-Hartman

    John David Anglin
     

11 Aug, 2015

2 commits

  • commit 4c4ac9a48ac512c6b5a6cca06cfad2ad96e8caaa upstream.

    Commit 0e0da48dee8d ("parisc: mm: don't count preallocated pmds")
    introduced a memory leak.

    After this commit, the 'return' statement in pmd_free is executed in all
    cases. Even for pmd that are not attached to the pgd. So 'free_pages'
    can never be called anymore, leading to a memory leak.

    Signed-off-by: Christophe JAILLET
    Acked-by: Kirill A. Shutemov
    Acked-by: Mikulas Patocka
    Acked-by: Helge Deller
    Signed-off-by: Helge Deller
    Signed-off-by: Greg Kroah-Hartman

    Christophe Jaillet
     
  • commit 01ab60570427caa24b9debc369e452e86cd9beb4 upstream.

    The increased use of pdtlb/pitlb instructions seemed to increase the
    frequency of random segmentation faults building packages. Further, we
    had a number of cases where TLB inserts would repeatedly fail and all
    forward progress would stop. The Haskell ghc package caused a lot of
    trouble in this area. The final indication of a race in pte handling was
    this syslog entry on sibaris (C8000):

    swap_free: Unused swap offset entry 00000004
    BUG: Bad page map in process mysqld pte:00000100 pmd:019bbec5
    addr:00000000ec464000 vm_flags:00100073 anon_vma:0000000221023828 mapping: (null) index:ec464
    CPU: 1 PID: 9176 Comm: mysqld Not tainted 4.0.0-2-parisc64-smp #1 Debian 4.0.5-1
    Backtrace:
    [] show_stack+0x20/0x38
    [] dump_stack+0x9c/0x110
    [] print_bad_pte+0x1a8/0x278
    [] unmap_single_vma+0x3d8/0x770
    [] zap_page_range+0xf0/0x198
    [] SyS_madvise+0x404/0x8c0

    Note that the pte value is 0 except for the accessed bit 0x100. This bit
    shouldn't be set without the present bit.

    It should be noted that the madvise system call is probably a trigger for many
    of the random segmentation faults.

    In looking at the kernel code, I found the following problems:

    1) The pte_clear define didn't take TLB lock when clearing a pte.
    2) We didn't test pte present bit inside lock in exception support.
    3) The pte and tlb locks needed to merged in order to ensure consistency
    between page table and TLB. This also has the effect of serializing TLB
    broadcasts on SMP systems.

    The attached change implements the above and a few other tweaks to try
    to improve performance. Based on the timing code, TLB purges are very
    slow (e.g., ~ 209 cycles per page on rp3440). Thus, I think it
    beneficial to test the split_tlb variable to avoid duplicate purges.
    Probably, all PA 2.0 machines have combined TLBs.

    I dropped using __flush_tlb_range in flush_tlb_mm as I realized all
    applications and most threads have a stack size that is too large to
    make this useful. I added some comments to this effect.

    Since implementing 1 through 3, I haven't had any random segmentation
    faults on mx3210 (rp3440) in about one week of building code and running
    as a Debian buildd.

    Signed-off-by: John David Anglin
    Signed-off-by: Helge Deller
    Signed-off-by: Greg Kroah-Hartman

    John David Anglin
     

13 May, 2015

1 commit

  • On architectures where the stack grows upwards (CONFIG_STACK_GROWSUP=y,
    currently parisc and metag only) stack randomization sometimes leads to crashes
    when the stack ulimit is set to lower values than STACK_RND_MASK (which is 8 MB
    by default if not defined in arch-specific headers).

    The problem is, that when the stack vm_area_struct is set up in fs/exec.c, the
    additional space needed for the stack randomization (as defined by the value of
    STACK_RND_MASK) was not taken into account yet and as such, when the stack
    randomization code added a random offset to the stack start, the stack
    effectively got smaller than what the user defined via rlimit_max(RLIMIT_STACK)
    which then sometimes leads to out-of-stack situations and crashes.

    This patch fixes it by adding the maximum possible amount of memory (based on
    STACK_RND_MASK) which theoretically could be added by the stack randomization
    code to the initial stack size. That way, the user-defined stack size is always
    guaranteed to be at minimum what is defined via rlimit_max(RLIMIT_STACK).

    This bug is currently not visible on the metag architecture, because on metag
    STACK_RND_MASK is defined to 0 which effectively disables stack randomization.

    The changes to fs/exec.c are inside an "#ifdef CONFIG_STACK_GROWSUP"
    section, so it does not affect other platformws beside those where the
    stack grows upwards (parisc and metag).

    Signed-off-by: Helge Deller
    Cc: linux-parisc@vger.kernel.org
    Cc: James Hogan
    Cc: linux-metag@vger.kernel.org
    Cc: stable@vger.kernel.org # v3.16+

    Helge Deller
     

24 Apr, 2015

1 commit


22 Apr, 2015

2 commits

  • The following warning is seen when compiling parisc images

    ./arch/parisc/include/asm/pgalloc.h: In function 'pgd_alloc':
    ./arch/parisc/include/asm/pgalloc.h:29:5: warning: "PT_NLEVELS" is not defined

    Some definitions of PT_NLEVELS were missed with the conversion to
    CONFIG_PGTABLE_LEVELS.

    Fixes: f24ffde43237 ("parisc: expose number of page table levels
    on Kconfig level")
    Cc: Kirill A. Shutemov
    Signed-off-by: Guenter Roeck
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Helge Deller

    Guenter Roeck
     
  • The only reason to keep parisc's private asm/scatterlist.h was that it
    had the macro sg_virt_addr(). Convert all callers to use something else
    (sometimes just sg->offset was enough, others should use sg_virt()), and
    we can just use the asm-generic scatterlist.h instead.

    Signed-off-by: Matthew Wilcox
    Signed-off-by: Dave Anglin
    Signed-off-by: Helge Deller

    Matthew Wilcox
     

21 Apr, 2015

1 commit

  • Pull final removal of deprecated cpus_* cpumask functions from Rusty Russell:
    "This is the final removal (after several years!) of the obsolete
    cpus_* functions, prompted by their mis-use in staging.

    With these function removed, all cpu functions should only iterate to
    nr_cpu_ids, so we finally only allocate that many bits when cpumasks
    are allocated offstack"

    * tag 'cpumask-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (25 commits)
    cpumask: remove __first_cpu / __next_cpu
    cpumask: resurrect CPU_MASK_CPU0
    linux/cpumask.h: add typechecking to cpumask_test_cpu
    cpumask: only allocate nr_cpumask_bits.
    Fix weird uses of num_online_cpus().
    cpumask: remove deprecated functions.
    mips: fix obsolete cpumask_of_cpu usage.
    x86: fix more deprecated cpu function usage.
    ia64: remove deprecated cpus_ usage.
    powerpc: fix deprecated CPU_MASK_CPU0 usage.
    CPU_MASK_ALL/CPU_MASK_NONE: remove from deprecated region.
    staging/lustre/o2iblnd: Don't use cpus_weight
    staging/lustre/libcfs: replace deprecated cpus_ calls with cpumask_
    staging/lustre/ptlrpc: Do not use deprecated cpus_* functions
    blackfin: fix up obsolete cpu function usage.
    parisc: fix up obsolete cpu function usage.
    tile: fix up obsolete cpu function usage.
    arm64: fix up obsolete cpu function usage.
    mips: fix up obsolete cpu function usage.
    x86: fix up obsolete cpu function usage.
    ...

    Linus Torvalds
     

17 Apr, 2015

1 commit


16 Apr, 2015

1 commit

  • Pull exec domain removal from Richard Weinberger:
    "This series removes execution domain support from Linux.

    The idea behind exec domains was to support different ABIs. The
    feature was never complete nor stable. Let's rip it out and make the
    kernel signal handling code less complicated"

    * 'exec_domain_rip_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (27 commits)
    arm64: Removed unused variable
    sparc: Fix execution domain removal
    Remove rest of exec domains.
    arch: Remove exec_domain from remaining archs
    arc: Remove signal translation and exec_domain
    xtensa: Remove signal translation and exec_domain
    xtensa: Autogenerate offsets in struct thread_info
    x86: Remove signal translation and exec_domain
    unicore32: Remove signal translation and exec_domain
    um: Remove signal translation and exec_domain
    tile: Remove signal translation and exec_domain
    sparc: Remove signal translation and exec_domain
    sh: Remove signal translation and exec_domain
    s390: Remove signal translation and exec_domain
    mn10300: Remove signal translation and exec_domain
    microblaze: Remove signal translation and exec_domain
    m68k: Remove signal translation and exec_domain
    m32r: Remove signal translation and exec_domain
    m32r: Autogenerate offsets in struct thread_info
    frv: Remove signal translation and exec_domain
    ...

    Linus Torvalds
     

15 Apr, 2015

1 commit


13 Apr, 2015

1 commit


23 Mar, 2015

3 commits

  • Make the code which sets up the pmd depend on PT_NLEVELS == 3, not on
    CONFIG_64BIT. The reason is, that a 64bit kernel with a page size
    greater than 4k doesn't need the pmd and thus has PT_NLEVELS = 2.

    Signed-off-by: Helge Deller

    Helge Deller
     
  • The patch dc6c9a35b66b520cf67e05d8ca60ebecad3b0479 that counts pmds
    allocated for a process introduced a bug on 64-bit PA-RISC kernels.

    The PA-RISC architecture preallocates one pmd with each pgd. This
    preallocated pmd can never be freed - pmd_free does nothing when it is
    called with this pmd. When the kernel attempts to free this preallocated
    pmd, it decreases the count of allocated pmds. The result is that the
    counter underflows and this error is reported.

    This patch fixes the bug by artifically incrementing the counter in
    pmd_free when the kernel tries to free the preallocated pmd.

    Signed-off-by: Mikulas Patocka
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Helge Deller

    Mikulas Patocka
     
  • Signed-off-by: Helge Deller

    Helge Deller
     

05 Mar, 2015

1 commit


01 Mar, 2015

1 commit

  • Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
    table levels folded. Usually, these defines are provided by
    and .

    But some architectures fold page table levels in a custom way. They
    need to define these macros themself. This patch adds missing defines.

    The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
    and __pud_alloc() on architectures without these page table levels.

    Signed-off-by: Kirill A. Shutemov
    Cc: Aaro Koskinen
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: "James E.J. Bottomley"
    Cc: Koichi Yasutake
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

20 Feb, 2015

1 commit

  • Pull kbuild updates from Michal Marek:

    - several cleanups in kbuild

    - serialize multiple *config targets so that 'make defconfig kvmconfig'
    works

    - The cc-ifversion macro got support for an else-branch

    * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
    kbuild,gcov: simplify kernel/gcov/Makefile more
    kbuild: allow cc-ifversion to have the argument for false condition
    kbuild,gcov: simplify kernel/gcov/Makefile
    kbuild,gcov: remove unnecessary workaround
    kbuild: do not add $(call ...) to invoke cc-version or cc-fullversion
    kbuild: fix cc-ifversion macro
    kbuild: drop $(version_h) from MRPROPER_FILES
    kbuild: use mixed-targets when two or more config targets are given
    kbuild: remove redundant line from bounds.h/asm-offsets.h
    kbuild: merge bounds.h and asm-offsets.h rules
    kbuild: Drop support for clean-rule

    Linus Torvalds
     

17 Feb, 2015

10 commits


14 Feb, 2015

1 commit

  • For instrumenting global variables KASan will shadow memory backing memory
    for modules. So on module loading we will need to allocate memory for
    shadow and map it at address in shadow that corresponds to the address
    allocated in module_alloc().

    __vmalloc_node_range() could be used for this purpose, except it puts a
    guard hole after allocated area. Guard hole in shadow memory should be a
    problem because at some future point we might need to have a shadow memory
    at address occupied by guard hole. So we could fail to allocate shadow
    for module_alloc().

    Now we have VM_NO_GUARD flag disabling guard page, so we need to pass into
    __vmalloc_node_range(). Add new parameter 'vm_flags' to
    __vmalloc_node_range() function.

    Signed-off-by: Andrey Ryabinin
    Cc: Dmitry Vyukov
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrey Konovalov
    Cc: Yuri Gribov
    Cc: Konstantin Khlebnikov
    Cc: Sasha Levin
    Cc: Christoph Lameter
    Cc: Joonsoo Kim
    Cc: Dave Hansen
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     

13 Feb, 2015

1 commit

  • If an attacker can cause a controlled kernel stack overflow, overwriting
    the restart block is a very juicy exploit target. This is because the
    restart_block is held in the same memory allocation as the kernel stack.

    Moving the restart block to struct task_struct prevents this exploit by
    making the restart_block harder to locate.

    Note that there are other fields in thread_info that are also easy
    targets, at least on some architectures.

    It's also a decent simplification, since the restart code is more or less
    identical on all architectures.

    [james.hogan@imgtec.com: metag: align thread_info::supervisor_stack]
    Signed-off-by: Andy Lutomirski
    Cc: Thomas Gleixner
    Cc: Al Viro
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Kees Cook
    Cc: David Miller
    Acked-by: Richard Weinberger
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Vineet Gupta
    Cc: Russell King
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Steven Miao
    Cc: Mark Salter
    Cc: Aurelien Jacquiot
    Cc: Mikael Starvik
    Cc: Jesper Nilsson
    Cc: David Howells
    Cc: Richard Kuo
    Cc: "Luck, Tony"
    Cc: Geert Uytterhoeven
    Cc: Michal Simek
    Cc: Ralf Baechle
    Cc: Jonas Bonn
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Acked-by: Michael Ellerman (powerpc)
    Tested-by: Michael Ellerman (powerpc)
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Chen Liqin
    Cc: Lennox Wu
    Cc: Chris Metcalf
    Cc: Guan Xuetao
    Cc: Chris Zankel
    Cc: Max Filippov
    Cc: Oleg Nesterov
    Cc: Guenter Roeck
    Signed-off-by: James Hogan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     

12 Feb, 2015

1 commit

  • LKP has triggered a compiler warning after my recent patch "mm: account
    pmd page tables to the process":

    mm/mmap.c: In function 'exit_mmap':
    >> mm/mmap.c:2857:2: warning: right shift count >= width of type [enabled by default]

    The code:

    > 2857 WARN_ON(mm_nr_pmds(mm) >
    2858 round_up(FIRST_USER_ADDRESS, PUD_SIZE) >> PUD_SHIFT);

    In this, on tile, we have FIRST_USER_ADDRESS defined as 0. round_up() has
    the same type -- int. PUD_SHIFT.

    I think the best way to fix it is to define FIRST_USER_ADDRESS as unsigned
    long. On every arch for consistency.

    Signed-off-by: Kirill A. Shutemov
    Reported-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

11 Feb, 2015

1 commit


30 Jan, 2015

1 commit

  • The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
    "you should SIGSEGV" error, because the SIGSEGV case was generally
    handled by the caller - usually the architecture fault handler.

    That results in lots of duplication - all the architecture fault
    handlers end up doing very similar "look up vma, check permissions, do
    retries etc" - but it generally works. However, there are cases where
    the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

    In particular, when accessing the stack guard page, libsigsegv expects a
    SIGSEGV. And it usually got one, because the stack growth is handled by
    that duplicated architecture fault handler.

    However, when the generic VM layer started propagating the error return
    from the stack expansion in commit fee7e49d4514 ("mm: propagate error
    from stack expansion even for guard page"), that now exposed the
    existing VM_FAULT_SIGBUS result to user space. And user space really
    expected SIGSEGV, not SIGBUS.

    To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
    duplicate architecture fault handlers about it. They all already have
    the code to handle SIGSEGV, so it's about just tying that new return
    value to the existing code, but it's all a bit annoying.

    This is the mindless minimal patch to do this. A more extensive patch
    would be to try to gather up the mostly shared fault handling logic into
    one generic helper routine, and long-term we really should do that
    cleanup.

    Just from this patch, you can generally see that most architectures just
    copied (directly or indirectly) the old x86 way of doing things, but in
    the meantime that original x86 model has been improved to hold the VM
    semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
    "newer" things, so it would be a good idea to bring all those
    improvements to the generic case and teach other architectures about
    them too.

    Reported-and-tested-by: Takashi Iwai
    Tested-by: Jan Engelhardt
    Acked-by: Heiko Carstens # "s390 still compiles and boots"
    Cc: linux-arch@vger.kernel.org
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

23 Jan, 2015

1 commit

  • Pull module and param fixes from Rusty Russell:
    "Surprising number of fixes this merge window :(

    The first two are minor fallout from the param rework which went in
    this merge window.

    The next three are a series which fixes a longstanding (but never
    previously reported and unlikely , so no CC stable) race between
    kallsyms and freeing the init section.

    Finally, a minor cleanup as our module refcount will now be -1 during
    unload"

    * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
    module: make module_refcount() a signed integer.
    module: fix race in kallsyms resolution during module load success.
    module: remove mod arg from module_free, rename module_memfree().
    module_arch_freeing_init(): new hook for archs before module->module_init freed.
    param: fix uninitialized read with CONFIG_DEBUG_LOCK_ALLOC
    param: initialize store function to NULL if not available.

    Linus Torvalds
     

20 Jan, 2015

1 commit

  • Archs have been abusing module_free() to clean up their arch-specific
    allocations. Since module_free() is also (ab)used by BPF and trace code,
    let's keep it to simple allocations, and provide a hook called before
    that.

    This means that avr32, ia64, parisc and s390 no longer need to implement
    their own module_free() at all. avr32 doesn't need module_finalize()
    either.

    Signed-off-by: Rusty Russell
    Cc: Chris Metcalf
    Cc: Haavard Skinnemoen
    Cc: Hans-Christian Egtvedt
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-ia64@vger.kernel.org
    Cc: linux-parisc@vger.kernel.org
    Cc: linux-s390@vger.kernel.org

    Rusty Russell
     

10 Jan, 2015

1 commit


27 Dec, 2014

2 commits


14 Dec, 2014

1 commit