24 Jan, 2014

1 commit


09 Nov, 2013

11 commits


25 Oct, 2013

1 commit


01 Oct, 2013

1 commit

  • A high setting of max_map_count, and a process core-dumping with a large
    enough vm_map_count could result in an NT_FILE note not being written,
    and the kernel crashing immediately later because it has assumed
    otherwise.

    Reproduction of the oops-causing bug described here:

    https://lkml.org/lkml/2013/8/30/50

    Rge ussue originated in commit 2aa362c49c31 ("coredump: extend core dump
    note section to contain file names of mapped file") from Oct 4, 2012.

    This patch make that section optional in that case. fill_files_note()
    should signify the error, and also let the info struct in
    elf_core_dump() be zero-initialized so that we can check for the
    optionally written note.

    [akpm@linux-foundation.org: avoid abusing E2BIG, remove a couple of not-really-needed local variables]
    [akpm@linux-foundation.org: fix sparse warning]
    Signed-off-by: Dan Aloni
    Cc: Al Viro
    Cc: Denys Vlasenko
    Reported-by: Martin MOKREJS
    Tested-by: Martin MOKREJS
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Aloni
     

11 Jul, 2013

1 commit

  • Since all architectures have been converted to use vm_unmapped_area(),
    there is no remaining use for the free_area_cache.

    Signed-off-by: Michel Lespinasse
    Acked-by: Rik van Riel
    Cc: "James E.J. Bottomley"
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: Helge Deller
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Paul Mackerras
    Cc: Richard Henderson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

03 May, 2013

1 commit

  • Pull powerpc update from Benjamin Herrenschmidt:
    "The main highlights this time around are:

    - A pile of addition POWER8 bits and nits, such as updated
    performance counter support (Michael Ellerman), new branch history
    buffer support (Anshuman Khandual), base support for the new PCI
    host bridge when not using the hypervisor (Gavin Shan) and other
    random related bits and fixes from various contributors.

    - Some rework of our page table format by Aneesh Kumar which fixes a
    thing or two and paves the way for THP support. THP itself will
    not make it this time around however.

    - More Freescale updates, including Altivec support on the new e6500
    cores, new PCI controller support, and a pile of new boards support
    and updates.

    - The usual batch of trivial cleanups & fixes"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (156 commits)
    powerpc: Fix build error for book3e
    powerpc: Context switch the new EBB SPRs
    powerpc: Turn on the EBB H/FSCR bits
    powerpc: Replace CPU_FTR_BCTAR with CPU_FTR_ARCH_207S
    powerpc: Setup BHRB instructions facility in HFSCR for POWER8
    powerpc: Fix interrupt range check on debug exception
    powerpc: Update tlbie/tlbiel as per ISA doc
    powerpc: Print page size info during boot
    powerpc: print both base and actual page size on hash failure
    powerpc: Fix hpte_decode to use the correct decoding for page sizes
    powerpc: Decode the pte-lp-encoding bits correctly.
    powerpc: Use encode avpn where we need only avpn values
    powerpc: Reduce PTE table memory wastage
    powerpc: Move the pte free routines from common header
    powerpc: Reduce the PTE_INDEX_SIZE
    powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format
    powerpc: New hugepage directory format
    powerpc: Don't truncate pgd_index wrongly
    powerpc: Don't hard code the size of pte page
    powerpc: Save DAR and DSISR in pt_regs on MCE
    ...

    Linus Torvalds
     

01 May, 2013

2 commits

  • Cleanup. Every linux_binfmt->core_dump() sets PF_DUMPCORE, move this into
    zap_threads() called by do_coredump().

    Signed-off-by: Oleg Nesterov
    Acked-by: Mandeep Singh Baines
    Cc: Neil Horman
    Cc: "Rafael J. Wysocki"
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • The comment I originally added in commit a3defbe5c337 ("binfmt_elf: fix
    PIE execution with randomization disabled") is not really 100% accurate
    -- sysctl is not the only way how PF_RANDOMIZE could be forcibly unset
    in runtime.

    Another option of course is direct modification of personality flags
    (i.e. running through setarch wrapper).

    Make the comment more explicit and accurate.

    Signed-off-by: Jiri Kosina
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     

26 Apr, 2013

1 commit

  • We are currently out of free bits in AT_HWCAP. With POWER8, we have
    several hardware features that we need to advertise.

    Tested on POWER and x86.

    Signed-off-by: Michael Neuling
    Signed-off-by: Nishanth Aravamudan
    Signed-off-by: Benjamin Herrenschmidt

    Michael Neuling
     

18 Apr, 2013

1 commit

  • Documentation/filesystems/proc.txt says about coredump_filter bitmask,

    Note bit 0-4 doesn't effect any hugetlb memory. hugetlb memory are only
    effected by bit 5-6.

    However current code can go into the subsequent flag checks of bit 0-4
    for vma(VM_HUGETLB). So this patch inserts 'return' and makes it work
    as written in the document.

    Signed-off-by: Naoya Horiguchi
    Reviewed-by: Rik van Riel
    Acked-by: Michal Hocko
    Reviewed-by: HATAYAMA Daisuke
    Acked-by: KOSAKI Motohiro
    Acked-by: David Rientjes
    Cc: [3.7+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

04 Mar, 2013

1 commit

  • Pull new ImgTec Meta architecture from James Hogan:
    "This adds core architecture support for Imagination's Meta processor
    cores, followed by some later miscellaneous arch/metag cleanups and
    fixes which I kept separate to ease review:

    - Support for basic Meta 1 (ATP) and Meta 2 (HTP) core architecture
    - A few fixes all over, particularly for symbol prefixes
    - A few privilege protection fixes
    - Several cleanups (setup.c includes, split out a lot of
    metag_ksyms.c)
    - Fix some missing exports
    - Convert hugetlb to use vm_unmapped_area()
    - Copy device tree to non-init memory
    - Provide dma_get_sgtable()"

    * tag 'metag-v3.9-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (61 commits)
    metag: Provide dma_get_sgtable()
    metag: prom.h: remove declaration of metag_dt_memblock_reserve()
    metag: copy devicetree to non-init memory
    metag: cleanup metag_ksyms.c includes
    metag: move mm/init.c exports out of metag_ksyms.c
    metag: move usercopy.c exports out of metag_ksyms.c
    metag: move setup.c exports out of metag_ksyms.c
    metag: move kick.c exports out of metag_ksyms.c
    metag: move traps.c exports out of metag_ksyms.c
    metag: move irq enable out of irqflags.h on SMP
    genksyms: fix metag symbol prefix on crc symbols
    metag: hugetlb: convert to vm_unmapped_area()
    metag: export clear_page and copy_page
    metag: export metag_code_cache_flush_all
    metag: protect more non-MMU memory regions
    metag: make TXPRIVEXT bits explicit
    metag: kernel/setup.c: sort includes
    perf: Enable building perf tools for Meta
    metag: add boot time LNKGET/LNKSET check
    metag: add __init to metag_cache_probe()
    ...

    Linus Torvalds
     

03 Mar, 2013

1 commit

  • The commit "binfmt_elf: cleanups"
    (f670d0ecda73b7438eec9ed108680bc5f5362ad8) removed an ifndef elf_map but
    this breaks compilation for metag which does define elf_map.

    This adds the ifndef back in as it was before, but does not affect the
    other cleanups made by that patch.

    Signed-off-by: James Hogan
    Cc: Alexander Viro
    Cc: linux-fsdevel@vger.kernel.org
    Acked-by: Mikael Pettersson

    James Hogan
     

27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

23 Feb, 2013

1 commit


22 Feb, 2013

1 commit


28 Jan, 2013

1 commit

  • This is in preparation for the full dynticks feature. While
    remotely reading the cputime of a task running in a full
    dynticks CPU, we'll need to do some extra-computation. This
    way we can account the time it spent tickless in userspace
    since its last cputime snapshot.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Ingo Molnar
    Cc: Li Zhong
    Cc: Namhyung Kim
    Cc: Paul E. McKenney
    Cc: Paul Gortmaker
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Cc: Thomas Gleixner

    Frederic Weisbecker
     

18 Dec, 2012

1 commit

  • If elf_core_dump() is called and fill_note_info() fails in the kmalloc()
    then it returns 0 but has not yet initialised all the needed fields. As a
    result we do a kfree(randomness) after correctly skipping the thread data.

    [akpm@linux-foundation.org: checkpatch fixes]
    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     

29 Nov, 2012

1 commit


10 Oct, 2012

1 commit

  • Pull generic execve() changes from Al Viro:
    "This introduces the generic kernel_thread() and kernel_execve()
    functions, and switches x86, arm, alpha, um and s390 over to them."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: (26 commits)
    s390: convert to generic kernel_execve()
    s390: switch to generic kernel_thread()
    s390: fold kernel_thread_helper() into ret_from_fork()
    s390: fold execve_tail() into start_thread(), convert to generic sys_execve()
    um: switch to generic kernel_thread()
    x86, um/x86: switch to generic sys_execve and kernel_execve
    x86: split ret_from_fork
    alpha: introduce ret_from_kernel_execve(), switch to generic kernel_execve()
    alpha: switch to generic kernel_thread()
    alpha: switch to generic sys_execve()
    arm: get rid of execve wrapper, switch to generic execve() implementation
    arm: optimized current_pt_regs()
    arm: introduce ret_from_kernel_execve(), switch to generic kernel_execve()
    arm: split ret_from_fork, simplify kernel_thread() [based on patch by rmk]
    generic sys_execve()
    generic kernel_execve()
    new helper: current_pt_regs()
    preparation for generic kernel_thread()
    um: kill thread->forking
    um: let signal_delivered() do SIGTRAP on singlestepping into handler
    ...

    Linus Torvalds
     

09 Oct, 2012

2 commits

  • A long time ago, in v2.4, VM_RESERVED kept swapout process off VMA,
    currently it lost original meaning but still has some effects:

    | effect | alternative flags
    -+------------------------+---------------------------------------------
    1| account as reserved_vm | VM_IO
    2| skip in core dump | VM_IO, VM_DONTDUMP
    3| do not merge or expand | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP
    4| do not mlock | VM_IO, VM_DONTEXPAND, VM_HUGETLB, VM_PFNMAP

    This patch removes reserved_vm counter from mm_struct. Seems like nobody
    cares about it, it does not exported into userspace directly, it only
    reduces total_vm showed in proc.

    Thus VM_RESERVED can be replaced with VM_IO or pair VM_DONTEXPAND | VM_DONTDUMP.

    remap_pfn_range() and io_remap_pfn_range() set VM_IO|VM_DONTEXPAND|VM_DONTDUMP.
    remap_vmalloc_range() set VM_DONTEXPAND | VM_DONTDUMP.

    [akpm@linux-foundation.org: drivers/vfio/pci/vfio_pci.c fixup]
    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     
  • Rename VM_NODUMP into VM_DONTDUMP: this name matches other negative flags:
    VM_DONTEXPAND, VM_DONTCOPY. Currently this flag used only for
    sys_madvise. The next patch will use it for replacing the outdated flag
    VM_RESERVED.

    Also forbid madvise(MADV_DODUMP) for special kernel mappings VM_SPECIAL
    (VM_IO | VM_DONTEXPAND | VM_RESERVED | VM_PFNMAP)

    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

06 Oct, 2012

4 commits

  • This note has the following format:

    long count -- how many files are mapped
    long page_size -- units for file_ofs
    array of [COUNT] elements of
    long start
    long end
    long file_ofs
    followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...

    Signed-off-by: Denys Vlasenko
    Cc: Oleg Nesterov
    Cc: Amerigo Wang
    Cc: "Jonathan M. Foote"
    Cc: Roland McGrath
    Cc: Pedro Alves
    Cc: Fengguang Wu
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denys Vlasenko
     
  • Existing PRSTATUS note contains only si_signo, si_code, si_errno fields
    from the siginfo of the signal which caused core to be dumped.

    There are tools which try to analyze crashes for possible security
    implications, and they want to use, among other data, si_addr field from
    the SIGSEGV.

    This patch adds a new elf note, NT_SIGINFO, which contains the complete
    siginfo_t of the signal which killed the process.

    Signed-off-by: Denys Vlasenko
    Reviewed-by: Oleg Nesterov
    Cc: Amerigo Wang
    Cc: "Jonathan M. Foote"
    Cc: Roland McGrath
    Cc: Pedro Alves
    Cc: Fengguang Wu
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denys Vlasenko
     
  • This is a preparatory patch for the introduction of NT_SIGINFO elf note.

    With this patch we pass "siginfo_t *siginfo" instead of "int signr" to
    do_coredump() and put it into coredump_params. It will be used by the
    next patch. Most changes are simple s/signr/siginfo->si_signo/.

    Signed-off-by: Denys Vlasenko
    Reviewed-by: Oleg Nesterov
    Cc: Amerigo Wang
    Cc: "Jonathan M. Foote"
    Cc: Roland McGrath
    Cc: Pedro Alves
    Cc: Fengguang Wu
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Denys Vlasenko
     
  • load_elf_interp() has interp_map_addr carefully described as
    "uninitialized_var" and marked so as to avoid a warning. However if you
    trace the code it is passed into load_elf_interp and then this value is
    checked against NULL.

    As this return value isn't used this is actually safe but it freaks
    various analysis tools that see un-initialized memory addresses being read
    before their value is ever defined.

    Set it to NULL as a matter of programming good taste if nothing else

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     

27 Sep, 2012

1 commit

  • In !CORE_DUMP_USE_REGSET case, if elf_note_info_init fails to allocate
    memory for info->fields, it frees already allocated stuff and returns
    error to its caller, fill_note_info. Which in turn returns error to its
    caller, elf_core_dump. Which jumps to cleanup label and calls
    free_note_info, which will happily try to free all info->fields again.
    BOOM.

    This is the fix.

    Signed-off-by: Oleg Nesterov
    Signed-off-by: Denys Vlasenko
    Cc: Venu Byravarasu
    Cc:
    Signed-off-by: Andrew Morton

    Denys Vlasenko
     

20 Sep, 2012

1 commit


31 May, 2012

1 commit


24 May, 2012

1 commit

  • Pull user namespace enhancements from Eric Biederman:
    "This is a course correction for the user namespace, so that we can
    reach an inexpensive, maintainable, and reasonably complete
    implementation.

    Highlights:
    - Config guards make it impossible to enable the user namespace and
    code that has not been converted to be user namespace safe.

    - Use of the new kuid_t type ensures the if you somehow get past the
    config guards the kernel will encounter type errors if you enable
    user namespaces and attempt to compile in code whose permission
    checks have not been updated to be user namespace safe.

    - All uids from child user namespaces are mapped into the initial
    user namespace before they are processed. Removing the need to add
    an additional check to see if the user namespace of the compared
    uids remains the same.

    - With the user namespaces compiled out the performance is as good or
    better than it is today.

    - For most operations absolutely nothing changes performance or
    operationally with the user namespace enabled.

    - The worst case performance I could come up with was timing 1
    billion cache cold stat operations with the user namespace code
    enabled. This went from 156s to 164s on my laptop (or 156ns to
    164ns per stat operation).

    - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value.
    Most uid/gid setting system calls treat these value specially
    anyway so attempting to use -1 as a uid would likely cause
    entertaining failures in userspace.

    - If setuid is called with a uid that can not be mapped setuid fails.
    I have looked at sendmail, login, ssh and every other program I
    could think of that would call setuid and they all check for and
    handle the case where setuid fails.

    - If stat or a similar system call is called from a context in which
    we can not map a uid we lie and return overflowuid. The LFS
    experience suggests not lying and returning an error code might be
    better, but the historical precedent with uids is different and I
    can not think of anything that would break by lying about a uid we
    can't map.

    - Capabilities are localized to the current user namespace making it
    safe to give the initial user in a user namespace all capabilities.

    My git tree covers all of the modifications needed to convert the core
    kernel and enough changes to make a system bootable to runlevel 1."

    Fix up trivial conflicts due to nearby independent changes in fs/stat.c

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits)
    userns: Silence silly gcc warning.
    cred: use correct cred accessor with regards to rcu read lock
    userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq
    userns: Convert cgroup permission checks to use uid_eq
    userns: Convert tmpfs to use kuid and kgid where appropriate
    userns: Convert sysfs to use kgid/kuid where appropriate
    userns: Convert sysctl permission checks to use kuid and kgids.
    userns: Convert proc to use kuid/kgid where appropriate
    userns: Convert ext4 to user kuid/kgid where appropriate
    userns: Convert ext3 to use kuid/kgid where appropriate
    userns: Convert ext2 to use kuid/kgid where appropriate.
    userns: Convert devpts to use kuid/kgid where appropriate
    userns: Convert binary formats to use kuid/kgid where appropriate
    userns: Add negative depends on entries to avoid building code that is userns unsafe
    userns: signal remove unnecessary map_cred_ns
    userns: Teach inode_capable to understand inodes whose uids map to other namespaces.
    userns: Fail exec for suid and sgid binaries with ids outside our user namespace.
    userns: Convert stat to return values mapped from kuids and kgids
    userns: Convert user specfied uids and gids in chown into kuids and kgid
    userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs
    ...

    Linus Torvalds
     

16 May, 2012

1 commit