16 Apr, 2015

1 commit

  • Most-used page->mapping helper -- page_mapping() -- has already uninlined.
    Let's uninline also page_rmapping() and page_anon_vma(). It saves us
    depending on configuration around 400 bytes in text:

    text data bss dec hex filename
    660318 99254 410000 1169572 11d8a4 mm/built-in.o-before
    659854 99254 410000 1169108 11d6d4 mm/built-in.o

    I also tried to make code a bit more clean.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Kirill A. Shutemov
    Cc: Christoph Lameter
    Cc: Konstantin Khlebnikov
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

14 Feb, 2015

1 commit

  • kstrdup() is often used to duplicate strings where neither source neither
    destination will be ever modified. In such case we can just reuse the
    source instead of duplicating it. The problem is that we must be sure
    that the source is non-modifiable and its life-time is long enough.

    I suspect the good candidates for such strings are strings located in
    kernel .rodata section, they cannot be modifed because the section is
    read-only and their life-time is equal to kernel life-time.

    This small patchset proposes alternative version of kstrdup -
    kstrdup_const, which returns source string if it is located in .rodata
    otherwise it fallbacks to kstrdup. To verify if the source is in
    .rodata function checks if the address is between sentinels
    __start_rodata, __end_rodata. I guess it should work with all
    architectures.

    The main patch is accompanied by four patches constifying kstrdup for
    cases where situtation described above happens frequently.

    I have tested the patchset on mobile platform (exynos4210-trats) and it
    saves 3272 string allocations. Since minimal allocation is 32 or 64
    bytes depending on Kconfig options the patchset saves respectively about
    100KB or 200KB of memory.

    Stats from tested platform show that the main offender is sysfs:

    By caller:
    2260 __kernfs_new_node
    631 clk_register+0xc8/0x1b8
    318 clk_register+0x34/0x1b8
    51 kmem_cache_create
    12 alloc_vfsmnt

    By string (with count >= 5):
    883 power
    876 subsystem
    135 parameters
    132 device
    61 iommu_group
    ...

    This patch (of 5):

    Add an alternative version of kstrdup which returns pointer to constant
    char array. The function checks if input string is in persistent and
    read-only memory section, if yes it returns the input string, otherwise it
    fallbacks to kstrdup.

    kstrdup_const is accompanied by kfree_const performing conditional memory
    deallocation of the string.

    Signed-off-by: Andrzej Hajda
    Cc: Marek Szyprowski
    Cc: Kyungmin Park
    Cc: Mike Turquette
    Cc: Alexander Viro
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Tejun Heo
    Cc: Greg KH
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrzej Hajda
     

12 Feb, 2015

1 commit


10 Oct, 2014

1 commit

  • - Rename vm_is_stack() to task_of_stack() and change it to return
    "struct task_struct *" rather than the global (and thus wrong in
    general) pid_t.

    - Add the new pid_of_stack() helper which calls task_of_stack() and
    uses the right namespace to report the correct pid_t.

    Unfortunately we need to define this helper twice, in task_mmu.c
    and in task_nommu.c. perhaps it makes sense to add fs/proc/util.c
    and move at least pid_of_stack/task_of_stack there to avoid the
    code duplication.

    - Change show_map_vma() and show_numa_map() to use the new helper.

    Signed-off-by: Oleg Nesterov
    Cc: Alexander Viro
    Cc: Cyrill Gorcunov
    Cc: "Eric W. Biederman"
    Cc: Greg Ungerer
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

09 Aug, 2014

1 commit

  • Aleksei hit the soft lockup during reading /proc/PID/smaps. David
    investigated the problem and suggested the right fix.

    while_each_thread() is racy and should die, this patch updates
    vm_is_stack().

    Signed-off-by: Oleg Nesterov
    Reported-by: Aleksei Besogonov
    Tested-by: Aleksei Besogonov
    Suggested-by: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

07 Aug, 2014

1 commit

  • Functions krealloc(), __krealloc(), kzfree() belongs to slab API, so
    should be placed in slab_common.c

    Also move slab allocator's tracepoints defenitions to slab_common.c No
    functional changes here.

    Signed-off-by: Andrey Ryabinin
    Acked-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Ryabinin
     

07 May, 2014

1 commit


13 Apr, 2014

1 commit

  • Pull audit updates from Eric Paris.

    * git://git.infradead.org/users/eparis/audit: (28 commits)
    AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
    audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
    audit: do not cast audit_rule_data pointers pointlesly
    AUDIT: Allow login in non-init namespaces
    audit: define audit_is_compat in kernel internal header
    kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
    sched: declare pid_alive as inline
    audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
    syscall_get_arch: remove useless function arguments
    audit: remove stray newline from audit_log_execve_info() audit_panic() call
    audit: remove stray newlines from audit_log_lost messages
    audit: include subject in login records
    audit: remove superfluous new- prefix in AUDIT_LOGIN messages
    audit: allow user processes to log from another PID namespace
    audit: anchor all pid references in the initial pid namespace
    audit: convert PPIDs to the inital PID namespace.
    pid: get pid_t ppid of task in init_pid_ns
    audit: rename the misleading audit_get_context() to audit_take_context()
    audit: Add generic compat syscall support
    audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
    ...

    Linus Torvalds
     

08 Apr, 2014

1 commit

  • To increase compiler portability there is which
    provides convenience macros for various gcc constructs. Eg: __weak for
    __attribute__((weak)). I've replaced all instances of gcc attributes with
    the right macro in the memory management (/mm) subsystem.

    [akpm@linux-foundation.org: while-we're-there consistency tweaks]
    Signed-off-by: Gideon Israel Dsouza
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gideon Israel Dsouza
     

08 Mar, 2014

1 commit


22 Jan, 2014

1 commit

  • Some applications that run on HPC clusters are designed around the
    availability of RAM and the overcommit ratio is fine tuned to get the
    maximum usage of memory without swapping. With growing memory, the
    1%-of-all-RAM grain provided by overcommit_ratio has become too coarse
    for these workload (on a 2TB machine it represents no less than 20GB).

    This patch adds the new overcommit_kbytes sysctl variable that allow a
    much finer grain.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix nommu build]
    Signed-off-by: Jerome Marchand
    Cc: Dave Hansen
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jerome Marchand
     

15 Jan, 2014

1 commit

  • Commit 8456a648cf44 ("slab: use struct page for slab management") causes
    a crash in the LVM2 testsuite on PA-RISC (the crashing test is
    fsadm.sh). The testsuite doesn't crash on 3.12, crashes on 3.13-rc1 and
    later.

    Bad Address (null pointer deref?): Code=15 regs=000000413edd89a0 (Addr=000006202224647d)
    CPU: 3 PID: 24008 Comm: loop0 Not tainted 3.13.0-rc6 #5
    task: 00000001bf3c0048 ti: 000000413edd8000 task.ti: 000000413edd8000

    YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
    PSW: 00001000000001101111100100001110 Not tainted
    r00-03 000000ff0806f90e 00000000405c8de0 000000004013e6c0 000000413edd83f0
    r04-07 00000000405a95e0 0000000000000200 00000001414735f0 00000001bf349e40
    r08-11 0000000010fe3d10 0000000000000001 00000040829c7778 000000413efd9000
    r12-15 0000000000000000 000000004060d800 0000000010fe3000 0000000010fe3000
    r16-19 000000413edd82a0 00000041078ddbc0 0000000000000010 0000000000000001
    r20-23 0008f3d0d83a8000 0000000000000000 00000040829c7778 0000000000000080
    r24-27 00000001bf349e40 00000001bf349e40 202d66202224640d 00000000405a95e0
    r28-31 202d662022246465 000000413edd88f0 000000413edd89a0 0000000000000001
    sr00-03 000000000532c000 0000000000000000 0000000000000000 000000000532c000
    sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000

    IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000401fe42c 00000000401fe430
    IIR: 539c0030 ISR: 00000000202d6000 IOR: 000006202224647d
    CPU: 3 CR30: 000000413edd8000 CR31: 0000000000000000
    ORIG_R28: 00000000405a95e0
    IAOQ[0]: vma_interval_tree_iter_first+0x14/0x48
    IAOQ[1]: vma_interval_tree_iter_first+0x18/0x48
    RP(r2): flush_dcache_page+0x128/0x388
    Backtrace:
    flush_dcache_page+0x128/0x388
    lo_splice_actor+0x90/0x148 [loop]
    splice_from_pipe_feed+0xc0/0x1d0
    __splice_from_pipe+0xac/0xc0
    lo_direct_splice_actor+0x1c/0x70 [loop]
    splice_direct_to_actor+0xec/0x228
    lo_receive+0xe4/0x298 [loop]
    loop_thread+0x478/0x640 [loop]
    kthread+0x134/0x168
    end_fault_vector+0x20/0x28
    xfs_setsize_buftarg+0x0/0x90 [xfs]

    Kernel panic - not syncing: Bad Address (null pointer deref?)

    Commit 8456a648cf44 changes the page structure so that the slab
    subsystem reuses the page->mapping field.

    The crash happens in the following way:
    * XFS allocates some memory from slab and issues a bio to read data
    into it.
    * the bio is sent to the loopback device.
    * lo_receive creates an actor and calls splice_direct_to_actor.
    * lo_splice_actor copies data to the target page.
    * lo_splice_actor calls flush_dcache_page because the page may be
    mapped by userspace. In that case we need to flush the kernel cache.
    * flush_dcache_page asks for the list of userspace mappings, however
    that page->mapping field is reused by the slab subsystem for a
    different purpose. This causes the crash.

    Note that other architectures without coherent caches (sparc, arm, mips)
    also call page_mapping from flush_dcache_page, so they may crash in the
    same way.

    This patch fixes this bug by testing if the page is a slab page in
    page_mapping and returning NULL if it is.

    The patch also fixes VM_BUG_ON(PageSlab(page)) that could happen in
    earlier kernels in the same scenario on architectures without cache
    coherence when CONFIG_DEBUG_VM is enabled - so it should be backported
    to stable kernels.

    In the old kernels, the function page_mapping is placed in
    include/linux/mm.h, so you should modify the patch accordingly when
    backporting it.

    Signed-off-by: Mikulas Patocka
    Cc: John David Anglin ]
    Cc: Andi Kleen
    Cc: Christoph Lameter
    Acked-by: Pekka Enberg
    Reviewed-by: Joonsoo Kim
    Cc: Helge Deller
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mikulas Patocka
     

13 Nov, 2013

1 commit

  • The same calculation is currently done in three differents places.
    Factor that code so future changes has to be made at only one place.

    [akpm@linux-foundation.org: uninline vm_commit_limit()]
    Signed-off-by: Jerome Marchand
    Cc: Dave Hansen
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jerome Marchand
     

12 Sep, 2013

1 commit

  • PageSwapCache() is always false when !CONFIG_SWAP, so compiler
    properly discard related code. Therefore, we don't need #ifdef explicitly.

    Signed-off-by: Joonsoo Kim
    Acked-by: Johannes Weiner
    Cc: Minchan Kim
    Cc: Mel Gorman
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joonsoo Kim
     

11 Jul, 2013

1 commit

  • Since all architectures have been converted to use vm_unmapped_area(),
    there is no remaining use for the free_area_cache.

    Signed-off-by: Michel Lespinasse
    Acked-by: Rik van Riel
    Cc: "James E.J. Bottomley"
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: Helge Deller
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Paul Mackerras
    Cc: Richard Henderson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

24 Feb, 2013

4 commits

  • When I use several fast SSD to do swap, swapper_space.tree_lock is
    heavily contended. This makes each swap partition have one
    address_space to reduce the lock contention. There is an array of
    address_space for swap. The swap entry type is the index to the array.

    In my test with 3 SSD, this increases the swapout throughput 20%.

    [akpm@linux-foundation.org: revert unneeded change to __add_to_swap_cache]
    Signed-off-by: Shaohua Li
    Cc: Hugh Dickins
    Acked-by: Rik van Riel
    Acked-by: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • According to akpm, this saves 1/2k text and makes things simple for the
    next patch.

    Numbers from Minchan:

    add/remove: 1/0 grow/shrink: 6/22 up/down: 92/-516 (-424)
    function old new delta
    page_mapping - 48 +48
    do_task_stat 2292 2308 +16
    page_remove_rmap 240 248 +8
    load_elf_binary 4500 4508 +8
    update_queue 532 536 +4
    scsi_probe_and_add_lun 2892 2896 +4
    lookup_fast 644 648 +4
    vcs_read 1040 1036 -4
    __ip_route_output_key 1904 1900 -4
    ip_route_input_noref 2508 2500 -8
    shmem_file_aio_read 784 772 -12
    __isolate_lru_page 272 256 -16
    shmem_replace_page 708 688 -20
    mark_buffer_dirty 228 208 -20
    __set_page_dirty_buffers 240 220 -20
    __remove_mapping 276 256 -20
    update_mmu_cache 500 476 -24
    set_page_dirty_balance 92 68 -24
    set_page_dirty 172 148 -24
    page_evictable 88 64 -24
    page_cache_pipe_buf_steal 248 224 -24
    clear_page_dirty_for_io 340 316 -24
    test_set_page_writeback 400 372 -28
    test_clear_page_writeback 516 488 -28
    invalidate_inode_page 156 128 -28
    page_mkclean 432 400 -32
    flush_dcache_page 360 328 -32
    __set_page_dirty_nobuffers 324 280 -44
    shrink_page_list 2412 2356 -56

    Signed-off-by: Shaohua Li
    Suggested-by: Andrew Morton
    Cc: Hugh Dickins
    Acked-by: Rik van Riel
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • do_mmap_pgoff() rounds up the desired size to the next PAGE_SIZE
    multiple, however there was no equivalent code in mm_populate(), which
    caused issues.

    This could be fixed by introduced the same rounding in mm_populate(),
    however I think it's preferable to make do_mmap_pgoff() return populate
    as a size rather than as a boolean, so we don't have to duplicate the
    size rounding logic in mm_populate().

    Signed-off-by: Michel Lespinasse
    Acked-by: Rik van Riel
    Tested-by: Andy Lutomirski
    Cc: Greg Ungerer
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • When creating new mappings using the MAP_POPULATE / MAP_LOCKED flags (or
    with MCL_FUTURE in effect), we want to populate the pages within the
    newly created vmas. This may take a while as we may have to read pages
    from disk, so ideally we want to do this outside of the write-locked
    mmap_sem region.

    This change introduces mm_populate(), which is used to defer populating
    such mappings until after the mmap_sem write lock has been released.
    This is implemented as a generalization of the former do_mlock_pages(),
    which accomplished the same task but was using during mlock() /
    mlockall().

    Signed-off-by: Michel Lespinasse
    Reported-by: Andy Lutomirski
    Acked-by: Rik van Riel
    Tested-by: Andy Lutomirski
    Cc: Greg Ungerer
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

29 Oct, 2012

1 commit


16 Oct, 2012

1 commit


04 Sep, 2012

1 commit

  • Without this patch we can get (many) kmem trace events
    with call site at krealloc().

    This happens because krealloc is calling __krealloc,
    which performs the allocation through kmalloc_track_caller.

    Since neither krealloc nor __krealloc are marked inline explicitly,
    the caller can be traced as being krealloc, which clearly is not
    the intended behavior.

    This patch allows to get the real caller of krealloc, by creating
    an always inlined function __do_krealloc, thus tracing the
    call site accurately.

    Acked-by: Christoph Lameter
    Cc: Glauber Costa
    Signed-off-by: Ezequiel Garcia
    Signed-off-by: Pekka Enberg

    Ezequiel Garcia
     

01 Jun, 2012

1 commit

  • take it to mm/util.c, convert vm_mmap() to use of that one and
    take it to mm/util.c as well, convert both sys_mmap_pgoff() to
    use of vm_mmap_pgoff()

    Signed-off-by: Al Viro

    Al Viro
     

22 Mar, 2012

1 commit

  • Stack for a new thread is mapped by userspace code and passed via
    sys_clone. This memory is currently seen as anonymous in
    /proc//maps, which makes it difficult to ascertain which mappings
    are being used for thread stacks. This patch uses the individual task
    stack pointers to determine which vmas are actually thread stacks.

    For a multithreaded program like the following:

    #include

    void *thread_main(void *foo)
    {
    while(1);
    }

    int main()
    {
    pthread_t t;
    pthread_create(&t, NULL, thread_main, NULL);
    pthread_join(t, NULL);
    }

    proc/PID/maps looks like the following:

    00400000-00401000 r-xp 00000000 fd:0a 3671804 /home/siddhesh/a.out
    00600000-00601000 rw-p 00000000 fd:0a 3671804 /home/siddhesh/a.out
    019ef000-01a10000 rw-p 00000000 00:00 0 [heap]
    7f8a44491000-7f8a44492000 ---p 00000000 00:00 0
    7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0
    7f8a44c92000-7f8a44e3d000 r-xp 00000000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a44e3d000-7f8a4503d000 ---p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a4503d000-7f8a45041000 r--p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a45041000-7f8a45043000 rw-p 001af000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a45043000-7f8a45048000 rw-p 00000000 00:00 0
    7f8a45048000-7f8a4505f000 r-xp 00000000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4505f000-7f8a4525e000 ---p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4525e000-7f8a4525f000 r--p 00016000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4525f000-7f8a45260000 rw-p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a45260000-7f8a45264000 rw-p 00000000 00:00 0
    7f8a45264000-7f8a45286000 r-xp 00000000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45457000-7f8a4545a000 rw-p 00000000 00:00 0
    7f8a45484000-7f8a45485000 rw-p 00000000 00:00 0
    7f8a45485000-7f8a45486000 r--p 00021000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45486000-7f8a45487000 rw-p 00022000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45487000-7f8a45488000 rw-p 00000000 00:00 0
    7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack]
    7fff627ff000-7fff62800000 r-xp 00000000 00:00 0 [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

    Here, one could guess that 7f8a44492000-7f8a44c92000 is a stack since
    the earlier vma that has no permissions (7f8a44e3d000-7f8a4503d000) but
    that is not always a reliable way to find out which vma is a thread
    stack. Also, /proc/PID/maps and /proc/PID/task/TID/maps has the same
    content.

    With this patch in place, /proc/PID/task/TID/maps are treated as 'maps
    as the task would see it' and hence, only the vma that that task uses as
    stack is marked as [stack]. All other 'stack' vmas are marked as
    anonymous memory. /proc/PID/maps acts as a thread group level view,
    where all thread stack vmas are marked as [stack:TID] where TID is the
    process ID of the task that uses that vma as stack, while the process
    stack is marked as [stack].

    So /proc/PID/maps will look like this:

    00400000-00401000 r-xp 00000000 fd:0a 3671804 /home/siddhesh/a.out
    00600000-00601000 rw-p 00000000 fd:0a 3671804 /home/siddhesh/a.out
    019ef000-01a10000 rw-p 00000000 00:00 0 [heap]
    7f8a44491000-7f8a44492000 ---p 00000000 00:00 0
    7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack:1442]
    7f8a44c92000-7f8a44e3d000 r-xp 00000000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a44e3d000-7f8a4503d000 ---p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a4503d000-7f8a45041000 r--p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a45041000-7f8a45043000 rw-p 001af000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a45043000-7f8a45048000 rw-p 00000000 00:00 0
    7f8a45048000-7f8a4505f000 r-xp 00000000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4505f000-7f8a4525e000 ---p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4525e000-7f8a4525f000 r--p 00016000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4525f000-7f8a45260000 rw-p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a45260000-7f8a45264000 rw-p 00000000 00:00 0
    7f8a45264000-7f8a45286000 r-xp 00000000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45457000-7f8a4545a000 rw-p 00000000 00:00 0
    7f8a45484000-7f8a45485000 rw-p 00000000 00:00 0
    7f8a45485000-7f8a45486000 r--p 00021000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45486000-7f8a45487000 rw-p 00022000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45487000-7f8a45488000 rw-p 00000000 00:00 0
    7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack]
    7fff627ff000-7fff62800000 r-xp 00000000 00:00 0 [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

    Thus marking all vmas that are used as stacks by the threads in the
    thread group along with the process stack. The task level maps will
    however like this:

    00400000-00401000 r-xp 00000000 fd:0a 3671804 /home/siddhesh/a.out
    00600000-00601000 rw-p 00000000 fd:0a 3671804 /home/siddhesh/a.out
    019ef000-01a10000 rw-p 00000000 00:00 0 [heap]
    7f8a44491000-7f8a44492000 ---p 00000000 00:00 0
    7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack]
    7f8a44c92000-7f8a44e3d000 r-xp 00000000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a44e3d000-7f8a4503d000 ---p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a4503d000-7f8a45041000 r--p 001ab000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a45041000-7f8a45043000 rw-p 001af000 fd:00 2097482 /lib64/libc-2.14.90.so
    7f8a45043000-7f8a45048000 rw-p 00000000 00:00 0
    7f8a45048000-7f8a4505f000 r-xp 00000000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4505f000-7f8a4525e000 ---p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4525e000-7f8a4525f000 r--p 00016000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a4525f000-7f8a45260000 rw-p 00017000 fd:00 2099938 /lib64/libpthread-2.14.90.so
    7f8a45260000-7f8a45264000 rw-p 00000000 00:00 0
    7f8a45264000-7f8a45286000 r-xp 00000000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45457000-7f8a4545a000 rw-p 00000000 00:00 0
    7f8a45484000-7f8a45485000 rw-p 00000000 00:00 0
    7f8a45485000-7f8a45486000 r--p 00021000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45486000-7f8a45487000 rw-p 00022000 fd:00 2097348 /lib64/ld-2.14.90.so
    7f8a45487000-7f8a45488000 rw-p 00000000 00:00 0
    7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0
    7fff627ff000-7fff62800000 r-xp 00000000 00:00 0 [vdso]
    ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

    where only the vma that is being used as a stack by *that* task is
    marked as [stack].

    Analogous changes have been made to /proc/PID/smaps,
    /proc/PID/numa_maps, /proc/PID/task/TID/smaps and
    /proc/PID/task/TID/numa_maps. Relevant snippets from smaps and
    numa_maps:

    [siddhesh@localhost ~ ]$ pgrep a.out
    1441
    [siddhesh@localhost ~ ]$ cat /proc/1441/smaps | grep "\[stack"
    7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack:1442]
    7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack]
    [siddhesh@localhost ~ ]$ cat /proc/1441/task/1442/smaps | grep "\[stack"
    7f8a44492000-7f8a44c92000 rw-p 00000000 00:00 0 [stack]
    [siddhesh@localhost ~ ]$ cat /proc/1441/task/1441/smaps | grep "\[stack"
    7fff6273b000-7fff6275c000 rw-p 00000000 00:00 0 [stack]
    [siddhesh@localhost ~ ]$ cat /proc/1441/numa_maps | grep "stack"
    7f8a44492000 default stack:1442 anon=2 dirty=2 N0=2
    7fff6273a000 default stack anon=3 dirty=3 N0=3
    [siddhesh@localhost ~ ]$ cat /proc/1441/task/1442/numa_maps | grep "stack"
    7f8a44492000 default stack anon=2 dirty=2 N0=2
    [siddhesh@localhost ~ ]$ cat /proc/1441/task/1441/numa_maps | grep "stack"
    7fff6273a000 default stack anon=3 dirty=3 N0=3

    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Siddhesh Poyarekar
    Cc: KOSAKI Motohiro
    Cc: Alexander Viro
    Cc: Jamie Lokier
    Cc: Mike Frysinger
    Cc: Alexey Dobriyan
    Cc: Matt Mackall
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Siddhesh Poyarekar
     

31 Oct, 2011

1 commit


25 May, 2011

1 commit

  • When I was reading nommu code, I found that it handles the vma list/tree
    in an unusual way. IIUC, because there can be more than one
    identical/overrapped vmas in the list/tree, it sorts the tree more
    strictly and does a linear search on the tree. But it doesn't applied to
    the list (i.e. the list could be constructed in a different order than
    the tree so that we can't use the list when finding the first vma in that
    order).

    Since inserting/sorting a vma in the tree and link is done at the same
    time, we can easily construct both of them in the same order. And linear
    searching on the tree could be more costly than doing it on the list, it
    can be converted to use the list.

    Also, after the commit 297c5eee3724 ("mm: make the vma list be doubly
    linked") made the list be doubly linked, there were a couple of code need
    to be fixed to construct the list properly.

    Patch 1/6 is a preparation. It maintains the list sorted same as the tree
    and construct doubly-linked list properly. Patch 2/6 is a simple
    optimization for the vma deletion. Patch 3/6 and 4/6 convert tree
    traversal to list traversal and the rest are simple fixes and cleanups.

    This patch:

    @vma added into @mm should be sorted by start addr, end addr and VMA
    struct addr in that order because we may get identical VMAs in the @mm.
    However this was true only for the rbtree, not for the list.

    This patch fixes this by remembering 'rb_prev' during the tree traversal
    like find_vma_prepare() does and linking the @vma via __vma_link_list().
    After this patch, we can iterate the whole VMAs in correct order simply by
    using @mm->mmap list.

    [akpm@linux-foundation.org: avoid duplicating __vma_link_list()]
    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     

31 Mar, 2011

1 commit


07 Jan, 2011

1 commit


24 Oct, 2010

1 commit


10 Aug, 2010

1 commit

  • Use memdup_user when user data is immediately copied into the
    allocated region.

    The semantic patch that makes this change is as follows:
    (http://coccinelle.lip6.fr/)

    //
    @@
    expression from,to,size,flag;
    position p;
    identifier l1,l2;
    @@

    - to = \(kmalloc@p\|kzalloc@p\)(size,flag);
    + to = memdup_user(from,size);
    if (
    - to==NULL
    + IS_ERR(to)
    || ...) {

    }
    - if (copy_from_user(to, from, size) != 0) {
    -
    - }
    //

    Signed-off-by: Julia Lawall
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Julia Lawall
     

10 Apr, 2010

1 commit

  • As suggested by Linus, introduce a kern_ptr_validate() helper that does some
    sanity checks to make sure a pointer is a valid kernel pointer. This is a
    preparational step for fixing SLUB kmem_ptr_validate().

    Cc: Andrew Morton
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Ingo Molnar
    Cc: Matt Mackall
    Cc: Nick Piggin
    Signed-off-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Pekka Enberg
     

17 Jan, 2010

1 commit


31 Dec, 2009

1 commit

  • Move sys_mmap_pgoff() from mm/util.c to mm/mmap.c and mm/nommu.c,
    where we'd expect to find such code: especially now that it contains
    the MAP_HUGETLB handling. Revert mm/util.c to how it was in 2.6.32.

    This patch just ignores MAP_HUGETLB in the nommu case, as in 2.6.32,
    whereas 2.6.33-rc2 reported -ENOSYS. Perhaps validate_mmap_request()
    should reject it with -EINVAL? Add that later if necessary.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

11 Dec, 2009

2 commits


17 Jun, 2009

2 commits


01 Jun, 2009

1 commit


07 May, 2009

1 commit


15 Apr, 2009

1 commit

  • Impact: clean up

    Create a sub directory in include/trace called events to keep the
    trace point headers in their own separate directory. Only headers that
    declare trace points should be defined in this directory.

    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Neil Horman
    Cc: Zhao Lei
    Cc: Eduard - Gabriel Munteanu
    Cc: Pekka Enberg
    Signed-off-by: Steven Rostedt

    Steven Rostedt