13 Jul, 2019

1 commit

  • Patch series "add init_on_alloc/init_on_free boot options", v10.

    Provide init_on_alloc and init_on_free boot options.

    These are aimed at preventing possible information leaks and making the
    control-flow bugs that depend on uninitialized values more deterministic.

    Enabling either of the options guarantees that the memory returned by the
    page allocator and SL[AU]B is initialized with zeroes. SLOB allocator
    isn't supported at the moment, as its emulation of kmem caches complicates
    handling of SLAB_TYPESAFE_BY_RCU caches correctly.

    Enabling init_on_free also guarantees that pages and heap objects are
    initialized right after they're freed, so it won't be possible to access
    stale data by using a dangling pointer.

    As suggested by Michal Hocko, right now we don't let the heap users to
    disable initialization for certain allocations. There's not enough
    evidence that doing so can speed up real-life cases, and introducing ways
    to opt-out may result in things going out of control.

    This patch (of 2):

    The new options are needed to prevent possible information leaks and make
    control-flow bugs that depend on uninitialized values more deterministic.

    This is expected to be on-by-default on Android and Chrome OS. And it
    gives the opportunity for anyone else to use it under distros too via the
    boot args. (The init_on_free feature is regularly requested by folks
    where memory forensics is included in their threat models.)

    init_on_alloc=1 makes the kernel initialize newly allocated pages and heap
    objects with zeroes. Initialization is done at allocation time at the
    places where checks for __GFP_ZERO are performed.

    init_on_free=1 makes the kernel initialize freed pages and heap objects
    with zeroes upon their deletion. This helps to ensure sensitive data
    doesn't leak via use-after-free accesses.

    Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator
    returns zeroed memory. The two exceptions are slab caches with
    constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never
    zero-initialized to preserve their semantics.

    Both init_on_alloc and init_on_free default to zero, but those defaults
    can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and
    CONFIG_INIT_ON_FREE_DEFAULT_ON.

    If either SLUB poisoning or page poisoning is enabled, those options take
    precedence over init_on_alloc and init_on_free: initialization is only
    applied to unpoisoned allocations.

    Slowdown for the new features compared to init_on_free=0, init_on_alloc=0:

    hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%)
    hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%)

    Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%)
    Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%)
    Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%)
    Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%)

    The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline
    is within the standard error.

    The new features are also going to pave the way for hardware memory
    tagging (e.g. arm64's MTE), which will require both on_alloc and on_free
    hooks to set the tags for heap objects. With MTE, tagging will have the
    same cost as memory initialization.

    Although init_on_free is rather costly, there are paranoid use-cases where
    in-memory data lifetime is desired to be minimized. There are various
    arguments for/against the realism of the associated threat models, but
    given that we'll need the infrastructure for MTE anyway, and there are
    people who want wipe-on-free behavior no matter what the performance cost,
    it seems reasonable to include it in this series.

    [glider@google.com: v8]
    Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com
    [glider@google.com: v9]
    Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com
    [glider@google.com: v10]
    Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com
    Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com
    Signed-off-by: Alexander Potapenko
    Acked-by: Kees Cook
    Acked-by: Michal Hocko [page and dmapool parts
    Acked-by: James Morris ]
    Cc: Christoph Lameter
    Cc: Masahiro Yamada
    Cc: "Serge E. Hallyn"
    Cc: Nick Desaulniers
    Cc: Kostya Serebryany
    Cc: Dmitry Vyukov
    Cc: Sandeep Patil
    Cc: Laura Abbott
    Cc: Randy Dunlap
    Cc: Jann Horn
    Cc: Mark Rutland
    Cc: Marco Elver
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     

05 Jun, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this software may be redistributed and or modified under the terms
    of the gnu general public license gpl version 2 as published by the
    free software foundation

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-only

    has been chosen to replace the boilerplate/reference in 1 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Reviewed-by: Armijn Hemel
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190531190112.039124428@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

06 Mar, 2019

1 commit

  • Many kernel-doc comments in mm/ have the return value descriptions
    either misformatted or omitted at all which makes kernel-doc script
    unhappy:

    $ make V=1 htmldocs
    ...
    ./mm/util.c:36: info: Scanning doc for kstrdup
    ./mm/util.c:41: warning: No description found for return value of 'kstrdup'
    ./mm/util.c:57: info: Scanning doc for kstrdup_const
    ./mm/util.c:66: warning: No description found for return value of 'kstrdup_const'
    ./mm/util.c:75: info: Scanning doc for kstrndup
    ./mm/util.c:83: warning: No description found for return value of 'kstrndup'
    ...

    Fixing the formatting and adding the missing return value descriptions
    eliminates ~100 such warnings.

    Link: http://lkml.kernel.org/r/1549549644-4903-4-git-send-email-rppt@linux.ibm.com
    Signed-off-by: Mike Rapoport
    Reviewed-by: Andrew Morton
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

15 Jun, 2018

1 commit

  • mm/*.c files use symbolic and octal styles for permissions.

    Using octal and not symbolic permissions is preferred by many as more
    readable.

    https://lkml.org/lkml/2016/8/2/1945

    Prefer the direct use of octal for permissions.

    Done using
    $ scripts/checkpatch.pl -f --types=SYMBOLIC_PERMS --fix-inplace mm/*.c
    and some typing.

    Before: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
    44
    After: $ git grep -P -w "0[0-7]{3,3}" mm | wc -l
    86

    Miscellanea:

    o Whitespace neatening around these conversions.

    Link: http://lkml.kernel.org/r/2e032ef111eebcd4c5952bae86763b541d373469.1522102887.git.joe@perches.com
    Signed-off-by: Joe Perches
    Acked-by: David Rientjes
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

28 Feb, 2017

1 commit

  • Now that %z is standartised in C99 there is no reason to support %Z.
    Unlike %L it doesn't even make format strings smaller.

    Use BUILD_BUG_ON in a couple ATM drivers.

    In case anyone didn't notice lib/vsprintf.o is about half of SLUB which
    is in my opinion is quite an achievement. Hopefully this patch inspires
    someone else to trim vsprintf.c more.

    Link: http://lkml.kernel.org/r/20170103230126.GA30170@avx2
    Signed-off-by: Alexey Dobriyan
    Cc: Andy Shevchenko
    Cc: Rasmus Villemoes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

25 Feb, 2017

1 commit

  • cleanup rest of dma_addr_t and phys_addr_t type casting in mm
    use %pad for dma_addr_t
    use %pa for phys_addr_t

    Link: http://lkml.kernel.org/r/1486618489-13912-1-git-send-email-miles.chen@mediatek.com
    Signed-off-by: Miles Chen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miles Chen
     

18 Mar, 2016

2 commits

  • Most of the mm subsystem uses pr_ so make it consistent.

    Miscellanea:

    - Realign arguments
    - Add missing newline to format
    - kmemleak-test.c has a "kmemleak: " prefix added to the
    "Kmemleak testing" logging message via pr_fmt

    Signed-off-by: Joe Perches
    Acked-by: Tejun Heo [percpu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Kernel style prefers a single string over split strings when the string is
    'user-visible'.

    Miscellanea:

    - Add a missing newline
    - Realign arguments

    Signed-off-by: Joe Perches
    Acked-by: Tejun Heo [percpu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     

07 Nov, 2015

1 commit

  • …d avoiding waking kswapd

    __GFP_WAIT has been used to identify atomic context in callers that hold
    spinlocks or are in interrupts. They are expected to be high priority and
    have access one of two watermarks lower than "min" which can be referred
    to as the "atomic reserve". __GFP_HIGH users get access to the first
    lower watermark and can be called the "high priority reserve".

    Over time, callers had a requirement to not block when fallback options
    were available. Some have abused __GFP_WAIT leading to a situation where
    an optimisitic allocation with a fallback option can access atomic
    reserves.

    This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
    cannot sleep and have no alternative. High priority users continue to use
    __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
    are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
    callers that want to wake kswapd for background reclaim. __GFP_WAIT is
    redefined as a caller that is willing to enter direct reclaim and wake
    kswapd for background reclaim.

    This patch then converts a number of sites

    o __GFP_ATOMIC is used by callers that are high priority and have memory
    pools for those requests. GFP_ATOMIC uses this flag.

    o Callers that have a limited mempool to guarantee forward progress clear
    __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
    into this category where kswapd will still be woken but atomic reserves
    are not used as there is a one-entry mempool to guarantee progress.

    o Callers that are checking if they are non-blocking should use the
    helper gfpflags_allow_blocking() where possible. This is because
    checking for __GFP_WAIT as was done historically now can trigger false
    positives. Some exceptions like dm-crypt.c exist where the code intent
    is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
    flag manipulations.

    o Callers that built their own GFP flags instead of starting with GFP_KERNEL
    and friends now also need to specify __GFP_KSWAPD_RECLAIM.

    The first key hazard to watch out for is callers that removed __GFP_WAIT
    and was depending on access to atomic reserves for inconspicuous reasons.
    In some cases it may be appropriate for them to use __GFP_HIGH.

    The second key hazard is callers that assembled their own combination of
    GFP flags instead of starting with something like GFP_KERNEL. They may
    now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
    if it's missed in most cases as other activity will wake kswapd.

    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Mel Gorman
     

02 Oct, 2015

1 commit

  • If a DMA pool lies at the very top of the dma_addr_t range (as may
    happen with an IOMMU involved), the calculated end address of the pool
    wraps around to zero, and page lookup always fails.

    Tweak the relevant calculation to be overflow-proof.

    Signed-off-by: Robin Murphy
    Cc: Arnd Bergmann
    Cc: Marek Szyprowski
    Cc: Sumit Semwal
    Cc: Sakari Ailus
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Murphy
     

09 Sep, 2015

2 commits

  • Currently a call to dma_pool_alloc() with a ___GFP_ZERO flag returns a
    non-zeroed memory region.

    This patchset adds support for the __GFP_ZERO flag to dma_pool_alloc(),
    adds 2 wrapper functions for allocing zeroed memory from a pool, and
    provides a coccinelle script for finding & replacing instances of
    dma_pool_alloc() followed by memset(0) with a single dma_pool_zalloc()
    call.

    There was some concern that this always calls memset() to zero, instead
    of passing __GFP_ZERO into the page allocator.
    [https://lkml.org/lkml/2015/7/15/881]

    I ran a test on my system to get an idea of how often dma_pool_alloc()
    calls into pool_alloc_page().

    After Boot: [ 30.119863] alloc_calls:541, page_allocs:7
    After an hour: [ 3600.951031] alloc_calls:9566, page_allocs:12
    After copying 1GB file onto a USB drive:
    [ 4260.657148] alloc_calls:17225, page_allocs:12

    It doesn't look like dma_pool_alloc() calls down to the page allocator
    very often (at least on my system).

    This patch (of 4):

    Currently the __GFP_ZERO flag is ignored by dma_pool_alloc().
    Make dma_pool_alloc() zero the memory if this flag is set.

    Signed-off-by: Sean O. Stalley
    Acked-by: David Rientjes
    Cc: Vinod Koul
    Cc: Bjorn Helgaas
    Cc: Gilles Muller
    Cc: Nicolas Palix
    Cc: Michal Marek
    Cc: Sebastian Andrzej Siewior
    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sean O. Stalley
     
  • dma_pool_destroy() does not tolerate a NULL dma_pool pointer argument and
    performs a NULL-pointer dereference. This requires additional attention
    and effort from developers/reviewers and forces all dma_pool_destroy()
    callers to do a NULL check

    if (pool)
    dma_pool_destroy(pool);

    Or, otherwise, be invalid dma_pool_destroy() users.

    Tweak dma_pool_destroy() and NULL-check the pointer there.

    Proposed by Andrew Morton.

    Link: https://lkml.org/lkml/2015/6/8/583
    Signed-off-by: Sergey Senozhatsky
    Acked-by: David Rientjes
    Cc: Julia Lawall
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergey Senozhatsky
     

05 Sep, 2015

1 commit


10 Oct, 2014

2 commits

  • Remove 3 brace coding style for any arm of this statement

    Signed-off-by: Paul McQuade
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul McQuade
     
  • cat /sys/.../pools followed by removal the device leads to:

    |======================================================
    |[ INFO: possible circular locking dependency detected ]
    |3.17.0-rc4+ #1498 Not tainted
    |-------------------------------------------------------
    |rmmod/2505 is trying to acquire lock:
    | (s_active#28){++++.+}, at: [] kernfs_remove_by_name_ns+0x3c/0x88
    |
    |but task is already holding lock:
    | (pools_lock){+.+.+.}, at: [] dma_pool_destroy+0x18/0x17c
    |
    |which lock already depends on the new lock.
    |the existing dependency chain (in reverse order) is:
    |
    |-> #1 (pools_lock){+.+.+.}:
    | [] show_pools+0x30/0xf8
    | [] dev_attr_show+0x1c/0x48
    | [] sysfs_kf_seq_show+0x88/0x10c
    | [] kernfs_seq_show+0x24/0x28
    | [] seq_read+0x1b8/0x480
    | [] vfs_read+0x8c/0x148
    | [] SyS_read+0x40/0x8c
    | [] ret_fast_syscall+0x0/0x48
    |
    |-> #0 (s_active#28){++++.+}:
    | [] __kernfs_remove+0x258/0x2ec
    | [] kernfs_remove_by_name_ns+0x3c/0x88
    | [] dma_pool_destroy+0x148/0x17c
    | [] hcd_buffer_destroy+0x20/0x34
    | [] usb_remove_hcd+0x110/0x1a4

    The problem is the lock order of pools_lock and kernfs_mutex in
    dma_pool_destroy() vs show_pools() call path.

    This patch breaks out the creation of the sysfs file outside of the
    pools_lock mutex. The newly added pools_reg_lock ensures that there is no
    race of create vs destroy code path in terms whether or not the sysfs file
    has to be deleted (and was it deleted before we try to create a new one)
    and what to do if device_create_file() failed.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sebastian Andrzej Siewior
     

19 Sep, 2014

1 commit

  • dma_pool_create() needs to unlock the mutex in error case. The bug was
    introduced in the 3.16 by commit cc6b664aa26d ("mm/dmapool.c: remove
    redundant NULL check for dev in dma_pool_create()")/

    Signed-off-by: Krzysztof Hałasa
    Cc: stable@vger.kernel.org # v3.16
    Signed-off-by: Linus Torvalds

    Krzysztof Hałasa
     

05 Jun, 2014

2 commits


05 May, 2014

1 commit


12 Dec, 2012

1 commit


11 Dec, 2012

1 commit

  • dmapool always calls dma_alloc_coherent() with GFP_ATOMIC flag,
    regardless the flags provided by the caller. This causes excessive
    pruning of emergency memory pools without any good reason. Additionaly,
    on ARM architecture any driver which is using dmapools will sooner or
    later trigger the following error:
    "ERROR: 256 KiB atomic DMA coherent pool is too small!
    Please increase it with coherent_pool= kernel parameter!".
    Increasing the coherent pool size usually doesn't help much and only
    delays such error, because all GFP_ATOMIC DMA allocations are always
    served from the special, very limited memory pool.

    This patch changes the dmapool code to correctly use gfp flags provided
    by the dmapool caller.

    Reported-by: Soeren Moch
    Reported-by: Thomas Petazzoni
    Signed-off-by: Marek Szyprowski
    Tested-by: Andrew Lunn
    Tested-by: Soeren Moch
    Cc: stable@vger.kernel.org

    Marek Szyprowski
     

31 Oct, 2011

2 commits


26 Jul, 2011

1 commit

  • devres uses the pointer value as key after it's freed, which is safe but
    triggers spurious use-after-free warnings on some static analysis tools.
    Rearrange code to avoid such warnings.

    Signed-off-by: Maxin B. John
    Reviewed-by: Rolf Eike Beer
    Acked-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maxin B John
     

14 Jan, 2011

2 commits

  • As it stands this code will degenerate into a busy-wait if the calling task
    has signal_pending().

    Cc: Rolf Eike Beer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • dma_pool_free() scans for the page to free in the pool list holding the
    pool lock. Then it releases the lock basically to acquire it immediately
    again. Modify the code to only take the lock once.

    This will do some additional loops and computations with the lock held in
    if memory debugging is activated. If it is not activated the only new
    operations with this lock is one if and one substraction.

    Signed-off-by: Rolf Eike Beer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     

27 Oct, 2010

1 commit

  • Buggy drivers (e.g. fsl_udc) could call dma_pool_alloc from atomic
    context with GFP_KERNEL. In most instances, the first pool_alloc_page
    call would succeed and the sleeping functions would never be called. This
    allowed the buggy drivers to slip through the cracks.

    Add a might_sleep_if() checking for __GFP_WAIT in flags.

    Signed-off-by: Dima Zavin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dima Zavin
     

01 Jul, 2009

1 commit


28 Apr, 2008

1 commit

  • Previously it was only enabled for CONFIG_DEBUG_SLAB.

    Not hooked into the slub runtime debug configuration, so you currently only
    get it with CONFIG_SLUB_DEBUG_ON, not plain CONFIG_SLUB_DEBUG

    Acked-by: Matthew Wilcox
    Signed-off-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

04 Dec, 2007

7 commits