Eric Lee / smarc-fsl-linux-kernel

04 Apr, 2017

1 commit

38bffdac0 Merge branch 'sched/core' into locking/core ... Browse Code »

Required for the rtmutex/sched_deadline patches which depend on both
branches

Thomas Gleixner
2017-04-04 17:31:12 +0800

26 Mar, 2017

1 commit

8ce371f98 lockdep: Fix per-cpu static objects ... Browse Code »

Since commit 383776fa7527 ("locking/lockdep: Handle statically initialized
PER_CPU locks properly") we try to collapse per-cpu locks into a single
class by giving them all the same key. For this key we choose the canonical
address of the per-cpu object, which would be the offset into the per-cpu
area.

This has two problems:

- there is a case where we run !0 lock->key through static_obj() and
expect this to pass; it doesn't for canonical pointers.

- 0 is a valid canonical address.

Cure both issues by redefining the canonical address as the address of the
per-cpu variable on the boot CPU.

Since I didn't want to rely on CPU0 being the boot-cpu, or even existing at
all, track the boot CPU in a variable.

Fixes: 383776fa7527 ("locking/lockdep: Handle statically initialized PER_CPU locks properly")
Reported-by: kernel test robot
Signed-off-by: Peter Zijlstra (Intel)
Tested-by: Borislav Petkov
Cc: Sebastian Andrzej Siewior
Cc: linux-mm@kvack.org
Cc: wfg@linux.intel.com
Cc: kernel test robot
Cc: LKP
Link: http://lkml.kernel.org/r/20170320114108.kbvcsuepem45j5cr@hirez.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner

Peter Zijlstra
2017-03-26 21:09:45 +0800

16 Mar, 2017

1 commit

383776fa7 locking/lockdep: Handle statically initialized PER_CPU locks properly ... Browse Code »

If a PER_CPU struct which contains a spin_lock is statically initialized
via:

DEFINE_PER_CPU(struct foo, bla) = {
.lock = __SPIN_LOCK_UNLOCKED(bla.lock)
};

then lockdep assigns a seperate key to each lock because the logic for
assigning a key to statically initialized locks is to use the address as
the key. With per CPU locks the address is obvioulsy different on each CPU.

That's wrong, because all locks should have the same key.

To solve this the following modifications are required:

1) Extend the is_kernel/module_percpu_addr() functions to hand back the
canonical address of the per CPU address, i.e. the per CPU address
minus the per CPU offset.

2) Check the lock address with these functions and if the per CPU check
matches use the returned canonical address as the lock key, so all per
CPU locks have the same key.

3) Move the static_obj(key) check into look_up_lock_class() so this check
can be avoided for statically initialized per CPU locks. That's
required because the canonical address fails the static_obj(key) check
for obvious reasons.

Reported-by: Mike Galbraith
Signed-off-by: Thomas Gleixner
[ Merged Dan's fixups for !MODULES and !SMP into this patch. ]
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Dan Murphy
Cc: Linus Torvalds
Cc: Paul E. McKenney
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20170227143736.pectaimkjkan5kow@linutronix.de
Signed-off-by: Ingo Molnar

Thomas Gleixner
2017-03-16 16:57:08 +0800

07 Mar, 2017

1 commit

320661b08 percpu: acquire pcpu_lock when updating pcpu_nr_empty_pop_pages ... Browse Code »

Update to pcpu_nr_empty_pop_pages in pcpu_alloc() is currently done
without holding pcpu_lock. This can lead to bad updates to the variable.
Add missing lock calls.

Fixes: b539b87fed37 ("percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated")
Signed-off-by: Tahsin Erdogan
Signed-off-by: Tejun Heo
Cc: stable@vger.kernel.org # v3.18+

Tahsin Erdogan
2017-03-07 04:55:39 +0800

28 Feb, 2017

1 commit

4091fb95b scripts/spelling.txt: add "followings" pattern and fix typo instances ... Browse Code »

Fix typos and add the following to the scripts/spelling.txt:

followings||following

While we are here, add a missing colon in the boilerplate in DT binding
documents. The "you SoC" in allwinner,sunxi-pinctrl.txt was fixed as
well.

I reworded "as the followings:" to "as follows:" for
drivers/usb/gadget/udc/renesas_usb3.c.

Link: http://lkml.kernel.org/r/1481573103-11329-32-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Masahiro Yamada
2017-02-28 10:43:47 +0800

14 Dec, 2016

1 commit

e6efef726 Merge branch 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu ... Browse Code »

Pull percpu update from Tejun Heo:
"This includes just one patch to reject non-power-of-2 alignments and
trigger warning. Interestingly, this actually caught a bug in XEN
ARM64"

* 'for-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: ensure the requested alignment is power of two

Linus Torvalds
2016-12-14 04:34:47 +0800

13 Dec, 2016

1 commit

8f6066049 mm/percpu.c: fix panic triggered by BUG_ON() falsely ... Browse Code »

As shown by pcpu_build_alloc_info(), the number of units within a percpu
group is deduced by rounding up the number of CPUs within the group to
@upa boundary/ Therefore, the number of CPUs isn't equal to the units's
if it isn't aligned to @upa normally. However, pcpu_page_first_chunk()
uses BUG_ON() to assert that one number is equal to the other roughly,
so a panic is maybe triggered by the BUG_ON() incorrectly.

In order to fix this issue, the number of CPUs is rounded up then
compared with units's and the BUG_ON() is replaced with a warning and
return of an error code as well, to keep system alive as much as
possible.

Link: http://lkml.kernel.org/r/57FCF07C.2020103@zoho.com
Signed-off-by: zijun_hu
Cc: Tejun Heo
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

zijun_hu
2016-12-13 10:55:09 +0800

20 Oct, 2016

1 commit

3ca45a46f percpu: ensure the requested alignment is power of two ... Browse Code »

The percpu allocator expectedly assumes that the requested alignment
is power of two but hasn't been veryfing the input. If the specified
alignment isn't power of two, the allocator can malfunction. Add the
sanity check.

The following is detailed analysis of the effects of alignments which
aren't power of two.

The alignment must be a even at least since the LSB of a chunk->map
element is used as free/in-use flag of a area; besides, the alignment
must be a power of 2 too since ALIGN() doesn't work well for other
alignment always but is adopted by pcpu_fit_in_area(). IOW, the
current allocator only works well for a power of 2 aligned area
allocation.

See below opposite example for why an odd alignment doesn't work.
Let's assume area [16, 36) is free but its previous one is in-use, we
want to allocate a @size == 8 and @align == 7 area. The larger area
[16, 36) is split to three areas [16, 21), [21, 29), [29, 36)
eventually. However, due to the usage for a chunk->map element, the
actual offset of the aim area [21, 29) is 21 but is recorded in
relevant element as 20; moreover, the residual tail free area [29,
36) is mistook as in-use and is lost silently

Unlike macro roundup(), ALIGN(x, a) doesn't work if @a isn't a power
of 2 for example, roundup(10, 6) == 12 but ALIGN(10, 6) == 10, and
the latter result isn't desired obviously.

tj: Code style and patch description updates.

Signed-off-by: zijun_hu
Suggested-by: Tejun Heo
Signed-off-by: Tejun Heo

zijun_hu
2016-10-20 01:53:02 +0800

05 Oct, 2016

2 commits

9b7396624 mm/percpu.c: fix potential memory leakage for pcpu_embed_first_chunk() ... Browse Code »

in order to ensure the percpu group areas within a chunk aren't
distributed too sparsely, pcpu_embed_first_chunk() goes to error handling
path when a chunk spans over 3/4 VMALLOC area, however, during the error
handling, it forget to free the memory allocated for all percpu groups by
going to label @out_free other than @out_free_areas.

it will cause memory leakage issue if the rare scene really happens, in
order to fix the issue, we check chunk spanned area immediately after
completing memory allocation for all percpu groups, we go to label
@out_free_areas to free the memory then return if the checking is failed.

in order to verify the approach, we dump all memory allocated then
enforce the jump then dump all memory freed, the result is okay after
checking whether we free all memory we allocate in this function.

BTW, The approach is chosen after thinking over the below scenes
- we don't go to label @out_free directly to fix this issue since we
maybe free several allocated memory blocks twice
- the aim of jumping after pcpu_setup_first_chunk() is bypassing free
usable memory other than handling error, moreover, the function does
not return error code in any case, it either panics due to BUG_ON()
or return 0.

Signed-off-by: zijun_hu
Tested-by: zijun_hu
Signed-off-by: Tejun Heo

zijun_hu
2016-10-05 23:52:55 +0800
93c76b6b2 mm/percpu.c: correct max_distance calculation for pcpu_embed_first_chunk() ... Browse Code »

pcpu_embed_first_chunk() calculates the range a percpu chunk spans into
@max_distance and uses it to ensure that a chunk is not too big compared
to the total vmalloc area. However, during calculation, it used incorrect
top address by adding a unit size to the highest group's base address.

This can make the calculated max_distance slightly smaller than the actual
distance although given the scale of values involved the error is very
unlikely to have an actual impact.

Fix this issue by adding the group's size instead of a unit size.

BTW, The type of variable max_distance is changed from size_t to unsigned
long too based on below consideration:
- type unsigned long usually have same width with IP core registers and
can be applied at here very well
- make @max_distance type consistent with the operand calculated against
it such as @ai->groups[i].base_offset and macro VMALLOC_TOTAL
- type unsigned long is more universal then size_t, size_t is type defined
to unsigned int or unsigned long among various ARCHs usually

Signed-off-by: zijun_hu
Signed-off-by: Tejun Heo

zijun_hu
2016-10-05 23:52:54 +0800

25 May, 2016

2 commits

6710e594f percpu: fix synchronization between synchronous map extension and chunk destruction ... Browse Code »

For non-atomic allocations, pcpu_alloc() can try to extend the area
map synchronously after dropping pcpu_lock; however, the extension
wasn't synchronized against chunk destruction and the chunk might get
freed while extension is in progress.

This patch fixes the bug by putting most of non-atomic allocations
under pcpu_alloc_mutex to synchronize against pcpu_balance_work which
is responsible for async chunk management including destruction.

Signed-off-by: Tejun Heo
Reported-and-tested-by: Alexei Starovoitov
Reported-by: Vlastimil Babka
Reported-by: Sasha Levin
Cc: stable@vger.kernel.org # v3.18+
Fixes: 1a4d76076cda ("percpu: implement asynchronous chunk population")

Tejun Heo
2016-05-25 23:48:25 +0800
4f996e234 percpu: fix synchronization between chunk->map_extend_work and chunk destruction ... Browse Code »

Atomic allocations can trigger async map extensions which is serviced
by chunk->map_extend_work. pcpu_balance_work which is responsible for
destroying idle chunks wasn't synchronizing properly against
chunk->map_extend_work and may end up freeing the chunk while the work
item is still in flight.

This patch fixes the bug by rolling async map extension operations
into pcpu_balance_work.

Signed-off-by: Tejun Heo
Reported-and-tested-by: Alexei Starovoitov
Reported-by: Vlastimil Babka
Reported-by: Sasha Levin
Cc: stable@vger.kernel.org # v3.18+
Fixes: 9c824b6a172c ("percpu: make sure chunk->map array has available space")

Tejun Heo
2016-05-25 23:48:25 +0800

18 Mar, 2016

4 commits

870d4b12a mm: percpu: use pr_fmt to prefix output ... Browse Code »

Use the normal mechanism to make the logging output consistently
"percpu:" instead of a mix of "PERCPU:" and "percpu:"

Signed-off-by: Joe Perches
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-18 06:09:34 +0800
1170532bb mm: convert printk(KERN_<LEVEL> to pr_<level> ... Browse Code »

Most of the mm subsystem uses pr_ so make it consistent.

Miscellanea:

- Realign arguments
- Add missing newline to format
- kmemleak-test.c has a "kmemleak: " prefix added to the
"Kmemleak testing" logging message via pr_fmt

Signed-off-by: Joe Perches
Acked-by: Tejun Heo [percpu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-18 06:09:34 +0800
756a025f0 mm: coalesce split strings ... Browse Code »

Kernel style prefers a single string over split strings when the string is
'user-visible'.

Miscellanea:

- Add a missing newline
- Realign arguments

Signed-off-by: Joe Perches
Acked-by: Tejun Heo [percpu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-18 06:09:34 +0800
598d80914 mm: convert pr_warning to pr_warn ... Browse Code »

There are a mixture of pr_warning and pr_warn uses in mm. Use pr_warn
consistently.

Miscellanea:

- Coalesce formats
- Realign arguments

Signed-off-by: Joe Perches
Acked-by: Tejun Heo [percpu]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joe Perches
2016-03-18 06:09:34 +0800

23 Jan, 2016

1 commit

1d5cfdb07 tree wide: use kvfree() than conditional kfree()/vfree() ... Browse Code »

There are many locations that do

if (memory_was_allocated_by_vmalloc)
vfree(ptr);
else
kfree(ptr);

but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
using is_vmalloc_addr(). Unless callers have special reasons, we can
replace this branch with kvfree(). Please check and reply if you found
problems.

Signed-off-by: Tetsuo Handa
Acked-by: Michal Hocko
Acked-by: Jan Kara
Acked-by: Russell King
Reviewed-by: Andreas Dilger
Acked-by: "Rafael J. Wysocki"
Acked-by: David Rientjes
Cc: "Luck, Tony"
Cc: Oleg Drokin
Cc: Boris Petkov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tetsuo Handa
2016-01-23 09:02:18 +0800

06 Nov, 2015

1 commit

f09f1243c mm/percpu: use offset_in_page macro ... Browse Code »

linux/mm.h provides offset_in_page() macro. Let's use already predefined
macro instead of (addr & ~PAGE_MASK).

Signed-off-by: Alexander Kuleshov
Acked-by: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Alexander Kuleshov
2015-11-06 11:34:48 +0800

21 Jul, 2015

1 commit

292c24a07 percpu: clean up of schunk->map[] assignment in pcpu_setup_first_chunk ... Browse Code »

The original assignment is a little redundent.

Signed-off-by: Baoquan He
Acked-by: Christoph Lameter
Signed-off-by: Tejun Heo

Baoquan He
2015-07-21 23:31:00 +0800

25 Jun, 2015

1 commit

8a8c35fad mm: kmemleak_alloc_percpu() should follow the gfp from per_alloc() ... Browse Code »

Beginning at commit d52d3997f843 ("ipv6: Create percpu rt6_info"), the
following INFO splat is logged:

===============================
[ INFO: suspicious RCU usage. ]
4.1.0-rc7-next-20150612 #1 Not tainted
-------------------------------
kernel/sched/core.c:7318 Illegal context switch in RCU-bh read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 0
3 locks held by systemd/1:
#0: (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0x1f/0x40
#1: (rcu_read_lock_bh){......}, at: [] ipv6_add_addr+0x62/0x540
#2: (addrconf_hash_lock){+...+.}, at: [] ipv6_add_addr+0x184/0x540
stack backtrace:
CPU: 0 PID: 1 Comm: systemd Not tainted 4.1.0-rc7-next-20150612 #1
Hardware name: TOSHIBA TECRA A50-A/TECRA A50-A, BIOS Version 4.20 04/17/2014
Call Trace:
dump_stack+0x4c/0x6e
lockdep_rcu_suspicious+0xe7/0x120
___might_sleep+0x1d5/0x1f0
__might_sleep+0x4d/0x90
kmem_cache_alloc+0x47/0x250
create_object+0x39/0x2e0
kmemleak_alloc_percpu+0x61/0xe0
pcpu_alloc+0x370/0x630

Additional backtrace lines are truncated. In addition, the above splat
is followed by several "BUG: sleeping function called from invalid
context at mm/slub.c:1268" outputs. As suggested by Martin KaFai Lau,
these are the clue to the fix. Routine kmemleak_alloc_percpu() always
uses GFP_KERNEL for its allocations, whereas it should follow the gfp
from its callers.

Reviewed-by: Catalin Marinas
Reviewed-by: Kamalesh Babulal
Acked-by: Martin KaFai Lau
Signed-off-by: Larry Finger
Cc: Martin KaFai Lau
Cc: Catalin Marinas
Cc: Tejun Heo
Cc: Christoph Lameter
Cc: [3.18+]
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Larry Finger
2015-06-25 08:49:46 +0800

25 Mar, 2015

1 commit

bffc43758 percpu: Fix trivial typos in comments ... Browse Code »

Change 'tranlated' to 'translated'
Change 'mutliples' to 'multiples'

Signed-off-by: Yannick Guerrini
Signed-off-by: Tejun Heo

Yannick Guerrini
2015-03-25 01:41:54 +0800

14 Feb, 2015

1 commit

807de073b percpu: use %*pb[l] to print bitmaps including cpumasks and nodemasks ... Browse Code »

printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

Signed-off-by: Tejun Heo
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Tejun Heo
2015-02-14 13:21:37 +0800

29 Oct, 2014

1 commit

9f295664e percpu: off by one in BUG_ON() ... Browse Code »

The unit_map[] array has "nr_cpu_ids" number of elements. It's
allocated a few lines earlier in the function. So this test should be
>= instead of >.

Signed-off-by: Dan Carpenter
Signed-off-by: Tejun Heo

Dan Carpenter
2014-10-29 22:34:34 +0800

09 Oct, 2014

1 commit

6ae833c7f percpu: fix how @gfp is interpreted by the percpu allocator ... Browse Code »

When @gfp is specified, the percpu allocator is interested in whether
it contains all of GFP_KERNEL or not. If it does, the normal
allocation path is taken; otherwise, the atomic allocation path.
Unfortunately, pcpu_alloc() was incorrectly testing for whether @gfp
contains any part of GFP_KERNEL.

Fix it by testing "(gfp & GFP_KERNEL) != GFP_KERNEL" instead of
"!(gfp & GFP_KERNEL)" to decide whether the allocation should be
atomic or not.

Signed-off-by: Tejun Heo

Tejun Heo
2014-10-09 00:01:52 +0800

22 Sep, 2014

1 commit

bb2e226b3 Revert "percpu: free percpu allocation info for uniprocessor system" ... Browse Code »

This reverts commit 3189eddbcafc ("percpu: free percpu allocation info for
uniprocessor system").

The commit causes a hang with a crisv32 image. This may be an architecture
problem, but at least for now the revert is necessary to be able to boot a
crisv32 image.

Cc: Tejun Heo
Cc: Honggang Li
Signed-off-by: Guenter Roeck
Signed-off-by: Tejun Heo
Fixes: 3189eddbcafc ("percpu: free percpu allocation info for uniprocessor system")
Cc: stable@vger.kernel.org # Please don't apply 3189eddbcafc

Guenter Roeck
2014-09-22 11:32:38 +0800

09 Sep, 2014

1 commit

23cb8981e percpu: fix locking regression in the failure path of pcpu_alloc() ... Browse Code »

While updating locking, b38d08f3181c ("percpu: restructure locking")
broke pcpu_create_chunk() creation path in pcpu_alloc(). It returns
without releasing pcpu_alloc_mutex. Fix it.

Signed-off-by: Tejun Heo
Reported-by: Julia Lawall

Tejun Heo
2014-09-09 07:02:45 +0800

03 Sep, 2014

10 commits

1a4d76076 percpu: implement asynchronous chunk population ... Browse Code »

The percpu allocator now supports atomic allocations by only
allocating from already populated areas but the mechanism to ensure
that there's adequate amount of populated areas was missing.

This patch expands pcpu_balance_work so that in addition to freeing
excess free chunks it also populates chunks to maintain an adequate
level of populated areas. pcpu_alloc() schedules pcpu_balance_work if
the amount of free populated areas is too low or after an atomic
allocation failure.

* PERPCU_DYNAMIC_RESERVE is increased by two pages to account for
PCPU_EMPTY_POP_PAGES_LOW.

* pcpu_async_enabled is added to gate both async jobs -
chunk->map_extend_work and pcpu_balance_work - so that we don't end
up scheduling them while the needed subsystems aren't up yet.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:05 +0800
fe6bd8c3d percpu: rename pcpu_reclaim_work to pcpu_balance_work ... Browse Code »

pcpu_reclaim_work will also be used to populate chunks asynchronously.
Rename it to pcpu_balance_work in preparation. pcpu_reclaim() is
renamed to pcpu_balance_workfn() and some of its local variables are
renamed too.

This is pure rename.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:05 +0800
b539b87fe percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated ... Browse Code »

pcpu_nr_empty_pop_pages counts the number of empty populated pages
across all chunks and chunk->nr_populated counts the number of
populated pages in a chunk. Both will be used to implement pre/async
population for atomic allocations.

pcpu_chunk_[de]populated() are added to update chunk->populated,
chunk->nr_populated and pcpu_nr_empty_pop_pages together. All
successful chunk [de]populations should be followed by the
corresponding pcpu_chunk_[de]populated() calls.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:05 +0800
9c824b6a1 percpu: make sure chunk->map array has available space ... Browse Code »

An allocation attempt may require extending chunk->map array which
requires GFP_KERNEL context which isn't available for atomic
allocations. This patch ensures that chunk->map array usually keeps
some amount of available space by directly allocating buffer space
during GFP_KERNEL allocations and scheduling async extension during
atomic ones. This should make atomic allocation failures from map
space exhaustion rare.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:05 +0800
5835d96e9 percpu: implement [__]alloc_percpu_gfp() ... Browse Code »

Now that pcpu_alloc_area() can allocate only from populated areas,
it's easy to add atomic allocation support to [__]alloc_percpu().
Update pcpu_alloc() so that it accepts @gfp and skips all the blocking
operations and allocates only from the populated areas if @gfp doesn't
contain GFP_KERNEL. New interface functions [__]alloc_percpu_gfp()
are added.

While this means that atomic allocations are possible, this isn't
complete yet as there's no mechanism to ensure that certain amount of
populated areas is kept available and atomic allocations may keep
failing under certain conditions.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:04 +0800
e04d32083 percpu: indent the population block in pcpu_alloc() ... Browse Code »

The next patch will conditionalize the population block in
pcpu_alloc() which will end up making a rather large indentation
change obfuscating the actual logic change. This patch puts the block
under "if (true)" so that the next patch can avoid indentation
changes. The defintions of the local variables which are used only in
the block are moved into the block.

This patch is purely cosmetic.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:04 +0800
a16037c8d percpu: make pcpu_alloc_area() capable of allocating only from populated areas ... Browse Code »

Update pcpu_alloc_area() so that it can skip unpopulated areas if the
new parameter @pop_only is true. This is implemented by a new
function, pcpu_fit_in_area(), which determines the amount of head
padding considering the alignment and populated state.

@pop_only is currently always false but this will be used to implement
atomic allocation.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:02 +0800
b38d08f31 percpu: restructure locking ... Browse Code »

At first, the percpu allocator required a sleepable context for both
alloc and free paths and used pcpu_alloc_mutex to protect everything.
Later, pcpu_lock was introduced to protect the index data structure so
that the free path can be invoked from atomic contexts. The
conversion only updated what's necessary and left most of the
allocation path under pcpu_alloc_mutex.

The percpu allocator is planned to add support for atomic allocation
and this patch restructures locking so that the coverage of
pcpu_alloc_mutex is further reduced.

* pcpu_alloc() now grab pcpu_alloc_mutex only while creating a new
chunk and populating the allocated area. Everything else is now
protected soley by pcpu_lock.

After this change, multiple instances of pcpu_extend_area_map() may
race but the function already implements sufficient synchronization
using pcpu_lock.

This also allows multiple allocators to arrive at new chunk
creation. To avoid creating multiple empty chunks back-to-back, a
new chunk is created iff there is no other empty chunk after
grabbing pcpu_alloc_mutex.

* pcpu_lock is now held while modifying chunk->populated bitmap.
After this, all data structures are protected by pcpu_lock.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:02 +0800
a93ace487 percpu: move region iterations out of pcpu_[de]populate_chunk() ... Browse Code »

Previously, pcpu_[de]populate_chunk() were called with the range which
may contain multiple target regions in it and
pcpu_[de]populate_chunk() iterated over the regions. This has the
benefit of batching up cache flushes for all the regions; however,
we're planning to add more bookkeeping logic around [de]population to
support atomic allocations and this delegation of iterations gets in
the way.

This patch moves the region iterations out of
pcpu_[de]populate_chunk() into its callers - pcpu_alloc() and
pcpu_reclaim() - so that we can later add logic to track more states
around them. This change may make cache and tlb flushes more frequent
but multi-region [de]populations are rare anyway and if this actually
becomes a problem, it's not difficult to factor out cache flushes as
separate callbacks which are directly invoked from percpu.c.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:02 +0800
dca496451 percpu: move common parts out of pcpu_[de]populate_chunk() ... Browse Code »

percpu-vm and percpu-km implement separate versions of
pcpu_[de]populate_chunk() and some part which is or should be common
are currently in the specific implementations. Make the following
changes.

* Allocate area clearing is moved from the pcpu_populate_chunk()
implementations to pcpu_alloc(). This makes percpu-km's version
noop.

* Quick exit tests in pcpu_[de]populate_chunk() of percpu-vm are moved
to their respective callers so that they are applied to percpu-km
too. This doesn't make any meaningful difference as both functions
are noop for percpu-km; however, this is more consistent and will
help implementing atomic allocation support.

Signed-off-by: Tejun Heo

Tejun Heo
2014-09-03 02:46:01 +0800

16 Aug, 2014

1 commit

3189eddbc percpu: free percpu allocation info for uniprocessor system ... Browse Code »

Currently, only SMP system free the percpu allocation info.
Uniprocessor system should free it too. For example, one x86 UML
virtual machine with 256MB memory, UML kernel wastes one page memory.

Signed-off-by: Honggang Li
Signed-off-by: Tejun Heo
Cc: stable@vger.kernel.org

Honggang Li
2014-08-16 20:59:02 +0800

19 Jun, 2014

1 commit

fb009e3a9 percpu: Use ALIGN macro instead of hand coding alignment calculation ... Browse Code »

Signed-off-by: Christoph Lameter
Signed-off-by: Tejun Heo

Christoph Lameter
2014-06-19 23:00:27 +0800

15 Apr, 2014

1 commit

5a838c3b6 percpu: make pcpu_alloc_chunk() use pcpu_mem_free() instead of kfree() ... Browse Code »

pcpu_chunk_struct_size = sizeof(struct pcpu_chunk) +
BITS_TO_LONGS(pcpu_unit_pages) * sizeof(unsigned long)

It hardly could be ever bigger than PAGE_SIZE even for large-scale machine,
but for consistency with its couterpart pcpu_mem_zalloc(),
use pcpu_mem_free() instead.

Commit b4916cb17c26 ("percpu: make pcpu_free_chunk() use
pcpu_mem_free() instead of kfree()") addressed this problem, but
missed this one.

tj: commit message updated

Signed-off-by: Jianyu Zhan
Signed-off-by: Tejun Heo
Fixes: 099a19d91ca4 ("percpu: allow limited allocation before slab is online)
Cc: stable@vger.kernel.org

Jianyu Zhan
2014-04-15 04:18:06 +0800

29 Mar, 2014

1 commit

21ddfd38e percpu: renew the max_contig if we merge the head and previous block ... Browse Code »

During pcpu_alloc_area(), we might merge the current head with the
previous block. Since we have calculated the max_contig using the
size of previous block before we skip it, and now we update the size
of previous block, so we should renew the max_contig.

Signed-off-by: Jianyu Zhan
Signed-off-by: Tejun Heo

Jianyu Zhan
2014-03-29 21:29:42 +0800