24 Mar, 2011
40 commits
-
In struct page_cgroup, we have a full word for flags but only a few are
reserved. Use the remaining upper bits to encode, depending on
configuration, the node or the section, to enable page_cgroup-to-page
lookups without a direct pointer.This saves a full word for every page in a system with memory cgroups
enabled.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Cc: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The per-cgroup LRU lists string up 'struct page_cgroup's. To get from
those structures to the page they represent, a lookup is required.
Currently, the lookup is done through a direct pointer in struct
page_cgroup, so a lot of functions down the callchain do this lookup by
themselves instead of receiving the page pointer from their callers.The next patch removes this pointer, however, and the lookup is no longer
that straight-forward. In preparation for that, this patch only leaves
the non-optional lookups when coming directly from the LRU list and passes
the page down the stack.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It is one logical function, no need to have it split up.
Also, get rid of some checks from the inner function that ensured the
sanity of the outer function.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Instead of passing a whole struct page_cgroup to this function, let it
take only what it really needs from it: the struct mem_cgroup and the
page.This has the advantage that reading pc->mem_cgroup is now done at the same
place where the ordering rules for this pointer are enforced and
explained.It is also in preparation for removing the pc->page backpointer.
Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch series removes the direct page pointer from struct page_cgroup,
which saves 20% of per-page memcg memory overhead (Fedora and Ubuntu
enable memcg per default, openSUSE apparently too).The node id or section number is encoded in the remaining free bits of
pc->flags which allows calculating the corresponding page without the
extra pointer.I ran, what I think is, a worst-case microbenchmark that just cats a large
sparse file to /dev/null, because it means that walking the LRU list on
behalf of per-cgroup reclaim and looking up pages from page_cgroups is
happening constantly and at a high rate. But it made no measurable
difference. A profile reported a 0.11% share of the new
lookup_cgroup_page() function in this benchmark.This patch:
All callsites check PCG_USED before passing pc->mem_cgroup, so the latter
is never NULL.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Acked-by: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add checks at allocating or freeing a page whether the page is used (iow,
charged) from the view point of memcg.This check may be useful in debugging a problem and we did similar checks
before the commit 52d4b9ac(memcg: allocate all page_cgroup at boot).This patch adds some overheads at allocating or freeing memory, so it's
enabled only when CONFIG_DEBUG_VM is enabled.Signed-off-by: Daisuke Nishimura
Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The page_cgroup array is set up before even fork is initialized. I
seriously doubt that this code executes before the array is alloc'd.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
No callsite ever passes a NULL pointer for a struct mem_cgroup * to the
committing function. There is no need to check for it.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
These definitions have been unused since '4b3bde4 memcg: remove the
overhead associated with the root cgroup'.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Cc: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Since transparent huge pages, checking whether memory cgroups are below
their limits is no longer enough, but the actual amount of chargeable
space is important.To not have more than one limit-checking interface, replace
memory_cgroup_check_under_limit() and memory_cgroup_check_margin() with a
single memory_cgroup_margin() that returns the chargeable space and leaves
the comparison to the callsite.Soft limits are now checked the other way round, by using the already
existing function that returns the amount by which soft limits are
exceeded: res_counter_soft_limit_excess().Also remove all the corresponding functions on the res_counter side that
are now no longer used.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Acked-by: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Soft limit reclaim continues until the usage is below the current soft
limit, but the documented semantics are actually that soft limit reclaim
will push usage back until the soft limits are met again.Signed-off-by: Johannes Weiner
Acked-by: KAMEZAWA Hiroyuki
Cc: Daisuke Nishimura
Acked-by: Balbir Singh
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove initialization of vaiable in caller of memory cgroup function.
Actually, it's return value of memcg function but it's initialized in
caller.Some memory cgroup uses following style to bring the result of start
function to the end function for avoiding races.mem_cgroup_start_A(&(*ptr))
/* Something very complicated can happen here. */
mem_cgroup_end_A(*ptr)In some calls, *ptr should be initialized to NULL be caller. But it's
ugly. This patch fixes that *ptr is initialized by _start function.Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Johannes Weiner
Acked-by: Daisuke Nishimura
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
res_counter_read_u64 reads u64 value without lock. It's dangerous in a
32bit environment. Add locking.Signed-off-by: KAMEZAWA Hiroyuki
Cc: Greg Thelen
Cc: Johannes Weiner
Cc: David Rientjes
Cc: KOSAKI Motohiro
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
minix bit operations are only used by minix filesystem and useless by
other modules. Because byte order of inode and block bitmaps is different
on each architecture like below:m68k:
big-endian 16bit indexed bitmapsh8300, microblaze, s390, sparc, m68knommu:
big-endian 32 or 64bit indexed bitmapsm32r, mips, sh, xtensa:
big-endian 32 or 64bit indexed bitmaps for big-endian mode
little-endian bitmaps for little-endian modeOthers:
little-endian bitmapsIn order to move minix bit operations from asm/bitops.h to architecture
independent code in minix filesystem, this provides two config options.CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
m32r, mips, sh, xtensa). The architectures which always use little-endian
bitmaps do not select these options.Finally, we can remove minix bit operations from asm/bitops.h for all
architectures.Signed-off-by: Akinobu Mita
Acked-by: Arnd Bergmann
Acked-by: Greg Ungerer
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Andreas Schwab
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: Yoshinori Sato
Cc: Michal Simek
Cc: "David S. Miller"
Cc: Hirokazu Takata
Acked-by: Ralf Baechle
Acked-by: Paul Mundt
Cc: Chris Zankel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for moving minix bit operations from asm/bitops.h to
architecture independent code in minix filesystem, this removes inline asm
from minix_find_first_zero_bit() for m68k.Signed-off-by: Akinobu Mita
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Andreas Schwab
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As the result of conversions, there are no users of ext2 non-atomic bit
operations except for ext2 filesystem itself. Now we can put them into
architecture independent code in ext2 filesystem, and remove from
asm/bitops.h for all architectures.Signed-off-by: Akinobu Mita
Cc: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Cc: Alasdair Kergon
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: NeilBrown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Cc: Evgeniy Dushistov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: Ryusuke Konishi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: Joel Becker
Cc: Mark Fasheh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: "Theodore Ts'o"
Cc: Andreas Dilger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: Jan Kara
Cc: Andreas Dilger
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Cc: Andy Grover
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Cc: Avi Kivity
Cc: Marcelo Tosatti
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.Signed-off-by: Akinobu Mita
Acked-by: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce little-endian bit operations to the big-endian architectures
which do not have native little-endian bit operations and the
little-endian architectures. (alpha, avr32, blackfin, cris, frv, h8300,
ia64, m32r, mips, mn10300, parisc, sh, sparc, tile, x86, xtensa)These architectures can just include generic implementation
(asm-generic/bitops/le.h).Signed-off-by: Akinobu Mita
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Mikael Starvik
Cc: David Howells
Cc: Yoshinori Sato
Cc: "Luck, Tony"
Cc: Ralf Baechle
Cc: Kyle McMartin
Cc: Matthew Wilcox
Cc: Grant Grundler
Cc: Paul Mundt
Cc: Kazumoto Kojima
Cc: Hirokazu Takata
Cc: "David S. Miller"
Cc: Chris Zankel
Cc: Ingo Molnar
Cc: Thomas Gleixner
Acked-by: Hans-Christian Egtvedt
Acked-by: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce little-endian bit operations by renaming native ext2 bit
operations. The ext2 bit operations are kept as wrapper macros using
little-endian bit operations to maintain bisectability until the
conversions are finished.Signed-off-by: Akinobu Mita
Acked-by: Greg Ungerer
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Andreas Schwab
Cc: Arnd Bergmann
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This introduces CONFIG_GENERIC_FIND_BIT_LE to tell whether to use generic
implementation of find_*_bit_le() in lib/find_next_bit.c or not.For now we select CONFIG_GENERIC_FIND_BIT_LE for all architectures which
enable CONFIG_GENERIC_FIND_NEXT_BIT.But m68knommu wants to define own faster find_next_zero_bit_le() and
continues using generic find_next_{,zero_}bit().
(CONFIG_GENERIC_FIND_NEXT_BIT and !CONFIG_GENERIC_FIND_BIT_LE)Signed-off-by: Akinobu Mita
Cc: Greg Ungerer
Cc: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce little-endian bit operations by renaming native ext2 bit
operations and changing find_*_bit_le() to take a "void *". The ext2 bit
operations are kept as wrapper macros using little-endian bit operations
to maintain bisectability until the conversions are finished.Signed-off-by: Akinobu Mita
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Andreas Schwab
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce little-endian bit operations by renaming native ext2 bit
operations. The ext2 and minix bit operations are kept as wrapper macros
using little-endian bit operations to maintain bisectability until the
conversions are finished.Signed-off-by: Akinobu Mita
Cc: Russell King
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce little-endian bit operations by renaming native ext2 bit
operations. The ext2 bit operations are kept as wrapper macros using
little-endian bit operations to maintain bisectability until the
conversions are finished.Signed-off-by: Akinobu Mita
Cc: Arnd Bergmann
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce little-endian bit operations by renaming existing powerpc native
little-endian bit operations and changing them to take any pointer types.Signed-off-by: Akinobu Mita
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This makes the little-endian bitops take any pointer types by changing the
prototypes and adding casts in the preprocessor macros.That would seem to at least make all the filesystem code happier, and they
can continue to do just something like#define ext2_set_bit __test_and_set_bit_le
(or whatever the exact sequence ends up being).
Signed-off-by: Akinobu Mita
Cc: Arnd Bergmann
Cc: Richard Henderson
Cc: Ivan Kokshaysky
Cc: Mikael Starvik
Cc: David Howells
Cc: Yoshinori Sato
Cc: "Luck, Tony"
Cc: Ralf Baechle
Cc: Kyle McMartin
Cc: Matthew Wilcox
Cc: Grant Grundler
Cc: Paul Mundt
Cc: Kazumoto Kojima
Cc: Hirokazu Takata
Cc: "David S. Miller"
Cc: Chris Zankel
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Hans-Christian Egtvedt
Cc: "H. Peter Anvin"
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
As a preparation for providing little-endian bitops for all architectures,
This renames generic implementation of little-endian bitops. (remove
"generic_" prefix and postfix "_le")s/generic_find_next_le_bit/find_next_bit_le/
s/generic_find_next_zero_le_bit/find_next_zero_bit_le/
s/generic_find_first_zero_le_bit/find_first_zero_bit_le/
s/generic___test_and_set_le_bit/__test_and_set_bit_le/
s/generic___test_and_clear_le_bit/__test_and_clear_bit_le/
s/generic_test_le_bit/test_bit_le/
s/generic___set_le_bit/__set_bit_le/
s/generic___clear_le_bit/__clear_bit_le/
s/generic_test_and_set_le_bit/test_and_set_bit_le/
s/generic_test_and_clear_le_bit/test_and_clear_bit_le/Signed-off-by: Akinobu Mita
Acked-by: Arnd Bergmann
Acked-by: Hans-Christian Egtvedt
Cc: Geert Uytterhoeven
Cc: Roman Zippel
Cc: Andreas Schwab
Cc: Greg Ungerer
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch series introduces little-endian bit operations in asm/bitops.h
for all architectures and converts all ext2 non-atomic and minix bit
operations to use little-endian bit operations. It enables us to remove
ext2 non-atomic and minix bit operations from asm/bitops.h. The reason
they should be removed from asm/bitops.h is as follows:For ext2 non-atomic bit operations, they are used for little-endian byte
order bitmap access by some filesystems and modules. But using ext2_*()
functions on a module other than ext2 filesystem makes some feel strange.For minix bit operations, they are only used by minix filesystem and are
useless by other modules. Because byte order of inode and block bitmap isThis patch:
In order to make the forthcoming changes smaller, this merges macro
definisions in asm-generic/bitops/le.h for big-endian and little-endian as
much as possible.This also removes unused BITOP_WORD macro.
Signed-off-by: Akinobu Mita
Acked-by: Arnd Bergmann
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
asm-generic/bitops/le.h is only intended to be included directly from
asm-generic/bitops/ext2-non-atomic.h or asm-generic/bitops/minix-le.h
which implements generic ext2 or minix bit operations.This stops including asm-generic/bitops/le.h directly and use ext2
non-atomic bit operations instead.It seems odd to use ext2_*_bit() on rds, but it will replaced with
__{set,clear,test}_bit_le() after introducing little endian bit operations
for all architectures. This indirect step is necessary to maintain
bisectability for some architectures which have their own little-endian
bit operations.Signed-off-by: Akinobu Mita
Cc: Andy Grover
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
asm-generic/bitops/le.h is only intended to be included directly from
asm-generic/bitops/ext2-non-atomic.h or asm-generic/bitops/minix-le.h
which implements generic ext2 or minix bit operations.This stops including asm-generic/bitops/le.h directly and use ext2
non-atomic bit operations instead.It seems odd to use ext2_set_bit() on kvm, but it will replaced with
__set_bit_le() after introducing little endian bit operations for all
architectures. This indirect step is necessary to maintain bisectability
for some architectures which have their own little-endian bit operations.Signed-off-by: Akinobu Mita
Cc: Avi Kivity
Cc: Marcelo Tosatti
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds