Eric Lee / smarc-fsl-linux-kernel

03 Jun, 2006

1 commit

b1ab41c49 [PATCH] slab.c: fix offslab_limit bug ... Browse Code »

mm/slab.c's offlab_limit logic is totally broken.

Firstly, "offslab_limit" is a global variable while it should either be
calculated in situ or should be passed in as a parameter.

Secondly, the more serious problem with it is that the condition for
calculating it:

if (!(OFF_SLAB(sizes->cs_cachep))) {
offslab_limit = sizes->cs_size - sizeof(struct slab);
offslab_limit /= sizeof(kmem_bufctl_t);

is in total disconnect with the condition that makes use of it:

/* More than offslab_limit objects will cause problems */
if ((flags & CFLGS_OFF_SLAB) && num > offslab_limit)
break;

but due to offslab_limit being a global variable this breakage was
hidden.

Up until lockdep came along and perturbed the slab sizes sufficiently so
that the first off-slab cache would still see a (non-calculated) zero
value for offslab_limit and would panic with:

kmem_cache_create: couldn't create cache size-512.

Call Trace:
[] show_trace+0x96/0x1c8
[] dump_stack+0x13/0x15
[] panic+0x39/0x21a
[] kmem_cache_create+0x5a0/0x5d0
[] kmem_cache_init+0x193/0x379
[] start_kernel+0x17f/0x218
[] _sinittext+0x263/0x26a

Kernel panic - not syncing: kmem_cache_create(): failed to create slab `size-512'

Paolo Ornati's config on x86_64 managed to trigger it.

The fix is to move the calculation to the place that makes use of it.
This also makes slab.o 54 bytes smaller.

Btw., the check itself is quite silly. Its intention is to test whether
the number of objects per slab would be higher than the number of slab
control pointers possible. In theory it could be triggered: if someone
tried to allocate 4-byte objects cache and explicitly requested with
CFLGS_OFF_SLAB. So i kept the check.

Out of historic interest i checked how old this bug was and it's
ancient, 10 years old! It is the oldest hidden and then truly triggering
bugs i ever saw being fixed in the kernel!

Signed-off-by: Ingo Molnar
Signed-off-by: Linus Torvalds

Ingo Molnar
2006-06-03 02:21:10 +0800

01 Jun, 2006

1 commit

25a6df952 [PATCH] spanned_pages is not updated at a case of memory hot-add ... Browse Code »

From: Yasunori Goto

If hot-added memory's address is smaller than old area, spanned_pages will
not be updated. It must be fixed.

example) Old zone_start_pfn = 0x60000, and spanned_pages = 0x10000
Added new memory's start_pfn = 0x50000, and end_pfn = 0x60000

new spanned_pages will be still 0x10000 by old code.
(It should be updated to 0x20000.) Because old_zone_end_pfn will be
0x70000, and end_pfn smaller than it. So, spanned_pages will not be
updated.

In current code, spanned_pages is updated only when end_pfn is updated.
But, it should be updated by subtraction between bigger end_pfn and new
zone_start_pfn.

Signed-off-by: Yasunori Goto
Signed-off-by: Dave Hansen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Yasunori Goto
2006-06-01 07:27:10 +0800

22 May, 2006

3 commits

e984bb43f [PATCH] Align the node_mem_map endpoints to a MAX_ORDER boundary ... Browse Code »

Andy added code to buddy allocator which does not require the zone's
endpoints to be aligned to MAX_ORDER. An issue is that the buddy allocator
requires the node_mem_map's endpoints to be MAX_ORDER aligned. Otherwise
__page_find_buddy could compute a buddy not in node_mem_map for partial
MAX_ORDER regions at zone's endpoints. page_is_buddy will detect that
these pages at endpoints are not PG_buddy (they were zeroed out by bootmem
allocator and not part of zone). Of course the negative here is we could
waste a little memory but the positive is eliminating all the old checks
for zone boundary conditions.

SPARSEMEM won't encounter this issue because of MAX_ORDER size constraint
when SPARSEMEM is configured. ia64 VIRTUAL_MEM_MAP doesn't need the logic
either because the holes and endpoints are handled differently. This
leaves checking alloc_remap and other arches which privately allocate for
node_mem_map.

Signed-off-by: Bob Picco
Acked-by: Mel Gorman
Cc: Dave Hansen
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Bob Picco
2006-05-22 03:59:22 +0800
bdd804f47 [PATCH] Cpuset: might sleep checking zones allowed fix ... Browse Code »

Fix a couple of infrequently encountered 'sleeping function called from
invalid context' in the cpuset hooks in __alloc_pages. Could sleep while
interrupts disabled.

The routine cpuset_zone_allowed() is called by code in mm/page_alloc.c
__alloc_pages() to determine if a zone is allowed in the current tasks
cpuset. This routine can sleep, for certain GFP_KERNEL allocations, if the
zone is on a memory node not allowed in the current cpuset, but might be
allowed in a parent cpuset.

But we can't sleep in __alloc_pages() if in interrupt, nor if called for a
GFP_ATOMIC request (__GFP_WAIT not set in gfp_flags).

The rule was intended to be:
Don't call cpuset_zone_allowed() if you can't sleep, unless you
pass in the __GFP_HARDWALL flag set in gfp_flag, which disables
the code that might scan up ancestor cpusets and sleep.

This rule was being violated in a couple of places, due to a bogus change
made (by myself, pj) to __alloc_pages() as part of the November 2005 effort
to cleanup its logic, and also due to a later fix to constrain which swap
daemons were awoken.

The bogus change can be seen at:
http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-11/4691.html
[PATCH 01/05] mm fix __alloc_pages cpuset ALLOC_* flags

This was first noticed on a tight memory system, in code that was disabling
interrupts and doing allocation requests with __GFP_WAIT not set, which
resulted in __might_sleep() writing complaints to the log "Debug: sleeping
function called ...", when the code in cpuset_zone_allowed() tried to take
the callback_sem cpuset semaphore.

We haven't seen a system hang on this 'might_sleep' yet, but we are at
decent risk of seeing it fairly soon, especially since the additional
cpuset_zone_allowed() check was added, conditioning wakeup_kswapd(), in
March 2006.

Special thanks to Dave Chinner, for figuring this out, and a tip of the hat
to Nick Piggin who warned me of this back in Nov 2005, before I was ready
to listen.

Signed-off-by: Paul Jackson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Paul Jackson
2006-05-22 03:59:18 +0800
12783b002 [PATCH] SPARSEMEM incorrectly calculates section number ... Browse Code »

A bad calculation/loop in __section_nr() could result in incorrect section
information being put into sysfs memory entries. This primarily impacts
memory add operations as the sysfs information is used while onlining new
memory.

Fix suggested by Dave Hansen.

Note that the bug may not be obvious from the patch. It actually occurs in
the function's return statement:

return (root_nr * SECTIONS_PER_ROOT) + (ms - root);

In the existing code, root_nr has already been multiplied by
SECTIONS_PER_ROOT.

Signed-off-by: Mike Kravetz
Cc: Dave Hansen
Cc: Andy Whitcroft
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Kravetz
2006-05-22 03:59:17 +0800

16 May, 2006

3 commits

a4523a8b3 [PATCH] slab: Fix kmem_cache_destroy() on NUMA ... Browse Code »

With CONFIG_NUMA set, kmem_cache_destroy() may fail and say "Can't
free all objects." The problem is caused by sequences such as the
following (suppose we are on a NUMA machine with two nodes, 0 and 1):

* Allocate an object from cache on node 0.
* Free the object on node 1. The object is put into node 1's alien
array_cache for node 0.
* Call kmem_cache_destroy(), which ultimately ends up in __cache_shrink().
* __cache_shrink() does drain_cpu_caches(), which loops through all nodes.
For each node it drains the shared array_cache and then handles the
alien array_cache for the other node.

However this means that node 0's shared array_cache will be drained,
and then node 1 will move the contents of its alien[0] array_cache
into that same shared array_cache. node 0's shared array_cache is
never looked at again, so the objects left there will appear to be in
use when __cache_shrink() calls __node_shrink() for node 0. So
__node_shrink() will return 1 and kmem_cache_destroy() will fail.

This patch fixes this by having drain_cpu_caches() do
drain_alien_cache() on every node before it does drain_array() on the
nodes' shared array_caches.

The problem was originally reported by Or Gerlitz .

Signed-off-by: Roland Dreier
Acked-by: Christoph Lameter
Acked-by: Pekka Enberg
Signed-off-by: Linus Torvalds

Roland Dreier
2006-05-16 22:59:32 +0800
39d24e642 [PATCH] add slab_is_available() routine for boot code ... Browse Code »

slab_is_available() indicates slab based allocators are available for use.
SPARSEMEM code needs to know this as it can be called at various times
during the boot process.

Signed-off-by: Mike Kravetz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Kravetz
2006-05-16 02:20:56 +0800
ac924c603 [PATCH] setup_per_zone_pages_min() overflow fix ... Browse Code »

As pointed out in http://bugzilla.kernel.org/show_bug.cgi?id=6490, this
function can experience overflows on 32-bit machines, causing our response to
changed values of min_free_kbytes to go whacky.

Fixing it efficiently is all too hard, so fix it with 64-bit math instead.

Cc: Ake Sandgren
Cc: Martin Bligh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-05-16 02:20:55 +0800

02 May, 2006

3 commits

bed120c64 [PATCH] spufs: fix for CONFIG_NUMA ... Browse Code »

Based on an older patch from Mike Kravetz

We need to have a mem_map for high addresses in order to make fops->no_page
work on spufs mem and register files. So far, we have used the
memory_present() function during early bootup, but that did not work when
CONFIG_NUMA was enabled.

We now use the __add_pages() function to add the mem_map when loading the
spufs module, which is a lot nicer.

Signed-off-by: Arnd Bergmann
Cc: Paul Mackerras
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joel H Schopp
2006-05-02 09:17:46 +0800
46a66eecd [PATCH] sparsemem interaction with memory add bug fixes ... Browse Code »

This patch fixes two bugs with the way sparsemem interacts with memory add.
They are:

- memory leak if memmap for section already exists

- calling alloc_bootmem_node() after boot

These bugs were discovered and a first cut at the fixes were provided by
Arnd Bergmann and Joel Schopp .

Signed-off-by: Mike Kravetz
Signed-off-by: Joel Schopp
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Kravetz
2006-05-02 09:17:46 +0800
4c28f8119 [PATCH] page migration: Fix fallback behavior for dirty pages ... Browse Code »

Currently we check PageDirty() in order to make the decision to swap out
the page. However, the dirty information may be only be contained in the
ptes pointing to the page. We need to first unmap the ptes before checking
for PageDirty(). If unmap is successful then the page count of the page
will also be decreased so that pageout() works properly.

This is a fix necessary for 2.6.17. Without this fix we may migrate dirty
pages for filesystems without migration functions. Filesystems may keep
pointers to dirty pages. Migration of dirty pages can result in the
filesystem keeping pointers to freed pages.

Unmapping is currently not be separated out from removing all the
references to a page and moving the mapping. Therefore try_to_unmap will
be called again in migrate_page() if the writeout is successful. However,
it wont do anything since the ptes are already removed.

The coming updates to the page migration code will restructure the code
so that this is no longer necessary.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-05-02 09:17:45 +0800

29 Apr, 2006

1 commit

693f7d362 [PATCH] slab: fix crash on __drain_alien_cahce() during CPU Hotplug ... Browse Code »

transfer_objects should only be called when all of the cpus in the
node are online. CPU_DEAD notifier callback marks l3->shared to NULL.

Signed-off-by: Jacob Shin
Signed-off-by: Linus Torvalds

shin, jacob
2006-04-29 00:00:35 +0800

27 Apr, 2006

1 commit

ebf43500e [PATCH] Add find_get_pages_contig(): contiguous variant of find_get_pages() ... Browse Code »

find_get_pages_contig() will break out if we hit a hole in the page cache.
From Andrew Morton, small modifications and documentation by me.

Signed-off-by: Jens Axboe

Jens Axboe
2006-04-27 14:59:48 +0800

26 Apr, 2006

1 commit

83d722f7e [PATCH] Remove __devinit and __cpuinit from notifier_call definitions ... Browse Code »

Few of the notifier_chain_register() callers use __init in the definition
of notifier_call. It is incorrect as the function definition should be
available after the initializations (they do not unregister them during
initializations).

This patch fixes all such usages to _not_ have the notifier_call __init
section.

Signed-off-by: Chandra Seetharaman
Signed-off-by: Linus Torvalds

Chandra Seetharaman
2006-04-26 23:30:03 +0800

23 Apr, 2006

1 commit

304dbdb7a [PATCH] add migratepage address space op to shmem ... Browse Code »

Basic problem: pages of a shared memory segment can only be migrated once.

In 2.6.16 through 2.6.17-rc1, shared memory mappings do not have a
migratepage address space op. Therefore, migrate_pages() falls back to
default processing. In this path, it will try to pageout() dirty pages.
Once a shared memory page has been migrated it becomes dirty, so
migrate_pages() will try to page it out. However, because the page count
is 3 [cache + current + pte], pageout() will return PAGE_KEEP because
is_page_cache_freeable() returns false. This will abort all subsequent
migrations.

This patch adds a migratepage address space op to shared memory segments to
avoid taking the default path. We use the "migrate_page()" function
because it knows how to migrate dirty pages. This allows shared memory
segment pages to migrate, subject to other conditions such as # pte's
referencing the page [page_mapcount(page)], when requested.

I think this is safe. If we're migrating a shared memory page, then we
found the page via a page table, so it must be in memory.

Can be verified with memtoy and the shmem-mbind-test script, both
available at: http://free.linux.hp.com/~lts/Tools/

Signed-off-by: Lee Schermerhorn
Acked-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Lee Schermerhorn
2006-04-23 00:19:52 +0800

20 Apr, 2006

5 commits

6d472be37 [PATCH] Remove cond_resched in gather_stats() ... Browse Code »

gather_stats() is called with a spinlock held from check_pte_range. We
cannot reschedule with a lock held.

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-04-20 22:54:03 +0800
6aa3001b2 [PATCH] page_alloc.c: buddy handling cleanup ... Browse Code »

Fix up some whitespace damage.

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-04-20 00:13:50 +0800
013159227 [PATCH] mm: fix mm_struct reference counting bugs in mm/oom_kill.c ... Browse Code »

Fix oom_kill_task() so it doesn't call mmput() (which may sleep) while
holding tasklist_lock.

Signed-off-by: David S. Peterson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dave Peterson
2006-04-20 00:13:50 +0800
97c2c9b84 [PATCH] oom-kill: mm locking fix ... Browse Code »

Dave Peterson points out that badness() is playing with
mm_structs without taking a reference on them.

mmput() can sleep, so taking a reference here (inside tasklist_lock) is
hard. Fix it up via task_lock() instead.

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-04-20 00:13:49 +0800
75129e297 [PATCH] mm/slob.c: for_each_possible_cpu(), not NR_CPUS ... Browse Code »

Convert for-loops that explicitly reference "NR_CPUS" into the
potentially more efficient for_each_possible_cpu() construct.

Signed-off-by: John Hawkes
Cc: Matt Mackall
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

John Hawkes
2006-04-20 00:13:49 +0800

18 Apr, 2006

1 commit

69cf0fac6 [PATCH] Fix MADV_REMOVE protection checking ... Browse Code »

madvise_remove needs to respect file and mmap protections.

Signed-off-by: Hugh Dickins
[ Will the real CVE-2006-1524 stand up, please.. ]
Signed-off-by: Linus Torvalds

Hugh Dickins
2006-04-18 09:22:18 +0800

11 Apr, 2006

11 commits

fd5403c79 [PATCH] page-writeback comment fixes ... Browse Code »

Signed-off-by: Coywolf Qi Hunt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Coywolf Qi Hunt
2006-04-11 21:18:46 +0800
64a3ca5f7 [PATCH] mm/migrate.c: don't export a static function ... Browse Code »

EXPORT_SYMBOL'ing of a static function is not a good idea.

Signed-off-by: Adrian Bunk
Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2006-04-11 21:18:33 +0800
d5ddc79bc [PATCH] overcommit: use totalreserve_pages for nommu ... Browse Code »

This patch is an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory() in mm/nommu.c.

When the OVERCOMMIT_GUESS algorithm calculates the number of free pages,
the algorithm subtracts the number of reserved pages from the result
nr_free_pages().

Signed-off-by: Hideo Aoki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hideo AOKI
2006-04-11 21:18:32 +0800
6d9f78396 [PATCH] overcommit: use totalreserve_pages ... Browse Code »

This patch is an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory() in mm/mmap.c.

When the OVERCOMMIT_GUESS algorithm calculates the number of free pages,
the algorithm subtracts the number of reserved pages from the result
nr_free_pages().

Signed-off-by: Hideo Aoki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hideo AOKI
2006-04-11 21:18:32 +0800
cb45b0e96 [PATCH] overcommit: add calculate_totalreserve_pages() ... Browse Code »

These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory().

- why the kernel needed patching

When the kernel can't allocate anonymous pages in practice, currnet
OVERCOMMIT_GUESS could return success. This implementation might be
the cause of oom kill in memory pressure situation.

If the Linux runs with page reservation features like
/proc/sys/vm/lowmem_reserve_ratio and without swap region, I think
the oom kill occurs easily.

- the overall design approach in the patch

When the OVERCOMMET_GUESS algorithm calculates number of free pages,
the reserved free pages are regarded as non-free pages.

This change helps to avoid the pitfall that the number of free pages
become less than the number which the kernel tries to keep free.

- testing results

I tested the patches using my test kernel module.

If the patches aren't applied to the kernel, __vm_enough_memory()
returns success in the situation but autual page allocation is
failed.

On the other hand, if the patches are applied to the kernel, memory
allocation failure is avoided since __vm_enough_memory() returns
failure in the situation.

I checked that on i386 SMP 16GB memory machine. I haven't tested on
nommu environment currently.

This patch adds totalreserve_pages for __vm_enough_memory().

Calculate_totalreserve_pages() checks maximum lowmem_reserve pages and
pages_high in each zone. Finally, the function stores the sum of each
zone to totalreserve_pages.

The totalreserve_pages is calculated when the VM is initilized.
And the variable is updated when /proc/sys/vm/lowmem_reserve_raito
or /proc/sys/vm/min_free_kbytes are changed.

Signed-off-by: Hideo Aoki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hideo AOKI
2006-04-11 21:18:32 +0800
e23ca00bf [PATCH] Some page migration fixups ... Browse Code »

- Remove sparse comment

- Remove duplicated include

- Return the correct error condition in migrate_page_remove_references().

Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Christoph Lameter
2006-04-11 21:18:32 +0800
1e624196f [PATCH] mm: fix bug in brk() ... Browse Code »

The code checks for newbrk with oldbrk which are page aligned before making
a check for the memory limit set of data segment. If the memory limit is
not page aligned in that case it bypasses the test for the limit if the
memory allocation is still for the same page.

Signed-off-by: Ram Gupta
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ram Gupta
2006-04-11 21:18:32 +0800
d6fef9da1 [PATCH] nommu: use compound page in slab allocator ... Browse Code »

The earlier patch to consolidate mmu and nommu page allocation and
refcounting by using compound pages for nommu allocations had a bug:
kmalloc slabs who's pages were initially allocated by a non-__GFP_COMP
allocator could be passed into mm/nommu.c kmalloc allocations which really
wanted __GFP_COMP underlying pages. Fix that by having nommu pass
__GFP_COMP to all higher order slab allocations.

Signed-off-by: Luke Yang
Acked-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Luke Yang
2006-04-11 21:18:32 +0800
fb7faf331 [PATCH] slab: add statistics for alien cache overflows ... Browse Code »

Add a statistics counter which is incremented everytime the alien cache
overflows. alien_cache limit is hardcoded to 12 right now. We can use
this statistics to tune alien cache if needed in the future.

Signed-off-by: Alok N Kataria
Signed-off-by: Ravikiran Thirumalai
Signed-off-by: Shai Fultheim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ravikiran G Thirumalai
2006-04-11 21:18:31 +0800
5b74ada7e [PATCH] slab: allocate node local memory for off-slab slabmanagement ... Browse Code »

Allocate off-slab slab descriptors from node local memory.

Signed-off-by: Alok N Kataria
Signed-off-by: Ravikiran Thirumalai
Signed-off-by: Shai Fultheim
Acked-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ravikiran G Thirumalai
2006-04-11 21:18:31 +0800
676165a8a [PATCH] Fix buddy list race that could lead to page lru list corruptions ... Browse Code »

Rohit found an obscure bug causing buddy list corruption.

page_is_buddy is using a non-atomic test (PagePrivate && page_count == 0)
to determine whether or not a free page's buddy is itself free and in the
buddy lists.

Each of the conjuncts may be true at different times due to unrelated
conditions, so the non-atomic page_is_buddy test may find each conjunct to
be true even if they were not both true at the same time (ie. the page was
not on the buddy lists).

Signed-off-by: Martin Bligh
Signed-off-by: Rohit Seth
Signed-off-by: Nick Piggin
Signed-off-by: KAMEZAWA Hiroyuki
Signed-off-by: Linus Torvalds

Nick Piggin
2006-04-11 01:16:37 +0800

10 Apr, 2006

1 commit

a8062231d [PATCH] x86_64: Handle empty PXMs that only contain hotplug memory ... Browse Code »

The node setup code would try to allocate the node metadata in the node
itself, but that fails if there is no memory in there.

This can happen with memory hotplug when the hotplug area defines an so
far empty node.

Now use bootmem to try to allocate the mem_map in other nodes.

And if it fails don't panic, but just ignore the node.

To make this work I added a new __alloc_bootmem_nopanic function that
does what its name implies.

TBD should try to use nearby nodes here. Currently we just use any.
It's hard to do it better because bootmem doesn't have proper fallback
lists yet.

Signed-off-by: Andi Kleen
Signed-off-by: Linus Torvalds

Andi Kleen
2006-04-10 02:53:16 +0800

02 Apr, 2006

3 commits

a580290c3 Documentation: fix minor kernel-doc warnings ... Browse Code »

This patch updates the comments to match the actual code.

Signed-off-by: Martin Waitz
Signed-off-by: Adrian Bunk

Martin Waitz
2006-04-02 19:59:55 +0800
40094fa65 BUG_ON() Conversion in mm/slab.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-04-02 19:49:25 +0800
75babcace BUG_ON() Conversion in mm/highmem.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-04-02 19:47:35 +0800

01 Apr, 2006

4 commits

5aae277ed BUG_ON() Conversion in mm/vmalloc.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-04-01 07:26:09 +0800
e74ca2b49 BUG_ON() Conversion in mm/swap_state.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-04-01 07:25:12 +0800
46a350ef9 BUG_ON() Conversion in mm/mmap.c ... Browse Code »

this changes if() BUG(); constructs to BUG_ON() which is
cleaner, contains unlikely() and can better optimized away.

Signed-off-by: Eric Sesterhenn
Signed-off-by: Adrian Bunk

Eric Sesterhenn
2006-04-01 07:23:29 +0800
f79e2abb9 [PATCH] sys_sync_file_range() ... Browse Code »

Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible. Things which would require two or three syscalls with
fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c. The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC. I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested. This is trivial to fix: add a new flag bit, set
wbc->nonblocking. But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines. It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility. Perhaps it should make such
requests fail...

Cc: Nick Piggin
Cc: Michael Kerrisk
Cc: Ulrich Drepper
Cc: Neil Brown
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andrew Morton
2006-04-01 04:18:54 +0800