27 Oct, 2010
40 commits
-
Use the new {max,min}3 macros to save some cycles and bytes on the stack.
This patch substitutes trivial nested macros with their counterpart.Signed-off-by: Hagen Paul Pfeifer
Cc: Joe Perches
Cc: Ingo Molnar
Cc: Hartley Sweeten
Cc: Russell King
Cc: Benjamin Herrenschmidt
Cc: Thomas Gleixner
Cc: Herbert Xu
Cc: Roland Dreier
Cc: Sean Hefty
Cc: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce two additional min/max macros to compare three operands. This
will save some cycles as well as some bytes on the stack and last but not
least more pleasing as macro nesting.[akpm@linux-foundation.org: fix warnings]
Signed-off-by: Hagen Paul Pfeifer
Cc: Joe Perches
Cc: Ingo Molnar
Cc: Hartley Sweeten
Cc: Russell King
Cc: Benjamin Herrenschmidt
Cc: Thomas Gleixner
Cc: Herbert Xu
Cc: Roland Dreier
Cc: Sean Hefty
Cc: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Some code cleanups for hostfs.
Signed-off-by: Richard Weinberger
Cc: Jeff Dike
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This patch removes __do_IRQ() from user mode linux. __do_IRQ is deprecated.
Signed-off-by: Richard Weinberger
Cc: Jeff Dike
Cc: Thomas Gleixner
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With glibc 2.11 or later that was built with --enable-multi-arch, the UML
link fails with undefined references to __rel_iplt_start and similar
symbols. In recent binutils, the default linker script defines these
symbols (see ld --verbose). Fix the UML linker scripts to match the new
defaults for these sections.Signed-off-by: Roland McGrath
Cc: Jeff Dike
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
I think that it's better to detect DMA misuse at build time rather than
calling BUG_ON. Architectures that can't do DMA need to define
CONFIG_NO_DMA.Thanks to Sam Ravnborg for explaining how CONFIG_NO_DMA and CONFIG_HAS_DMA
work:http://marc.info/?l=linux-kernel&m=128359913825550&w=2
HAS_DMA is defined like this:
config HAS_DMA
boolean
depends on !NO_DMA
default ySo to set HAS_DMA to true an arch should do:
1) Do not define NO_DMA
2) Define NO_DMA abd set it to 'n'Must archs - including um - used principle 1).
In the um case we want to say that we do NOT have any DMA.
This can be done in two ways.
a) define NO_DMA and set it to 'y'
b) redefine HAS_DMA and set it to 'n'.The patch you provided used principle b) where other archs use principle a).
So I suggest you should use principle a) for um too.Signed-off-by: FUJITA Tomonori
Cc: Miklos Szeredi
Cc: Jeff Dike
Cc: Sam Ravnborg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
T2 are the only alpha SMP systems that do HAE switching at runtime, which
is fundamentally racy on SMP. This patch limits MMIO space on T2 to HAE0
only, like we did on MCPCIA (rawhide) long ago. This leaves us with only
112 Mb of PCI MMIO (128 Mb HAE aperture minus 16 Mb reserved for EISA),
but since linux PCI allocations are reasonably tight, it should be enough
for sane hardware configurations.Also, fix a typo in MCPCIA_FROB_MMIO macro which shouldn't call set_hae()
if MCPCIA_ONE_HAE_WINDOW is defined. It's more for correctness, as
set_hae() is a no-op anyway in that case.Signed-off-by: Ivan Kokshaysky
Cc: Matt Turner
Cc: Richard Henderson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Signed-off-by: FUJITA Tomonori
Cc: Ivan Kokshaysky
Cc: Richard Henderson
Cc: Matt Turner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Structure info is copied to userland with some padding fields unitialized.
It leads to leaking of stack memory.[akpm@linux-foundation.org: remove now-unneeded zeroing of info->hi_ireqfreq]
Signed-off-by: Vasiliy Kulikov
Cc: Clemens Ladisch
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
$./hpet_example info /dev/hpet
-hpet: executing info
hpet_info: hi_irqfreq 0x0 hi_flags 0x0 hi_hpet 0 hi_timer 2Signed-off-by: Jaswinder Singh Rajput
Cc: Clemens Ladisch
Cc: "Venkatesh Pallipadi (Venki)"
Cc: john stultz
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix the following style problems:
WARNING: Use #include instead of
WARNING: Use #include instead of
ERROR: code indent should use tabs where possible
ERROR: do not initialise statics to 0 or NULLSigned-off-by: Jaswinder Singh Rajput
Cc: Clemens Ladisch
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Jaswinder Singh Rajput wrote:
> By executing Documentation/timers/hpet_example.c
>
> for polling, I requested for 3 iterations but it seems iteration work
> for only 2 as first expired time is always very small.
>
> # ./hpet_example poll /dev/hpet 10 3
> -hpet: executing poll
> hpet_poll: info.hi_flags 0x0
> hpet_poll: expired time = 0x13
> hpet_poll: revents = 0x1
> hpet_poll: data 0x1
> hpet_poll: expired time = 0x1868c
> hpet_poll: revents = 0x1
> hpet_poll: data 0x1
> hpet_poll: expired time = 0x18645
> hpet_poll: revents = 0x1
> hpet_poll: data 0x1Clearing the HPET interrupt enable bit disables interrupt generation
but does not disable the timer, so the interrupt status bit will still
be set when the timer elapses. If another interrupt arrives before
the timer has been correctly programmed (due to some other device on
the same interrupt line, or CONFIG_DEBUG_SHIRQ), this results in an
extra unwanted interrupt event because the status bit is likely to be
set from comparator matches that happened before the device was opened.Therefore, we have to ensure that the interrupt status bit is and
stays cleared until we actually program the timer.Signed-off-by: Clemens Ladisch
Reported-by: Jaswinder Singh Rajput
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: john stultz
Cc: Bob Picco
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
When the initialization code in hpet finds a memory resource and does not
find an IRQ, it does not unmap the memory resource previously mapped.There are buggy BIOSes which report resources exactly like this and what
is worse the memory region bases point to normal RAM. This normally would
not matter since the space is not touched. But when PAT is turned on,
ioremap causes the page to be uncached and sets this bit in page->flags.Then when the page is about to be used by the allocator, it is reported
as:BUG: Bad page state in process md5sum pfn:3ed00
page:ffffea0000dbd800 count:0 mapcount:0 mapping:(null) index:0x0
page flags: 0x20000001000000(uncached)
Pid: 7956, comm: md5sum Not tainted 2.6.34-12-desktop #1
Call Trace:
[] bad_page+0xb1/0x100
[] prep_new_page+0x1a5/0x1c0
[] get_page_from_freelist+0x3a1/0x640
[] __alloc_pages_nodemask+0x10f/0x6b0
...In this particular case:
1) HPET returns 3ed00000 as memory region base, but it is not in
reserved ranges reported by the BIOS (excerpt):
BIOS-e820: 0000000000100000 - 00000000af6cf000 (usable)
BIOS-e820: 00000000af6cf000 - 00000000afdcf000 (reserved)2) there is no IRQ resource reported by HPET method. On the other
hand, the Intel HPET specs (1.0a) says (3.2.5.1):
_CRS (
// Report 1K of memory consumed by this Timer Block
memory range consumed
// Optional: only used if BIOS allocates Interrupts [1]
IRQs consumed
)[1] For case where Timer Block is configured to consume IRQ0/IRQ8 AND
Legacy 8254/Legacy RTC hardware still exists, the device objects
associated with 8254 & RTC devices should not report IRQ0/IRQ8 as
"consumed resources".So in theory we should check whether if it is the case and use those
interrupts instead.Anyway the address reported by the BIOS here is bogus, so non-presence
of IRQ doesn't mean the "optional" part in point 2).Since I got no reply previously, fix this by simply unmapping the space
when IRQ is not found and memory region was mapped previously. It would
be probably more safe to walk the resources again and unmap appropriately
depending on type. But as we now use only ioremap for both 2 memory
resource types, it is not necessarily needed right now.Addresses https://bugzilla.novell.com/show_bug.cgi?id=629908
Reported-by: Olaf Hering
Signed-off-by: Jiri Slaby
Acked-by: Clemens Ladisch
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Simple code for reducing list_empty(&source) check.
Signed-off-by: Bob Liu
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Wu Fengguang
Cc: KOSAKI Motohiro
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
If not_managed is true all pages will be putback to lru, so break the loop
earlier to skip other pages isolate.Signed-off-by: Bob Liu
Acked-by: KAMEZAWA Hiroyuki
Acked-by: Wu Fengguang
Cc: KOSAKI Motohiro
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
__test_page_isolated_in_pageblock() returns 1 if all pages in the range
are isolated, so fix the comment. Variable `pfn' will be initialised in
the following loop so remove it.Signed-off-by: Bob Liu
Acked-by: KAMEZAWA Hiroyuki
Cc: Wu Fengguang
Cc: KOSAKI Motohiro
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
page_order() is called by memory hotplug's user interface to check the
section is removable or not. (is_mem_section_removable())It calls page_order() withoug holding zone->lock.
So, even if the caller doesif (PageBuddy(page))
ret = page_order(page) ...
The caller may hit BUG_ON().For fixing this, there are 2 choices.
1. add zone->lock.
2. remove BUG_ON().is_mem_section_removable() is used for some "advice" and doesn't need to
be 100% accurate. This is_removable() can be called via user program..
We don't want to take this important lock for long by user's request. So,
this patch removes BUG_ON().Signed-off-by: KAMEZAWA Hiroyuki
Acked-by: Wu Fengguang
Acked-by: Michal Hocko
Acked-by: Mel Gorman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add missing spin_lock() of the page_table_lock before an error return in
hugetlb_cow(). Callers of hugtelb_cow() expect it to be held upon return.Signed-off-by: Dean Nelson
Cc: Mel Gorman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The vma returned by find_vma does not necessarily include the target
address. If this happens the code tries to follow a page outside of any
vma and returns ENOENT instead of EFAULT.Signed-off-by: Gleb Natapov
Acked-by: Christoph Lameter
Cc: Minchan Kim
Cc: KAMEZAWA Hiroyuki
Cc: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
System management wants to subscribe to changes in swap configuration.
Make /proc/swaps pollable like /proc/mounts.[akpm@linux-foundation.org: document proc_poll_event]
Signed-off-by: Kay Sievers
Acked-by: Greg KH
Cc: Jonathan Corbet
Cc: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add vzalloc() and vzalloc_node() to encapsulate the
vmalloc-then-memset-zero operation.Use __GFP_ZERO to zero fill the allocated memory.
Signed-off-by: Dave Young
Cc: Christoph Lameter
Acked-by: Greg Ungerer
Cc: David Howells
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Reported-by: KOSAKI Motohiro
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
I had to go back to a 2.6.20 tree to work out why we're adding a
number-of-inodes into a number-of-pages count. Restore the lost comment.Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce ___GFP_* masks in order for gfp_t to not be mixed with plain
integers which causes a lot of warnings like the following:warning: restricted gfp_t degrades to integer
Signed-off-by: Namhyung Kim
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This removes following warning from sparse:
mm/vmstat.c:466:5: warning: symbol 'fragmentation_index' was not declared. Should it be static?
[akpm@linux-foundation.org: move the include to top-of-file]
Signed-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Declare 'bdi_pending_list' and 'tag_pages_for_writeback()' to remove
following sparse warnings:mm/backing-dev.c:46:1: warning: symbol 'bdi_pending_list' was not declared. Should it be static?
mm/page-writeback.c:825:6: warning: symbol 'tag_pages_for_writeback' was not declared. Should it be static?Signed-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
s_start() and s_stop() grab/release vmlist_lock but were missing proper
annotations. Add them.Signed-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Rename redundant 'tmp' to fix following sparse warnings:
mm/vmalloc.c:296:34: warning: symbol 'tmp' shadows an earlier one
mm/vmalloc.c:293:24: originally declared hereSigned-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make anon_vma_chain_free() static. It is called only in rmap.c and the
corresponding alloc function is already static.Signed-off-by: Namhyung Kim
Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The page_check_address() conditionally grabs *@ptlp in case of returning
non-NULL. Rename and wrap it using __cond_lock() removes following
warnings from sparse:mm/rmap.c:472:9: warning: context imbalance in 'page_mapped_in_vma' - unexpected unlock
mm/rmap.c:524:9: warning: context imbalance in 'page_referenced_one' - unexpected unlock
mm/rmap.c:706:9: warning: context imbalance in 'page_mkclean_one' - unexpected unlock
mm/rmap.c:1066:9: warning: context imbalance in 'try_to_unmap_one' - unexpected unlockSigned-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The page_lock_anon_vma() conditionally grabs RCU and anon_vma lock but
page_unlock_anon_vma() releases them unconditionally. This leads sparse
to complain about context imbalance. Annotate them.Signed-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The follow_pte() conditionally grabs *@ptlp in case of returning 0.
Rename and wrap it using __cond_lock() removes following warnings:mm/memory.c:2337:9: warning: context imbalance in 'do_wp_page' - unexpected unlock
mm/memory.c:3142:19: warning: context imbalance in 'handle_mm_fault' - different lock contexts for basic blockSigned-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The do_wp_page() releases @ptl but was missing proper annotation. Add it.
This removes following warnings from sparse:mm/memory.c:2337:9: warning: context imbalance in 'do_wp_page' - unexpected unlock
mm/memory.c:3142:19: warning: context imbalance in 'handle_mm_fault' - different lock contexts for basic blockSigned-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The get_locked_pte() conditionally grabs 'ptl' in case of returning
non-NULL. This leads sparse to complain about context imbalance. Rename
and wrap it using __cond_lock() to make sparse happy.Signed-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This removes following warning from sparse:
mm/page_alloc.c:1934:9: warning: restricted gfp_t degrades to integer
Signed-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
'end' shadows earlier one and is not necessary at all. Remove it and use
'pos' instead. This removes following sparse warnings:mm/filemap.c:2180:24: warning: symbol 'end' shadows an earlier one
mm/filemap.c:2132:25: originally declared hereSigned-off-by: Namhyung Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
access_error() already takes error_code as an argument, so there is
no need for an additional write flag.Signed-off-by: Michel Lespinasse
Acked-by: Rik van Riel
Cc: Nick Piggin
Acked-by: Wu Fengguang
Cc: Ying Han
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Acked-by: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This change reduces mmap_sem hold times that are caused by waiting for
disk transfers when accessing file mapped VMAs.It introduces the VM_FAULT_ALLOW_RETRY flag, which indicates that the call
site wants mmap_sem to be released if blocking on a pending disk transfer.
In that case, filemap_fault() returns the VM_FAULT_RETRY status bit and
do_page_fault() will then re-acquire mmap_sem and retry the page fault.It is expected that the retry will hit the same page which will now be
cached, and thus it will complete with a low mmap_sem hold time.Tests:
- microbenchmark: thread A mmaps a large file and does random read accesses
to the mmaped area - achieves about 55 iterations/s. Thread B does
mmap/munmap in a loop at a separate location - achieves 55 iterations/s
before, 15000 iterations/s after.- We are seeing related effects in some applications in house, which show
significant performance regressions when running without this change.[akpm@linux-foundation.org: fix warning & crash]
Signed-off-by: Michel Lespinasse
Acked-by: Rik van Riel
Acked-by: Linus Torvalds
Cc: Nick Piggin
Reviewed-by: Wu Fengguang
Cc: Ying Han
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Acked-by: "H. Peter Anvin"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduce a single location where filemap_fault() locks the desired page.
There used to be two such places, depending if the initial find_get_page()
was successful or not.Signed-off-by: Michel Lespinasse
Acked-by: Rik van Riel
Acked-by: Linus Torvalds
Cc: Nick Piggin
Reviewed-by: Wu Fengguang
Cc: Ying Han
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Reorder structure anon_vma to remove alignment padding on 64 builds when
(CONFIG_KSM || CONFIG_MIGRATION).
This will shrink the size of the anon_vma structure from 40 to 32 bytes
& allow more objects per slab in its kmem_cache.Under slub the objects in the anon_vma kmem_cache will then be 40 bytes
with 102 objects per slab. (On v2.6.36 without this patch,the size is 48
bytes and 85 objects/slab.)Signed-off-by: Richard Kennedy
Reviewed-by: Rik van Riel
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds