Eric Lee / smarc-fsl-linux-kernel

15 Jul, 2013

1 commit

148f9bb87 x86: delete __cpuinit usage from all x86 files ... Browse Code »

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings. In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files. x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: x86@kernel.org
Acked-by: Ingo Molnar
Acked-by: Thomas Gleixner
Acked-by: H. Peter Anvin
Signed-off-by: Paul Gortmaker

Paul Gortmaker
2013-07-15 07:36:56 +0800

11 Jul, 2013

1 commit

98d1e64f9 mm: remove free_area_cache ... Browse Code »

Since all architectures have been converted to use vm_unmapped_area(),
there is no remaining use for the free_area_cache.

Signed-off-by: Michel Lespinasse
Acked-by: Rik van Riel
Cc: "James E.J. Bottomley"
Cc: "Luck, Tony"
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: Helge Deller
Cc: Ivan Kokshaysky
Cc: Matt Turner
Cc: Paul Mackerras
Cc: Richard Henderson
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michel Lespinasse
2013-07-11 09:11:34 +0800

10 Jul, 2013

1 commit

73b44ff43 mm/pgtable: don't accumulate addr during pgd prepopulate pmd ... Browse Code »

The old codes accumulate addr to get right pmd, however, currently pmds
are preallocated and transfered as a parameter, there is unnecessary to
accumulate addr variable any more, this patch remove it.

Signed-off-by: Wanpeng Li
Reviewed-by: Michal Hocko
Reviewed-by: Zhang Yanfei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Wanpeng Li
2013-07-10 01:33:23 +0800

04 Jul, 2013

6 commits

46a841329 mm/x86: prepare for removing num_physpages and simplify mem_init() ... Browse Code »

Prepare for removing num_physpages and simplify mem_init().

Signed-off-by: Jiang Liu
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Andreas Herrmann
Cc: Tang Chen
Cc: Wen Congyang
Cc: Jianguo Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-07-04 07:07:38 +0800
0c9885347 mm: concentrate modification of totalram_pages into the mm core ... Browse Code »

Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().

With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.

Signed-off-by: Jiang Liu
Acked-by: David Howells
Cc: "H. Peter Anvin"
Cc: "Michael S. Tsirkin"
Cc:
Cc: Arnd Bergmann
Cc: Catalin Marinas
Cc: Chris Metcalf
Cc: Geert Uytterhoeven
Cc: Ingo Molnar
Cc: Jeremy Fitzhardinge
Cc: Jianguo Wu
Cc: Joonsoo Kim
Cc: Kamezawa Hiroyuki
Cc: Konrad Rzeszutek Wilk
Cc: Marek Szyprowski
Cc: Mel Gorman
Cc: Michel Lespinasse
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Rusty Russell
Cc: Tang Chen
Cc: Tejun Heo
Cc: Thomas Gleixner
Cc: Wen Congyang
Cc: Will Deacon
Cc: Yasuaki Ishimatsu
Cc: Yinghai Lu
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-07-04 07:07:33 +0800
170a5a7eb mm: make __free_pages_bootmem() only available at boot time ... Browse Code »

In order to simpilify management of totalram_pages and
zone->managed_pages, make __free_pages_bootmem() only available at boot
time. With this change applied, __free_pages_bootmem() will only be
used by bootmem.c and nobootmem.c at boot time, so mark it as __init.
Other callers of __free_pages_bootmem() have been converted to use
free_reserved_page(), which handles totalram_pages and
zone->managed_pages in a safer way.

This patch also fix a bug in free_pagetable() for x86_64, which should
increase zone->managed_pages instead of zone->present_pages when freeing
reserved pages.

And now we have managed_pages_count_lock to protect totalram_pages and
zone->managed_pages, so remove the redundant ppb_lock lock in
put_page_bootmem(). This greatly simplifies the locking rules.

Signed-off-by: Jiang Liu
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Wen Congyang
Cc: Tang Chen
Cc: Yasuaki Ishimatsu
Cc: Mel Gorman
Cc: Minchan Kim
Cc: "Michael S. Tsirkin"
Cc:
Cc: Arnd Bergmann
Cc: Catalin Marinas
Cc: Chris Metcalf
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Jeremy Fitzhardinge
Cc: Jianguo Wu
Cc: Joonsoo Kim
Cc: Kamezawa Hiroyuki
Cc: Konrad Rzeszutek Wilk
Cc: Marek Szyprowski
Cc: Michel Lespinasse
Cc: Rik van Riel
Cc: Rusty Russell
Cc: Tejun Heo
Cc: Will Deacon
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-07-04 07:07:33 +0800
7b4b2a0d6 mm: accurately calculate zone->managed_pages for highmem zones ... Browse Code »

Commit "mm: introduce new field 'managed_pages' to struct zone" assumes
that all highmem pages will be freed into the buddy system by function
mem_init(). But that's not always true, some architectures may reserve
some highmem pages during boot. For example PPC may allocate highmem
pages for giagant HugeTLB pages, and several architectures have code to
check PageReserved flag to exclude highmem pages allocated during boot
when freeing highmem pages into the buddy system.

So treat highmem pages in the same way as normal pages, that is to:
1) reset zone->managed_pages to zero in mem_init().
2) recalculate managed_pages when freeing pages into the buddy system.

Signed-off-by: Jiang Liu
Cc: "H. Peter Anvin"
Cc: Tejun Heo
Cc: Joonsoo Kim
Cc: Yinghai Lu
Cc: Mel Gorman
Cc: Minchan Kim
Cc: Kamezawa Hiroyuki
Cc: Marek Szyprowski
Cc: "Michael S. Tsirkin"
Cc:
Cc: Arnd Bergmann
Cc: Catalin Marinas
Cc: Chris Metcalf
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Ingo Molnar
Cc: Jeremy Fitzhardinge
Cc: Jianguo Wu
Cc: Konrad Rzeszutek Wilk
Cc: Michel Lespinasse
Cc: Rik van Riel
Cc: Rusty Russell
Cc: Tang Chen
Cc: Thomas Gleixner
Cc: Wen Congyang
Cc: Will Deacon
Cc: Yasuaki Ishimatsu
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-07-04 07:07:33 +0800
c88442ec4 mm/x86: use free_reserved_area() to simplify code ... Browse Code »

Use common help function free_reserved_area() to simplify code.

Signed-off-by: Jiang Liu
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Cc: Tang Chen
Cc: Wen Congyang
Cc: Jianguo Wu
Cc: "Michael S. Tsirkin"
Cc:
Cc: Arnd Bergmann
Cc: Catalin Marinas
Cc: Chris Metcalf
Cc: David Howells
Cc: Geert Uytterhoeven
Cc: Jeremy Fitzhardinge
Cc: Joonsoo Kim
Cc: Kamezawa Hiroyuki
Cc: Konrad Rzeszutek Wilk
Cc: Marek Szyprowski
Cc: Mel Gorman
Cc: Michel Lespinasse
Cc: Minchan Kim
Cc: Rik van Riel
Cc: Rusty Russell
Cc: Tejun Heo
Cc: Will Deacon
Cc: Yasuaki Ishimatsu
Cc: Russell King
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-07-04 07:07:33 +0800
1873e5002 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 ... Browse Code »

Pull ARM64 updates from Catalin Marinas:
"Main features:
- KVM and Xen ports to AArch64
- Hugetlbfs and transparent huge pages support for arm64
- Applied Micro X-Gene Kconfig entry and dts file
- Cache flushing improvements

For arm64 huge pages support, there are x86 changes moving part of
arch/x86/mm/hugetlbpage.c into mm/hugetlb.c to be re-used by arm64"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (66 commits)
arm64: Add initial DTS for APM X-Gene Storm SOC and APM Mustang board
arm64: Add defines for APM ARMv8 implementation
arm64: Enable APM X-Gene SOC family in the defconfig
arm64: Add Kconfig option for APM X-Gene SOC family
arm64/Makefile: provide vdso_install target
ARM64: mm: THP support.
ARM64: mm: Raise MAX_ORDER for 64KB pages and THP.
ARM64: mm: HugeTLB support.
ARM64: mm: Move PTE_PROT_NONE bit.
ARM64: mm: Make PAGE_NONE pages read only and no-execute.
ARM64: mm: Restore memblock limit when map_mem finished.
mm: thp: Correct the HPAGE_PMD_ORDER check.
x86: mm: Remove general hugetlb code from x86.
mm: hugetlb: Copy general hugetlb code from x86 to mm.
x86: mm: Remove x86 version of huge_pmd_share.
mm: hugetlb: Copy huge_pmd_share from x86 to mm.
arm64: KVM: document kernel object mappings in HYP
arm64: KVM: MAINTAINERS update
arm64: KVM: userspace API documentation
arm64: KVM: enable initialization of a 32bit vcpu
...

Linus Torvalds
2013-07-04 01:31:38 +0800

03 Jul, 2013

1 commit

1982269a5 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 mm changes from Ingo Molnar:
"Misc improvements:

- Fix /proc/mtrr reporting
- Fix ioremap printout
- Remove the unused pvclock fixmap entry on 32-bit
- misc cleanups"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ioremap: Correct function name output
x86: Fix /proc/mtrr with base/size more than 44bits
ix86: Don't waste fixmap entries
x86/mm: Drop unneeded include
x86_64: Correct phys_addr in cleanup_highmap comment

Linus Torvalds
2013-07-03 07:29:05 +0800

28 Jun, 2013

1 commit

4f4319a02 x86/ioremap: Correct function name output ... Browse Code »

Infact, let the compiler enter the function name so that there
are no discrepancies.

Signed-off-by: Borislav Petkov
Link: http://lkml.kernel.org/r/1372369996-20556-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar

Borislav Petkov
2013-06-28 17:11:58 +0800

14 Jun, 2013

2 commits

53313b2c9 x86: mm: Remove general hugetlb code from x86. ... Browse Code »

huge_pte_alloc, huge_pte_offset and follow_huge_p[mu]d have
already been copied over to mm.

This patch removes the x86 copies of these functions and activates
the general ones by enabling:
CONFIG_ARCH_WANT_GENERAL_HUGETLB

Signed-off-by: Steve Capper
Acked-by: Catalin Marinas
Acked-by: Andrew Morton

Steve Capper
2013-06-14 16:40:15 +0800
cfe28c5d6 x86: mm: Remove x86 version of huge_pmd_share. ... Browse Code »

The huge_pmd_share code has been copied over to mm/hugetlb.c to
make it accessible to other architectures.

Remove the x86 copy of the huge_pmd_share code and enable the
ARCH_WANT_HUGE_PMD_SHARE config flag. That way we reference the
general one.

Signed-off-by: Steve Capper
Acked-by: Catalin Marinas
Acked-by: Andrew Morton

Steve Capper
2013-06-14 16:39:46 +0800

01 Jun, 2013

1 commit

7de3d66b1 x86: Fix adjust_range_size_mask calling position ... Browse Code »

Commit

8d57470d x86, mm: setup page table in top-down

causes a kernel panic while setting mem=2G.

[mem 0x00000000-0x000fffff] page 4k
[mem 0x7fe00000-0x7fffffff] page 1G
[mem 0x7c000000-0x7fdfffff] page 1G
[mem 0x00100000-0x001fffff] page 4k
[mem 0x00200000-0x7bffffff] page 2M

for last entry is not what we want, we should have
[mem 0x00200000-0x3fffffff] page 2M
[mem 0x40000000-0x7bffffff] page 1G

Actually we merge the continuous ranges with same page size too early.
in this case, before merging we have
[mem 0x00200000-0x3fffffff] page 2M
[mem 0x40000000-0x7bffffff] page 2M
after merging them, will get
[mem 0x00200000-0x7bffffff] page 2M
even we can use 1G page to map
[mem 0x40000000-0x7bffffff]

that will cause problem, because we already map
[mem 0x7fe00000-0x7fffffff] page 1G
[mem 0x7c000000-0x7fdfffff] page 1G
with 1G page, aka [0x40000000-0x7fffffff] is mapped with 1G page already.
During phys_pud_init() for [0x40000000-0x7bffffff], it will not
reuse existing that pud page, and allocate new one then try to use
2M page to map it instead, as page_size_mask does not include
PG_LEVEL_1G. At end will have [7c000000-0x7fffffff] not mapped, loop
in phys_pmd_init stop mapping at 0x7bffffff.

That is right behavoir, it maps exact range with exact page size that
we ask, and we should explicitly call it to map [7c000000-0x7fffffff]
before or after mapping 0x40000000-0x7bffffff.
Anyway we need to make sure ranges' page_size_mask correct and consistent
after split_mem_range for each range.

Fix that by calling adjust_range_size_mask before merging range
with same page size.

-v2: update change log.
-v3: add more explanation why [7c000000-0x7fffffff] is not mapped, and
it causes panic.

Bisected-by: "Xie, ChanglongX"
Bisected-by: Yuanhan Liu
Reported-and-tested-by: Yuanhan Liu
Signed-off-by: Yinghai Lu
Link: http://lkml.kernel.org/r/1370015587-20835-1-git-send-email-yinghai@kernel.org
Cc: v3.9
Signed-off-by: H. Peter Anvin

Yinghai Lu
2013-06-01 04:21:32 +0800

28 May, 2013

1 commit

1e3b30818 x86_64: Correct phys_addr in cleanup_highmap comment ... Browse Code »

For x86_64, we have phys_base, which means the delta between the
the address kernel is actually running at and the address kernel
is compiled to run at. Not phys_addr so correct it.

Signed-off-by: Zhang Yanfei
Link: http://lkml.kernel.org/r/5192F9BF.2000802@cn.fujitsu.com
Signed-off-by: Ingo Molnar

Zhang Yanfei
2013-05-28 17:41:24 +0800

10 May, 2013

1 commit

cf8b166d5 x86/mm: Add missing comments for initial kernel direct mapping ... Browse Code »

Two sets of comments were lost during patch-series shuffling:

- comments for init_range_memory_mapping()

- comments in init_mem_mapping that is helpful for reminding people
that the pagetable is setup top-down

The comments were written by Yinghai in his patch in:

https://lkml.org/lkml/2012/11/28/620

This patch reintroduces them.

Originally-From: Yinghai Lu
Signed-off-by: Zhang Yanfei
Cc: Yasuaki Ishimatsu
Cc: Konrad Rzeszutek Wilk
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/518BC776.7010506@gmail.com
[ Tidied it all up a bit. ]
Signed-off-by: Ingo Molnar

Zhang Yanfei
2013-05-10 18:00:35 +0800

02 May, 2013

1 commit

20b4fb485 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
don't bother with deferred freeing of fdtables
proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
proc: Make the PROC_I() and PDE() macros internal to procfs
proc: Supply a function to remove a proc entry by PDE
take cgroup_open() and cpuset_open() to fs/proc/base.c
ppc: Clean up scanlog
ppc: Clean up rtas_flash driver somewhat
hostap: proc: Use remove_proc_subtree()
drm: proc: Use remove_proc_subtree()
drm: proc: Use minor->index to label things, not PDE->name
drm: Constify drm_proc_list[]
zoran: Don't print proc_dir_entry data in debug
reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
proc: Supply an accessor for getting the data from a PDE's parent
airo: Use remove_proc_subtree()
rtl8192u: Don't need to save device proc dir PDE
rtl8187se: Use a dir under /proc/net/r8180/
proc: Add proc_mkdir_data()
proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
proc: Move PDE_NET() to fs/proc/proc_net.c
...

Linus Torvalds
2013-05-02 08:51:54 +0800

30 Apr, 2013

14 commits

f9b3bcfbc Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 mm changes from Ingo Molnar:
"Misc smaller changes all over the map"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/iommu/dmar: Remove warning for HPET scope type
x86/mm/gart: Drop unnecessary check
x86/mm/hotplug: Put kernel_physical_mapping_remove() declaration in CONFIG_MEMORY_HOTREMOVE
x86/mm/fixmap: Remove unused FIX_CYCLONE_TIMER
x86/mm/numa: Simplify some bit mangling
x86/mm: Re-enable DEBUG_TLBFLUSH for X86_32
x86/mm/cpa: Cleanup split_large_page() and its callee
x86: Drop always empty .text..page_aligned section

Linus Torvalds
2013-04-30 23:40:35 +0800
df8edfa9a Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 cpuid changes from Ingo Molnar:
"The biggest change is x86 CPU bug handling refactoring and cleanups,
by Borislav Petkov"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, CPU, AMD: Drop useless label
x86, AMD: Correct {rd,wr}msr_amd_safe warnings
x86: Fold-in trivial check_config function
x86, cpu: Convert AMD Erratum 400
x86, cpu: Convert AMD Erratum 383
x86, cpu: Convert Cyrix coma bug detection
x86, cpu: Convert FDIV bug detection
x86, cpu: Convert F00F bug detection
x86, cpu: Expand cpufeature facility to include cpu bugs

Linus Torvalds
2013-04-30 23:34:38 +0800
16fa94b53 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull scheduler changes from Ingo Molnar:
"The main changes in this development cycle were:

- full dynticks preparatory work by Frederic Weisbecker

- factor out the cpu time accounting code better, by Li Zefan

- multi-CPU load balancer cleanups and improvements by Joonsoo Kim

- various smaller fixes and cleanups"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
sched: Fix init NOHZ_IDLE flag
sched: Prevent to re-select dst-cpu in load_balance()
sched: Rename load_balance_tmpmask to load_balance_mask
sched: Move up affinity check to mitigate useless redoing overhead
sched: Don't consider other cpus in our group in case of NEWLY_IDLE
sched: Explicitly cpu_idle_type checking in rebalance_domains()
sched: Change position of resched_cpu() in load_balance()
sched: Fix wrong rq's runnable_avg update with rt tasks
sched: Document task_struct::personality field
sched/cpuacct/UML: Fix header file dependency bug on the UML build
cgroup: Kill subsys.active flag
sched/cpuacct: No need to check subsys active state
sched/cpuacct: Initialize cpuacct subsystem earlier
sched/cpuacct: Initialize root cpuacct earlier
sched/cpuacct: Allocate per_cpu cpuusage for root cpuacct statically
sched/cpuacct: Clean up cpuacct.h
sched/cpuacct: Remove redundant NULL checks in cpuacct_acount_field()
sched/cpuacct: Remove redundant NULL checks in cpuacct_charge()
sched/cpuacct: Add cpuacct_acount_field()
sched/cpuacct: Add cpuacct_init()
...

Linus Torvalds
2013-04-30 22:43:28 +0800
56a1ba5fd x86: rename random32() to prandom_u32() ... Browse Code »

Use preferable function name which implies using a pseudo-random
number generator.

Signed-off-by: Akinobu Mita
Acked-by: H. Peter Anvin
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2013-04-30 09:28:42 +0800
4a7b4d236 x86: pageattr-test: remove srandom32 call ... Browse Code »

pageattr-test calls srandom32() once every test iteration. But calling
srandom32() after late_initcalls is not meaningfull. Because the random
states for random32() is mixed by good random numbers in late_initcall
prandom_reseed().

So this removes the call to srandom32().

Signed-off-by: Akinobu Mita
Acked-by: H. Peter Anvin
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Akinobu Mita
2013-04-30 09:28:42 +0800
d2ad351e3 x86/mm/numa: use setup_nr_node_ids() instead of opencoding. ... Browse Code »

Signed-off-by: Cody P Schafer
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Benjamin Herrenschmidt
Acked-by: Yinghai Lu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cody P Schafer
2013-04-30 06:54:36 +0800
8e2cdbcb8 x86-64: fall back to regular page vmemmap on allocation failure ... Browse Code »

Memory hotplug can happen on a machine under load, memory shortness
and fragmentation, so huge page allocations for the vmemmap are not
guaranteed to succeed.

Try to fall back to regular pages before failing the hotplug event
completely.

Signed-off-by: Johannes Weiner
Cc: Ben Hutchings
Cc: Bernhard Schmidt
Cc: Johannes Weiner
Cc: Russell King
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Benjamin Herrenschmidt
Cc: "Luck, Tony"
Cc: Heiko Carstens
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2013-04-30 06:54:35 +0800
e8216da5c x86-64: use vmemmap_populate_basepages() for !pse setups ... Browse Code »

We already have generic code to allocate vmemmap with regular pages, use
it.

Signed-off-by: Johannes Weiner
Cc: Ben Hutchings
Cc: Bernhard Schmidt
Cc: Johannes Weiner
Cc: Russell King
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Benjamin Herrenschmidt
Cc: "Luck, Tony"
Cc: Heiko Carstens
Cc: David Miller
Cc: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2013-04-30 06:54:35 +0800
6c7a2ca4c x86-64: remove dead debugging code for !pse setups ... Browse Code »

No need to maintain addr_end and p_end when they are never actually read
anywhere on !pse setups. Remove the dead code.

Signed-off-by: Johannes Weiner
Cc: Ben Hutchings
Cc: Bernhard Schmidt
Cc: Johannes Weiner
Cc: Russell King
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Benjamin Herrenschmidt
Cc: "Luck, Tony"
Cc: Heiko Carstens
Cc: David Miller
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2013-04-30 06:54:35 +0800
0aad818b2 sparse-vmemmap: specify vmemmap population range in bytes ... Browse Code »

The sparse code, when asking the architecture to populate the vmemmap,
specifies the section range as a starting page and a number of pages.

This is an awkward interface, because none of the arch-specific code
actually thinks of the range in terms of 'struct page' units and always
translates it to bytes first.

In addition, later patches mix huge page and regular page backing for
the vmemmap. For this, they need to call vmemmap_populate_basepages()
on sub-section ranges with PAGE_SIZE and PMD_SIZE in mind. But these
are not necessarily multiples of the 'struct page' size and so this unit
is too coarse.

Just translate the section range into bytes once in the generic sparse
code, then pass byte ranges down the stack.

Signed-off-by: Johannes Weiner
Cc: Ben Hutchings
Cc: Bernhard Schmidt
Cc: Johannes Weiner
Cc: Russell King
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Benjamin Herrenschmidt
Cc: "Luck, Tony"
Cc: Heiko Carstens
Acked-by: David S. Miller
Tested-by: David S. Miller
Cc: Wu Fengguang
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2013-04-30 06:54:35 +0800
ef9324732 mm, vmalloc: change iterating a vmlist to find_vm_area() ... Browse Code »

This patchset removes vm_struct list management after initializing
vmalloc. Adding and removing an entry to vmlist is linear time
complexity, so it is inefficient. If we maintain this list, overall
time complexity of adding and removing area to vmalloc space is O(N),
although we use rbtree for finding vacant place and it's time complexity
is just O(logN).

And vmlist and vmlist_lock is used many places of outside of vmalloc.c.
It is preferable that we hide this raw data structure and provide
well-defined function for supporting them, because it makes that they
cannot mistake when manipulating theses structure and it makes us easily
maintain vmalloc layer.

For kexec and makedumpfile, I export vmap_area_list, instead of vmlist.
This comes from Atsushi's recommendation. For more information, please
refer below link. https://lkml.org/lkml/2012/12/6/184

This patch:

The purpose of iterating a vmlist is finding vm area with specific virtual
address. find_vm_area() is provided for this purpose and more efficient,
because it uses a rbtree. So change it.

Signed-off-by: Joonsoo Kim
Signed-off-by: Joonsoo Kim
Acked-by: Guan Xuetao
Acked-by: Ingo Molnar
Acked-by: Chris Metcalf
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Atsushi Kumagai
Cc: Dave Anderson
Cc: Eric Biederman
Cc: Vivek Goyal
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Joonsoo Kim
2013-04-30 06:54:33 +0800
5e7ccf863 mm/x86: use free_highmem_page() to free highmem pages into buddy system ... Browse Code »

Use helper function free_highmem_page() to free highmem pages into
the buddy system.

Signed-off-by: Jiang Liu
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Cong Wang
Cc: Yinghai Lu
Cc: Attilio Rao
Cc: Konrad Rzeszutek Wilk
Reviewed-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-04-30 06:54:32 +0800
bced0e32f mm/x86: use common help functions to free reserved pages ... Browse Code »

Use common help functions to free reserved pages.

Signed-off-by: Jiang Liu
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiang Liu
2013-04-30 06:54:31 +0800
2f96b8c1d proc: Split kcore bits from linux/procfs.h into linux/kcore.h ... Browse Code »

Split kcore bits from linux/procfs.h into linux/kcore.h.

Signed-off-by: David Howells
Acked-by: KOSAKI Motohiro
Acked-by: Ralf Baechle
cc: linux-mips@linux-mips.org
cc: sparclinux@vger.kernel.org
cc: x86@kernel.org
cc: linux-mm@kvack.org
Signed-off-by: Al Viro

David Howells
2013-04-30 03:42:02 +0800

15 Apr, 2013

2 commits

587ff8c4e x86/mm/hotplug: Put kernel_physical_mapping_remove() declaration in CONFIG_MEMORY_HOTREMOVE ... Browse Code »

kernel_physical_mapping_remove() is only called by
arch_remove_memory() in init_64.c, which is enclosed in
CONFIG_MEMORY_HOTREMOVE. So when we don't configure
CONFIG_MEMORY_HOTREMOVE, the compiler will give a warning:

warning: ‘kernel_physical_mapping_remove’ defined but not used

So put kernel_physical_mapping_remove() in
CONFIG_MEMORY_HOTREMOVE.

Signed-off-by: Tang Chen
Cc: linux-mm@kvack.org
Cc: gregkh@linuxfoundation.org
Cc: yinghai@kernel.org
Cc: wency@cn.fujitsu.com
Cc: mgorman@suse.de
Cc: tj@kernel.org
Cc: liwanp@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/1366019207-27818-3-git-send-email-tangchen@cn.fujitsu.com
Signed-off-by: Ingo Molnar

Tang Chen
2013-04-15 18:03:24 +0800
6c4c4d4bd Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 fixes from Ingo Molnar:
"Misc fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set
x86/mm/cpa/selftest: Fix false positive in CPA self test
x86/mm/cpa: Convert noop to functional fix
x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal
x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates

Linus Torvalds
2013-04-15 02:13:24 +0800

13 Apr, 2013

1 commit

1de14c3c5 x86-32: Fix possible incomplete TLB invalidate with PAE pagetables ... Browse Code »

This patch attempts to fix:

https://bugzilla.kernel.org/show_bug.cgi?id=56461

The symptom is a crash and messages like this:

chrome: Corrupted page table at address 34a03000
*pdpt = 0000000000000000 *pde = 0000000000000000
Bad pagetable: 000f [#1] PREEMPT SMP

Ingo guesses this got introduced by commit 611ae8e3f520 ("x86/tlb:
enable tlb flush range support for x86") since that code started to free
unused pagetables.

On x86-32 PAE kernels, that new code has the potential to free an entire
PMD page and will clear one of the four page-directory-pointer-table
(aka pgd_t entries).

The hardware aggressively "caches" these top-level entries and invlpg
does not actually affect the CPU's copy. If we clear one we *HAVE* to
do a full TLB flush, otherwise we might continue using a freed pmd page.
(note, we do this properly on the population side in pud_populate()).

This patch tracks whenever we clear one of these entries in the 'struct
mmu_gather', and ensures that we follow up with a full tlb flush.

BTW, I disassembled and checked that:

if (tlb->fullmm == 0)
and
if (!tlb->fullmm && !tlb->need_flush_all)

generate essentially the same code, so there should be zero impact there
to the !PAE case.

Signed-off-by: Dave Hansen
Cc: Peter Anvin
Cc: Ingo Molnar
Cc: Artem S Tashkinov
Signed-off-by: Linus Torvalds

Dave Hansen
2013-04-13 07:56:47 +0800

12 Apr, 2013

2 commits

26564600c x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set ... Browse Code »

When CONFIG_DEBUG_PAGEALLOC is set page table updates made by
kernel_map_pages() are not made visible (via TLB flush)
immediately if lazy MMU is on. In environments that support lazy
MMU (e.g. Xen) this may lead to fatal page faults, for example,
when zap_pte_range() needs to allocate pages in
__tlb_remove_page() -> tlb_next_batch().

Signed-off-by: Boris Ostrovsky
Cc: konrad.wilk@oracle.com
Link: http://lkml.kernel.org/r/1365703192-2089-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Ingo Molnar

Boris Ostrovsky
2013-04-12 13:19:19 +0800
18699739b x86/mm/cpa/selftest: Fix false positive in CPA self test ... Browse Code »

If the pmd is not present, _PAGE_PSE will not be set anymore.
Fix the false positive.

Reported-by: Ingo Molnar
Signed-off-by: Andrea Arcangeli
Cc: Stefan Bader
Cc: Andy Whitcroft
Cc: Mel Gorman
Cc: Borislav Petkov
Link: http://lkml.kernel.org/r/1365687369-30802-1-git-send-email-aarcange@redhat.com
Signed-off-by: Ingo Molnar

Andrea Arcangeli
2013-04-12 12:39:20 +0800

11 Apr, 2013

3 commits

f76cfa3c2 x86/mm/cpa: Convert noop to functional fix ... Browse Code »

Commit:

a8aed3e0752b ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge")

introduced a valid fix but one location that didn't trigger the bug that
lead to finding those (small) problems, wasn't updated using the
right variable.

The wrong variable was also initialized for no good reason, that
may have been the source of the confusion. Remove the noop
initialization accordingly.

Commit a8aed3e0752b also erroneously removed one canon_pgprot pass meant
to clear pmd bitflags not supported in hardware by older CPUs, that
automatically gets corrected by this patch too by applying it to the right
variable in the new location.

Reported-by: Stefan Bader
Signed-off-by: Andrea Arcangeli
Acked-by: Borislav Petkov
Cc: Andy Whitcroft
Cc: Mel Gorman
Link: http://lkml.kernel.org/r/1365600505-19314-1-git-send-email-aarcange@redhat.com
Signed-off-by: Ingo Molnar

Andrea Arcangeli
2013-04-11 16:34:42 +0800
1160c2779 x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates ... Browse Code »

In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops
when lazy MMU updates are enabled, because set_pgd effects are being
deferred.

One instance of this problem is during process mm cleanup with memory
cgroups enabled. The chain of events is as follows:

- zap_pte_range enables lazy MMU updates
- zap_pte_range eventually calls mem_cgroup_charge_statistics,
which accesses the vmalloc'd mem_cgroup per-cpu stat area
- vmalloc_fault is triggered which tries to sync the corresponding
PGD entry with set_pgd, but the update is deferred
- vmalloc_fault oopses due to a mismatch in the PUD entries

The OOPs usually looks as so:

------------[ cut here ]------------
kernel BUG at arch/x86/mm/fault.c:396!
invalid opcode: 0000 [#1] SMP
.. snip ..
CPU 1
Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1
RIP: e030:[] [] vmalloc_fault+0x11f/0x208
.. snip ..
Call Trace:
[] do_page_fault+0x399/0x4b0
[] ? xen_mc_extend_args+0xec/0x110
[] page_fault+0x25/0x30
[] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50
[] __mem_cgroup_uncharge_common+0xd8/0x350
[] mem_cgroup_uncharge_page+0x57/0x60
[] page_remove_rmap+0xe0/0x150
[] ? vm_normal_page+0x1a/0x80
[] unmap_single_vma+0x531/0x870
[] unmap_vmas+0x52/0xa0
[] ? pte_mfn_to_pfn+0x72/0x100
[] exit_mmap+0x98/0x170
[] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
[] mmput+0x83/0xf0
[] exit_mm+0x104/0x130
[] do_exit+0x15a/0x8c0
[] do_group_exit+0x3f/0xa0
[] sys_exit_group+0x17/0x20
[] system_call_fastpath+0x16/0x1b

Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the
changes visible to the consistency checks.

Cc:
RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737
Tested-by: Josh Boyer
Reported-and-Tested-by: Krishna Raman
Signed-off-by: Samu Kallio
Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com
Tested-by: Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: H. Peter Anvin

Samu Kallio
2013-04-11 02:25:07 +0800
7e9a2f0a0 x86/mm/numa: Simplify some bit mangling ... Browse Code »

Minor. Reordered a few lines to lose a superfluous OR operation.

Signed-off-by: Martin Bundgaard
Link: http://lkml.kernel.org/r/1363286075-62615-1-git-send-email-martin@mindflux.org
Signed-off-by: Ingo Molnar

Martin Bundgaard
2013-04-11 01:06:26 +0800

10 Apr, 2013

1 commit

5952886bf x86/mm/cpa: Cleanup split_large_page() and its callee ... Browse Code »

So basically we're generating the pte_t * from a struct page and
we're handing it down to the __split_large_page() internal version
which then goes and gets back struct page * from it because it
needs it.

Change the caller to hand down struct page * directly and the
callee can compute the pte_t itself.

Net save is one virt_to_page() call and simpler code. While at
it, make __split_large_page() static.

Signed-off-by: Borislav Petkov
Acked-by: Thomas Gleixner
Link: http://lkml.kernel.org/r/1363886217-24703-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar

Borislav Petkov
2013-04-10 20:39:08 +0800