12 Nov, 2013

3 commits

  • Pull devicetree updates from Rob Herring:
    "DeviceTree updates for 3.13. This is a bit larger pull request than
    usual for this cycle with lots of clean-up.

    - Cross arch clean-up and consolidation of early DT scanning code.
    - Clean-up and removal of arch prom.h headers. Makes arch specific
    prom.h optional on all but Sparc.
    - Addition of interrupts-extended property for devices connected to
    multiple interrupt controllers.
    - Refactoring of DT interrupt parsing code in preparation for
    deferred probe of interrupts.
    - ARM cpu and cpu topology bindings documentation.
    - Various DT vendor binding documentation updates"

    * tag 'devicetree-for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (82 commits)
    powerpc: add missing explicit OF includes for ppc
    dt/irq: add empty of_irq_count for !OF_IRQ
    dt: disable self-tests for !OF_IRQ
    of: irq: Fix interrupt-map entry matching
    MIPS: Netlogic: replace early_init_devtree() call
    of: Add Panasonic Corporation vendor prefix
    of: Add Chunghwa Picture Tubes Ltd. vendor prefix
    of: Add AU Optronics Corporation vendor prefix
    of/irq: Fix potential buffer overflow
    of/irq: Fix bug in interrupt parsing refactor.
    of: set dma_mask to point to coherent_dma_mask
    of: add vendor prefix for PHYTEC Messtechnik GmbH
    DT: sort vendor-prefixes.txt
    of: Add vendor prefix for Cadence
    of: Add empty for_each_available_child_of_node() macro definition
    arm/versatile: Fix versatile irq specifications.
    of/irq: create interrupts-extended property
    microblaze/pci: Drop PowerPC-ism from irq parsing
    of/irq: Create of_irq_parse_and_map_pci() to consolidate arch code.
    of/irq: Use irq_of_parse_and_map()
    ...

    Linus Torvalds
     
  • Pull metag architecture changes from James Hogan:
    - A change to remove the last dependence on bootloader exception
    handlers so that QEMU can more easily boot an SMP Linux/Meta kernel
    image directly.
    - A fix for a minor off by one error in a BUG_ON condition found by Dan
    Carpenter.

    * tag 'metag-for-v3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
    metag: off by one in setup_bootmem_node()
    metag: handle low level kicks directly

    Linus Torvalds
     
  • Pull scheduler changes from Ingo Molnar:
    "The main changes in this cycle are:

    - (much) improved CONFIG_NUMA_BALANCING support from Mel Gorman, Rik
    van Riel, Peter Zijlstra et al. Yay!

    - optimize preemption counter handling: merge the NEED_RESCHED flag
    into the preempt_count variable, by Peter Zijlstra.

    - wait.h fixes and code reorganization from Peter Zijlstra

    - cfs_bandwidth fixes from Ben Segall

    - SMP load-balancer cleanups from Peter Zijstra

    - idle balancer improvements from Jason Low

    - other fixes and cleanups"

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits)
    ftrace, sched: Add TRACE_FLAG_PREEMPT_RESCHED
    stop_machine: Fix race between stop_two_cpus() and stop_cpus()
    sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus
    sched: Fix asymmetric scheduling for POWER7
    sched: Move completion code from core.c to completion.c
    sched: Move wait code from core.c to wait.c
    sched: Move wait.c into kernel/sched/
    sched/wait: Fix __wait_event_interruptible_lock_irq_timeout()
    sched: Avoid throttle_cfs_rq() racing with period_timer stopping
    sched: Guarantee new group-entities always have weight
    sched: Fix hrtimer_cancel()/rq->lock deadlock
    sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining
    sched: Fix race on toggling cfs_bandwidth_used
    sched: Remove extra put_online_cpus() inside sched_setaffinity()
    sched/rt: Fix task_tick_rt() comment
    sched/wait: Fix build breakage
    sched/wait: Introduce prepare_to_wait_event()
    sched/wait: Add ___wait_cond_timeout() to wait_event*_timeout() too
    sched: Remove get_online_cpus() usage
    sched: Fix race in migrate_swap_stop()
    ...

    Linus Torvalds
     

08 Nov, 2013

1 commit


06 Nov, 2013

1 commit

  • Kick interrupts trigger the LWK (low level kick) signal, usually handled
    by the __TBIDoStdLWK() function which is the only handler inherited from
    the bootloader. The LWK signal is converted either to a SWK (plain
    software kick) or a SWS (software kick with an attached message).

    Linux has kick_handler() to handle SWK and call registered kick handlers
    (IPIs and inter-thread comms), but SWS is as far as I'm aware unused
    with Linux.

    Therefore remove that abstraction and have Linux handle LWK directly.
    This will reduce kick latency slightly, and reduce our dependence on the
    bootloader, which makes it easier to directly boot a kernel in QEMU
    (particularly for SMP).

    Signed-off-by: James Hogan

    James Hogan
     

10 Oct, 2013

7 commits

  • Now that prom.h is optional, all the empty prom.h headers can be removed.

    Signed-off-by: Rob Herring
    Acked-by: Vineet Gupta
    Acked-by: Catalin Marinas
    Acked-by: Grant Likely
    Cc: Will Deacon
    Cc: Mark Salter
    Cc: Aurelien Jacquiot
    Cc: James Hogan
    Cc: Jonas Bonn
    Cc: Chris Zankel
    Cc: Max Filippov

    Rob Herring
     
  • HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc,
    but it is only used for /proc/device-teee and sparc does not enable
    /proc/device-tree. So this option is redundant. Remove the option and
    always enable it. This has the side effect of fixing /proc/device-tree
    on arches such as arm64 which failed to define this option.

    Signed-off-by: Rob Herring
    Acked-by: Vineet Gupta
    Acked-by: Grant Likely
    Cc: Russell King
    Cc: James Hogan
    Cc: Michal Simek
    Cc: Jonas Bonn
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: x86@kernel.org
    Cc: Chris Zankel
    Cc: Max Filippov

    Rob Herring
     
  • Move setup_machine_fdt out of prom.h and into asm/setup.h in preparation
    to remove prom.h.

    Signed-off-by: Rob Herring
    Acked-by: Grant Likely
    Cc: James Hogan
    Cc: linux-metag@vger.kernel.org

    Rob Herring
     
  • Convert metag to use the common of_flat_dt_get_machine_name function.

    Signed-off-by: Rob Herring
    [james.hogan: fix missing arch_get_next_mach and const mismatch]
    Reported-by: Guenter Roeck
    Signed-off-by: James Hogan

    Rob Herring
     
  • All arches do essentially the same thing now for
    early_init_dt_setup_initrd_arch, so it can now be removed.

    Signed-off-by: Rob Herring
    Acked-by: Vineet Gupta
    Cc: Russell King
    Cc: Mark Salter
    Cc: Aurelien Jacquiot
    Cc: James Hogan
    Cc: Michal Simek
    Cc: Ralf Baechle
    Cc: Jonas Bonn
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: x86@kernel.org
    Cc: Chris Zankel
    Cc: Max Filippov
    Acked-by: Grant Likely

    Rob Herring
     
  • Convert metag to use new early_init_dt_scan function.

    Signed-off-by: Rob Herring
    Cc: James Hogan

    Rob Herring
     
  • Use the common unflatten_and_copy_device_tree to copy the built-in FDT
    out of init section.

    Signed-off-by: Rob Herring
    Cc: James Hogan

    Rob Herring
     

01 Oct, 2013

1 commit

  • All arch overriden implementations of do_softirq() share the following
    common code: disable irqs (to avoid races with the pending check),
    check if there are softirqs pending, then execute __do_softirq() on
    a specific stack.

    Consolidate the common parts such that archs only worry about the
    stack switch.

    Acked-by: Linus Torvalds
    Signed-off-by: Frederic Weisbecker
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Paul Mackerras
    Cc: James Hogan
    Cc: James E.J. Bottomley
    Cc: Helge Deller
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: David S. Miller
    Cc: Andrew Morton

    Frederic Weisbecker
     

25 Sep, 2013

1 commit

  • In order to prepare to per-arch implementations of preempt_count move
    the required bits into an asm-generic header and use this for all
    archs.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

20 Sep, 2013

2 commits

  • This patch builds on patch 2 and periodically decays that max value to
    do idle balancing per sched domain by approximately 1% per second. Also
    decay the rq's max_idle_balance_cost value.

    Signed-off-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1379096813-3032-4-git-send-email-jason.low2@hp.com
    Signed-off-by: Ingo Molnar

    Jason Low
     
  • In this patch, we keep track of the max cost we spend doing idle load balancing
    for each sched domain. If the avg time the CPU remains idle is less then the
    time we have already spent on idle balancing + the max cost of idle balancing
    in the sched domain, then we don't continue to attempt the balance. We also
    keep a per rq variable, max_idle_balance_cost, which keeps track of the max
    time spent on newidle load balances throughout all its domains so that we can
    determine the avg_idle's max value.

    By using the max, we avoid overrunning the average. This further reduces the
    chance we attempt balancing when the CPU is not idle for longer than the cost
    to balance.

    Signed-off-by: Jason Low
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1379096813-3032-3-git-send-email-jason.low2@hp.com
    Signed-off-by: Ingo Molnar

    Jason Low
     

13 Sep, 2013

2 commits

  • After the last architecture switched to generic hard irqs the config
    options HAVE_GENERIC_HARDIRQS & GENERIC_HARDIRQS and the related code
    for !CONFIG_GENERIC_HARDIRQS can be removed.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     
  • Unlike global OOM handling, memory cgroup code will invoke the OOM killer
    in any OOM situation because it has no way of telling faults occuring in
    kernel context - which could be handled more gracefully - from
    user-triggered faults.

    Pass a flag that identifies faults originating in user space from the
    architecture-specific fault handlers to generic code so that memcg OOM
    handling can be improved.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Cc: David Rientjes
    Cc: KAMEZAWA Hiroyuki
    Cc: azurIt
    Cc: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

12 Sep, 2013

1 commit

  • Currently hugepage migration works well only for pmd-based hugepages
    (mainly due to lack of testing,) so we had better not enable migration of
    other levels of hugepages until we are ready for it.

    Some users of hugepage migration (mbind, move_pages, and migrate_pages) do
    page table walk and check pud/pmd_huge() there, so they are safe. But the
    other users (softoffline and memory hotremove) don't do this, so without
    this patch they can try to migrate unexpected types of hugepages.

    To prevent this, we introduce hugepage_migration_support() as an
    architecture dependent check of whether hugepage are implemented on a pmd
    basis or not. And on some architecture multiple sizes of hugepages are
    available, so hugepage_migration_support() also checks hugepage size.

    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Hillf Danton
    Cc: Wanpeng Li
    Cc: Mel Gorman
    Cc: Hugh Dickins
    Cc: KOSAKI Motohiro
    Cc: Michal Hocko
    Cc: Rik van Riel
    Cc: "Aneesh Kumar K.V"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

11 Sep, 2013

1 commit

  • Pull device tree core updates from Grant Likely:
    "Generally minor changes. A bunch of bug fixes, particularly for
    initialization and some refactoring. Most notable change if feeding
    the entire flattened tree into the random pool at boot. May not be
    significant, but shouldn't hurt either"

    Tim Bird questions whether the boot time cost of the random feeding may
    be noticeable. And "add_device_randomness()" is definitely not some
    speed deamon of a function.

    * tag 'devicetree-for-linus' of git://git.secretlab.ca/git/linux:
    of/platform: add error reporting to of_amba_device_create()
    irq/of: Fix comment typo for irq_of_parse_and_map
    of: Feed entire flattened device tree into the random pool
    of/fdt: Clean up casting in unflattening path
    of/fdt: Remove duplicate memory clearing on FDT unflattening
    gpio: implement gpio-ranges binding document fix
    of: call __of_parse_phandle_with_args from of_parse_phandle
    of: introduce of_parse_phandle_with_fixed_args
    of: move of_parse_phandle()
    of: move documentation of of_parse_phandle_with_args
    of: Fix missing memory initialization on FDT unflattening
    of: consolidate definition of early_init_dt_alloc_memory_arch()
    of: Make of_get_phy_mode() return int i.s.o. const int
    include: dt-binding: input: create a DT header defining key codes.
    of/platform: Staticize of_platform_device_create_pdata()
    of: Specify initrd location using 64-bit
    dt: Typo fix
    OF: make of_property_for_each_{u32|string}() use parameters if OF is not enabled

    Linus Torvalds
     

24 Jul, 2013

1 commit

  • On some PAE architectures, the entire range of physical memory could reside
    outside the 32-bit limit. These systems need the ability to specify the
    initrd location using 64-bit numbers.

    This patch globally modifies the early_init_dt_setup_initrd_arch() function to
    use 64-bit numbers instead of the current unsigned long.

    There has been quite a bit of debate about whether to use u64 or phys_addr_t.
    It was concluded to stick to u64 to be consistent with rest of the device
    tree code. As summarized by Geert, "The address to load the initrd is decided
    by the bootloader/user and set at that point later in time. The dtb should not
    be tied to the kernel you are booting"

    More details on the discussion can be found here:
    https://lkml.org/lkml/2013/6/20/690
    https://lkml.org/lkml/2012/9/13/544

    Signed-off-by: Santosh Shilimkar
    Acked-by: Rob Herring
    Acked-by: Vineet Gupta
    Acked-by: Jean-Christophe PLAGNIOL-VILLARD
    Signed-off-by: Grant Likely

    Santosh Shilimkar
     

22 Jul, 2013

3 commits


15 Jul, 2013

1 commit

  • The __cpuinit type of throwaway sections might have made sense
    some time ago when RAM was more constrained, but now the savings
    do not offset the cost and complications. For example, the fix in
    commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
    is a good example of the nasty type of bugs that can be created
    with improper use of the various __init prefixes.

    After a discussion on LKML[1] it was decided that cpuinit should go
    the way of devinit and be phased out. Once all the users are gone,
    we can then finally remove the macros themselves from linux/init.h.

    Note that some harmless section mismatch warnings may result, since
    notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
    are flagged as __cpuinit -- so if we remove the __cpuinit from
    arch specific callers, we will also get section mismatch warnings.
    As an intermediate step, we intend to turn the linux/init.h cpuinit
    content into no-ops as early as possible, since that will get rid
    of these warnings. In any case, they are temporary and harmless.

    This removes all the arch/metag uses of the __cpuinit macros from
    all C files. Currently metag does not have any __CPUINIT used in
    assembly files.

    [1] https://lkml.org/lkml/2013/5/20/589

    Cc: James Hogan
    Acked-by: James Hogan
    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

11 Jul, 2013

1 commit


10 Jul, 2013

1 commit

  • A few remaining architectures directly kill the page faulting task in an
    out of memory situation. This is usually not a good idea since that
    task might not even use a significant amount of memory and so may not be
    the optimal victim to resolve the situation.

    Since 2.6.29's 1c0fe6e ("mm: invoke oom-killer from page fault") there
    is a hook that architecture page fault handlers are supposed to call to
    invoke the OOM killer and let it pick the right task to kill. Convert
    the remaining architectures over to this hook.

    To have the previous behavior of simply taking out the faulting task the
    vm.oom_kill_allocating_task sysctl can be set to 1.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Michal Hocko
    Cc: KAMEZAWA Hiroyuki
    Acked-by: David Rientjes
    Acked-by: Vineet Gupta [arch/arc bits]
    Cc: James Hogan
    Cc: David Howells
    Cc: Jonas Bonn
    Cc: Chen Liqin
    Cc: Lennox Wu
    Cc: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

09 Jul, 2013

1 commit

  • In csum_tcpudp_nofold, add 1 if the carry bit is set after adding the
    destination IP address (32 bits) to the checksum (16 bits).

    The lack of carry handling for this particular addition meant that a
    destination address of *.*.255.255 (e.g. certain broadcasts) sometimes
    resulted in an incorrect checksum. This bug has been present in the Meta
    port since the code was written in the 2.4 days.

    Reported-by: Marcin Nowakowski
    Signed-off-by: James Hogan

    James Hogan
     

07 Jul, 2013

1 commit

  • Pull Metag architecture changes from James Hogan:
    - Infrastructure and DT files for TZ1090 SoC (pin control drivers
    already merged via pinctrl tree).
    - Panic on boot instead of just warning if cache aliasing possible.
    - Various SMP/hotplug fixes.
    - Various other randconfig/sparse fixes.

    * tag 'metag-for-v3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: (24 commits)
    metag: move EXPORT_SYMBOL(csum_partial) to metag_ksyms.c
    metag: cpu hotplug: route_irq: preserve irq mask
    metag: kick: add missing irq_enter/exit to kick_handler()
    metag: smp: don't spin waiting for CPU to start
    metag: smp: enable irqs after set_cpu_online
    metag: use clear_tasks_mm_cpumask()
    metag: tz1090: select and instantiate pinctrl-tz1090-pdc
    metag: tz1090: select and instantiate pinctrl-tz1090
    metag: don't check for cache aliasing on smp cpu boot
    metag: panic if cache aliasing possible
    metag: *.dts: include using preprocessor
    metag: add symlink
    metag/.gitignore: Extend the *.dtb pattern to match the dtb.S files
    metag/traps: include setup.h for the per_cpu_trap_init declaration
    metag/traps: Mark die() as __noreturn to match the declaration.
    metag/processor.h: Add missing cpuinfo_op declaration.
    metag/setup: Restrict scope for the capabilities variable
    metag/mm/cache: Restrict scope for metag_lnkget_probe
    metag/asm/irq.h: Declare init_IRQ
    metag/kernel/irq.c: Declare root_domain as static
    ...

    Linus Torvalds
     

05 Jul, 2013

2 commits

  • Merge Kconfig menu diet patches from Dave Hansen:
    "I think the "Kernel Hacking" menu has gotten a bit out of hand. It is
    over 120 lines long on my system with everything enabled and options
    are scattered around it haphazardly.

    http://sr71.net/~dave/linux/kconfig-horror.png

    Let's try to introduce some sanity. This set takes that 120 lines
    down to 55 and makes it vastly easier to find some things. It's a
    start.

    This set stands on its own, but there is plenty of room for follow-up
    patches. The arch-specific debug options still end up getting stuck
    in the top-level "kernel hacking" menu. OPTIMIZE_INLINING, for
    instance, could obviously go in to the "compiler options" menu, but
    the fact that it is defined in arch/ in a separate Kconfig file keeps
    it on its own for the moment.

    The Signed-off-by's in here look funky. I changed employers while
    working on this set, so I have signoffs from both email addresses"

    * emailed patches from Dave Hansen :
    hang and lockup detection menu
    kconfig: consolidate printk options
    group locking debugging options
    consolidate compilation option configs
    consolidate runtime testing configs
    order memory debugging Kconfig options
    consolidate per-arch stack overflow debugging options

    Linus Torvalds
     
  • Original posting:

    http://lkml.kernel.org/r/20121214184202.F54094D9@kernel.stglabs.ibm.com

    Several architectures have similar stack debugging config options.
    They all pretty much do the same thing, some with slightly
    differing help text.

    This patch changes the architectures to instead enable a Kconfig
    boolean, and then use that boolean in the generic Kconfig.debug
    to present the actual menu option. This removes a bunch of
    duplication and adds consistency across arches.

    Signed-off-by: Dave Hansen
    Reviewed-by: H. Peter Anvin
    Reviewed-by: James Hogan
    Acked-by: Chris Metcalf [for tile]
    Signed-off-by: Dave Hansen
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

04 Jul, 2013

6 commits

  • Move EXPORT_SYMBOL(csum_partial) from lib/checksum.c into metag_ksyms.c
    so that it doesn't get omitted by the static linker if it's not used by
    any other statically linked code, which can result in undefined symbols
    when building modules.

    For example a randconfig caused the following error:
    ERROR: "csum_partial" [fs/reiserfs/reiserfs.ko] undefined!

    Signed-off-by: James Hogan

    James Hogan
     
  • Prepare for killing free_all_bootmem_node() by using free_all_bootmem().

    Signed-off-by: Jiang Liu
    Cc: James Hogan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Prepare for removing num_physpages and simplify mem_init().

    Signed-off-by: Jiang Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Concentrate code to modify totalram_pages into the mm core, so the arch
    memory initialized code doesn't need to take care of it. With these
    changes applied, only following functions from mm core modify global
    variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
    free_all_bootmem_node(), adjust_managed_page_count().

    With this patch applied, it will be much more easier for us to keep
    totalram_pages and zone->managed_pages in consistence.

    Signed-off-by: Jiang Liu
    Acked-by: David Howells
    Cc: "H. Peter Anvin"
    Cc: "Michael S. Tsirkin"
    Cc:
    Cc: Arnd Bergmann
    Cc: Catalin Marinas
    Cc: Chris Metcalf
    Cc: Geert Uytterhoeven
    Cc: Ingo Molnar
    Cc: Jeremy Fitzhardinge
    Cc: Jianguo Wu
    Cc: Joonsoo Kim
    Cc: Kamezawa Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Marek Szyprowski
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Cc: Minchan Kim
    Cc: Rik van Riel
    Cc: Rusty Russell
    Cc: Tang Chen
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wen Congyang
    Cc: Will Deacon
    Cc: Yasuaki Ishimatsu
    Cc: Yinghai Lu
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Commit "mm: introduce new field 'managed_pages' to struct zone" assumes
    that all highmem pages will be freed into the buddy system by function
    mem_init(). But that's not always true, some architectures may reserve
    some highmem pages during boot. For example PPC may allocate highmem
    pages for giagant HugeTLB pages, and several architectures have code to
    check PageReserved flag to exclude highmem pages allocated during boot
    when freeing highmem pages into the buddy system.

    So treat highmem pages in the same way as normal pages, that is to:
    1) reset zone->managed_pages to zero in mem_init().
    2) recalculate managed_pages when freeing pages into the buddy system.

    Signed-off-by: Jiang Liu
    Cc: "H. Peter Anvin"
    Cc: Tejun Heo
    Cc: Joonsoo Kim
    Cc: Yinghai Lu
    Cc: Mel Gorman
    Cc: Minchan Kim
    Cc: Kamezawa Hiroyuki
    Cc: Marek Szyprowski
    Cc: "Michael S. Tsirkin"
    Cc:
    Cc: Arnd Bergmann
    Cc: Catalin Marinas
    Cc: Chris Metcalf
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Ingo Molnar
    Cc: Jeremy Fitzhardinge
    Cc: Jianguo Wu
    Cc: Konrad Rzeszutek Wilk
    Cc: Michel Lespinasse
    Cc: Rik van Riel
    Cc: Rusty Russell
    Cc: Tang Chen
    Cc: Thomas Gleixner
    Cc: Wen Congyang
    Cc: Will Deacon
    Cc: Yasuaki Ishimatsu
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     
  • Change signature of free_reserved_area() according to Russell King's
    suggestion to fix following build warnings:

    arch/arm/mm/init.c: In function 'mem_init':
    arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
    free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
    ^
    In file included from include/linux/mman.h:4:0,
    from arch/arm/mm/init.c:15:
    include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
    extern unsigned long free_reserved_area(unsigned long start, unsigned long end,

    mm/page_alloc.c: In function 'free_reserved_area':
    >> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
    In file included from arch/mips/include/asm/page.h:49:0,
    from include/linux/mmzone.h:20,
    from include/linux/gfp.h:4,
    from include/linux/mm.h:8,
    from mm/page_alloc.c:18:
    arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
    mm/page_alloc.c: In function 'free_area_init_nodes':
    mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]

    Also address some minor code review comments.

    Signed-off-by: Jiang Liu
    Reported-by: Arnd Bergmann
    Cc: "H. Peter Anvin"
    Cc: "Michael S. Tsirkin"
    Cc:
    Cc: Catalin Marinas
    Cc: Chris Metcalf
    Cc: David Howells
    Cc: Geert Uytterhoeven
    Cc: Ingo Molnar
    Cc: Jeremy Fitzhardinge
    Cc: Jianguo Wu
    Cc: Joonsoo Kim
    Cc: Kamezawa Hiroyuki
    Cc: Konrad Rzeszutek Wilk
    Cc: Marek Szyprowski
    Cc: Mel Gorman
    Cc: Michel Lespinasse
    Cc: Minchan Kim
    Cc: Rik van Riel
    Cc: Rusty Russell
    Cc: Tang Chen
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wen Congyang
    Cc: Will Deacon
    Cc: Yasuaki Ishimatsu
    Cc: Yinghai Lu
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiang Liu
     

03 Jul, 2013

2 commits

  • Pull perf updates from Ingo Molnar:
    "Kernel improvements:

    - watchdog driver improvements by Li Zefan
    - Power7 CPI stack events related improvements by Sukadev Bhattiprolu
    - event multiplexing via hrtimers and other improvements by Stephane
    Eranian
    - kernel stack use optimization by Andrew Hunter
    - AMD IOMMU uncore PMU support by Suravee Suthikulpanit
    - NMI handling rate-limits by Dave Hansen
    - various hw_breakpoint fixes by Oleg Nesterov
    - hw_breakpoint overflow period sampling and related signal handling
    fixes by Jiri Olsa
    - Intel Haswell PMU support by Andi Kleen

    Tooling improvements:

    - Reset SIGTERM handler in workload child process, fix from David
    Ahern.
    - Makefile reorganization, prep work for Kconfig patches, from Jiri
    Olsa.
    - Add automated make test suite, from Jiri Olsa.
    - Add --percent-limit option to 'top' and 'report', from Namhyung
    Kim.
    - Sorting improvements, from Namhyung Kim.
    - Expand definition of sysfs format attribute, from Michael Ellerman.

    Tooling fixes:

    - 'perf tests' fixes from Jiri Olsa.
    - Make Power7 CPI stack events available in sysfs, from Sukadev
    Bhattiprolu.
    - Handle death by SIGTERM in 'perf record', fix from David Ahern.
    - Fix printing of perf_event_paranoid message, from David Ahern.
    - Handle realloc failures in 'perf kvm', from David Ahern.
    - Fix divide by 0 in variance, from David Ahern.
    - Save parent pid in thread struct, from David Ahern.
    - Handle JITed code in shared memory, from Andi Kleen.
    - Fixes for 'perf diff', from Jiri Olsa.
    - Remove some unused struct members, from Jiri Olsa.
    - Add missing liblk.a dependency for python/perf.so, fix from Jiri
    Olsa.
    - Respect CROSS_COMPILE in liblk.a, from Rabin Vincent.
    - No need to do locking when adding hists in perf report, only 'top'
    needs that, from Namhyung Kim.
    - Fix alignment of symbol column in in the hists browser (top,
    report) when -v is given, from NAmhyung Kim.
    - Fix 'perf top' -E option behavior, from Namhyung Kim.
    - Fix bug in isupper() and islower(), from Sukadev Bhattiprolu.
    - Fix compile errors in bp_signal 'perf test', from Sukadev
    Bhattiprolu.

    ... and more things"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (102 commits)
    perf/x86: Disable PEBS-LL in intel_pmu_pebs_disable()
    perf/x86: Fix shared register mutual exclusion enforcement
    perf/x86/intel: Support full width counting
    x86: Add NMI duration tracepoints
    perf: Drop sample rate when sampling is too slow
    x86: Warn when NMI handlers take large amounts of time
    hw_breakpoint: Introduce "struct bp_cpuinfo"
    hw_breakpoint: Simplify *register_wide_hw_breakpoint()
    hw_breakpoint: Introduce cpumask_of_bp()
    hw_breakpoint: Simplify the "weight" usage in toggle_bp_slot() paths
    hw_breakpoint: Simplify list/idx mess in toggle_bp_slot() paths
    perf/x86/intel: Add mem-loads/stores support for Haswell
    perf/x86/intel: Support Haswell/v4 LBR format
    perf/x86/intel: Move NMI clearing to end of PMI handler
    perf/x86/intel: Add Haswell PEBS support
    perf/x86/intel: Add simple Haswell PMU support
    perf/x86/intel: Add Haswell PEBS record support
    perf/x86/intel: Fix sparse warning
    perf/x86/amd: AMD IOMMU Performance Counter PERF uncore PMU implementation
    perf/x86/amd: Add IOMMU Performance Counter resource management
    ...

    Linus Torvalds
     
  • Pull VFS patches (part 1) from Al Viro:
    "The major change in this pile is ->readdir() replacement with
    ->iterate(), dealing with ->f_pos races in ->readdir() instances for
    good.

    There's a lot more, but I'd prefer to split the pull request into
    several stages and this is the first obvious cutoff point."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (67 commits)
    [readdir] constify ->actor
    [readdir] ->readdir() is gone
    [readdir] convert ecryptfs
    [readdir] convert coda
    [readdir] convert ocfs2
    [readdir] convert fatfs
    [readdir] convert xfs
    [readdir] convert btrfs
    [readdir] convert hostfs
    [readdir] convert afs
    [readdir] convert ncpfs
    [readdir] convert hfsplus
    [readdir] convert hfs
    [readdir] convert befs
    [readdir] convert cifs
    [readdir] convert freevxfs
    [readdir] convert fuse
    [readdir] convert hpfs
    reiserfs: switch reiserfs_readdir_dentry to inode
    reiserfs: is_privroot_deh() needs only directory inode, actually
    ...

    Linus Torvalds
     

02 Jul, 2013

1 commit

  • The route_irq() function needs to preserve the irq mask by using the
    _irqsave/irqrestore variants of raw spin lock functions instead of the
    _irq variants. This is because it is called from __cpu_disable() (via
    migrate_irqs()), which is called with IRQs disabled, so using the _irq
    variants re-enables IRQs.

    This appears to have been causing occasional hits of the
    BUG_ON(!irqs_disabled()) in __irq_work_run() during CPU hotplug soak
    testing:
    BUG: failure at kernel/irq_work.c:122/__irq_work_run()!

    Signed-off-by: James Hogan
    Acked-by: Thomas Gleixner
    Reviewed-by: Srivatsa S. Bhat

    James Hogan