05 Mar, 2008

40 commits

  • My memcgroup patch to fix hang with shmem/tmpfs added NULL page handling to
    mem_cgroup_charge_common. It seemed convenient at the time, but hard to
    justify now: there's a perfectly appropriate swappage to charge and uncharge
    instead, this is not on any hot path through shmem_getpage, and no performance
    hit was observed from the slight extra overhead.

    So revert that NULL page handling from mem_cgroup_charge_common; and make it
    clearer by bringing page_cgroup_assign_new_page_cgroup into its body - that
    was a helper I found more of a hindrance to understanding.

    Signed-off-by: Hugh Dickins
    Cc: David Rientjes
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Replace free_hot_cold_page's VM_BUG_ON(page_get_page_cgroup(page)) by a "Bad
    page state" and clear: most users don't have CONFIG_DEBUG_VM on, and if it
    were set here, it'd likely cause corruption when the page is reused.

    Don't use page_assign_page_cgroup to clear it: that should be private to
    memcontrol.c, and always called with the lock taken; and memmap_init_zone
    doesn't need it either - like page->mapping and other pointers throughout the
    kernel, Linux assumes pointers in zeroed structures are NULL pointers.

    Instead use page_reset_bad_cgroup, added to memcontrol.h for this only.

    Signed-off-by: Hugh Dickins
    Cc: David Rientjes
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Page migration gave me free_hot_cold_page's VM_BUG_ON page->page_cgroup.
    remove_migration_pte was calling mem_cgroup_charge on the new page whenever it
    found a swap pte, before it had determined it to be a migration entry. That
    left a surplus reference count on the page_cgroup, so it was still attached
    when the page was later freed.

    Move that mem_cgroup_charge down to where we're sure it's a migration entry.
    We were already under i_mmap_lock or anon_vma->lock, so its GFP_KERNEL was
    already inappropriate: change that to GFP_ATOMIC.

    It's essential that remove_migration_pte removes all the migration entries,
    other crashes follow if not. So proceed even when the charge fails: normally
    it cannot, but after a mem_cgroup_force_empty it might - comment in the code.

    Signed-off-by: Hugh Dickins
    Cc: David Rientjes
    Cc: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Don't uncharge when do_swap_page's call to do_wp_page fails: the page which
    was charged for is there in the pagetable, and will be correctly uncharged
    when that area is unmapped - it was only its COWing which failed.

    And while we're here, remove earlier XXX comment: yes, OR in do_wp_page's
    return value (maybe VM_FAULT_WRITE) with do_swap_page's there; but if it
    fails, mask out success bits, which might confuse some arches e.g. sparc.

    Signed-off-by: Hugh Dickins
    Cc: David Rientjes
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • There's nothing wrong with mem_cgroup_charge failure in do_wp_page and
    do_anonymous page using __free_page, but it does look odd when nearby code
    uses page_cache_release: use that instead (while turning a blind eye to
    ancient inconsistencies of page_cache_release versus put_page).

    Signed-off-by: Hugh Dickins
    Cc: David Rientjes
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Each caller of mem_cgroup_move_lists is having to use page_get_page_cgroup:
    it's more convenient if it acts upon the page itself not the page_cgroup; and
    in a later patch this becomes important to handle within memcontrol.c.

    Signed-off-by: Hugh Dickins
    Cc: David Rientjes
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • vm_match_cgroup is a perverse name for a macro to match mm with cgroup: rename
    it mm_match_cgroup, matching mm_init_cgroup and mm_free_cgroup.

    Signed-off-by: Hugh Dickins
    Acked-by: David Rientjes
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Hirokazu Takahashi
    Cc: YAMAMOTO Takashi
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Wrap __mark_check_format() into an if(0) to make sure that parameters such as

    trace_mark(mm_page_alloc, "order %u pfn %lu", order, page?page_to_pfn(page):0);

    (where page_to_pfn() has side-effects) won't generate code because of the
    __mark_check_format().

    Thanks to Jan Kiszka for reporting this.

    Signed-off-by: Mathieu Desnoyers
    Cc: Jan Kiszka
    Cc: "Frank Ch. Eigler"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • get_marker() may return NULL, so test for it.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Jesper Juhl
    Acked-by: Mathieu Desnoyers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Juhl
     
  • Signed-off-by: Chris Dearman
    Signed-off-by: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Dearman
     
  • This just removes unused DEBUG_FORCEDAC define in the IOMMU code.

    Signed-off-by: FUJITA Tomonori
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • This patch makes the IOMMU code not allocate a memory area spanning LLD's
    segment boundary.

    is_span_boundary() judges whether a memory area spans LLD's segment boundary.
    If iommu_arena_find_pages() finds such a area, it tries to find the next
    available memory area.

    Signed-off-by: FUJITA Tomonori
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • iommu_arena_find_pages duplicates the code to access to the bitmap for free
    space management. This patch convert the IOMMU code to have only one place to
    access the bitmap, in the popular way that other IOMMUs (e.g. POWER and
    SPARC) do.

    This patch is preparation for modifications to fix the IOMMU segment boundary
    problem.

    Signed-off-by: FUJITA Tomonori
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • This patch is preparation for modifications to fix the IOMMU segment boundary
    problem.

    Signed-off-by: FUJITA Tomonori
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • Include falloc.h in header-y; it defines a flag for the fallocate sysctl.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Adrian Bunk reported another compile error with a SVN head GCC:

    ...
    CC arch/cris/arch-v10/lib/string.o
    /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/cris/arch-v10/lib/string.c:138:
    error: lvalue required as increment operand
    /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/cris/arch-v10/lib/string.c:138:
    error: lvalue required as increment operand
    /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/cris/arch-v10/lib/string.c:139:
    error: lvalue required as increment operand
    ...

    This is due to the use of the construct:

    *((long*)dst)++ = lc;

    Which isn't legal since casts don't return an lvalue.

    The solution is to import the implementation from newlib,
    which is continually autotested together with GCC mainline,
    and uses the construct:

    *(long *) dst = lc; dst += 4;

    Since this is an import of a file from newlib, I'm not touching
    the formatting or correcting any checkpatch errors.

    As for the earlier fix for memset.c, even if the two files for
    CRIS v10 and CRIS v32 are identical at the moment, it might
    be possible to tweak the CRIS v32 version.
    Thus, I'm not yet folding them into the same file, at least not
    until we've done some research on it.

    Signed-off-by: Jesper Nilsson
    Cc: Mikael Starvik
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jesper Nilsson
     
  • The Coverity checker spotted the following inconsequent NULL checking in
    drivers/char/pcmcia/ipwireless/network.c:ipwireless_network_packet_received()

    if (tty && channel_idx == IPW_CHANNEL_RAS
    && (network->ras_control_lines &
    IPW_CONTROL_LINE_DCD) != 0
    && ipwireless_tty_is_modem(tty)) {
    ...
    else
    ipwireless_tty_received(tty, data, length);

    Cc: Adrian Bunk
    Signed-off-by: David Sterba
    Signed-off-by: Jiri Kosina
    Cc: "John W. Linville"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Sterba
     
  • SM502 has a programmable PLL which can provide the panel pixel clock instead
    of the 288MHz and 336MHz PLLs.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Ville Syrjala
    Cc: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ville Syrjala
     
  • misc_div is a subset of px_div so eliminate the smaller table.

    Signed-off-by: Ville Syrjala
    Cc: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ville Syrjala
     
  • Vertical sync height register can only hold 6 bits. Fix the hsync start test
    to use > instead of >=. Also add a few clarifying comments.

    Signed-off-by: Ville Syrjala
    Acked-by: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ville Syrjala
     
  • Even though it may not be strictly necessary transp.offset should probably be
    0 when alpha channel is not available.

    Signed-off-by: Ville Syrjala
    Acked-by: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ville Syrjala
     
  • The RGB offsets were reversed in 16bpp modes. Simply trying to reverse the
    offsets when endianness differs is clearly the wrong thing to do but that is
    an issue for another patch.

    Signed-off-by: Ville Syrjala
    Acked-by: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ville Syrjala
     
  • The sm501fb palette code clearly does not handle direct color so change the
    driver to use true color visual for 16bpp.

    Signed-off-by: Ville Syrjala
    Acked-by: Magnus Damm
    Acked-by: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ville Syrjala
     
  • We should be able to do ndelay(some_u64), but that can cause a call to
    __divdi3() to be emitted because the ndelay() macros does a divide.

    Fix it by switching to static inline which will force the u64 arg to be
    treated as an unsigned long. udelay() takes an unsigned long arg.

    [bunk@kernel.org: reported m68k build breakage]
    Cc: Adrian Bunk
    Cc: Evgeniy Polyakov
    Cc: Martin Michlmayr
    Cc: Herbert Xu
    Cc: Ralf Baechle
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • The patch replaces dev_dbg() by dev_err(), so the user could actually see the
    error, instead of wondering why w1 doesn't work. The root cause of the bus
    reset error isn't yet debugged though, but this sometimes happens on iPaq
    H5555.

    And while I'm at it, some cosmetic cleanups also made (few lines were using
    spaces instead of tabs).

    Signed-off-by: Anton Vorontsov
    Acked-by: Evgeniy Polyakov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Vorontsov
     
  • On the error condition clk_get() returns ERR_PTR(..), so checking for NULL
    doesn't work. ds1wm module causes a kernel oops when ds1wm clock isn't
    registered.

    This patch converts NULL check to IS_ERR(), plus uses PTR_ERR()
    for the return code.

    Signed-off-by: Anton Vorontsov
    Acked-by: Evgeniy Polyakov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Vorontsov
     
  • Commit id 94f389485e27641348c1951ab8d65157122a8939 (Separate MPC52xx PSC FIOF
    regsiters from the rest of PSC) split the PSC fifo registers away from the
    core PSC regs. Doing so broke the mpc52xx_psc_spi driver.

    This patch teaches the mpc52xx_psc_spi driver about the new PSC fifo
    register definitions.

    Signed-off-by: Grant Likely
    Cc: David Brownell
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Kumar Gala
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Grant Likely
     
  • On my system, pkt_open() consumes 584 bytes because the compiler decides to
    inline lots of functions that would not normally be part of long call chains.
    The following patch fixes that problem on my system.

    Signed-off-by: Peter Osterlund
    Cc: Nix
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Osterlund
     
  • People are adding `noinline' in various places to prevent excess stack
    consumption due to gcc inlining. But once this is done, it is quite unobvious
    why the `noinline' is present in the code. We can comment each and every
    site, or we can use noinline_for_stack.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Correct error paths in probe function.

    The probe function enables mmio mode so it important to disable the mmio
    mode before exiting the probe function. Otherwise, the console is left in
    unusable state (garbled fonts at least, lock up at worst).

    [akpm@linux-foundation.org: cleanups]
    Signed-off-by: Krzysztof Helt
    Cc: "Antonino A. Daplas"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krzysztof Helt
     
  • Rename Memory Controller to Memory Resource Controller. Reflect the same
    changes in the CONFIG definition for the Memory Resource Controller. Group
    together the config options for Resource Counters and Memory Resource
    Controller.

    Signed-off-by: Balbir Singh
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Balbir Singh
     
  • Move kprobes examples from Documentation/kprobes.txt to under samples/.
    Patch originally by Randy Dunlap.

    o Updated the patch to apply on 2.6.25-rc3
    o Modified examples code to build on multiple architectures. Currently,
    the kprobe and jprobe examples code works for x86 and powerpc
    o Cleaned up unneeded #includes
    o Cleaned up Kconfig per Sam Ravnborg's suggestions to fix build break
    on archs that don't have kretprobes
    o Implemented suggestions by Mathieu Desnoyers on CONFIG_KRETPROBES
    o Included Andrew Morton's cleanup based on x86-git
    o Modified kretprobe_example to act as a arch-agnostic module to
    determine routine execution times:
    Use 'modprobe kretprobe_example func=' to determine
    execution time of func_name in nanoseconds.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Ananth N Mavinakayanahalli
    Acked-by: Mathieu Desnoyers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ananth N Mavinakayanahalli
     
  • Add CONFIG_HAVE_KRETPROBES to the arch//Kconfig file for relevant
    architectures with kprobes support. This facilitates easy handling of
    in-kernel modules (like samples/kprobes/kretprobe_example.c) that depend on
    kretprobes being present in the kernel.

    Thanks to Sam Ravnborg for helping make the patch more lean.

    Per Mathieu's suggestion, added CONFIG_KRETPROBES and fixed up dependencies.

    Signed-off-by: Ananth N Mavinakayanahalli
    Acked-by: Mathieu Desnoyers
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ananth N Mavinakayanahalli
     
  • VT notifier callbacks need to be aware of console switches. This is already
    partially done from console_callback(), but at that time fg_console, cursor
    positions, etc. are not yet updated and hence screen readers fetch the old
    values.

    This adds an update notify after all of the values are updated in
    redraw_screen(vc, 1).

    Signed-off-by: Samuel Thibault
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Samuel Thibault
     
  • Some oprofile results obtained while using tbench on a 2x2 cpu machine were
    very surprising.

    For example, loopback_xmit() function was using high number of cpu cycles
    to perform the statistic updates, supposed to be real cheap since they use
    percpu data

    pcpu_lstats = netdev_priv(dev);
    lb_stats = per_cpu_ptr(pcpu_lstats, smp_processor_id());
    lb_stats->packets++; /* HERE : serious contention */
    lb_stats->bytes += skb->len;

    struct pcpu_lstats is a small structure containing two longs. It appears
    that on my 32bits platform, alloc_percpu(8) allocates a single cache line,
    instead of giving to each cpu a separate cache line.

    Using the following patch gave me impressive boost in various benchmarks
    ( 6 % in tbench)
    (all percpu_counters hit this bug too)

    Long term fix (ie >= 2.6.26) would be to let each CPU allocate their own
    block of memory, so that we dont need to roudup sizes to L1_CACHE_BYTES, or
    merging the SGI stuff of course...

    Note : SLUB vs SLAB is important here to *show* the improvement, since they
    dont have the same minimum allocation sizes (8 bytes vs 32 bytes). This
    could very well explain regressions some guys reported when they switched
    to SLUB.

    Signed-off-by: Eric Dumazet
    Acked-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • Fix NULL pointer dereference in fsync_buffers_list() introduced by recent fix
    of races in private_list handling. Since bh->b_assoc_map has been cleared in
    __remove_assoc_queue() we should really use original value stored in the
    'mapping' variable.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • jiffies subtraction may cause an overflow problem. It should be using
    time_after().

    [akpm@linux-foundation.org: include jiffies.h]
    Signed-off-by: KOSAKI Motohiro
    Cc: Lee Schermerhorn
    Cc: Paul Jackson
    Cc: Mel Gorman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Keith Mannthey said:

    The parameter hotadd_percent is setup right but there is a "Malformed
    early option 'numa'" message.

    Rusty Russell said:

    This happens when the function registered with early_param() returns
    non-zero. __setup() functions return 1 if OK, module_param() and
    early_param() return 0 or a -ve error code.

    For instance:

    Linux version 2.6.25-rc3-t (raa@steel) (gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #22 SMP PREEMPT Tue Feb 26
    BIOS-provided physical RAM map:
    BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
    BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
    BIOS-e820: 0000000000100000 - 000000003fff0000 (usable)
    BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS)
    BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data)
    BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
    Malformed early option 'loglevel'
    127MB HIGHMEM available.
    896MB LOWMEM available.

    Command line:

    BOOT_IMAGE=2.6.25-t ro root=809 ro console=ttyS0,57600n8 console=tty0 loglevel=5

    Acked-by: Yinghai Lu
    Cc: Rusty Russell
    Cc: Keith Mannthey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Riesen
     
  • This makes the user_regset-based core dump code call user_regset writeback
    hooks when available. This is necessary groundwork to allow IA64 to set
    CORE_DUMP_USE_REGSET.

    Cc: Shaohua Li
    Signed-off-by: Roland McGrath
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • Signed-off-by: Balbir Singh
    Signed-off-by: Pavel Emelyanov
    Cc: Paul Menage
    Cc: KAMEZAWA Hiroyuki
    Cc: YAMAMOTO Takashi
    Cc: Hugh Dickins
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    akpm@linux-foundation.org