06 Feb, 2008

3 commits

  • Most pagecache (and some other) radix tree insertions have the great
    opportunity to preallocate a few nodes with relaxed gfp flags. But the
    preallocation is squandered when it comes time to allocate a node, we
    default to first attempting a GFP_ATOMIC allocation -- that doesn't
    normally fail, but it can eat into atomic memory reserves that we don't
    need to be using.

    Another upshot of this is that it removes the sometimes highly contended
    zone->lock from underneath tree_lock. Pagecache insertions are always
    performed with a radix tree preload, and after this change, such a
    situation will never fall back to kmem_cache_alloc within
    radix_tree_node_alloc.

    David Miller reports seeing this allocation fail on a highly threaded
    sparc64 system:

    [527319.459981] dd: page allocation failure. order:0, mode:0x20
    [527319.460403] Call Trace:
    [527319.460568] [00000000004b71e0] __slab_alloc+0x1b0/0x6a8
    [527319.460636] [00000000004b7bbc] kmem_cache_alloc+0x4c/0xa8
    [527319.460698] [000000000055309c] radix_tree_node_alloc+0x20/0x90
    [527319.460763] [0000000000553238] radix_tree_insert+0x12c/0x260
    [527319.460830] [0000000000495cd0] add_to_page_cache+0x38/0xb0
    [527319.460893] [00000000004e4794] mpage_readpages+0x6c/0x134
    [527319.460955] [000000000049c7fc] __do_page_cache_readahead+0x170/0x280
    [527319.461028] [000000000049cc88] ondemand_readahead+0x208/0x214
    [527319.461094] [0000000000496018] do_generic_mapping_read+0xe8/0x428
    [527319.461152] [0000000000497948] generic_file_aio_read+0x108/0x170
    [527319.461217] [00000000004badac] do_sync_read+0x88/0xd0
    [527319.461292] [00000000004bb5cc] vfs_read+0x78/0x10c
    [527319.461361] [00000000004bb920] sys_read+0x34/0x60
    [527319.461424] [0000000000406294] linux_sparc_syscall32+0x3c/0x40

    The calltrace is significant: __do_page_cache_readahead allocates a number
    of pages with GFP_KERNEL, and hence it should have reclaimed sufficient
    memory to satisfy GFP_ATOMIC allocations. However after the list of pages
    goes to mpage_readpages, there can be significant intervals (including disk
    IO) before all the pages are inserted into the radix-tree. So the reserves
    can easily be depleted at that point. The patch is confirmed to fix the
    problem.

    Signed-off-by: Nick Piggin
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • This patch makes swiotlb not allocate a memory area spanning LLD's segment
    boundary.

    is_span_boundary() judges whether a memory area spans LLD's segment boundary.
    If map_single finds such a area, map_single tries to find the next available
    memory area.

    Signed-off-by: FUJITA Tomonori
    Cc: James Bottomley
    Cc: Jens Axboe
    Cc: Greg KH
    Cc: Jeff Garzik
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • This adds IOMMU helper functions for the free area management. These
    functions take care of LLD's segment boundary limit for IOMMUs. They would be
    useful for IOMMUs that use bitmap for the free area management.

    Signed-off-by: FUJITA Tomonori
    Cc: Jeff Garzik
    Cc: James Bottomley
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     

04 Feb, 2008

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (79 commits)
    Jesper Juhl is the new trivial patches maintainer
    Documentation: mention email-clients.txt in SubmittingPatches
    fs/binfmt_elf.c: spello fix
    do_invalidatepage() comment typo fix
    Documentation/filesystems/porting fixes
    typo fixes in net/core/net_namespace.c
    typo fix in net/rfkill/rfkill.c
    typo fixes in net/sctp/sm_statefuns.c
    lib/: Spelling fixes
    kernel/: Spelling fixes
    include/scsi/: Spelling fixes
    include/linux/: Spelling fixes
    include/asm-m68knommu/: Spelling fixes
    include/asm-frv/: Spelling fixes
    fs/: Spelling fixes
    drivers/watchdog/: Spelling fixes
    drivers/video/: Spelling fixes
    drivers/ssb/: Spelling fixes
    drivers/serial/: Spelling fixes
    drivers/scsi/: Spelling fixes
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
    scsi: fix dependency bug in aic7 Makefile
    kbuild: add svn revision information to setlocalversion
    kbuild: do not warn about __*init/__*exit symbols being exported
    Move Kconfig.instrumentation to arch/Kconfig and init/Kconfig
    Add HAVE_KPROBES
    Add HAVE_OPROFILE
    Create arch/Kconfig
    Fix ARM to play nicely with generic Instrumentation menu
    kconfig: ignore select of unknown symbol
    kconfig: mark config as changed when loading an alternate config
    kbuild: Spelling/grammar fixes for config DEBUG_SECTION_MISMATCH
    Remove __INIT_REFOK and __INITDATA_REFOK
    kbuild: print only total number of section mismatces found

    Linus Torvalds
     

03 Feb, 2008

4 commits


02 Feb, 2008

1 commit

  • Change latencytop Kconfig entry so it doesn't list the archictectures
    that support it. Instead introduce HAVE_LATENCY_SUPPORT which any
    architecture can set. Should reduce patch conflicts.

    Cc: Arjan van de Ven
    Cc: Martin Schwidefsky
    Cc: Holger Wolf
    Signed-off-by: Heiko Carstens
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     

30 Jan, 2008

4 commits

  • This patch adds a new configuration option, which adds support for a new
    early_param which gets checked in arch/x86/kernel/setup_{32,64}.c:setup_arch()
    to decide wether OHCI-1394 FireWire controllers should be initialized and
    enabled for physical DMA access to allow remote debugging of early problems
    like issues ACPI or other subsystems which are executed very early.

    If the config option is not enabled, no code is changed, and if the boot
    paramenter is not given, no new code is executed, and independent of that,
    all new code is freed after boot, so the config option can be even enabled
    in standard, non-debug kernels.

    With specialized tools, it is then possible to get debugging information
    from machines which have no serial ports (notebooks) such as the printk
    buffer contents, or any data which can be referenced from global pointers,
    if it is stored below the 4GB limit and even memory dumps of of the physical
    RAM region below the 4GB limit can be taken without any cooperation from the
    CPU of the host, so the machine can be crashed early, it does not matter.

    In the extreme, even kernel debuggers can be accessed in this way. I wrote
    a small kgdb module and an accompanying gdb stub for FireWire which allows
    to gdb to talk to kgdb using remote remory reads and writes over FireWire.

    An version of the gdb stub fore FireWire is able to read all global data
    from a system which is running a a normal kernel without any kernel debugger,
    without any interruption or support of the system's CPU. That way, e.g. the
    task struct and so on can be read and even manipulated when the physical DMA
    access is granted.

    A HOWTO is included in this patch, in Documentation/debugging-via-ohci1394.txt
    and I've put a copy online at
    ftp://ftp.suse.de/private/bk/firewire/docs/debugging-via-ohci1394.txt

    It also has links to all the tools which are available to make use of it
    another copy of it is online at:
    ftp://ftp.suse.de/private/bk/firewire/kernel/ohci1394_dma_early-v2.diff

    Signed-Off-By: Bernhard Kaindl
    Tested-By: Thomas Renninger
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Bernhard Kaindl
     
  • During the work on the x86 32 and 64 bit backtrace code I found it useful
    to have a simple test module to test a process and irq context backtrace.
    Since the existing backtrace code was buggy, I figure it might be useful
    to have such a test module in the kernel so that maybe we can even
    detect such bugs earlier..

    [ mingo@elte.hu: build fix ]

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Arjan van de Ven
     
  • introduce the "asmregparm" calling convention: for functions
    implemented in assembly with a fixed regparm input parameters
    calling convention.

    mark the semaphore and rwsem slowpath functions with that.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Thomas Gleixner

    Ingo Molnar
     
  • Here is a quick and naive smoke test for kprobes. This is intended to
    just verify if some unrelated change broke the *probes subsystem. It is
    self contained, architecture agnostic and isn't of any great use by itself.

    This needs to be built in the kernel and runs a basic set of tests to
    verify if kprobes, jprobes and kretprobes run fine on the kernel. In case
    of an error, it'll print out a message with a "BUG" prefix.

    This is a start; we intend to add more tests to this bucket over time.

    Thanks to Jim Keniston and Masami Hiramatsu for comments and suggestions.

    Tested on x86 (32/64) and powerpc.

    Signed-off-by: Ananth N Mavinakayanahalli
    Acked-by: Masami Hiramatsu
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ananth N Mavinakayanahalli
     

29 Jan, 2008

7 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.25: (1470 commits)
    [IPV6] ADDRLABEL: Fix double free on label deletion.
    [PPP]: Sparse warning fixes.
    [IPV4] fib_trie: remove unneeded NULL check
    [IPV4] fib_trie: More whitespace cleanup.
    [NET_SCHED]: Use nla_policy for attribute validation in ematches
    [NET_SCHED]: Use nla_policy for attribute validation in actions
    [NET_SCHED]: Use nla_policy for attribute validation in classifiers
    [NET_SCHED]: Use nla_policy for attribute validation in packet schedulers
    [NET_SCHED]: sch_api: introduce constant for rate table size
    [NET_SCHED]: Use typeful attribute parsing helpers
    [NET_SCHED]: Use typeful attribute construction helpers
    [NET_SCHED]: Use NLA_PUT_STRING for string dumping
    [NET_SCHED]: Use nla_nest_start/nla_nest_end
    [NET_SCHED]: Propagate nla_parse return value
    [NET_SCHED]: act_api: use PTR_ERR in tcf_action_init/tcf_action_get
    [NET_SCHED]: act_api: use nlmsg_parse
    [NET_SCHED]: act_api: fix netlink API conversion bug
    [NET_SCHED]: sch_netem: use nla_parse_nested_compat
    [NET_SCHED]: sch_atm: fix format string warning
    [NETNS]: Add namespace for ICMP replying code.
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (79 commits)
    Remove references to "make dep"
    kconfig: document use of HAVE_*
    Introduce new section reference annotations tags: __ref, __refdata, __refconst
    kbuild: warn about ld added unique sections
    kbuild: add verbose option to Section mismatch reporting in modpost
    kconfig: tristate choices with mixed tristate and boolean values
    asm-generic/vmlix.lds.h: simplify __mem{init,exit}* dependencies
    remove __attribute_used__
    kbuild: support ARCH=x86 in buildtar
    kconfig: remove "enable"
    kbuild: simplified warning report in modpost
    kbuild: introduce a few helpers in modpost
    kbuild: use simpler section mismatch warnings in modpost
    kbuild: link vmlinux.o before kallsyms passes
    kbuild: introduce new option to enhance section mismatch analysis
    Use separate sections for __dev/__cpu/__mem code/data
    compiler.h: introduce __section()
    all archs: consolidate init and exit sections in vmlinux.lds.h
    kbuild: check section names consistently in modpost
    kbuild: introduce blacklisting in modpost
    ...

    Linus Torvalds
     
  • This function is used by the ext4 multi block allocator patches.

    Also add generic_find_next_le_bit

    Signed-off-by: Aneesh Kumar K.V
    Cc:
    Signed-off-by: Andrew Morton

    Aneesh Kumar K.V
     
  • Before pushing pcounter to Linus tree, I would like to make some adjustments.

    Goal is to reduce kernel text size, by unlining too big functions.

    When a pcounter is bound to a statically defined per_cpu variable,
    we define two small helpers functions. (No more folding function
    using the fat for_each_possible_cpu(cpu) ... )

    static DEFINE_PER_CPU(int, NAME##_pcounter_values);
    static void NAME##_pcounter_add(struct pcounter *self, int val)
    {
    __get_cpu_var(NAME##_pcounter_values) += val;
    }
    static int NAME##_pcounter_getval(const struct pcounter *self, int cpu)
    {
    return per_cpu(NAME##_pcounter_values, cpu);
    }

    Fast path is therefore unchanged, while folding/alloc/free is now unlined.

    This saves 228 bytes on i386

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This just generalises what was introduced by Eric Dumazet for the struct proto
    inuse field in 286ab3d46058840d68e5d7d52e316c1f7e98c59f:

    [NET]: Define infrastructure to keep 'inuse' changes in an efficent SMP/NUMA way.

    Please look at the comment in there to see the rationale.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • If the config option CONFIG_SECTION_MISMATCH is not set and
    we see a Section mismatch present the following to the user:

    modpost: Found 1 section mismatch(es).
    To see additional details select "Enable full Section mismatch analysis"
    in the Kernel Hacking menu (CONFIG_SECTION_MISMATCH).

    If the option CONFIG_SECTION_MISMATCH is selected
    then be verbose in the Section mismatch reporting from mdopost.
    Sample outputs:

    WARNING: o-x86_64/vmlinux.o(.text+0x7396): Section mismatch in reference from the function discover_ebda() to the variable .init.data:ebda_addr
    The function discover_ebda() references
    the variable __initdata ebda_addr.
    This is often because discover_ebda lacks a __initdata
    annotation or the annotation of ebda_addr is wrong.

    WARNING: o-x86_64/vmlinux.o(.data+0x74d58): Section mismatch in reference from the variable pci_serial_quirks to the function .devexit.text:pci_plx9050_exit()
    The variable pci_serial_quirks references
    the function __devexit pci_plx9050_exit()
    If the reference is valid then annotate the
    variable with __exit* (see linux/init.h) or name the variable:
    *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

    WARNING: o-x86_64/vmlinux.o(__ksymtab+0x630): Section mismatch in reference from the variable __ksymtab_arch_register_cpu to the function .cpuinit.text:arch_register_cpu()
    The symbol arch_register_cpu is exported and annotated __cpuinit
    Fix this by removing the __cpuinit annotation of arch_register_cpu or drop the export.

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     
  • Setting the option DEBUG_SECTION_MISMATCH will
    report additional section mismatch'es but this
    should in the end makes it possible to get rid of
    all of them.

    See help text in lib/Kconfig.debug for details.

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     

28 Jan, 2008

2 commits


26 Jan, 2008

2 commits


25 Jan, 2008

15 commits