17 Oct, 2007

1 commit

  • SPARSEMEM is a pretty nice framework that unifies quite a bit of code over all
    the arches. It would be great if it could be the default so that we can get
    rid of various forms of DISCONTIG and other variations on memory maps. So far
    what has hindered this are the additional lookups that SPARSEMEM introduces
    for virt_to_page and page_address. This goes so far that the code to do this
    has to be kept in a separate function and cannot be used inline.

    This patch introduces a virtual memmap mode for SPARSEMEM, in which the memmap
    is mapped into a virtually contigious area, only the active sections are
    physically backed. This allows virt_to_page page_address and cohorts become
    simple shift/add operations. No page flag fields, no table lookups, nothing
    involving memory is required.

    The two key operations pfn_to_page and page_to_page become:

    #define __pfn_to_page(pfn) (vmemmap + (pfn))
    #define __page_to_pfn(page) ((page) - vmemmap)

    By having a virtual mapping for the memmap we allow simple access without
    wasting physical memory. As kernel memory is typically already mapped 1:1
    this introduces no additional overhead. The virtual mapping must be big
    enough to allow a struct page to be allocated and mapped for all valid
    physical pages. This vill make a virtual memmap difficult to use on 32 bit
    platforms that support 36 address bits.

    However, if there is enough virtual space available and the arch already maps
    its 1-1 kernel space using TLBs (f.e. true of IA64 and x86_64) then this
    technique makes SPARSEMEM lookups even more efficient than CONFIG_FLATMEM.
    FLATMEM needs to read the contents of the mem_map variable to get the start of
    the memmap and then add the offset to the required entry. vmemmap is a
    constant to which we can simply add the offset.

    This patch has the potential to allow us to make SPARSMEM the default (and
    even the only) option for most systems. It should be optimal on UP, SMP and
    NUMA on most platforms. Then we may even be able to remove the other memory
    models: FLATMEM, DISCONTIG etc.

    [apw@shadowen.org: config cleanups, resplit code etc]
    [kamezawa.hiroyu@jp.fujitsu.com: Fix sparsemem_vmemmap init]
    [apw@shadowen.org: vmemmap: remove excess debugging]
    [apw@shadowen.org: simplify initialisation code and reduce duplication]
    [apw@shadowen.org: pull out the vmemmap code into its own file]
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andy Whitcroft
    Acked-by: Mel Gorman
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Cc: "David S. Miller"
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

14 Oct, 2007

1 commit

  • In commit 4665079cbb2a3e17de82f2ab2940b9f97f37d65e ("[NETNS]: Move some
    code into __init section when CONFIG_NET_NS=n") we got a new section -
    .exit.text.refok (more of 'let's tell modpost that some bogus calls are
    not bogus', a-la text.init.refok).

    Unfortunately, the commit in question forgot to add it to TEXT_TEXT,
    with rather amusing results.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

13 Oct, 2007

1 commit


13 Sep, 2007

1 commit

  • Commit f629307c857c030d5a3dd777fee37c8bb395e171 introduced uses of
    kernel_termios_to_user_termios_1 and user_termios_to_kernel_termios_1
    on all architectures. However, powerpc, s390, avr32 and frv don't
    currently define those functions since their termios struct didn't
    need to be changed when the arbitrary baud rate stuff was added, and
    thus the kernel won't currently build on those architectures.

    This adds definitions of kernel_termios_to_user_termios_1 and
    user_termios_to_kernel_termios_1 to include/asm-generic/termios.h
    which are identical to kernel_termios_to_user_termios and
    user_termios_to_kernel_termios respectively. The definitions are the
    same because the "old" termios and "new" termios are in fact the same
    on these architectures (which are the same ones that use
    asm-generic/termios.h).

    Signed-off-by: Paul Mackerras
    Cc: Andrew Morton
    Cc: Alan Cox
    Cc: David Miller
    Signed-off-by: Linus Torvalds

    Paul Mackerras
     

12 Aug, 2007

1 commit

  • There are some parts of include/asm-generic/pgtable.h that are relevant to
    the non-mmu architectures. To make it easier to include this from them I
    would like to ifdef the relevant parts.

    Without this there is a handful of functions that are referenced in here
    that are not defined on many non-mmu architectures. They could be defined
    out of course, as an alternative approach.

    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Greg Ungerer
     

01 Aug, 2007

2 commits

  • Alexey Dobriyan noticed that the new WARN_ON() semantics that were
    introduced by commit 684f978347deb42d180373ac4c427f82ef963171 (to also
    return the value to be warned on) didn't compile when given a bitfield,
    because the typeof doesn't work for bitfields.

    So instead of the typeof trick, use an "int" variable together with a
    "!!(x)" expression, as suggested by Al Viro.

    To make matters more interesting, Paul Mackerras points out that that is
    sub-optimal on Power, but the old asm-coded comparison seems to be buggy
    anyway on 32-bit Power if the conditional was 64-bit, so I think there
    are more problems there.

    Regardless, the new WARN_ON() semantics may have been a bad idea. But
    this at least avoids the more serious complications.

    Cc: Alexey Dobriyan
    Cc: Herbert Xu
    Cc: Paul Mackerras
    Cc: Al Viro
    Cc: Ingo Molnar
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Use "__val" rather than "val" in the __get_unaligned macro in
    asm-generic/unaligned.h. This way gcc wont warn if you happen to also name
    something in the same scope "val".

    Signed-off-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     

20 Jul, 2007

2 commits

  • This changes the i386 linker script and the asm-generic macro it uses so that
    ELF note sections with SHF_ALLOC set are linked into the kernel image along
    with other read-only data. The PT_NOTE also points to their location.

    This paves the way for putting useful build-time information into ELF notes
    that can be found easily later in a kernel memory dump.

    Signed-off-by: Roland McGrath
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • per cpu data section contains two types of data. One set which is
    exclusively accessed by the local cpu and the other set which is per cpu,
    but also shared by remote cpus. In the current kernel, these two sets are
    not clearely separated out. This can potentially cause the same data
    cacheline shared between the two sets of data, which will result in
    unnecessary bouncing of the cacheline between cpus.

    One way to fix the problem is to cacheline align the remotely accessed per
    cpu data, both at the beginning and at the end. Because of the padding at
    both ends, this will likely cause some memory wastage and also the
    interface to achieve this is not clean.

    This patch:

    Moves the remotely accessed per cpu data (which is currently marked
    as ____cacheline_aligned_in_smp) into a different section, where all the data
    elements are cacheline aligned. And as such, this differentiates the local
    only data and remotely accessed data cleanly.

    Signed-off-by: Fenghua Yu
    Acked-by: Suresh Siddha
    Cc: Rusty Russell
    Cc: Christoph Lameter
    Cc:
    Cc: "Luck, Tony"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fenghua Yu
     

18 Jul, 2007

4 commits

  • Verify that types would match for assignment (under sizeof, so we are safe from
    side effects or any code actually getting generated), then explicitly cast
    everywhere to the fixed-sized types. Kills a bunch of bogus warnings about
    constants being truncated (gcc, sparse), finds a pile of endianness problems
    hidden by old noise (sparse).

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • sparse now warns if one compares pointers with integers. However, there are
    false positives, like:

    fs/filesystems.c:72:2: warning: Using plain integer as NULL pointer

    Every time BUG_ON(ptr) is used, ptr is checked against integer zero. Avoid
    that and save ~70 false positives from allyesconfig run.

    mentioned by Al.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Al Viro
    Acked-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Nobody is using ptep_test_and_clear_dirty and ptep_clear_flush_dirty. Remove
    the functions from all architectures.

    Signed-off-by: Martin Schwidefsky
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • The last user of ptep_establish in mm/ is long gone. Remove the architecture
    primitive as well.

    Signed-off-by: Martin Schwidefsky
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     

17 Jul, 2007

2 commits

  • The problem is as follows: in multi-threaded code (or more correctly: all
    code using clone() with CLONE_FILES) we have a race when exec'ing.

    thread #1 thread #2

    fd=open()

    fork + exec

    fcntl(fd,F_SETFD,FD_CLOEXEC)

    In some applications this can happen frequently. Take a web browser. One
    thread opens a file and another thread starts, say, an external PDF viewer.
    The result can even be a security issue if that open file descriptor
    refers to a sensitive file and the external program can somehow be tricked
    into using that descriptor.

    Just adding O_CLOEXEC support to open() doesn't solve the whole set of
    problems. There are other ways to create file descriptors (socket,
    epoll_create, Unix domain socket transfer, etc). These can and should be
    addressed separately though. open() is such an easy case that it makes not
    much sense putting the fix off.

    The test program:

    #include
    #include
    #include
    #include

    #ifndef O_CLOEXEC
    # define O_CLOEXEC 02000000
    #endif

    int
    main (int argc, char *argv[])
    {
    int fd;
    if (argc > 1)
    {
    fd = atol (argv[1]);
    printf ("child: fd = %d\n", fd);
    if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
    {
    puts ("file descriptor valid in child");
    return 1;
    }
    return 0;
    }

    fd = open ("/proc/self/exe", O_RDONLY | O_CLOEXEC);
    printf ("in parent: new fd = %d\n", fd);
    char buf[20];
    snprintf (buf, sizeof (buf), "%d", fd);
    execl ("/proc/self/exe", argv[0], buf, NULL);
    puts ("execl failed");
    return 1;
    }

    [kyle@parisc-linux.org: parisc fix]
    Signed-off-by: Ulrich Drepper
    Acked-by: Ingo Molnar
    Cc: Davide Libenzi
    Cc: Michael Kerrisk
    Cc: Chris Zankel
    Signed-off-by: Kyle McMartin
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • Continuing the work started in 411f0f3edc141a582190d3605cadd1d993abb6df ...

    This enables code with a dma path, that compiles away, to build without
    requiring additional code factoring. It also prevents code that calls
    dma_alloc_coherent and dma_free_coherent from linking whereas previously
    the code would hit a BUG() at run time. Finally, it allows archs that set
    !HAS_DMA to delete their asm/dma-mapping.h file.

    Cc: Cornelia Huck
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: John W. Linville
    Cc: Kyle McMartin
    Cc: James Bottomley
    Cc: Tejun Heo
    Cc: Jeff Garzik
    Cc:
    Cc:
    Cc:
    Cc:
    Signed-off-by: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Williams
     

10 Jul, 2007

1 commit


17 Jun, 2007

1 commit

  • Some changes done a while ago to avoid pounding on ptep_set_access_flags and
    update_mmu_cache in some race situations break sun4c which requires
    update_mmu_cache() to always be called on minor faults.

    This patch reworks ptep_set_access_flags() semantics, implementations and
    callers so that it's now responsible for returning whether an update is
    necessary or not (basically whether the PTE actually changed). This allow
    fixing the sparc implementation to always return 1 on sun4c.

    [akpm@linux-foundation.org: fixes, cleanups]
    Signed-off-by: Benjamin Herrenschmidt
    Cc: Hugh Dickins
    Cc: David Miller
    Cc: Mark Fortescue
    Acked-by: William Lee Irwin III
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

30 May, 2007

1 commit

  • The RO_DATA section were hardcoded to a specific
    alignment in include/asm-generic/vmlinux.h.
    But for sparc64 this did not match the PAGE_SIZE.

    Introduce a new section definition named:
    RO_DATA that takes actual alignment as parameter.
    RODATA are provided for backward compatibility.

    On top of this avoid hardcoding alignment for
    sparc64 in reset of the script
    Fix is build-tested on sparc64 + x86_64.

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     

25 May, 2007

1 commit


19 May, 2007

3 commits


12 May, 2007

1 commit


11 May, 2007

2 commits

  • These files are almost all the same.

    This patch could be made even simpler if we don't mind POLLREMOVE turning
    up in a few architectures that didn't have it previously (which should be
    OK as POLLREMOVE is not used anywhere in the current tree).

    Signed-off-by: Stephen Rothwell
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     
  • Add a syscall class for sending signals.

    Signed-off-by: Amy Griffis
    Signed-off-by: Al Viro

    Amy Griffis
     

09 May, 2007

5 commits

  • Fix the misspellings of "propogate", "writting" and (oh, the shame
    :-) "kenrel" in the source tree.

    Signed-off-by: Robert P. J. Day
    Signed-off-by: Adrian Bunk

    Robert P. J. Day
     
  • This series extena and standardises local_t operations on each architecture,
    allowing a rich set of atomic operations to be done on per-cpu data with
    minimal performance impact. On architectures where there seems to be no
    difference between the SMP and UP operation (same memory barriers, same
    LOCKing), local.h simply includes asm-generic/local.h, which removes
    duplicated code from the current kernel tree.

    This patch:

    local_t: architecture independent extension

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • atomic_add_unless as inline. Remove system.h atomic.h circular dependency.
    I agree (with Andi Kleen) this typeof is not needed and more error
    prone. All the original atomic.h code that uses cmpxchg (which includes
    the atomic_add_unless) uses defines instead of inline functions,
    probably to circumvent a circular dependency between system.h and
    atomic.h on powerpc (which my patch addresses). Therefore, it makes
    sense to use inline functions that will provide type checking.

    atomic_add_unless as inline. Remove system.h atomic.h circular dependency.
    Digging into the FRV architecture shows me that it is also affected by
    such a circular dependency. Here is the diff applying this against the
    rest of my atomic.h patches.

    It applies over the atomic.h standardization patches.

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • This patch moves the die notifier handling to common code. Previous
    various architectures had exactly the same code for it. Note that the new
    code is compiled unconditionally, this should be understood as an appel to
    the other architecture maintainer to implement support for it aswell (aka
    sprinkling a notify_die or two in the proper place)

    arm had a notifiy_die that did something totally different, I renamed it to
    arm_notify_die as part of the patch and made it static to the file it's
    declared and used at. avr32 used to pass slightly less information through
    this interface and I brought it into line with the other architectures.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
    [bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
    Signed-off-by: Christoph Hellwig
    Cc:
    Cc: Russell King
    Signed-off-by: Bryan Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

03 May, 2007

3 commits

  • Three cleanups:

    1: ELF notes are never mapped, so there's no need to have any access
    flags in their phdr.

    2: When generating them from asm, tell the assembler to use a SHT_NOTE
    section type. There doesn't seem to be a way to do this from C.

    3: Use ANSI rather than traditional cpp behaviour to stringify the
    macro argument.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: Eric W. Biederman

    Jeremy Fitzhardinge
     
  • Add hooks to allow a paravirt implementation to track the lifetime of
    an mm. Paravirtualization requires three hooks, but only two are
    needed in common code. They are:

    arch_dup_mmap, which is called when a new mmap is created at fork

    arch_exit_mmap, which is called when the last process reference to an
    mm is dropped, which typically happens on exit and exec.

    The third hook is activate_mm, which is called from the arch-specific
    activate_mm() macro/function, and so doesn't need stub versions for
    other architectures. It's called when an mm is first used.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: linux-arch@vger.kernel.org
    Cc: James Bottomley
    Acked-by: Ingo Molnar

    Jeremy Fitzhardinge
     
  • Allocating PDA and GDT at boot is a pain. Using simple per-cpu variables adds
    happiness (although we need the GDT page-aligned for Xen, which we do in a
    followup patch).

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Rusty Russell
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Rusty Russell
     

28 Apr, 2007

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (448 commits)
    [IPV4] nl_fib_lookup: Initialise res.r before fib_res_put(&res)
    [IPV6]: Fix thinko in ipv6_rthdr_rcv() changes.
    [IPV4]: Add multipath cached to feature-removal-schedule.txt
    [WIRELESS] cfg80211: Clarify locking comment.
    [WIRELESS] cfg80211: Fix locking in wiphy_new.
    [WEXT] net_device: Don't include wext bits if not required.
    [WEXT]: Misc code cleanups.
    [WEXT]: Reduce inline abuse.
    [WEXT]: Move EXPORT_SYMBOL statements where they belong.
    [WEXT]: Cleanup early ioctl call path.
    [WEXT]: Remove options.
    [WEXT]: Remove dead debug code.
    [WEXT]: Clean up how wext is called.
    [WEXT]: Move to net/wireless
    [AFS]: Eliminate cmpxchg() usage in vlocation code.
    [RXRPC]: Fix pointers passed to bitops.
    [RXRPC]: Remove bogus atomic_* overrides.
    [AFS]: Fix u64 printing in debug logging.
    [AFS]: Add "directory write" support.
    [AFS]: Implement the CB.InitCallBackState3 operation.
    ...

    Linus Torvalds
     

27 Apr, 2007

1 commit

  • The page_test_and_clear_dirty primitive really consists of two
    operations, page_test_dirty and the page_clear_dirty. The combination
    of the two is not an atomic operation, so it makes more sense to have
    two separate operations instead of one.
    In addition to the improved readability of the s390 version of
    SetPageUptodate, it now avoids the page_test_dirty operation which is
    an insert-storage-key-extended (iske) instruction which is an expensive
    operation.

    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

26 Apr, 2007

1 commit


09 Apr, 2007

1 commit

  • Since lazy MMU batching mode still allows interrupts to enter, it is
    possible for interrupt handlers to try to use kmap_atomic, which fails when
    lazy mode is active, since the PTE update to highmem will be delayed. The
    best workaround is to issue an explicit flush in kmap_atomic_functions
    case; this is the only way nested PTE updates can happen in the interrupt
    handler.

    Thanks to Jeremy Fitzhardinge for noting the bug and suggestions on a fix.

    This patch gets reverted again when we start 2.6.22 and the bug gets fixed
    differently.

    Signed-off-by: Zachary Amsden
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zachary Amsden
     

07 Mar, 2007

1 commit

  • This reverts commit 39d61db0edb34d60b83c5e0d62d0e906578cc707.

    The commit was buggy in multiple ways:
    - the conversion to ilog2() was incorrect to begin with
    - it tested the wrong #defines, so on all architectures but FRV you'd
    never see the bug except for constant arguments.
    - the new "get_order()" macro used its arguments multiple times, and
    didn't even parenthesize them properly
    - despite the comments, it was not true that you could use it for
    constant initializers, since not all architectures even use the
    generic page.h header file.

    All of the problems are individually fixable, but it all boils down to:
    better just revert it, and re-do it from scratch.

    Cc: David Howells
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Feb, 2007

2 commits

  • The VMI ROM has a mode where hypercalls can be queued and batched. This turns
    out to be a significant win during context switch, but must be done at a
    specific point before side effects to CPU state are visible to subsequent
    instructions. This is similar to the MMU batching hooks already provided.
    The same hooks could be used by the Xen backend to implement a context switch
    multicall.

    To explain a bit more about lazy modes in the paravirt patches, basically, the
    idea is that only one of lazy CPU or MMU mode can be active at any given time.
    Lazy MMU mode is similar to this lazy CPU mode, and allows for batching of
    multiple PTE updates (say, inside a remap loop), but to avoid keeping some
    kind of state machine about when to flush cpu or mmu updates, we just allow
    one or the other to be active. Although there is no real reason a more
    comprehensive scheme could not be implemented, there is also no demonstrated
    need for this extra complexity.

    Signed-off-by: Zachary Amsden
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Cc: Rusty Russell
    Cc: Chris Wright
    Signed-off-by: Andrew Morton

    Zachary Amsden
     
  • This defines a simple and minimalist programming interface for GPIO APIs:

    - Documentation/gpio.txt ... describes things (read it)

    - include/asm-arm/gpio.h ... defines the ARM hook, which just punts
    to for any implementation

    - include/asm-generic/gpio.h ... implement "can sleep" variants as calling
    the normal ones, for systems that don't handle i2c expanders.

    The immediate need for such a cross-architecture API convention is to support
    drivers that work the same on AT91 ARM and AVR32 AP7000 chips, which embed many
    of the same controllers but have different CPUs. However, several other users
    have been reported, including a driver for a hardware watchdog chip and some
    handhelds.org multi-CPU button drivers.

    Signed-off-by: David Brownell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Brownell