27 Jul, 2008

2 commits

  • long overdue...

    Signed-off-by: Al Viro

    Al Viro
     
  • Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
    architecture does:

    This enables us to cleanly fix the Calgary IOMMU issue that some devices
    are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).

    I think that per-device dma_mapping_ops support would be also helpful for
    KVM people to support PCI passthrough but Andi thinks that this makes it
    difficult to support the PCI passthrough (see the above thread). So I
    CC'ed this to KVM camp. Comments are appreciated.

    A pointer to dma_mapping_ops to struct dev_archdata is added. If the
    pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's
    NULL, the system-wide dma_ops pointer is used as before.

    If it's useful for KVM people, I plan to implement a mechanism to register
    a hook called when a new pci (or dma capable) device is created (it works
    with hot plugging). It enables IOMMUs to set up an appropriate
    dma_mapping_ops per device.

    The major obstacle is that dma_mapping_error doesn't take a pointer to the
    device unlike other DMA operations. So x86 can't have dma_mapping_ops per
    device. Note all the POWER IOMMUs use the same dma_mapping_error function
    so this is not a problem for POWER but x86 IOMMUs use different
    dma_mapping_error functions.

    The first patch adds the device argument to dma_mapping_error. The patch
    is trivial but large since it touches lots of drivers and dma-mapping.h in
    all the architecture.

    This patch:

    dma_mapping_error() doesn't take a pointer to the device unlike other DMA
    operations. So we can't have dma_mapping_ops per device.

    Note that POWER already has dma_mapping_ops per device but all the POWER
    IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device
    argument.

    [akpm@linux-foundation.org: fix sge]
    [akpm@linux-foundation.org: fix svc_rdma]
    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: fix bnx2x]
    [akpm@linux-foundation.org: fix s2io]
    [akpm@linux-foundation.org: fix pasemi_mac]
    [akpm@linux-foundation.org: fix sdhci]
    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: fix sparc]
    [akpm@linux-foundation.org: fix ibmvscsi]
    Signed-off-by: FUJITA Tomonori
    Cc: Muli Ben-Yehuda
    Cc: Andi Kleen
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Avi Kivity
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     

26 Jul, 2008

3 commits

  • * git://git.infradead.org/~dwmw2/random-2.6:
    remove dummy asm/kvm.h files
    firmware: create firmware binaries during 'make modules'.

    Linus Torvalds
     
  • This patch removes the dummy asm/kvm.h files on architectures not (yet)
    supporting KVM and uses the same conditional headers installation as
    already used for a.out.h .

    Also removed are superfluous install rules in the s390 and x86 Kbuild
    files (they are already in Kbuild.asm).

    Signed-off-by: Adrian Bunk
    Acked-by: Sam Ravnborg
    Signed-off-by: David Woodhouse

    Adrian Bunk
     
  • We duplicate alloc/free_thread_info defines on many platforms (the
    majority uses __get_free_pages/free_pages). This patch defines common
    defines and removes these duplicated defines.
    __HAVE_ARCH_THREAD_INFO_ALLOCATOR is introduced for platforms that do
    something different.

    Signed-off-by: FUJITA Tomonori
    Acked-by: Russell King
    Cc: Pekka Enberg
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     

25 Jul, 2008

6 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
    ide: use proper printk() KERN_* levels in ide-probe.c
    ide: fix for EATA SCSI HBA in ATA emulating mode
    ide: remove stale comments from drivers/ide/Makefile
    ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase
    ide-scsi: remove kmalloced struct request
    ht6560b: remove old history
    ht6560b: update email address
    ide-cd: fix oops when using growisofs
    gayle: release resources on ide_host_add() failure
    palm_bk3710: add UltraDMA/100 support
    ide: trivial sparse annotations
    ide: ide-tape.c sparse annotations and unaligned access removal
    ide: drop 'name' parameter from ->init_chipset method
    ide: prefix messages from IDE PCI host drivers by driver name
    it821x: remove DECLARE_ITE_DEV() macro
    it8213: remove DECLARE_ITE_DEV() macro
    ide: include PCI device name in messages from IDE PCI host drivers
    ide: remove for some archs
    ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
    ide-generic: is no longer needed on ppc32
    ...

    Linus Torvalds
     
  • * Remove include from ( includes
    which is enough).

    * Remove for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
    (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).

    There should be no functional changes caused by this patch.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[].

    v2:
    Add missing zero-ing of hws[] (caught during testing by Borislav Petkov).

    v3:
    Fix zero-oing of hws[] for _real_ this time.

    There should be no functional changes caused by this patch.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • * 'semaphore' of git://git.kernel.org/pub/scm/linux/kernel/git/willy/misc:
    Remove __DECLARE_SEMAPHORE_GENERIC
    Remove asm/semaphore.h
    Remove use of asm/semaphore.h
    Add missing semaphore.h includes
    Remove mention of semaphores from kernel-locking

    Linus Torvalds
     
  • This patch is by far the most complex in the series. It adds a new syscall
    paccept. This syscall differs from accept in that it adds (at the userlevel)
    two additional parameters:

    - a signal mask
    - a flags value

    The flags parameter can be used to set flag like SOCK_CLOEXEC. This is
    imlpemented here as well. Some people argued that this is a property which
    should be inherited from the file desriptor for the server but this is against
    POSIX. Additionally, we really want the signal mask parameter as well
    (similar to pselect, ppoll, etc). So an interface change in inevitable.

    The flag value is the same as for socket and socketpair. I think diverging
    here will only create confusion. Similar to the filesystem interfaces where
    the use of the O_* constants differs, it is acceptable here.

    The signal mask is handled as for pselect etc. The mask is temporarily
    installed for the thread and removed before the call returns. I modeled the
    code after pselect. If there is a problem it's likely also in pselect.

    For architectures which use socketcall I maintained this interface instead of
    adding a system call. The symmetry shouldn't be broken.

    The following test must be adjusted for architectures other than x86 and
    x86-64 and in case the syscall numbers changed.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #ifndef __NR_paccept
    # ifdef __x86_64__
    # define __NR_paccept 288
    # elif defined __i386__
    # define SYS_PACCEPT 18
    # define USE_SOCKETCALL 1
    # else
    # error "need __NR_paccept"
    # endif
    #endif

    #ifdef USE_SOCKETCALL
    # define paccept(fd, addr, addrlen, mask, flags) \
    ({ long args[6] = { \
    (long) fd, (long) addr, (long) addrlen, (long) mask, 8, (long) flags }; \
    syscall (__NR_socketcall, SYS_PACCEPT, args); })
    #else
    # define paccept(fd, addr, addrlen, mask, flags) \
    syscall (__NR_paccept, fd, addr, addrlen, mask, 8, flags)
    #endif

    #define PORT 57392

    #define SOCK_CLOEXEC O_CLOEXEC

    static pthread_barrier_t b;

    static void *
    tf (void *arg)
    {
    pthread_barrier_wait (&b);
    int s = socket (AF_INET, SOCK_STREAM, 0);
    struct sockaddr_in sin;
    sin.sin_family = AF_INET;
    sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
    sin.sin_port = htons (PORT);
    connect (s, (const struct sockaddr *) &sin, sizeof (sin));
    close (s);

    pthread_barrier_wait (&b);
    s = socket (AF_INET, SOCK_STREAM, 0);
    sin.sin_port = htons (PORT);
    connect (s, (const struct sockaddr *) &sin, sizeof (sin));
    close (s);
    pthread_barrier_wait (&b);

    pthread_barrier_wait (&b);
    sleep (2);
    pthread_kill ((pthread_t) arg, SIGUSR1);

    return NULL;
    }

    static void
    handler (int s)
    {
    }

    int
    main (void)
    {
    pthread_barrier_init (&b, NULL, 2);

    struct sockaddr_in sin;
    pthread_t th;
    if (pthread_create (&th, NULL, tf, (void *) pthread_self ()) != 0)
    {
    puts ("pthread_create failed");
    return 1;
    }

    int s = socket (AF_INET, SOCK_STREAM, 0);
    int reuse = 1;
    setsockopt (s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof (reuse));
    sin.sin_family = AF_INET;
    sin.sin_addr.s_addr = htonl (INADDR_LOOPBACK);
    sin.sin_port = htons (PORT);
    bind (s, (struct sockaddr *) &sin, sizeof (sin));
    listen (s, SOMAXCONN);

    pthread_barrier_wait (&b);

    int s2 = paccept (s, NULL, 0, NULL, 0);
    if (s2 < 0)
    {
    puts ("paccept(0) failed");
    return 1;
    }

    int coe = fcntl (s2, F_GETFD);
    if (coe & FD_CLOEXEC)
    {
    puts ("paccept(0) set close-on-exec-flag");
    return 1;
    }
    close (s2);

    pthread_barrier_wait (&b);

    s2 = paccept (s, NULL, 0, NULL, SOCK_CLOEXEC);
    if (s2 < 0)
    {
    puts ("paccept(SOCK_CLOEXEC) failed");
    return 1;
    }

    coe = fcntl (s2, F_GETFD);
    if ((coe & FD_CLOEXEC) == 0)
    {
    puts ("paccept(SOCK_CLOEXEC) does not set close-on-exec flag");
    return 1;
    }
    close (s2);

    pthread_barrier_wait (&b);

    struct sigaction sa;
    sa.sa_handler = handler;
    sa.sa_flags = 0;
    sigemptyset (&sa.sa_mask);
    sigaction (SIGUSR1, &sa, NULL);

    sigset_t ss;
    pthread_sigmask (SIG_SETMASK, NULL, &ss);
    sigaddset (&ss, SIGUSR1);
    pthread_sigmask (SIG_SETMASK, &ss, NULL);

    sigdelset (&ss, SIGUSR1);
    alarm (4);
    pthread_barrier_wait (&b);

    errno = 0 ;
    s2 = paccept (s, NULL, 0, &ss, 0);
    if (s2 != -1 || errno != EINTR)
    {
    puts ("paccept did not fail with EINTR");
    return 1;
    }

    close (s);

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    [akpm@linux-foundation.org: make it compile]
    [akpm@linux-foundation.org: add sys_ni stub]
    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Cc:
    Cc: "David S. Miller"
    Cc: Roland McGrath
    Cc: Kyle McMartin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • On 32-bit architectures PAGE_ALIGN() truncates 64-bit values to the 32-bit
    boundary. For example:

    u64 val = PAGE_ALIGN(size);

    always returns a value < 4GB even if size is greater than 4GB.

    The problem resides in PAGE_MASK definition (from include/asm-x86/page.h for
    example):

    #define PAGE_SHIFT 12
    #define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
    #define PAGE_MASK (~(PAGE_SIZE-1))
    ...
    #define PAGE_ALIGN(addr) (((addr)+PAGE_SIZE-1)&PAGE_MASK)

    The "~" is performed on a 32-bit value, so everything in "and" with
    PAGE_MASK greater than 4GB will be truncated to the 32-bit boundary.
    Using the ALIGN() macro seems to be the right way, because it uses
    typeof(addr) for the mask.

    Also move the PAGE_ALIGN() definitions out of include/asm-*/page.h in
    include/linux/mm.h.

    See also lkml discussion: http://lkml.org/lkml/2008/6/11/237

    [akpm@linux-foundation.org: fix drivers/media/video/uvc/uvc_queue.c]
    [akpm@linux-foundation.org: fix v850]
    [akpm@linux-foundation.org: fix powerpc]
    [akpm@linux-foundation.org: fix arm]
    [akpm@linux-foundation.org: fix mips]
    [akpm@linux-foundation.org: fix drivers/media/video/pvrusb2/pvrusb2-dvb.c]
    [akpm@linux-foundation.org: fix drivers/mtd/maps/uclinux.c]
    [akpm@linux-foundation.org: fix powerpc]
    Signed-off-by: Andrea Righi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Righi
     

24 Jul, 2008

1 commit


26 Jun, 2008

2 commits


24 Jun, 2008

1 commit

  • Commit 9267b4b3880d00dc2dab90f1d817c856939114f7 ("alpha: fix module load
    failures on smp (bug #10926)") causes a regression for my ev4
    uniprocessor build:

    CC arch/alpha/mm/init.o
    /export/data/repositories/linux-2.6/arch/alpha/mm/init.c:34: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘typeof’
    make[2]: *** [arch/alpha/mm/init.o] Error 1
    make[1]: *** [arch/alpha/mm] Error 2
    make: *** [sub-make] Error 2

    This fixes it for me (compile and boot tested):

    Signed-off-by: Thorsten Kranzkowski
    Acked-by: Ivan Kokshaysky
    Signed-off-by: Linus Torvalds

    Thorsten Kranzkowski
     

21 Jun, 2008

2 commits

  • Vast majority of these build failures are gcc-4.3 warnings
    about static functions and objects being referenced from
    non-static (read: "extern inline") functions, in conjunction
    with our -Werror.

    We cannot just convert "extern inline" to "static inline",
    as people keep suggesting all the time, because "extern inline"
    logic is crucial for generic kernel build.
    So
    - just make sure that all callees of critical "extern inline"
    functions are also "extern inline";
    - use "static inline", wherever it's possible.

    traps.c: work around gcc-4.3 being too smart about array
    bounds-checking.

    TODO: add "gnu_inline" attribute to all our "extern inline"
    functions to ensure desired behaviour with future compilers.

    Signed-off-by: Ivan Kokshaysky
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky
     
  • To calculate addresses of locally defined variables, GCC uses 32-bit
    displacement from the GP. Which doesn't work for per cpu variables in
    modules, as an offset to the kernel per cpu area is way above 4G.

    The workaround is to force allocation of a GOT entry for per cpu variable
    using ldq instruction with a 'literal' relocation.
    I had to use custom asm/percpu.h, as a required argument magic doesn't
    work with asm-generic/percpu.h macros.

    Signed-off-by: Ivan Kokshaysky
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky
     

15 May, 2008

3 commits

  • I noticed this because alpha was broken due to the recent commit commit
    bdc807871d58285737d50dc6163d0feb72cb0dc2 ("avoid overflows in
    kernel/time.c"). Most arches do something like this in their
    asm/param.h:

    #ifdef __KERNEL__
    # define HZ CONFIG_HZ
    #else
    # define HZ 100
    #endif

    A few arches though (namely alpha/h8300/um/v850/xtensa) either do no set
    HZ at all for !__KERNEL__, or they set it wrongly. This should bring all
    arches in line by setting up HZ for userspace.

    Without this currently perl 5.10 doesn't build on alpha:

    perl.c: In function 'perl_construct':
    perl.c:388: error: 'CONFIG_HZ' undeclared (first use in this function)
    -> http://buildd.debian.org/fetch.cgi?pkg=perl;ver=5.10.0-10;arch=alpha;stamp=1210252894

    Signed-off-by: Mike Frysinger
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Yoshinori Sato
    Cc: Jeff Dike
    Cc: Chris Zankel
    Cc: maximilian attems
    Signed-off-by: Andrew Morton
    [ HZ on alpha is 1024 for historical reasons. - Linus ]
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     
  • There is a possible data race in the page table walking code. After the split
    ptlock patches, it actually seems to have been introduced to the core code, but
    even before that I think it would have impacted some architectures (powerpc
    and sparc64, at least, walk the page tables without taking locks eg. see
    find_linux_pte()).

    The race is as follows:
    The pte page is allocated, zeroed, and its struct page gets its spinlock
    initialized. The mm-wide ptl is then taken, and then the pte page is inserted
    into the pagetables.

    At this point, the spinlock is not guaranteed to have ordered the previous
    stores to initialize the pte page with the subsequent store to put it in the
    page tables. So another Linux page table walker might be walking down (without
    any locks, because we have split-leaf-ptls), and find that new pte we've
    inserted. It might try to take the spinlock before the store from the other
    CPU initializes it. And subsequently it might read a pte_t out before stores
    from the other CPU have cleared the memory.

    There are also similar races in higher levels of the page tables. They
    obviously don't involve the spinlock, but could see uninitialized memory.

    Arch code and hardware pagetable walkers that walk the pagetables without
    locks could see similar uninitialized memory problems, regardless of whether
    split ptes are enabled or not.

    I prefer to put the barriers in core code, because that's where the higher
    level logic happens, but the page table accessors are per-arch, and open-coding
    them everywhere I don't think is an option. I'll put the read-side barriers
    in alpha arch code for now (other architectures perform data-dependent loads
    in order).

    Signed-off-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • read_barrie_depends has always been a noop (not a compiler barrier) on all
    architectures except SMP alpha. This brings UP alpha and frv into line with all
    other architectures, and fixes incorrect documentation.

    Signed-off-by: Nick Piggin
    Acked-by: Paul E. McKenney
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

05 May, 2008

1 commit

  • This patch fixes the following compile error on alpha caused by
    commit 3726c23df8e4d95b6f2b335dfa90e3f4850a8a00
    (alpha: types: use for the alpha architecture):

    ...
    CC arch/alpha/kernel/asm-offsets.s
    In file included from include2/asm/topology.h:6,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/topology.h:34,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/mmzone.h:683,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/gfp.h:4,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/slab.h:12,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/percpu.h:5,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/rcupdate.h:39,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/pid.h:4,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/include/linux/sched.h:74,
    from /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/alpha/kernel/asm-offsets.c:9:
    include2/asm/machvec.h:44: error: expected declaration specifiers or '...' before 'dma_addr_t'
    include2/asm/machvec.h:44: error: expected declaration specifiers or '...' before 'dma_addr_t'
    In file included from /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/alpha/kernel/asm-offsets.c:12:
    include2/asm/io.h:94: warning: type defaults to 'int' in declaration of 'dma_addr_t'
    include2/asm/io.h:94: warning: variable 'dma_addr_t' declared 'inline'
    include2/asm/io.h:94: error: expected ',' or ';' before 'isa_page_to_bus'
    make[2]: *** [arch/alpha/kernel/asm-offsets.s] Error 1

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: H. Peter Anvin

    Adrian Bunk
     

03 May, 2008

1 commit


29 Apr, 2008

1 commit

  • Unaligned access is ok for the following arches:
    cris, m68k, mn10300, powerpc, s390, x86

    Arches that use the memmove implementation for native endian, and
    the byteshifting for the opposite endianness.
    h8300, m32r, xtensa

    Packed struct for native endian, byteshifting for other endian:
    alpha, blackfin, ia64, parisc, sparc, sparc64, mips, sh

    m86knommu is generic_be for Coldfire, otherwise unaligned access is ok.

    frv, arm chooses endianness based on compiler settings, uses the byteshifting
    versions. Remove the unaligned trap handler from frv as it is now unused.

    v850 is le, uses the byteshifting versions for both be and le.

    Remove the now unused asm-generic implementation.

    Signed-off-by: Harvey Harrison
    Acked-by: David S. Miller
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     

28 Apr, 2008

3 commits

  • Fix things like this:

    security/selinux/netnode.c: In function 'sel_netnode_find':
    security/selinux/netnode.c:126: warning: 'idx' may be used uninitialized in this function
    security/selinux/netnode.c: In function 'sel_netnode_sid':
    security/selinux/netnode.c:225: warning: 'ret' may be used uninitialized in this function
    security/selinux/netnode.c:168: warning: 'idx' may be used uninitialized in this function

    due to code correctly not expecting BUG() to return.

    For some reason this reduces the object code size for that particular file.

    Cc: Ivan Kokshaysky
    Cc: Richard Henderson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Signed-off-by: Harvey Harrison
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     
  • s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
    model (which is more dynamic than most). Instead, they had proposed to
    implement it with an additional path through vm_normal_page(), using a bit in
    the pte to determine whether or not the page should be refcounted:

    vm_normal_page()
    {
    ...
    if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
    if (vma->vm_flags & VM_MIXEDMAP) {
    #ifdef s390
    if (!mixedmap_refcount_pte(pte))
    return NULL;
    #else
    if (!pfn_valid(pfn))
    return NULL;
    #endif
    goto out;
    }
    ...
    }

    This is fine, however if we are allowed to use a bit in the pte to determine
    refcountedness, we can use that to _completely_ replace all the vma based
    schemes. So instead of adding more cases to the already complex vma-based
    scheme, we can have a clearly seperate and simple pte-based scheme (and get
    slightly better code generation in the process):

    vm_normal_page()
    {
    #ifdef s390
    if (!mixedmap_refcount_pte(pte))
    return NULL;
    return pte_page(pte);
    #else
    ...
    #endif
    }

    And finally, we may rather make this concept usable by any architecture rather
    than making it s390 only, so implement a new type of pte state for this.
    Unfortunately the old vma based code must stay, because some architectures may
    not be able to spare pte bits. This makes vm_normal_page a little bit more
    ugly than we would like, but the 2 cases are clearly seperate.

    So introduce a pte_special pte state, and use it in mm/memory.c. It is
    currently a noop for all architectures, so this doesn't actually result in any
    compiled code changes to mm/memory.o.

    BTW:
    I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
    The reason is that, regardless of where vm_normal_page is actually
    implemented, the *abstraction* is still exactly the same. Also, while it
    depends on whether the architecture has pte_special or not, that is the
    only two possible cases, and it really isn't an arch specific function --
    the role of the arch code should be to provide primitive functions and
    accessors with which to build the core code; pte_special does that. We do
    not want architectures to know or care about vm_normal_page itself, and
    we definitely don't want them being able to invent something new there
    out of sight of mm/ code. If we made vm_normal_page an arch function, then
    we have to make vm_insert_mixed (next patch) an arch function too. So I
    don't think moving it to arch code fundamentally improves any abstractions,
    while it does practically make the code more difficult to follow, for both
    mm and arch developers, and easier to misuse.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Nick Piggin
    Acked-by: Carsten Otte
    Cc: Jared Hulbert
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

27 Apr, 2008

1 commit

  • Implement __fls on all 64-bit archs:

    alpha has an implementation of fls64.
    Added __fls(x) = fls64(x) - 1.

    ia64 has fls, but not __fls.
    Added __fls based on code of fls.

    mips and powerpc have __ilog2, which is the same as __fls.
    Added __fls = __ilog2.

    parisc, s390, sh and sparc64:
    Include generic __fls.

    x86_64 already has __fls.

    Signed-off-by: Alexander van Heukelum
    Signed-off-by: Ingo Molnar

    Alexander van Heukelum
     

20 Apr, 2008

1 commit

  • Create a simple macro to always return a pointer to the node_to_cpumask(node)
    value. This relies on compiler optimization to remove the extra indirection:

    #define node_to_cpumask_ptr(v, node) \
    cpumask_t _##v = node_to_cpumask(node), *v = &_##v

    For those systems with a large cpumask size, then a true pointer
    to the array element can be used:

    #define node_to_cpumask_ptr(v, node) \
    cpumask_t *v = &(node_to_cpumask_map[node])

    A node_to_cpumask_ptr_next() macro is provided to access another
    node_to_cpumask value.

    The other change is to always include asm-generic/topology.h moving the
    ifdef CONFIG_NUMA to this same file.

    Note: there are no references to either of these new macros in this patch,
    only the definition.

    Based on 2.6.25-rc5-mm1

    # alpha
    Cc: Richard Henderson

    # fujitsu
    Cc: David Howells

    # ia64
    Cc: Tony Luck

    # powerpc
    Cc: Paul Mackerras
    Cc: Anton Blanchard

    # sparc
    Cc: David S. Miller
    Cc: William L. Irwin

    # x86
    Cc: H. Peter Anvin

    Signed-off-by: Mike Travis
    Signed-off-by: Ingo Molnar

    Mike Travis
     

18 Apr, 2008

4 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (58 commits)
    ide: remove ide_init_default_irq() macro
    ide: move default IDE ports setup to ide_generic host driver
    ide: remove obsoleted "idex=noprobe" kernel parameter (take 2)
    ide: remove needless hwif->irq check from ide_hwif_configure()
    ide: init hwif->{io_ports,irq} explicitly in legacy VLB host drivers
    ide: limit legacy VLB host drivers to alpha, x86 and mips
    cmd640: init hwif->{io_ports,irq} explicitly
    cmd640: cleanup setup_device_ptrs()
    ide: add ide-4drives host driver (take 3)
    ide: remove ppc ifdef from init_ide_data()
    ide: remove ide_default_io_ctl() macro
    ide: remove CONFIG_IDE_ARCH_OBSOLETE_INIT
    ide: add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS (take 2)
    ppc/pmac: remove no longer needed IDE quirk
    ppc: don't include
    ppc: remove ppc_ide_md
    ppc/pplus: remove ppc_ide_md.ide_init_hwif hook
    ppc/sandpoint: remove ppc_ide_md hooks
    ppc/lopec: remove ppc_ide_md hooks
    ppc/mpc8xx: remove ppc_ide_md hooks
    ...

    Linus Torvalds
     
  • * Use ide_default_irq() instead of ide_init_default_irq() in
    ide_generic host driver (so the correct IRQ is always set
    regardless of CONFIG_PCI / CONFIG_BLK_DEV_IDEPCI).

    * Remove no longer needed ide_init_default_irq() macro.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • It is always == '((base) + 0x206)' if CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=y
    and it is not needed otherwise (arm, blackfin, parisc, ppc64, sh, sparc[64]).

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • * Add CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS to drivers/ide/Kconfig and use
    it instead of defining IDE_ARCH_OBSOLETE_DEFAULTS in .

    v2:
    * Define ide_default_irq() in ide-probe.c/ns87415.c if not already defined
    and drop defining ide_default_irq() for CONFIG_IDE_ARCH_OBSOLETE_DEFAULTS=n.

    [ Thanks to Stephen Rothwell and David Miller for noticing the problem. ]

    Cc: Stephen Rothwell
    Cc: David Miller
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     

17 Apr, 2008

1 commit

  • Semaphores are no longer performance-critical, so a generic C
    implementation is better for maintainability, debuggability and
    extensibility. Thanks to Peter Zijlstra for fixing the lockdep
    warning. Thanks to Harvey Harrison for pointing out that the
    unlikely() was unnecessary.

    Signed-off-by: Matthew Wilcox
    Acked-by: Ingo Molnar

    Matthew Wilcox
     

03 Apr, 2008

3 commits

  • A nasty compile error:

    In file included from security/keys/internal.h:16,
    from security/keys/sysctl.c:14:
    include/linux/key-ui.h: In function 'key_permission':
    include/linux/key-ui.h:51: error: invalid use of undefined type 'struct task_struct'

    apparently the compiler has decided that it needs to know sizeof(task_struct)
    so that it can add zero to a task_struct* (which is rather dumb of it).

    Getting task_struct in scope in these deeply-nested headers is scary-looking,
    so let's just remove the "+ 0".

    Cc: David Howells
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Make dma_alloc_coherent respect gfp flags (__GFP_COMP is one that
    matters).

    Signed-off-by: Ivan Kokshaysky
    Tested-by: Michael Cree
    Cc: Richard Henderson
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky
     
  • Currently include/linux/kvm.h is not considered by make headers_install,
    because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem
    was introduced by

    commit fb56dbb31c4738a3918db81fd24da732ce3b4ae6
    Author: Avi Kivity
    Date: Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to , which
    includes, not existing. Rather than add a zillion s, export kvm.
    only if the arch actually supports it.

    Signed-off-by: Avi Kivity

    which makes this an 2.6.25 regression.

    One way of solving the issue is to enhance Kbuild, but Avi and David conviced
    me, that changing headers_install is not the way to go. This patch changes
    the definition for linux/kvm.h to unifdef-y.

    If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
    architectures without asm/kvm.h. Therefore, this patch also provides
    asm/kvm.h on all architectures.

    Signed-off-by: Christian Borntraeger
    Acked-by: Avi Kivity
    Cc: Sam Ravnborg
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     

09 Feb, 2008

4 commits

  • Background: I've implemented 1K/2K page tables for s390. These sub-page
    page tables are required to properly support the s390 virtualization
    instruction with KVM. The SIE instruction requires that the page tables
    have 256 page table entries (pte) followed by 256 page status table entries
    (pgste). The pgstes are only required if the process is using the SIE
    instruction. The pgstes are updated by the hardware and by the hypervisor
    for a number of reasons, one of them is dirty and reference bit tracking.
    To avoid wasting memory the standard pte table allocation should return
    1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

    Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
    the s390 version for pte_alloc_one cannot return a pointer to a struct
    page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
    cannot return a pointer to a pte either, since that would require more than
    32 bit for the return value of pte_alloc_one (and the pte * would not be
    accessible since its not kmapped).

    Solution: The only solution I found to this dilemma is a new typedef: a
    pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
    later patch. For everybody else it will be a (struct page *). The
    additional problem with the initialization of the ptl lock and the
    NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
    a destructor pgtable_page_dtor. The page table allocation and free
    functions need to call these two whenever a page table page is allocated or
    freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
    To get the pgtable_t back from a pmd entry that has been installed with
    pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
    call in free_pte_range and apply_to_pte_range.

    Signed-off-by: Martin Schwidefsky
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • When the conversion factor between jiffies and milli- or microseconds is
    not a single multiply or divide, as for the case of HZ == 300, we currently
    do a multiply followed by a divide. The intervening result, however, is
    subject to overflows, especially since the fraction is not simplified (for
    HZ == 300, we multiply by 300 and divide by 1000).

    This is exposed to the user when passing a large timeout to poll(), for
    example.

    This patch replaces the multiply-divide with a reciprocal multiplication on
    32-bit platforms. When the input is an unsigned long, there is no portable
    way to do this on 64-bit platforms there is no portable way to do this
    since it requires a 128-bit intermediate result (which gcc does support on
    64-bit platforms but may generate libgcc calls, e.g. on 64-bit s390), but
    since the output is a 32-bit integer in the cases affected, just simplify
    the multiply-divide (*3/10 instead of *300/1000).

    The reciprocal multiply used can have off-by-one errors in the upper half
    of the valid output range. This could be avoided at the expense of having
    to deal with a potential 65-bit intermediate result. Since the intent is
    to avoid overflow problems and most of the other time conversions are only
    semiexact, the off-by-one errors were considered an acceptable tradeoff.

    At Ralf Baechle's suggestion, this version uses a Perl script to compute
    the necessary constants. We already have dependencies on Perl for kernel
    compiles. This does, however, require the Perl module Math::BigInt, which
    is included in the standard Perl distribution starting with version 5.8.0.
    In order to support older versions of Perl, include a table of canned
    constants in the script itself, and structure the script so that
    Math::BigInt isn't required if pulling values from said table.

    Running the script requires that the HZ value is available from the
    Makefile. Thus, this patch also adds the Kconfig variable CONFIG_HZ to the
    architectures which didn't already have it (alpha, cris, frv, h8300, m32r,
    m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or
    sh64 architectures, since Paul Mundt has dealt with those separately in the
    sh tree.

    Signed-off-by: H. Peter Anvin
    Cc: Ralf Baechle ,
    Cc: Sam Ravnborg ,
    Cc: Paul Mundt ,
    Cc: Richard Henderson ,
    Cc: Michael Starvik ,
    Cc: David Howells ,
    Cc: Yoshinori Sato ,
    Cc: Hirokazu Takata ,
    Cc: Geert Uytterhoeven ,
    Cc: Roman Zippel ,
    Cc: William L. Irwin ,
    Cc: Chris Zankel ,
    Cc: H. Peter Anvin ,
    Cc: Jan Engelhardt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    H. Peter Anvin
     
  • Suppress A.OUT library support if CONFIG_ARCH_SUPPORTS_AOUT is not set.

    Not all architectures support the A.OUT binfmt, so the ELF binfmt should not
    be permitted to go looking for A.OUT libraries to load in such a case. Not
    only that, but under such conditions A.OUT core dumps are not produced either.

    To make this work, this patch also does the following:

    (1) Makes the existence of the contents of linux/a.out.h contingent on
    CONFIG_ARCH_SUPPORTS_AOUT.

    (2) Renames dump_thread() to aout_dump_thread() as it's only called by A.OUT
    core dumping code.

    (3) Moves aout_dump_thread() into asm/a.out-core.h and makes it inline. This
    is then included only where needed. This means that this bit of arch
    code will be stored in the appropriate A.OUT binfmt module rather than
    the core kernel.

    (4) Drops A.OUT support for Blackfin (according to Mike Frysinger it's not
    needed) and FRV.

    This patch depends on the previous patch to move STACK_TOP[_MAX] out of
    asm/a.out.h and into asm/processor.h as they're required whether or not A.OUT
    format is available.

    [jdike@addtoit.com: uml: re-remove accidentally restored code]
    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're
    required whether or not A.OUT format is available.

    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells