15 May, 2008

1 commit

  • I noticed this because alpha was broken due to the recent commit commit
    bdc807871d58285737d50dc6163d0feb72cb0dc2 ("avoid overflows in
    kernel/time.c"). Most arches do something like this in their
    asm/param.h:

    #ifdef __KERNEL__
    # define HZ CONFIG_HZ
    #else
    # define HZ 100
    #endif

    A few arches though (namely alpha/h8300/um/v850/xtensa) either do no set
    HZ at all for !__KERNEL__, or they set it wrongly. This should bring all
    arches in line by setting up HZ for userspace.

    Without this currently perl 5.10 doesn't build on alpha:

    perl.c: In function 'perl_construct':
    perl.c:388: error: 'CONFIG_HZ' undeclared (first use in this function)
    -> http://buildd.debian.org/fetch.cgi?pkg=perl;ver=5.10.0-10;arch=alpha;stamp=1210252894

    Signed-off-by: Mike Frysinger
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Yoshinori Sato
    Cc: Jeff Dike
    Cc: Chris Zankel
    Cc: maximilian attems
    Signed-off-by: Andrew Morton
    [ HZ on alpha is 1024 for historical reasons. - Linus ]
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     

03 May, 2008

1 commit


29 Apr, 2008

1 commit

  • Unaligned access is ok for the following arches:
    cris, m68k, mn10300, powerpc, s390, x86

    Arches that use the memmove implementation for native endian, and
    the byteshifting for the opposite endianness.
    h8300, m32r, xtensa

    Packed struct for native endian, byteshifting for other endian:
    alpha, blackfin, ia64, parisc, sparc, sparc64, mips, sh

    m86knommu is generic_be for Coldfire, otherwise unaligned access is ok.

    frv, arm chooses endianness based on compiler settings, uses the byteshifting
    versions. Remove the unaligned trap handler from frv as it is now unused.

    v850 is le, uses the byteshifting versions for both be and le.

    Remove the now unused asm-generic implementation.

    Signed-off-by: Harvey Harrison
    Acked-by: David S. Miller
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     

28 Apr, 2008

1 commit

  • s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
    model (which is more dynamic than most). Instead, they had proposed to
    implement it with an additional path through vm_normal_page(), using a bit in
    the pte to determine whether or not the page should be refcounted:

    vm_normal_page()
    {
    ...
    if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
    if (vma->vm_flags & VM_MIXEDMAP) {
    #ifdef s390
    if (!mixedmap_refcount_pte(pte))
    return NULL;
    #else
    if (!pfn_valid(pfn))
    return NULL;
    #endif
    goto out;
    }
    ...
    }

    This is fine, however if we are allowed to use a bit in the pte to determine
    refcountedness, we can use that to _completely_ replace all the vma based
    schemes. So instead of adding more cases to the already complex vma-based
    scheme, we can have a clearly seperate and simple pte-based scheme (and get
    slightly better code generation in the process):

    vm_normal_page()
    {
    #ifdef s390
    if (!mixedmap_refcount_pte(pte))
    return NULL;
    return pte_page(pte);
    #else
    ...
    #endif
    }

    And finally, we may rather make this concept usable by any architecture rather
    than making it s390 only, so implement a new type of pte state for this.
    Unfortunately the old vma based code must stay, because some architectures may
    not be able to spare pte bits. This makes vm_normal_page a little bit more
    ugly than we would like, but the 2 cases are clearly seperate.

    So introduce a pte_special pte state, and use it in mm/memory.c. It is
    currently a noop for all architectures, so this doesn't actually result in any
    compiled code changes to mm/memory.o.

    BTW:
    I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
    The reason is that, regardless of where vm_normal_page is actually
    implemented, the *abstraction* is still exactly the same. Also, while it
    depends on whether the architecture has pte_special or not, that is the
    only two possible cases, and it really isn't an arch specific function --
    the role of the arch code should be to provide primitive functions and
    accessors with which to build the core code; pte_special does that. We do
    not want architectures to know or care about vm_normal_page itself, and
    we definitely don't want them being able to invent something new there
    out of sight of mm/ code. If we made vm_normal_page an arch function, then
    we have to make vm_insert_mixed (next patch) an arch function too. So I
    don't think moving it to arch code fundamentally improves any abstractions,
    while it does practically make the code more difficult to follow, for both
    mm and arch developers, and easier to misuse.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Nick Piggin
    Acked-by: Carsten Otte
    Cc: Jared Hulbert
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

17 Apr, 2008

1 commit

  • Semaphores are no longer performance-critical, so a generic C
    implementation is better for maintainability, debuggability and
    extensibility. Thanks to Peter Zijlstra for fixing the lockdep
    warning. Thanks to Harvey Harrison for pointing out that the
    unlikely() was unnecessary.

    Signed-off-by: Matthew Wilcox
    Acked-by: Ingo Molnar

    Matthew Wilcox
     

03 Apr, 2008

1 commit

  • Currently include/linux/kvm.h is not considered by make headers_install,
    because Kbuild cannot handle " unifdef-$(CONFIG_FOO) += foo.h. This problem
    was introduced by

    commit fb56dbb31c4738a3918db81fd24da732ce3b4ae6
    Author: Avi Kivity
    Date: Sun Dec 2 10:50:06 2007 +0200

    KVM: Export include/linux/kvm.h only if $ARCH actually supports KVM

    Currently, make headers_check barfs due to , which
    includes, not existing. Rather than add a zillion s, export kvm.
    only if the arch actually supports it.

    Signed-off-by: Avi Kivity

    which makes this an 2.6.25 regression.

    One way of solving the issue is to enhance Kbuild, but Avi and David conviced
    me, that changing headers_install is not the way to go. This patch changes
    the definition for linux/kvm.h to unifdef-y.

    If  unifdef-y is used for linux/kvm.h "make headers_check" will fail on all
    architectures without asm/kvm.h. Therefore, this patch also provides
    asm/kvm.h on all architectures.

    Signed-off-by: Christian Borntraeger
    Acked-by: Avi Kivity
    Cc: Sam Ravnborg
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Borntraeger
     

14 Feb, 2008

12 commits


09 Feb, 2008

5 commits

  • In file included from /home/bunk/linux/kernel-2.6/git/linux-2.6/arch/xtensa/kernel/syscall.c:39:
    include2/asm/unistd.h:681: error: 'sys_timerfd' undeclared here (not in a function)

    Signed-off-by: Adrian Bunk
    Cc: Christian Zankel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Background: I've implemented 1K/2K page tables for s390. These sub-page
    page tables are required to properly support the s390 virtualization
    instruction with KVM. The SIE instruction requires that the page tables
    have 256 page table entries (pte) followed by 256 page status table entries
    (pgste). The pgstes are only required if the process is using the SIE
    instruction. The pgstes are updated by the hardware and by the hypervisor
    for a number of reasons, one of them is dirty and reference bit tracking.
    To avoid wasting memory the standard pte table allocation should return
    1K/2K (31/64 bit) and 2K/4K if the process is using SIE.

    Problem: Page size on s390 is 4K, page table size is 1K or 2K. That means
    the s390 version for pte_alloc_one cannot return a pointer to a struct
    page. Trouble is that with the CONFIG_HIGHPTE feature on x86 pte_alloc_one
    cannot return a pointer to a pte either, since that would require more than
    32 bit for the return value of pte_alloc_one (and the pte * would not be
    accessible since its not kmapped).

    Solution: The only solution I found to this dilemma is a new typedef: a
    pgtable_t. For s390 pgtable_t will be a (pte *) - to be introduced with a
    later patch. For everybody else it will be a (struct page *). The
    additional problem with the initialization of the ptl lock and the
    NR_PAGETABLE accounting is solved with a constructor pgtable_page_ctor and
    a destructor pgtable_page_dtor. The page table allocation and free
    functions need to call these two whenever a page table page is allocated or
    freed. pmd_populate will get a pgtable_t instead of a struct page pointer.
    To get the pgtable_t back from a pmd entry that has been installed with
    pmd_populate a new function pmd_pgtable is added. It replaces the pmd_page
    call in free_pte_range and apply_to_pte_range.

    Signed-off-by: Martin Schwidefsky
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     
  • When the conversion factor between jiffies and milli- or microseconds is
    not a single multiply or divide, as for the case of HZ == 300, we currently
    do a multiply followed by a divide. The intervening result, however, is
    subject to overflows, especially since the fraction is not simplified (for
    HZ == 300, we multiply by 300 and divide by 1000).

    This is exposed to the user when passing a large timeout to poll(), for
    example.

    This patch replaces the multiply-divide with a reciprocal multiplication on
    32-bit platforms. When the input is an unsigned long, there is no portable
    way to do this on 64-bit platforms there is no portable way to do this
    since it requires a 128-bit intermediate result (which gcc does support on
    64-bit platforms but may generate libgcc calls, e.g. on 64-bit s390), but
    since the output is a 32-bit integer in the cases affected, just simplify
    the multiply-divide (*3/10 instead of *300/1000).

    The reciprocal multiply used can have off-by-one errors in the upper half
    of the valid output range. This could be avoided at the expense of having
    to deal with a potential 65-bit intermediate result. Since the intent is
    to avoid overflow problems and most of the other time conversions are only
    semiexact, the off-by-one errors were considered an acceptable tradeoff.

    At Ralf Baechle's suggestion, this version uses a Perl script to compute
    the necessary constants. We already have dependencies on Perl for kernel
    compiles. This does, however, require the Perl module Math::BigInt, which
    is included in the standard Perl distribution starting with version 5.8.0.
    In order to support older versions of Perl, include a table of canned
    constants in the script itself, and structure the script so that
    Math::BigInt isn't required if pulling values from said table.

    Running the script requires that the HZ value is available from the
    Makefile. Thus, this patch also adds the Kconfig variable CONFIG_HZ to the
    architectures which didn't already have it (alpha, cris, frv, h8300, m32r,
    m68k, m68knommu, sparc, v850, and xtensa.) It does *not* touch the sh or
    sh64 architectures, since Paul Mundt has dealt with those separately in the
    sh tree.

    Signed-off-by: H. Peter Anvin
    Cc: Ralf Baechle ,
    Cc: Sam Ravnborg ,
    Cc: Paul Mundt ,
    Cc: Richard Henderson ,
    Cc: Michael Starvik ,
    Cc: David Howells ,
    Cc: Yoshinori Sato ,
    Cc: Hirokazu Takata ,
    Cc: Geert Uytterhoeven ,
    Cc: Roman Zippel ,
    Cc: William L. Irwin ,
    Cc: Chris Zankel ,
    Cc: H. Peter Anvin ,
    Cc: Jan Engelhardt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    H. Peter Anvin
     
  • Some arches (like alpha and ia64) already have a clean posix_types.h header.
    This brings all the others in line by removing all references to __GLIBC__
    (and some undocumented __USE_ALL).

    Signed-off-by: Mike Frysinger
    Acked-by: Ingo Molnar
    Cc: Ulrich Drepper
    Cc: Roland McGrath
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     
  • Move STACK_TOP[_MAX] out of asm/a.out.h and into asm/processor.h as they're
    required whether or not A.OUT format is available.

    Signed-off-by: David Howells
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

08 Feb, 2008

2 commits


06 Feb, 2008

1 commit

  • (with Martin Schwidefsky )

    The pgd/pud/pmd/pte page table allocation functions get a mm_struct pointer as
    first argument. The free functions do not get the mm_struct argument. This
    is 1) asymmetrical and 2) to do mm related page table allocations the mm
    argument is needed on the free function as well.

    [kamalesh@linux.vnet.ibm.com: i386 fix]
    [akpm@linux-foundation.org: coding-syle fixes]
    Signed-off-by: Benjamin Herrenschmidt
    Signed-off-by: Martin Schwidefsky
    Cc:
    Signed-off-by: Kamalesh Babulal
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

01 Feb, 2008

1 commit

  • A userspace program may wish to set the mark for each packets its send
    without using the netfilter MARK target. Changing the mark can be used
    for mark based routing without netfilter or for packet filtering.

    It requires CAP_NET_ADMIN capability.

    Signed-off-by: Laszlo Attila Toth
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Laszlo Attila Toth
     

24 Oct, 2007

2 commits


23 Oct, 2007

2 commits

  • Add a Kconfig entry which will toggle some sanity checks on the sg
    entry and tables.

    Signed-off-by: Jens Axboe

    Jens Axboe
     
  • Change the page member of the scatterlist structure to be an unsigned
    long, and encode more stuff in the lower bits:

    - Bits 0 and 1 zero: this is a normal sg entry. Next sg entry is located
    at sg + 1.
    - Bit 0 set: this is a chain entry, the next real entry is at ->page_link
    with the two low bits masked off.
    - Bit 1 set: this is the final entry in the sg entry. sg_next() will return
    NULL when passed such an entry.

    It's thus important that sg table users use the proper accessors to get
    and set the page member.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

20 Oct, 2007

2 commits

  • forbid asm/bitops.h direct inclusion

    Because of compile errors that may occur after bit changes if asm/bitops.h is
    included directly without e.g. linux/kernel.h which includes linux/bitops.h,
    forbid direct inclusion of asm/bitops.h. Thanks to Adrian Bunk.

    Signed-off-by: Jiri Slaby
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jiri Slaby
     
  • Nobody uses flush_tlb_pgtables anymore, this patch removes all remaining
    traces of it from all archs.

    Signed-off-by: Benjamin Herrenschmidt
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

19 Oct, 2007

1 commit

  • Introduce test_and_set_bit_lock / clear_bit_unlock bitops with lock semantics.
    Convert all architectures to use the generic implementation.

    Signed-off-by: Nick Piggin
    Acked-By: David Howells
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: Haavard Skinnemoen
    Cc: Bryan Wu
    Cc: Mikael Starvik
    Cc: David Howells
    Cc: Yoshinori Sato
    Cc: "Luck, Tony"
    Cc: Hirokazu Takata
    Cc: Geert Uytterhoeven
    Cc: Roman Zippel
    Cc: Greg Ungerer
    Cc: Ralf Baechle
    Cc: Kyle McMartin
    Cc: Matthew Wilcox
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: Kazumoto Kojima
    Cc: Richard Curnow
    Cc: William Lee Irwin III
    Cc: "David S. Miller"
    Cc: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Cc: Miles Bader
    Cc: Andi Kleen
    Cc: Chris Zankel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

17 Oct, 2007

2 commits

  • Remove the __STRICT_ANSI__ check from the __u64/__s64 declaration on
    32bit targets.

    GCC can be made to warn about usage of long long types with ISO C90
    (-ansi), but only with -pedantic. You can write this in a way that even
    then it doesn't cause warnings, namely by:

    #ifdef __GNUC__
    __extension__ typedef __signed__ long long __s64;
    __extension__ typedef unsigned long long __u64;
    #endif

    The __extension__ keyword in front of this switches off any pedantic
    warnings for this expression.

    Signed-off-by: Olaf Hering
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Hering
     
  • DECLARE_MUTEX_LOCKED was used for semaphores used as completions and we've
    got rid of them. Well, except for one in libusual that the maintainer
    explicitly wants to keep as semaphore. So convert that useage to an
    explicit sema_init and kill of DECLARE_MUTEX_LOCKED so that new code is
    reminded to use a completion.

    Signed-off-by: Christoph Hellwig
    Acked-by: "Satyam Sharma"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

28 Aug, 2007

4 commits