09 Nov, 2005

2 commits

  • Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
    confusion, and make their semantics rigid. Improves efficiency of
    resched_task and some cpu_idle routines.

    * In resched_task:
    - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
    and as we hold it during resched_task, then there is no need for an
    atomic test and set there. The only other time this should be set is
    when the task's quantum expires, in the timer interrupt - this is
    protected against because the rq lock is irq-safe.

    - If TIF_NEED_RESCHED is set, then we don't need to do anything. It
    won't get unset until the task get's schedule()d off.

    - If we are running on the same CPU as the task we resched, then set
    TIF_NEED_RESCHED and no further action is required.

    - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
    after TIF_NEED_RESCHED has been set, then we need to send an IPI.

    Using these rules, we are able to remove the test and set operation in
    resched_task, and make clear the previously vague semantics of
    POLLING_NRFLAG.

    * In idle routines:
    - Enter cpu_idle with preempt disabled. When the need_resched() condition
    becomes true, explicitly call schedule(). This makes things a bit clearer
    (IMO), but haven't updated all architectures yet.

    - Many do a test and clear of TIF_NEED_RESCHED for some reason. According
    to the resched_task rules, this isn't needed (and actually breaks the
    assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
    held). So remove that. Generally one less locked memory op when switching
    to the idle thread.

    - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
    most polling idle loops. The above resched_task semantics allow it to be
    set until before the last time need_resched() is checked before going into
    a halt requiring interrupt wakeup.

    Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
    can be always left set, completely eliminating resched IPIs when rescheduling
    the idle task.

    POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.

    Signed-off-by: Nick Piggin
    Cc: Ingo Molnar
    Cc: Con Kolivas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Run idle threads with preempt disabled.

    Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
    How did it ever work before?

    Might fix the CPU hotplugging hang which Nigel Cunningham noted.

    We think the bug hits if the idle thread is preempted after checking
    need_resched() and before going to sleep, then the CPU offlined.

    After calling stop_machine_run, the CPU eventually returns from preemption and
    into the idle thread and goes to sleep. The CPU will continue executing
    previous idle and have no chance to call play_dead.

    By disabling preemption until we are ready to explicitly schedule, this bug is
    fixed and the idle threads generally become more robust.

    From: alexs

    PPC build fix

    From: Yoichi Yuasa

    MIPS build fix

    Signed-off-by: Nick Piggin
    Signed-off-by: Yoichi Yuasa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

07 Nov, 2005

5 commits

  • The sys_ptrace boilerplate code (everything outside the big switch
    statement for the arch-specific requests) is shared by most architectures.
    This patch moves it to kernel/ptrace.c and leaves the arch-specific code as
    arch_ptrace.

    Some architectures have a too different ptrace so we have to exclude them.
    They continue to keep their implementations. For sh64 I had to add a
    sh64_ptrace wrapper because it does some initialization on the first call.
    For um I removed an ifdefed SUBARCH_PTRACE_SPECIAL block, but
    SUBARCH_PTRACE_SPECIAL isn't defined anywhere in the tree.

    Signed-off-by: Christoph Hellwig
    Acked-by: Paul Mackerras
    Acked-by: Ralf Baechle
    Acked-By: David Howells
    Acked-by: Russell King
    Acked-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • SH7705 in extended cache mode has some left-over VALID_PAGE() cruft that it
    checks when doing lazy dcache write-back. This has been gone for some time
    (the last bits were in the discontig code, which should now also be gone --
    this also fixes up a build error in the non-discontig case).

    pfn_valid() gives the desired behaviour, so we switch to that.

    Signed-off-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mundt
     
  • There was only one board using this (hp690 specifically), and it just so
    happens that it's only physically discontiguous at the "normal" P1 offset. If
    we bump up the P1 offset, it's possible to hit a shadowed region of memory
    where we suddenly become magically contiguous.

    As people have been using this shadowed region workaround for quite some time
    (and without any adverse effects), it's time to drop the left over discontig
    bits that no longer have any practical use (it was always very much
    hp690-centric to begin with).

    Signed-off-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mundt
     
  • This adds support for the relatively quirky (ie, not in line with any known
    documentation, and amazed it works at all) SuperHyway implementation on
    SH4-202. This depends on the earlier SuperHyway patch for multiple block
    support and VCR refactoring.

    Signed-off-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mundt
     
  • sh had its own support for embedding ramdisk images in to the kernel binary,
    but people are using initramfs for this now, so we drop the ramdisk embedding.

    Signed-off-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Mundt
     

31 Oct, 2005

4 commits

  • Manual #include fixups for clashes - there may be some unnecessary

    Linus Torvalds
     
  • I recently picked up my older work to remove unnecessary #includes of
    sched.h, starting from a patch by Dave Jones to not include sched.h
    from module.h. This reduces the number of indirect includes of sched.h
    by ~300. Another ~400 pointless direct includes can be removed after
    this disentangling (patch to follow later).
    However, quite a few indirect includes need to be fixed up for this.

    In order to feed the patches through -mm with as little disturbance as
    possible, I've split out the fixes I accumulated up to now (complete for
    i386 and x86_64, more archs to follow later) and post them before the real
    patch. This way this large part of the patch is kept simple with only
    adding #includes, and all hunks are independent of each other. So if any
    hunk rejects or gets in the way of other patches, just drop it. My scripts
    will pick it up again in the next round.

    Signed-off-by: Tim Schmielau
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tim Schmielau
     
  • Define jiffies_64 in kernel/timer.c rather than having 24 duplicated
    defines in each architecture.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Make sure we always return, as all syscalls should. Also move the common
    prototype to

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

30 Oct, 2005

4 commits

  • Use pte_offset_map_lock, instead of pte_offset_map (or inappropriate
    pte_offset_kernel) and mm-wide page_table_lock, in sundry arch places.

    The i386 vm86 mark_screen_rdonly: yes, there was and is an assumption that the
    screen fits inside the one page table, as indeed it does.

    The sh __do_page_fault: which handles both kernel faults (without lock) and
    user mm faults (locked - though it set_pte without locking before).

    The sh64 flush_cache_range and helpers: which wrongly thought callers held
    page_table_lock before (only its tlb_start_vma did, and no longer does so);
    moved the flush loop down, and adjusted the large versus small range decision
    to consider a range which spans page tables as large.

    Signed-off-by: Hugh Dickins
    Acked-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • First step in pushing down the page_table_lock. init_mm.page_table_lock has
    been used throughout the architectures (usually for ioremap): not to serialize
    kernel address space allocation (that's usually vmlist_lock), but because
    pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.

    Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
    architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
    and drop it when allocating a new one, to check lest a racing task already
    did. Similarly no page_table_lock in vmalloc's map_vm_area.

    Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
    user mms, which are converted only by a later patch, for now they have to lock
    differently according to whether or not it's init_mm.

    If sources get muddled, there's a danger that an arch source taking
    init_mm.page_table_lock will be mixed with common source also taking it (or
    neither take it). So break the rules and make another change, which should
    break the build for such a mismatch: remove the redundant mm arg from
    pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).

    Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
    used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
    pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
    map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
    took page_table_lock for no good reason.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The sh64 hugetlbpage.c seems to be erroneous, left over from a bygone age,
    clashing with the common hugetlb.c. Replace it by a copy of the sh
    hugetlbpage.c. Except, delete that mk_pte_huge macro neither uses.

    Signed-off-by: Hugh Dickins
    Acked-by: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Convert everyone who uses platform_bus_type to include
    linux/platform_device.h.

    Signed-off-by: Russell King
    Acked-by: Greg Kroah-Hartman

    Russell King
     

28 Oct, 2005

1 commit


14 Oct, 2005

1 commit

  • Original patch by Harald Welte, with feedback from Herbert Xu
    and testing by Sébastien Bernard.

    EBTABLES, ARP tables, and IP/IP6 tables all assume that cpus
    are numbered linearly. That is not necessarily true.

    This patch fixes that up by calculating the largest possible
    cpu number, and allocating enough per-cpu structure space given
    that.

    Signed-off-by: David S. Miller

    David S. Miller
     

12 Sep, 2005

1 commit

  • When introducing the generic asm-offsets.h support the dependency
    chain for the prepare targets was changed. All build scripts expecting
    include/asm/asm-offsets.h to be made when using the prepare target would broke.
    With the limited number of prepare targets left in arch Makefiles
    the trivial solution was to introduce a new arch specific target: archprepare

    The dependency chain looks like this now:

    prepare
    |
    +--> prepare0
    |
    +--> archprepare
    |
    +--> scripts_basic
    +--> prepare1
    |
    +---> prepare2
    |
    +--> prepare3

    So prepare 3 is processed before prepare2 etc.
    This guaantees that the asm symlink, version.h, scripts_basic
    are all updated before archprepare is processed.

    prepare0 which build the asm-offsets.h file will need the
    actions performed by archprepare.

    The head target is now named prepare, because users scripts will most
    likely use that target, but prepare-all has been kept for compatibility.
    Updated Documentation/kbuild/makefiles.txt.

    Signed-off-by: Sam Ravnborg

    Sam Ravnborg
     

11 Sep, 2005

3 commits


10 Sep, 2005

1 commit


08 Sep, 2005

3 commits

  • Sanitized and fixed floppy dependencies: split the messy dependencies for
    BLK_DEV_FD by introducing a new symbol (ARCH_MAY_HAVE_PC_FDC), making
    BLK_DEV_FD depend on that one and taking declarations of ARCH_MAY_HAVE_PC_FDC
    to arch/*/Kconfig. While we are at it, fixed several obvious cases when
    BLK_DEV_FD should have been excluded (architectures lacking asm/floppy.h
    are *not* going to have floppy.c compile, let alone work).

    If you can come up with better name for that ("this architecture might
    have working PC-compatible floppy disk controller"), you are more than
    welcome - just s/ARCH_MAY_HAVE_PC_FDC/your_prefered_name/g in the patch
    below...

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    viro@ZenIV.linux.org.uk
     
  • This patch cleans up a commonly repeated set of changes to the NTP state
    variables by adding two helper inline functions:

    ntp_clear(): Clears the ntp state variables

    ntp_synced(): Returns 1 if the system is synced with a time server.

    This was compile tested for alpha, arm, i386, x86-64, ppc64, s390, sparc,
    sparc64.

    Signed-off-by: John Stultz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    john stultz
     
  • The second arg of do_timer_interrupt() is not used in the functions, and
    all callers pass NULL.

    Signed-off-by: Adrian Bunk
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     

30 Aug, 2005

1 commit

  • It has been reported that the way Linux handles NODEFER for signals is
    not consistent with the way other Unix boxes handle it. I've written a
    program to test the behavior of how this flag affects signals and had
    several reports from people who ran this on various Unix boxes,
    confirming that Linux seems to be unique on the way this is handled.

    The way NODEFER affects signals on other Unix boxes is as follows:

    1) If NODEFER is set, other signals in sa_mask are still blocked.

    2) If NODEFER is set and the signal is in sa_mask, then the signal is
    still blocked. (Note: this is the behavior of all tested but Linux _and_
    NetBSD 2.0 *).

    The way NODEFER affects signals on Linux:

    1) If NODEFER is set, other signals are _not_ blocked regardless of
    sa_mask (Even NetBSD doesn't do this).

    2) If NODEFER is set and the signal is in sa_mask, then the signal being
    handled is not blocked.

    The patch converts signal handling in all current Linux architectures to
    the way most Unix boxes work.

    Unix boxes that were tested: DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
    3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.

    * NetBSD was the only other Unix to behave like Linux on point #2. The
    main concern was brought up by point #1 which even NetBSD isn't like
    Linux. So with this patch, we leave NetBSD as the lonely one that
    behaves differently here with #2.

    Signed-off-by: Linus Torvalds

    Steven Rostedt
     

19 Aug, 2005

1 commit


28 Jul, 2005

1 commit


27 Jul, 2005

1 commit

  • machine_restart, machine_halt and machine_power_off are machine
    specific hooks deep into the reboot logic, that modules
    have no business messing with. Usually code should be calling
    kernel_restart, kernel_halt, kernel_power_off, or
    emergency_restart. So don't export machine_restart,
    machine_halt, and machine_power_off so we can catch buggy users.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

12 Jul, 2005

1 commit

  • Create a new top-level menu named "Networking" thus moving
    net related options and protocol selection way from the drivers
    menu and up on the top-level where they belong.

    To implement this all architectures has to source "net/Kconfig" before
    drivers/*/Kconfig in their Kconfig file. This change has been
    implemented for all architectures.

    Device drivers for ordinary NIC's are still to be found
    in the Device Drivers section, but Bluetooth, IrDA and ax25
    are located with their corresponding menu entries under the new
    networking menu item.

    Signed-off-by: Sam Ravnborg
    Signed-off-by: David S. Miller

    Sam Ravnborg
     

24 Jun, 2005

2 commits

  • This used to be used to disable FLATMEM selection, but I decided to change it
    to be done generically when DISCONTIG is enabled. The option is unused, so
    this kills it.

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • For all architectures, this just means that you'll see a "Memory Model"
    choice in your architecture menu. For those that implement DISCONTIGMEM,
    you may eventually want to make your ARCH_DISCONTIGMEM_ENABLE a "def_bool
    y" and make your users select DISCONTIGMEM right out of the new choice
    menu. The only disadvantage might be if you have some specific things that
    you need in your help option to explain something about DISCONTIGMEM.

    Signed-off-by: Dave Hansen
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

22 Jun, 2005

3 commits

  • Ingo recently introduced a great speedup for allocating new mmaps using the
    free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
    causes huge performance increases in thread creation.

    The downside of this patch is that it does lead to fragmentation in the
    mmap-ed areas (visible via /proc/self/maps), such that some applications
    that work fine under 2.4 kernels quickly run out of memory on any 2.6
    kernel.

    The problem is twofold:

    1) the free_area_cache is used to continue a search for memory where
    the last search ended. Before the change new areas were always
    searched from the base address on.

    So now new small areas are cluttering holes of all sizes
    throughout the whole mmap-able region whereas before small holes
    tended to close holes near the base leaving holes far from the base
    large and available for larger requests.

    2) the free_area_cache also is set to the location of the last
    munmap-ed area so in scenarios where we allocate e.g. five regions of
    1K each, then free regions 4 2 3 in this order the next request for 1K
    will be placed in the position of the old region 3, whereas before we
    appended it to the still active region 1, placing it at the location
    of the old region 2. Before we had 1 free region of 2K, now we only
    get two free regions of 1K -> fragmentation.

    The patch addresses thes issues by introducing yet another cache descriptor
    cached_hole_size that contains the largest known hole size below the
    current free_area_cache. If a new request comes in the size is compared
    against the cached_hole_size and if the request can be filled with a hole
    below free_area_cache the search is started from the base instead.

    The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
    (earlier posted) leakme.c test program terminates after 50000+ iterations
    with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
    (as expected) with thread creation, Ingo's test_str02 with 20000 threads
    requires 0.7s system time.

    Taking out Ingo's patch (un-patch available per request) by basically
    deleting all mentions of free_area_cache from the kernel and starting the
    search for new memory always at the respective bases we observe: leakme
    terminates successfully with 11 distinctive hardly fragmented areas in
    /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
    time for Ingo's test_str02 with 20000 threads.

    Now - drumroll ;-) the appended patch works fine with leakme: it ends with
    only 7 distinct areas in /proc/self/maps and also thread creation seems
    sufficiently fast with 0.71s for 20000 threads.

    Signed-off-by: Wolfgang Wander
    Credit-to: "Richard Purdie"
    Signed-off-by: Ken Chen
    Acked-by: Ingo Molnar (partly)
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfgang Wander
     
  • A lot of the code in arch/*/mm/hugetlbpage.c is quite similar. This patch
    attempts to consolidate a lot of the code across the arch's, putting the
    combined version in mm/hugetlb.c. There are a couple of uglyish hacks in
    order to covert all the hugepage archs, but the result is a very large
    reduction in the total amount of code. It also means things like hugepage
    lazy allocation could be implemented in one place, instead of six.

    Tested, at least a little, on ppc64, i386 and x86_64.

    Notes:
    - this patch changes the meaning of set_huge_pte() to be more
    analagous to set_pte()
    - does SH4 need s special huge_ptep_get_and_clear()??

    Acked-by: William Lee Irwin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Gibson
     
  • This patch implements a number of smp_processor_id() cleanup ideas that
    Arjan van de Ven and I came up with.

    The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
    spaghetti was hard to follow both on the implementational and on the
    usage side.

    Some of the complexity arose from picking wrong names, some of the
    complexity comes from the fact that not all architectures defined
    __smp_processor_id.

    In the new code, there are two externally visible symbols:

    - smp_processor_id(): debug variant.

    - raw_smp_processor_id(): nondebug variant. Replaces all existing
    uses of _smp_processor_id() and __smp_processor_id(). Defined
    by every SMP architecture in include/asm-*/smp.h.

    There is one new internal symbol, dependent on DEBUG_PREEMPT:

    - debug_smp_processor_id(): internal debug variant, mapped to
    smp_processor_id().

    Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
    lib/smp_processor_id.c file. All related comments got updated and/or
    clarified.

    I have build/boot tested the following 8 .config combinations on x86:

    {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

    I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other
    architectures are untested, but should work just fine.)

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

04 May, 2005

1 commit

  • A bunch of drivers use ISA DMA helpers or their equivalents for
    platforms that have ISA with different DMA controller (a lot of ARM
    boxen). Currently there is no way to put such dependency in Kconfig -
    CONFIG_ISA is not it (e.g. it is not set on platforms that have no ISA
    slots, but have on-board devices that pretend to be ISA ones).

    New symbol added - ISA_DMA_API. Set when we have functional
    enable_dma()/set_dma_mode()/etc. set of helpers. Next patches in the
    series will add missing dependencies for drivers that need them.

    I'm very carefully staying the hell out of the recurring flamefest on
    what exactly CONFIG_ISA would mean in ideal world - added symbol has a
    well-defined meaning and for now I really want to treat it as completely
    independent from the mess around CONFIG_ISA.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

01 May, 2005

1 commit


17 Apr, 2005

2 commits

  • This fixes u32 vs. pm_message_t confusion in remaining places. Fortunately
    there's few of them.

    Signed-off-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Machek
     
  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds