19 Jan, 2006

40 commits

  • Current implementation of boot_timer_handler isn't usable for s390. So I
    changed its name to do_boot_timer_handler, taking (struct sigcontext *)sc as
    argument. do_boot_timer_handler is called from new boot_timer_handler() in
    arch/um/os-Linux/signal.c, which uses the same mechanisms as other signal
    handler to find out sigcontext pointer.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • The serial UML OS-abstraction layer patch (um/kernel dir).

    This moves all systemcalls from time.c file under os-Linux dir and joins
    time.c and tine_kernel.c files

    Signed-off-by: Gennady Sharapov
    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gennady Sharapov
     
  • The serial UML OS-abstraction layer patch (um/kernel dir).

    This moves all systemcalls from user_util.c file under os-Linux dir

    Signed-off-by: Gennady Sharapov
    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gennady Sharapov
     
  • s390 doesn't have a LDT. So MM_COPY_SEGMENTS will not be supported on s390.

    The only user of MM_COPY_SEGMENTS is new_mm(), but that's no longer useful, as
    arch/sys-i386/ldt.c defines init_new_ldt(), which is called immediately after
    new_mm(). So we should copy host's LDT in init_new_ldt(), if /proc/mm is
    available, to have this subarch specific call in subarch code.

    Signed-off-by: Bodo Stroesser
    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bodo Stroesser
     
  • Add implementations of the write* and __raw_write* functions. __raw_writel is
    needed by lib/iocopy.c, which shouldn't be used in UML, but which is
    unconditionally linked in anyway.

    Signed-off-by: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Move the interrupt check from slab_node into ___cache_alloc and adds an
    "unlikely()" to avoid pipeline stalls on some architectures.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • This patch fixes a regression in 2.6.14 against 2.6.13 that causes an
    imbalance in memory allocation during bootup.

    The slab allocator in 2.6.13 is not numa aware and simply calls
    alloc_pages(). This means that memory policies may control the behavior of
    alloc_pages(). During bootup the memory policy is set to MPOL_INTERLEAVE
    resulting in the spreading out of allocations during bootup over all
    available nodes. The slab allocator in 2.6.13 has only a single list of
    slab pages. As a result the per cpu slab cache and the spinlock controlled
    page lists may contain slab entries from off node memory. The slab
    allocator in 2.6.13 makes no effort to discern the locality of an entry on
    its lists.

    The NUMA aware slab allocator in 2.6.14 controls locality of the slab pages
    explicitly by calling alloc_pages_node(). The NUMA slab allocator manages
    slab entries by having lists of available slab pages for each node. The
    per cpu slab cache can only contain slab entries associated with the node
    local to the processor. This guarantees that the default allocation mode
    of the slab allocator always assigns local memory if available.

    Setting MPOL_INTERLEAVE as a default policy during bootup has no effect
    anymore. In 2.6.14 all node unspecific slab allocations are performed on
    the boot processor. This means that most of key data structures are
    allocated on one node. Most processors will have to refer to these
    structures making the boot node a potential bottleneck. This may reduce
    performance and cause unnecessary memory pressure on the boot node.

    This patch implements NUMA policies in the slab layer. There is the need
    of explicit application of NUMA memory policies by the slab allcator itself
    since the NUMA slab allocator does no longer let the page_allocator control
    locality.

    The check for policies is made directly at the beginning of __cache_alloc
    using current->mempolicy. The memory policy is already frequently checked
    by the page allocator (alloc_page_vma() and alloc_page_current()). So it
    is highly likely that the cacheline is present. For MPOL_INTERLEAVE
    kmalloc() will spread out each request to one node after another so that an
    equal distribution of allocations can be obtained during bootup.

    It is not possible to push the policy check to lower layers of the NUMA
    slab allocator since the per cpu caches are now only containing slab
    entries from the current node. If the policy says that the local node is
    not to be preferred or forbidden then there is no point in checking the
    slab cache or local list of slab pages. The allocation better be directed
    immediately to the lists containing slab entries for the allowed set of
    nodes.

    This way of applying policy also fixes another strange behavior in 2.6.13.
    alloc_pages() is controlled by the memory allocation policy of the current
    process. It could therefore be that one process is running with
    MPOL_INTERLEAVE and would f.e. obtain a new page following that policy
    since no slab entries are in the lists anymore. A page can typically be
    used for multiple slab entries but lets say that the current process is
    only using one. The other entries are then added to the slab lists. These
    are now non local entries in the slab lists despite of the possible
    availability of local pages that would provide faster access and increase
    the performance of the application.

    Another process without MPOL_INTERLEAVE may now run and expect a local slab
    entry from kmalloc(). However, there are still these free slab entries
    from the off node page obtained from the other process via MPOL_INTERLEAVE
    in the cache. The process will then get an off node slab entry although
    other slab entries may be available that are local to that process. This
    means that the policy if one process may contaminate the locality of the
    slab caches for other processes.

    This patch in effect insures that a per process policy is followed for the
    allocation of slab entries and that there cannot be a memory policy
    influence from one process to another. A process with default policy will
    always get a local slab entry if one is available. And the process using
    memory policies will get its memory arranged as requested. Off-node slab
    allocation will require the use of spinlocks and will make the use of per
    cpu caches not possible. A process using memory policies to redirect
    allocations offnode will have to cope with additional lock overhead in
    addition to the latency added by the need to access a remote slab entry.

    Changes V1->V2
    - Remove #ifdef CONFIG_NUMA by moving forward declaration into
    prior #ifdef CONFIG_NUMA section.

    - Give the function determining the node number to use a saner
    name.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Convert mm/swapfile.c's swapon_sem to swapon_mutex.

    Signed-off-by: Ingo Molnar
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • proc support for zone reclaim

    This patch creates a proc entry /proc/sys/vm/zone_reclaim_mode that may be
    used to override the automatic determination of the zone reclaim made on
    bootup.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Some bits for zone reclaim exists in 2.6.15 but they are not usable. This
    patch fixes them up, removes unused code and makes zone reclaim usable.

    Zone reclaim allows the reclaiming of pages from a zone if the number of
    free pages falls below the watermarks even if other zones still have enough
    pages available. Zone reclaim is of particular importance for NUMA
    machines. It can be more beneficial to reclaim a page than taking the
    performance penalties that come with allocating a page on a remote zone.

    Zone reclaim is enabled if the maximum distance to another node is higher
    than RECLAIM_DISTANCE, which may be defined by an arch. By default
    RECLAIM_DISTANCE is 20. 20 is the distance to another node in the same
    component (enclosure or motherboard) on IA64. The meaning of the NUMA
    distance information seems to vary by arch.

    If zone reclaim is not successful then no further reclaim attempts will
    occur for a certain time period (ZONE_RECLAIM_INTERVAL).

    This patch was discussed before. See

    http://marc.theaimsgroup.com/?l=linux-kernel&m=113519961504207&w=2
    http://marc.theaimsgroup.com/?l=linux-kernel&m=113408418232531&w=2
    http://marc.theaimsgroup.com/?l=linux-kernel&m=113389027420032&w=2
    http://marc.theaimsgroup.com/?l=linux-kernel&m=113380938612205&w=2

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Zone reclaim has a huge impact on NUMA performance (f.e. our maximum
    throughput with XFS is raised from 4GB to 6GB/sec / page cache contamination
    of numa nodes destroys locality if one just does a large copy operation which
    results in performance dropping for good until reboot).

    This patch:

    Resurrect may_swap in struct scan_control

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Simplify migrate_page_add after feedback from Hugh. This also allows us to
    drop one parameter from migrate_page_add.

    Signed-off-by: Christoph Lameter
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Migration code currently does not take a reference to target page
    properly, so between unlocking the pte and trying to take a new
    reference to the page with isolate_lru_page, anything could happen to
    it.

    Fix this by holding the pte lock until we get a chance to elevate the
    refcount.

    Other small cleanups while we're here.

    Signed-off-by: Nick Piggin
    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Ravikiran reports that this variable is bouncing all around nodes on NUMA
    machines, causing measurable performance problems. Fix that up by only
    writing to it when it actually changed.

    And put it in a new cacheline to prevent it sharing with other things (this
    happened).

    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Some pcnet32 hardware erroneously has the Vendor ID for Trident. The
    pcnet32 driver looks for the PCI ethernet class before grabbing the
    hardware, but the current trident driver does not check against the PCI
    audio class. This allows the trident driver to claim the pcnet32 hardware.
    This patch prevents that.

    This revised version of the OSS Trident patch includes PCI_DEVICE Macro
    usage.

    Signed-off-by: Jon Mason
    Signed-off-by: Muli Ben-Yehuda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jon Mason
     
  • Fix incorrect variable size used to hold register value. This bug might
    wipe out a portion of the TCR value when setting the interface options.

    Signed-off-by: Paul Fulghum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Fulghum
     
  • On alpha:

    In file included from drivers/scsi/sym53c8xx_2/sym_glue.h:59,
    from drivers/scsi/sym53c8xx_2/sym_fw.c:40:
    include/scsi/scsi_transport_spi.h:57: error: field `dv_mutex' has incomplete type

    Cc: James Bottomley
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Fix a typo/mis-merge in one of the previous patches.

    Signed-off-by: Jan Beulich
    Signed-off-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • We have to check that also the second checkpoint list is non-empty before
    dropping the transaction.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • While checkpointing we have to check that our transaction still is in the
    checkpoint list *and* (not or) that it's not just a different transaction
    with the same address.

    Signed-off-by: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     
  • Linus Torvalds
     
  • Linus Torvalds
     
  • Linus Torvalds
     
  • Linus Torvalds
     
  • Based upon a report and preliminary patch from Jim Gifford.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Russell King
     
  • From: Eddie C. Dost

    I have the following patch for serial console over the RSC
    (remote system controller) on my E250 machine. It basically adds
    support for input-device=rsc and output-device=rsc from OBP, and
    allows 115200,8,n,1,- serial mode setting.

    Signed-off-by: David S. Miller

    Eddie C. Dost
     
  • Add an entry to MAINTAINERS for wireless networking, just so people
    know whom to bless with patches.

    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    John W. Linville
     
  • Correct location info for net-2.6 git tree.

    Signed-off-by: John W. Linville
    Signed-off-by: David S. Miller

    John W. Linville
     
  • Patch from David Vrabel

    Export ixp4xx_exp_bus_size so modules can use the IXP4XX_EXP_BUS_BASE(n) macro.

    Also, fix a printk format warning.

    Signed-off-by: David Vrabel
    Signed-off-by: Russell King

    David Vrabel
     
  • Patch from Nicolas Pitre

    Commit f4619025a51747a3788fd1bb6bdc46e368a889a7 broke the kernel
    decompressor (at least on PXA). Here's the fix.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Russell King

    Nicolas Pitre
     
  • Patch from Nicolas Pitre

    This is kernel provided user space code.

    Since a syscall is used, it has to be updated to work with EABI.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Russell King

    Nicolas Pitre
     
  • Patch from Nicolas Pitre

    The signal return path consists of user code provided by the kernel.
    Since a syscall is used, it has to be updated to work with EABI.

    Noticed by Daniel Jacobowitz.

    Signed-off-by: Nicolas Pitre
    Signed-off-by: Russell King

    Nicolas Pitre
     
  • Patch from Andrew Victor

    This patch fixes two small issues with 2.6.15-git12.

    1) Corrected major/minor numbers for ttyAT devices in the KConfig help.
    (Patch from Karl Olsen)

    2) tty->flip.count has been removed.

    Signed-off-by: Andrew Victor
    Signed-off-by: Russell King

    Andrew Victor
     
  • Patch from David Vrabel

    PXA27x SSP controller has a few different registers, including SCR (serial clock rate) in SSCR0.

    Signed-off-by: David Vrabel
    Signed-off-by: Russell King

    David Vrabel
     
  • David S. Miller
     
  • 1) fix "mld_marksources()" to
    a) send nothing when all queried sources are excluded
    b) send full exclude report when source queried sources are
    not excluded
    c) don't schedule a timer when there's nothing to report

    2) fix "add_grec()" to send empty-source records when it should
    The original check doesn't account for a non-empty source
    list with all sources inactive; the new code keeps that
    short-circuit case, and also generates the group header
    with an empty list if needed.

    3) fix mca_crcount decrement to be after add_grec(), which needs
    its original value

    4) add/remove delete records and prevent current advertisements
    when an exclude-mode filter moves from "active" to "inactive"
    or vice versa based on new filter additions.

    Items 1-3 are just IPv4 versions of the IPv6 bugs found
    by Yan Zheng and fixed earlier. Item #4 is a related bug that
    affects exclude-mode change records only (but not queries) and
    also occurs in IPv6 (IPv6 version coming soon).

    Signed-off-by: David L Stevens
    Signed-off-by: David S. Miller

    David L Stevens
     
  • Don't assume 16.

    Found by Ben Greear.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Jean says he really doesn't have time to much IRDA any more.
    The following would help motivate someone who has more time.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Remove page refcount manipulations from cassini driver by using
    another field in struct page. Needed for lockless pagecache.

    Signed-off-by: Nick Piggin
    Signed-off-by: David S. Miller

    Nick Piggin