13 Mar, 2008

1 commit

  • Comparing with kernel 2.6.24, tbench result has regression with
    2.6.25-rc1.

    1) On 2 quad-core processor stoakley: 4%.
    2) On 4 quad-core processor tigerton: more than 30%.

    bisect located below patch.

    b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b is first bad commit
    commit b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b
    Author: Herbert Xu
    Date: Tue Nov 13 21:33:32 2007 -0800

    [IPV6]: Move nfheader_len into rt6_info

    The dst member nfheader_len is only used by IPv6. It's also currently
    creating a rather ugly alignment hole in struct dst. Therefore this patch
    moves it from there into struct rt6_info.

    Above patch changes the cache line alignment, especially member
    __refcnt. I did a testing by adding 2 unsigned long pading before
    lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next
    cache line. The performance is recovered.

    I created a patch to rearrange the members in struct dst_entry.

    With Eric and Valdis Kletnieks's suggestion, I made finer arrangement.

    1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So
    sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I
    tested many patches on my 16-core tigerton by moving tclassid to
    different place. It looks like tclassid could also have impact on
    performance. If moving tclassid before metrics, or just don't move
    tclassid, the performance isn't good. So I move it behind metrics.

    2) Add comments before __refcnt.

    On 16-core tigerton:

    If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18%
    better than the one without the patch;

    If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30%
    better than the one without the patch.

    With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't
    introduce regression.

    Thank Eric, Valdis, and David!

    Signed-off-by: Zhang Yanmin
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Zhang Yanmin
     

11 Mar, 2008

2 commits


08 Mar, 2008

2 commits

  • Signed-off-by: Kirill A. Shutemov
    Signed-off-by: David S. Miller

    Kirill A. Shutemov
     
  • Current /proc/net is done with so called "shadows", but current
    implementation is broken and has little chances to get fixed.

    The problem is that dentries subtree of /proc/net directory has
    fancy revalidation rules to make processes living in different
    net namespaces see different entries in /proc/net subtree, but
    currently, tasks see in the /proc/net subdir the contents of any
    other namespace, depending on who opened the file first.

    The proposed fix is to turn /proc/net into a symlink, which points
    to /proc/self/net, which in turn shows what previously was in
    /proc/net - the network-related info, from the net namespace the
    appropriate task lives in.

    # ls -l /proc/net
    lrwxrwxrwx 1 root root 8 Mar 5 15:17 /proc/net -> self/net

    In other words - this behaves like /proc/mounts, but unlike
    "mounts", "net" is not a file, but a directory.

    Changes from v2:
    * Fixed discrepancy of /proc/net nlink count and selinux labeling
    screwup pointed out by Stephen.

    To get the correct nlink count the ->getattr callback for /proc/net
    is overridden to read one from the net->proc_net entry.

    To make selinux still work the net->proc_net entry is initialized
    properly, i.e. with the "net" name and the proc_net parent.

    Selinux fixes are
    Acked-by: Stephen Smalley

    Changes from v1:
    * Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.

    Signed-off-by: Pavel Emelyanov
    Acked-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

06 Mar, 2008

2 commits


05 Mar, 2008

2 commits

  • If all of the entropy is in the local and foreign addresses,
    but xor'ing together would cancel out that entropy, the
    current hash performs poorly.

    Suggested by Cosmin Ratiu:

    Basically, the situation is as follows: There is a client
    machine and a server machine. Both create 15000 virtual
    interfaces, open up a socket for each pair of interfaces and
    do SIP traffic. By profiling I noticed that there is a lot of
    time spent walking the established hash chains with this
    particular setup.

    The addresses were distributed like this: client interfaces
    were 198.18.0.1/16 with increments of 1 and server interfaces
    were 198.18.128.1/16 with increments of 1. As I said, there
    were 15000 interfaces. Source and destination ports were 5060
    for each connection. So in this case, ports don't matter for
    hashing purposes, and the bits from the address pairs used
    cancel each other, meaning there are no differences in the
    whole lot of pairs, so they all end up in the same hash chain.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Based upon a report by Andrew Morton and code analysis done
    by Jarek Poplawski.

    This reverts 33f807ba0d9259e7c75c7a2ce8bd2787e5b540c7 ("[NETPOLL]:
    Kill NETPOLL_RX_DROP, set but never tested.") and
    c7b6ea24b43afb5749cb704e143df19d70e23dea ("[NETPOLL]: Don't need
    rx_flags.").

    The rx_flags did get tested for zero vs. non-zero and therefore we do
    need those tests and that code which sets NETPOLL_RX_DROP et al.

    Signed-off-by: David S. Miller

    David S. Miller
     

29 Feb, 2008

2 commits


28 Feb, 2008

1 commit

  • Properly add parens around the macro argument. This is not needed by
    the kernel but the macro is exported to userspace, so it shouldn't
    make any assumptions.

    Also use NF_VERDICT_BITS instead of NF_VERDICT_QBTIS for the left-shift
    since thats whats logically correct.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

27 Feb, 2008

1 commit


25 Feb, 2008

3 commits


24 Feb, 2008

19 commits

  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
    libata-core: fix kernel-doc warning
    sata_fsl: fix build with ATA_VERBOSE_DEBUG
    [libata] ahci: AMD SB700/SB800 SATA support 64bit DMA
    libata-pmp: clear hob for pmp register accesses
    libata: automatically use DMADIR if drive/bridge requires it
    power_state: get rid of write-only variable in SATA
    pata_atiixp: Use 255 sector limit

    Linus Torvalds
     
  • Back in 2.6.17-rc2, a libata module parameter was added for atapi_dmadir.

    That's nice, but most SATA devices which need it will tell us about it
    in their IDENTIFY PACKET response, as bit-15 of word-62 of the
    returned data (as per ATA7, ATA8 specifications).

    So for those which specify it, we should automatically use the DMADIR bit.
    Otherwise, disc writing will fail by default on many SATA-ATAPI drives.

    This patch adds ATA_DFLAG_DMADIR and make ata_dev_configure() set it
    if atapi_dmadir is set or identify data indicates DMADIR is necessary.
    atapi_xlat() is converted to check ATA_DFLAG_DMADIR before setting
    DMADIR.

    Original patch is from Mark Lord.

    Signed-off-by: Tejun Heo
    Cc: Mark Lord
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (37 commits)
    [NETFILTER]: fix ebtable targets return
    [IP_TUNNEL]: Don't limit the number of tunnels with generic name explicitly.
    [NET]: Restore sanity wrt. print_mac().
    [NEIGH]: Fix race between neighbor lookup and table's hash_rnd update.
    [RTNL]: Validate hardware and broadcast address attribute for RTM_NEWLINK
    tg3: ethtool phys_id default
    [BNX2]: Update version to 1.7.4.
    [BNX2]: Disable parallel detect on an HP blade.
    [BNX2]: More 5706S link down workaround.
    ssb: Fix support for PCI devices behind a SSB->PCI bridge
    zd1211rw: fix sparse warnings
    rtl818x: fix sparse warnings
    ssb: Fix pcicore cardbus mode
    ssb: Make the GPIO API reentrancy safe
    ssb: Fix the GPIO API
    ssb: Fix watchdog access for devices without a chipcommon
    ssb: Fix serial console on new bcm47xx devices
    ath5k: Fix build warnings on some 64-bit platforms.
    WDEV, ath5k, don't return int from bool function
    WDEV: ath5k, fix lock imbalance
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
    [SPARC64]: make IOMMU code respect the segment boundary limits
    [SPARC64]: Fix cpu trampoline et al. mismatch warnings.
    [SPARC64]: More sparse warning fixes in process.c
    [SPARC64]: Fix sparse warning wrt. fault_in_user_windows.
    [SPARC64]: Kill show_regs32().
    [SPARC64]: Fix sparse warnings wrt. __show_regs().
    [SPARC64]: Kill show_stackframe{,32}().
    [SPARC64]: Fix sparse warnings wrt. machine_alt_power_off().

    Linus Torvalds
     
  • Use the added dev_alloc_name() call to create tunnel device name,
    rather than iterate in a hand-made loop with an artificial limit.

    Thanks Patrick for noticing this.

    [ The way this works is, when the device is actually registered,
    the generic code noticed the '%' in the name and invokes
    dev_alloc_name() to fully resolve the name. -DaveM ]

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • MAC_FMT had only one user and we tried to get rid of
    that, but this created more problems than it solved.

    As a result, this reverts three commits:

    235365f3aaaa10b7056293877c0ead50425f25c7 ("net/8021q/vlan_dev.c: Use
    print_mac."), fea5fa875eb235dc186b1f5184eb36abc63e26cc ("[NET]: Remove
    MAC_FMT"), and 8f789c48448aed74fe1c07af76de8f04adacec7d ("[NET]:
    Elminate spurious print_mac() calls.")

    Signed-off-by: David S. Miller

    David S. Miller
     
  • - replace old name 'cont' with 'cgrp' (Paul Menage did this cleanup for
    cgroup.c in commit bd89aabc6761de1c35b154fe6f914a445d301510)
    - remove a duplicate declaration of cgroup_path()

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • fix:
    - comments about need_forkexit_callback
    - comments about release agent
    - typo and comment style, etc.

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • - add missing file and declare.
    - remove unused file and macros.
    - some cleanup.

    Signed-off-by: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoshinori Sato
     
  • get_user const *ptr access fix.

    Signed-off-by: Yoshinori Sato
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yoshinori Sato
     
  • Not all architectures implement futex_atomic_cmpxchg_inatomic(). The default
    implementation returns -ENOSYS, which is currently not handled inside of the
    futex guts.

    Futex PI calls and robust list exits with a held futex result in an endless
    loop in the futex code on architectures which have no support.

    Fixing up every place where futex_atomic_cmpxchg_inatomic() is called would
    add a fair amount of extra if/else constructs to the already complex code. It
    is also not possible to disable the robust feature before user space tries to
    register robust lists.

    Compile time disabling is not a good idea either, as there are already
    architectures with runtime detection of futex_atomic_cmpxchg_inatomic support.

    Detect the functionality at runtime instead by calling
    cmpxchg_futex_value_locked() with a NULL pointer from the futex initialization
    code. This is guaranteed to fail, but the call of
    futex_atomic_cmpxchg_inatomic() happens with pagefaults disabled.

    On architectures, which use the asm-generic implementation or have a runtime
    CPU feature detection, a -ENOSYS return value disables the PI/robust features.

    On architectures with a working implementation the call returns -EFAULT and
    the PI/robust features are enabled.

    The relevant syscalls return -ENOSYS and the robust list exit code is blocked,
    when the detection fails.

    Fixes http://lkml.org/lkml/2008/2/11/149
    Originally reported by: Lennart Buytenhek

    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc: Lennert Buytenhek
    Cc: Riku Voipio
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Gleixner
     
  • Merge include/linux/efs_fs{_i,_dir}.h into fs/efs/efs.h. efs_vh.h remains
    there because this is the IRIX volume header and shouldn't really be
    handled by efs but by the partitioning code. efs_sb.h remains there for
    now because it's exported to userspace. Of course this wrong and aboot
    should have a copy of it's own, but I'll leave that to a separate patch to
    avoid any contention.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Fixes a sequencing bug in spi driver pxa2xx_spi.c in which the chip select
    for a transfer may be asserted before the clock polarity is set on the
    interface. As a result of this bug, the clock signal may have the wrong
    polarity at transfer start, so it may need to make an extra half transition
    before the intended clock/data signals begin. (This probably means all
    transfers are one bit out of sequence.)

    This only occurs on the first transfer following a change in clock polarity
    in systems using more than one more than one such polarity. The fix
    assures that the clock mode is properly set before asserting chip select.

    This bug was introduced in a patch merged on 2006/12/10, kernel 2.6.20.
    The patch defines an additional bit in: include/asm-arm/arch-pxa/regs-ssp.h
    for 2.6.25 and newer kernels but this addition must be made in:
    include/asm-arm/arch-pxa/pxa-regs.h for kernels between 2.6.20 and 2.6.24,
    inclusive

    Signed-off-by: Ned Forrester
    Signed-off-by: David Brownell
    Cc: Russell King
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ned Forrester
     
  • Make is_vmalloc_addr() contingent on CONFIG_MMU=y, as it won't compile
    in !MMU mode.

    [ Bug introduced in commit 9e2779fa281cfda13ac060753d674bbcaa23367e:
    "is_vmalloc_addr(): Check if an address is within the vmalloc
    boundaries" ].

    Signed-off-by: David Howells
    Cc: Greg Ungerer
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Fix build failure on sparc:

    In file included from include/linux/mm.h:39,
    from include/linux/memcontrol.h:24,
    from include/linux/swap.h:8,
    from include/linux/suspend.h:7,
    from init/do_mounts.c:6:
    include/asm/pgtable.h:344: warning: parameter names (without
    types) in function declaration
    include/asm/pgtable.h:345: warning: parameter names (without
    types) in function declaration
    include/asm/pgtable.h:346: error: expected '=', ',', ';', 'asm' or
    '__attribute__' before '___f___swp_entry'

    viro sayeth:

    I've run allmodconfig builds on a bunch of target, FWIW (essentially the
    same patch). Note that these includes are recent addition caused by added
    inline function that had since then become a define. So while I agree with
    your comments in general, in _this_ case it's pretty safe.

    The commit that had done it is 3062fc67dad01b1d2a15d58c709eff946389eca4
    ("memcontrol: move mm_cgroup to header file") and the switch to #define
    is in commit 60c12b1202a60eabb1c61317e5d2678fcea9893f ("memcontrol: add
    vm_match_cgroup()") (BTW, that probably warranted mentioning in the
    changelog of the latter).

    Cc: Adrian Bunk
    Cc: Robert Reif
    Signed-off-by: David Rientjes
    Cc: "David S. Miller"
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Define SO_MARK for MN10300.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Define HZ as a config option.

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • 2.6.25-rc1 percpu changes broke CONFIG_DEBUG_PREEMPT's per_cpu checking
    on several architectures. On s390, sparc64 and x86 it's been weakened to
    not checking at all; whereas on powerpc64 it's become too strict, issuing
    warnings from __raw_get_cpu_var in io_schedule and init_timer for example.

    Fix this by weakening powerpc's __my_cpu_offset to use the non-checking
    local_paca instead of get_paca (which itself contains such a check);
    and strengthening the generic my_cpu_offset to go the old slow way via
    smp_processor_id when CONFIG_DEBUG_PREEMPT (debug_smp_processor_id is
    where all the knowledge of what's correct when lives).

    Signed-off-by: Hugh Dickins
    Reviewed-by: Mike Travis
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • During the last step of hibernation in the "platform" mode (with the
    help of ACPI) we use the suspend code, including the devices'
    ->suspend() methods, to prepare the system for entering the ACPI S4
    system sleep state.

    But at least for some devices the operations performed by the
    ->suspend() callback in that case must be different from its operations
    during regular suspend.

    For this reason, introduce the new PM event type PM_EVENT_HIBERNATE and
    pass it to the device drivers' ->suspend() methods during the last phase
    of hibernation, so that they can distinguish this case and handle it as
    appropriate. Modify the drivers that handle PM_EVENT_SUSPEND in a
    special way and need to handle PM_EVENT_HIBERNATE in the same way.

    These changes are necessary to fix a hibernation regression related
    to the i915 driver (ref. http://lkml.org/lkml/2008/2/22/488).

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Tested-by: Jeff Chua
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

23 Feb, 2008

1 commit


22 Feb, 2008

4 commits

  • Fix macro argument substitution in PageHead() and PageTail() - 'page' should
    have brackets surrounding it (commit 6d7779538f765963ced45a3fa4bed7ba8d2c277d).

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (26 commits)
    PM: Make suspend_device() static
    PCI ACPI: Fix comment describing acpi_pci_choose_state
    Hibernation: Handle DEBUG_PAGEALLOC on x86
    ACPI: fix build warning
    ACPI: TSC breaks atkbd suspend
    ACPI: remove is_processor_present prototype
    acer-wmi: Add DMI match for mail LED on Acer TravelMate 4200 series
    ACPI: sparse fix, replace macro with static function
    ACPI: thinkpad-acpi: add tablet-mode reporting
    ACPI: thinkpad-acpi: minor hotkey_radio_sw fixes
    ACPI: thinkpad-acpi: improve thinkpad-acpi input device documentation
    ACPI: thinkpad-acpi: issue input events for tablet swivel events
    ACPI: thinkpad-acpi: make the video output feature optional
    ACPI: thinkpad-acpi: synchronize input device switches
    ACPI: thinkpad-acpi: always track input device open/close
    ACPI: thinkpad-acpi: trivial fix to documentation
    ACPI: thinkpad-acpi: trivial fix to module_desc typo
    intel_menlo: extract return values using PTR_ERR
    ACPI video: check for error from thermal_cooling_device_register
    ACPI thermal: extract return values using PTR_ERR
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/pci-2.6:
    PCI: Fix wrong reference counter check for proc_dir_entry
    PCI: fix up setup-bus.c #ifdef
    PCI: don't load acpi_php when acpi is disabled
    PCI: quirks: set 'En' bit of MSI Mapping for devices onHT-based nvidia platform
    PCI: kernel-doc: fix pci-acpi warning
    PCI: irq: patch for Intel ICH10 DeviceID's
    PCI: pci_ids: patch for Intel ICH10 DeviceID's
    PCI: AMD SATA IDE mode quirk
    PCI: drivers/pcmcia/i82092.c: fix up after pci_bus_region changes
    PCI: hotplug: acpiphp_ibm: Remove get device information

    Linus Torvalds
     
  • * 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm:
    [ARM] 4835/1: Fix stale comment in struct machine_desc description
    [ARM] 4829/1: add .get method to pxa-cpufreq to silence a warning
    [ARM] 4828/1: fix 3 warnings in drivers/video/pxafb.c
    [ARM] 4827/1: fix two warnings in drivers/i2c/busses/i2c-pxa.c
    [ARM] 4826/1: Orion: Register the RTC interrupt on the TS-209
    [ARM] pxa: fix clock lookup to find specific device clocks

    Linus Torvalds