22 Mar, 2008

20 commits

  • Mark "hdx=remap" and "hdx=remap63" kernel parameters as obsoleted
    (they are layering violation and should be dealt with in the same
    way as done by libata - device-mapper should be used instead).

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Mark "hdx=[driver_name]" and "hdx=scsi" kernel parameters as obsoleted
    (nowadays device-driver binding can be changed at runtime through sysfs
    and it can also be dealt with using per device driver parameters).

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • * "hdx=cyls,heads,sects,wpcom,irq" should be "hdx=cyls,heads,sects".

    * "hdx=" is for "x" from 'a' to 'u', "idex=" is for "x" from '0' to '9'.

    * "idex=noautotune" is long gone.

    * Obsoleted "ide0=" parameters were already removed from the documentation.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Mark "ide0=ali14xx|cmd640_vlb|dtc2278|ht6560b|qd65xx|umc8672" kernel
    parameters as obsoleted (per host driver replacements have been available
    for a long time).

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • …linux-2.6-sched-devel

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mingo/linux-2.6-sched-devel:
    sched: add arch_update_cpu_topology hook.
    sched: add exported arch_reinit_sched_domains() to header file.
    sched: remove double unlikely from schedule()
    sched: cleanup old and rarely used 'debug' features.

    Linus Torvalds
     
  • so use nodedata_phys directly.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     
  • Fix wrong function name and references to non-x86 architectures.

    Signed-off-by: Matti Linnanvuori mattilinnanvuori@yahoo.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Matti Linnanvuori
     
  • fix the bug reported here:

    http://bugzilla.kernel.org/show_bug.cgi?id=10232

    use update_memory_range() instead of add_memory_range() directly
    to avoid closing the gap.

    ( the new code only affects and runs on systems where the MTRR
    workaround triggers. )

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     
  • we have seen a little problem in rebooting Dell Optiplex 745 with the
    0KW626 board. Here is a small patch enabling reboot with this board,
    which forces the default reboot path it into the BIOS reboot mode.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Heinz-Ado Arnolds
     
  • The fault_msg text is not explictly nul terminated now in startup
    assembly. Do so by converting .ascii to .asciz.

    Signed-off-by: Jiri Slaby
    Cc: H. Peter Anvin
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Jiri Slaby
     
  • aperture_64.c takes a piece of memory and makes it into iommu
    window... but such window may not be saved by swsusp -- that leads to
    oops during hibernation.

    Signed-off-by: Pavel Machek
    Acked-by: "Rafael J. Wysocki"
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Pavel Machek
     
  • this patch allows hpet=force on nVidia nForce 430 southbridge.
    This patch was tested by me on my old Asus A8N-VM CSM (where bios does not
    support hpet and does not advertise it via acpi entry). My nForce430 version:
    lspci -nn | grep LPC
    00:0a.0 ISA bridge [0601]: nVidia Corporation MCP51 LPC Bridge [10de:0260]
    (rev a2)

    Kernel 2.6.24.3 after patching and using hpet=force reports this:
    dmesg | grep -i hpet
    Kernel command line: root=/dev/sda8 ro vga=773 video=vesafb:mtrr:4,ywrap
    vt.default_utf8=0 hpet=force
    Force enabled HPET at base address 0xfed00000
    hpet clockevent registered
    Time: hpet clocksource has been installed.

    grep -i hpet /proc/timer_list
    Clock Event Device: hpet
    set_next_event: hpet_legacy_next_event
    set_mode: hpet_legacy_set_mode

    grep Clock /proc/timer_list (before patching)
    Clock Event Device: pit
    Clock Event Device: lapic

    grep Clock /proc/timer_list (after patching)
    Clock Event Device: hpet
    Clock Event Device: lapic

    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Zbigniew Luszpinski
     
  • a system with 256 GB of RAM, when NUMA is disabled crashes the
    following way:

    Your BIOS doesn't leave a aperture memory hole
    Please enable the IOMMU option in the BIOS setup
    This costs you 64 MB of RAM
    Cannot allocate aperture memory hole (ffff8101c0000000,65536K)
    Kernel panic - not syncing: Not enough memory for aperture
    Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33

    Call Trace:
    [] panic+0xb2/0x190
    [] ? release_console_sem+0x7c/0x250
    [] ? __alloc_bootmem_nopanic+0x48/0x90
    [] ? free_bootmem+0x29/0x50
    [] gart_iommu_hole_init+0x5e7/0x680
    [] ? alloc_large_system_hash+0x16b/0x310
    [] ? _etext+0x0/0x1
    [] pci_iommu_alloc+0x1c/0x40
    [] mem_init+0x45/0x1a0
    [] start_kernel+0x295/0x380
    [] _sinittext+0x1c2/0x230

    the root cause is : memmap PMD is too big,
    [ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0
    almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G.

    solution will be:
    1. make memmap allocation get memory above 4G...
    2. reserve some dma32 range early before we try to set up memmap for all.
    and release that before pci_iommu_alloc, so gart or swiotlb could get some
    range under 4g limit for sure.

    the patch is using method 2.
    because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP

    will get
    Your BIOS doesn't leave a aperture memory hole
    Please enable the IOMMU option in the BIOS setup
    This costs you 64 MB of RAM
    Mapping aperture over 65536 KB of RAM @ 4000000
    Memory: 264245736k/268959744k available (8484k kernel code, 4187464k reserved, 4004k data, 724k init)

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     
  • We recently got some of the "Desktop Form Factor" Optiplex 745's in. I
    noticed that there's an entry for the SFF one's, but the BIOS model number
    of the DFF differs from that of the SFF. We have been reliably
    experiencing the same (as far as I can tell) reboot bug as the SFF boxes.

    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Coleman Kane
     
  • Fix visws printk format warnings:

    /local/linsrc/linux-2.6.24-git15/arch/x86/mach-visws/traps.c:50: warning: format '%#lx' expects type 'long unsigned int', but argument 2 has type 'u32'
    /local/linsrc/linux-2.6.24-git15/arch/x86/mach-visws/traps.c:50: warning: format '%#lx' expects type 'long unsigned int', but argument 3 has type 'u32'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Randy Dunlap
     
  • Clean up: eliminate some compiler noise on x86 when building with strict
    warnings enabled, introduced by commit 345b904c.

    In file included from include2/asm/thread_info_64.h:12,
    from include2/asm/thread_info.h:4,
    from
    /home/cel/src/linux/nfs-2.6/include/linux/thread_info.h:35,
    from
    /home/cel/src/linux/nfs-2.6/include/linux/preempt.h:9,
    from
    /home/cel/src/linux/nfs-2.6/include/linux/spinlock.h:49,
    from /home/cel/src/linux/nfs-2.6/include/linux/mmzone.h:7,
    from /home/cel/src/linux/nfs-2.6/include/linux/gfp.h:4,
    from /home/cel/src/linux/nfs-2.6/include/linux/slab.h:14,
    from /home/cel/src/linux/nfs-2.6/fs/nfsd/nfs4acl.c:40:
    include2/asm/page.h:55: warning: `inline' is not at beginning of
    declaration
    include2/asm/page.h:61: warning: `inline' is not at beginning of
    declaration

    Signed-off-by: Chuck Lever
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Chuck Lever
     
  • mm/slub.c: In function 'slab_alloc':
    mm/slub.c:1637: warning: assignment makes pointer from integer without a cast
    mm/slub.c:1637: warning: assignment makes pointer from integer without a cast
    mm/slub.c: In function 'slab_free':
    mm/slub.c:1796: warning: assignment makes pointer from integer without a cast
    mm/slub.c:1796: warning: assignment makes pointer from integer without a cast

    A cast is needed in the 386 and 486 code because the type is a pointer. In
    every other integer case the original cmpxchg code (and the cmpxchg_local
    which has been copied from it) worked fine, but since we touch a pointer,
    the type needs to be casted in the cmpxchg_local and cmpxchg macros.

    The more recent code (586+) does not have this problem (the cast is already
    there).

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Christoph Lameter
    Cc: Vegard Nossum
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Mathieu Desnoyers
     
  • when numa disabled I got this compile warning:

    arch/x86/kernel/setup64.c: In function setup_per_cpu_areas:
    arch/x86/kernel/setup64.c:147: warning: the address of
    contig_page_data will always evaluate as true

    it seems we missed checking if the node is online before we try to refer
    NODE_DATA. Fix it.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     
  • memory-less node support:

    this patch uses updated dev_to_node, because dev_to_node already makes sure
    it returns an online node.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Yinghai Lu
     

21 Mar, 2008

20 commits

  • Will be called each time the scheduling domains are rebuild.
    Needed for architectures that don't have a static cpu topology.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     
  • Needed so it can be called from outside of sched.c.

    Signed-off-by: Heiko Carstens
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Ingo Molnar

    Heiko Carstens
     
  • Combine two unlikely's

    Signed-off-by: Roel Kluin
    Signed-off-by: Ingo Molnar

    Roel Kluin
     
  • TREE_AVG and APPROX_AVG are initial task placement policies that have been
    disabled for a long while.. time to remove them.

    Signed-off-by: Peter Zijlstra
    CC: Srivatsa Vaddagiri
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
    [SPARC64]: Fix atomic backoff limit.

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (46 commits)
    [NET] ifb: set separate lockdep classes for queue locks
    [IPV6] KCONFIG: Fix description about IPV6_TUNNEL.
    [TCP]: Fix shrinking windows with window scaling
    netpoll: zap_completion_queue: adjust skb->users counter
    bridge: use time_before() in br_fdb_cleanup()
    [TG3]: Fix build warning on sparc32.
    MAINTAINERS: bluez-devel is subscribers-only
    audit: netlink socket can be auto-bound to pid other than current->pid (v2)
    [NET]: Fix permissions of /proc/net
    [SCTP]: Fix a race between module load and protosw access
    [NETFILTER]: ipt_recent: sanity check hit count
    [NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()
    [RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning
    [IPV4]: esp_output() misannotations
    [8021Q]: vlan_dev misannotations
    xfrm: ->eth_proto is __be16
    [IPV4]: ipv4_is_lbcast() misannotations
    [SUNRPC]: net/* NULL noise
    [SCTP]: fix misannotated __sctp_rcv_asconf_lookup()
    [PKT_SCHED]: annotate cls_u32
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6.25:
    sh: Use relative paths for mach/cpu symlinks.
    SH: Use newer, non-deprecated __SPIN_LOCK_UNLOCKED macro.
    sh: Fix more user header breakage from sh64 integration.
    sh: Fix uImage build error.
    sh: Fix up the timer IRQ definition for SH7203.
    sh: Fix up the address error exception handler for SH-2.
    serial: sh-sci: Fix fifo stall on SH7760/SH7780/SH7785 SCIF.

    Linus Torvalds
     
  • When building the kernel without passing the O= command line parameter
    there's no point to use absolute paths for them.

    Usually relative paths are preferred because they survive directory
    moves, work across networked file systems and chrooted environments.

    Absolute paths are still used if an output directory is given.

    Signed-off-by: Franck Bui-Huu
    Signed-off-by: Paul Mundt

    Franck Bui-Huu
     
  • Signed-off-by: Robert P. J. Day
    Signed-off-by: Paul Mundt

    Robert P. J. Day
     
  • [ 10.536424] =======================================================
    [ 10.536424] [ INFO: possible circular locking dependency detected ]
    [ 10.536424] 2.6.25-rc3-devel #3
    [ 10.536424] -------------------------------------------------------
    [ 10.536424] swapper/0 is trying to acquire lock:
    [ 10.536424] (&dev->queue_lock){-+..}, at: []
    dev_queue_xmit+0x175/0x2f3
    [ 10.536424]
    [ 10.536424] but task is already holding lock:
    [ 10.536424] (&p->tcfc_lock){-+..}, at: [] tcf_mirred+0x20/0x178
    [act_mirred]
    [ 10.536424]
    [ 10.536424] which lock already depends on the new lock.

    lockdep warns of locking order while using ifb with sch_ingress and
    act_mirred: ingress_lock, tcfc_lock, queue_lock (usually queue_lock
    is at the beginning). This patch is only to tell lockdep that ifb is
    a different device (e.g. from eth) and has its own pair of queue
    locks. (This warning is a false-positive in common scenario of using
    ifb; yet there are possible situations, when this order could be
    dangerous; lockdep should warn in such a case.) (With suggestions by
    David S. Miller)

    Reported-and-tested-by: Denys Fedoryshchenko
    Signed-off-by: Jarek Poplawski
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Jarek Poplawski
     
  • Based on notice from "Colin" .

    Signed-off-by: YOSHIFUJI Hideaki
    Signed-off-by: David S. Miller

    YOSHIFUJI Hideaki
     
  • When selecting a new window, tcp_select_window() tries not to shrink
    the offered window by using the maximum of the remaining offered window
    size and the newly calculated window size. The newly calculated window
    size is always a multiple of the window scaling factor, the remaining
    window size however might not be since it depends on rcv_wup/rcv_nxt.
    This means we're effectively shrinking the window when scaling it down.

    The dump below shows the problem (scaling factor 2^7):

    - Window size of 557 (71296) is advertised, up to 3111907257:

    IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557

    - New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes
    below the last end:

    IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514

    The number 40 results from downscaling the remaining window:

    3111907257 - 3111841425 = 65832
    65832 / 2^7 = 514
    65832 % 2^7 = 40

    If the sender uses up the entire window before it is shrunk, this can have
    chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq()
    will notice that the window has been shrunk since tcp_wnd_end() is before
    tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number.
    This will fail the receivers checks in tcp_sequence() however since it
    is before it's tp->rcv_wup, making it respond with a dupack.

    If both sides are in this condition, this leads to a constant flood of
    ACKs until the connection times out.

    Make sure the window is never shrunk by aligning the remaining window to
    the window scaling factor.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • zap_completion_queue() retrieves skbs from completion_queue where they have
    zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero
    yet, so it's increased now.

    Reported-and-tested-by: Andrew Morton
    Signed-off-by: Jarek Poplawski
    Signed-off-by: Andrew Morton
    Signed-off-by: David S. Miller

    Jarek Poplawski
     
  • In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they
    should be compared using the time_after() macro.

    Signed-off-by: Fabio Checconi
    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Fabio Checconi
     
  • Sparc MAC address support should be protected consistently
    with CONFIG_SPARC, but there was a stray CONFIG_SPARC64
    case.

    Bump driver version and release date.

    Reported by Andrew Morton.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: Pavel Machek
    Signed-off-by: David S. Miller

    Pavel Machek
     
  • From: Pavel Emelyanov

    This patch is based on the one from Thomas.

    The kauditd_thread() calls the netlink_unicast() and passes
    the audit_pid to it. The audit_pid, in turn, is received from
    the user space and the tool (I've checked the audit v1.6.9)
    uses getpid() to pass one in the kernel. Besides, this tool
    doesn't bind the netlink socket to this id, but simply creates
    it allowing the kernel to auto-bind one.

    That's the preamble.

    The problem is that netlink_autobind() _does_not_ guarantees
    that the socket will be auto-bound to the current pid. Instead
    it uses the current pid as a hint to start looking for a free
    id. So, in case of conflict, the audit messages can be sent
    to a wrong socket. This can happen (it's unlikely, but can be)
    in case some task opens more than one netlink sockets and then
    the audit one starts - in this case the audit's pid can be busy
    and its socket will be bound to another id.

    The proposal is to introduce an audit_nlk_pid in audit subsys,
    that will point to the netlink socket to send packets to. It
    will most often be equal to audit_pid. The socket id can be
    got from the skb's netlink CB right in the audit_receive_msg.
    The audit_nlk_pid reset to 0 is not required, since all the
    decisions are taken based on audit_pid value only.

    Later, if the audit tools will bind the socket themselves, the
    kernel will have to provide a way to setup the audit_nlk_pid
    as well.

    A good side effect of this patch is that audit_pid can later
    be converted to struct pid, as it is not longer safe to use
    pid_t-s in the presence of pid namespaces. But audit code still
    uses the tgid from task_struct in the audit_signal_info and in
    the audit_filter_syscall.

    Signed-off-by: Thomas Graf
    Signed-off-by: Pavel Emelyanov
    Acked-by: Eric Paris
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • commit e9720ac ([NET]: Make /proc/net a symlink on /proc/self/net (v3))
    broke ganglia and probably other applications that read /proc/net/dev.

    This is due to the change of permissions of /proc/net that was
    introduced in that commit.

    Before: dr-xr-xr-x 5 root root 0 Mar 19 11:30 /proc/net
    After: dr-xr--r-- 5 root root 0 Mar 19 11:29 /proc/self/net

    This patch restores the permissions to the old value which makes
    ganglia happy again.

    Pavel Emelyanov says:

    This also broke the postfix, as it was reported in bug #10286
    and described in detail by Benjamin.

    Signed-off-by: Andre Noll
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Andre Noll
     
  • There is a race is SCTP between the loading of the module
    and the access by the socket layer to the protocol functions.
    In particular, a list of addresss that SCTP maintains is
    not initialized prior to the registration with the protosw.
    Thus it is possible for a user application to gain access
    to SCTP functions before everything has been initialized.
    The problem shows up as odd crashes during connection
    initializtion when we try to access the SCTP address list.

    The solution is to refactor how we do registration and
    initialize the lists prior to registering with the protosw.
    Care must be taken since the address list initialization
    depends on some other pieces of SCTP initialization. Also
    the clean-up in case of failure now also needs to be refactored.

    Signed-off-by: Vlad Yasevich
    Acked-by: Sridhar Samudrala
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • If a rule using ipt_recent is created with a hit count greater than
    ip_pkt_list_tot, the rule will never match as it cannot keep track
    of enough timestamps. This patch makes ipt_recent refuse to create such
    rules.

    With ip_pkt_list_tot's default value of 20, the following can be used
    to reproduce the problem.

    nc -u -l 0.0.0.0 1234 &
    for i in `seq 1 100`; do echo $i | nc -w 1 -u 127.0.0.1 1234; done

    This limits it to 20 packets:
    iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
    --rsource
    iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
    60 --hitcount 20 --name test --rsource -j DROP

    While this is unlimited:
    iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \
    --rsource
    iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \
    60 --hitcount 21 --name test --rsource -j DROP

    With the patch the second rule-set will throw an EINVAL.

    Reported-by: Sean Kennedy
    Signed-off-by: Daniel Hokka Zakrisson
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Daniel Hokka Zakrisson