27 Dec, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (42 commits)
    ipv4: dont create routes on down devices
    epic100: hamachi: yellowfin: Fix skb allocation size
    sundance: Fix oopses with corrupted skb_shared_info
    Revert "ipv4: Allow configuring subnets as local addresses"
    USB: mcs7830: return negative if auto negotiate fails
    irda: prevent integer underflow in IRLMP_ENUMDEVICES
    tcp: fix listening_get_next()
    atl1c: Do not use legacy PCI power management
    mac80211: fix mesh forwarding
    MAINTAINERS: email address change
    net: Fix range checks in tcf_valid_offset().
    net_sched: sch_sfq: fix allot handling
    hostap: remove netif_stop_queue from init
    mac80211/rt2x00: add ieee80211_tx_status_ni()
    typhoon: memory corruption in typhoon_get_drvinfo()
    net: Add USB PID for new MOSCHIP USB ethernet controller MCS7832 variant
    net_sched: always clone skbs
    ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed.
    netlink: fix gcc -Wconversion compilation warning
    asix: add USB ID for Logitec LAN-GTJ U2A
    ...

    Linus Torvalds
     

26 Dec, 2010

1 commit

  • In ip_route_output_slow(), instead of allowing a route to be created on
    a not UPed device, report -ENETUNREACH immediately.

    # ip tunnel add mode ipip remote 10.16.0.164 local
    10.16.0.72 dev eth0
    # (Note : tunl1 is down)
    # ping -I tunl1 10.1.2.3
    PING 10.1.2.3 (10.1.2.3) from 192.168.18.5 tunl1: 56(84) bytes of data.
    (nothing)
    # ./a.out tunl1
    # ip tunnel del tunl1
    Message from syslogd@shelby at Dec 22 10:12:08 ...
    kernel: unregister_netdevice: waiting for tunl1 to become free.
    Usage count = 3

    After patch:
    # ping -I tunl1 10.1.2.3
    connect: Network is unreachable

    Reported-by: Nicolas Dichtel
    Signed-off-by: Eric Dumazet
    Reviewed-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 Dec, 2010

3 commits

  • This reverts commit 4465b469008bc03b98a1b8df4e9ae501b6c69d4b.

    Conflicts:

    net/ipv4/fib_frontend.c

    As reported by Ben Greear, this causes regressions:

    > Change 4465b469008bc03b98a1b8df4e9ae501b6c69d4b caused rules
    > to stop matching the input device properly because the
    > FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find().
    >
    > This breaks rules such as:
    >
    > ip rule add pref 512 lookup local
    > ip rule del pref 0 lookup local
    > ip link set eth2 up
    > ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2
    > ip rule add to 172.16.0.102 iif eth2 lookup local pref 10
    > ip rule add iif eth2 lookup 10001 pref 20
    > ip route add 172.16.0.0/24 dev eth2 table 10001
    > ip route add unreachable 0/0 table 10001
    >
    > If you had a second interface 'eth0' that was on a different
    > subnet, pinging a system on that interface would fail:
    >
    > [root@ct503-60 ~]# ping 192.168.100.1
    > connect: Invalid argument

    Reported-by: Ben Greear
    Signed-off-by: David S. Miller

    David S. Miller
     
  • If the user-provided len is less than the expected offset, the
    IRLMP_ENUMDEVICES getsockopt will do a copy_to_user() with a very large
    size value. While this isn't be a security issue on x86 because it will
    get caught by the access_ok() check, it may leak large amounts of kernel
    heap on other architectures. In any event, this patch fixes it.

    Signed-off-by: Dan Rosenberg
    Signed-off-by: David S. Miller

    Dan Rosenberg
     
  • Alexey Vlasov found /proc/net/tcp could sometime loop and display
    millions of sockets in LISTEN state.

    In 2.6.29, when we converted TCP hash tables to RCU, we left two
    sk_next() calls in listening_get_next().

    We must instead use sk_nulls_next() to properly detect an end of chain.

    Reported-by: Alexey Vlasov
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

23 Dec, 2010

2 commits


21 Dec, 2010

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    ceph: handle partial result from get_user_pages
    ceph: mark user pages dirty on direct-io reads
    ceph: fix null pointer dereference in ceph_init_dentry for nfs reexport
    ceph: fix direct-io on non-page-aligned buffers
    ceph: fix msgr_init error path

    Linus Torvalds
     
  • When deploying SFQ/IFB here at work, I found the allot management was
    pretty wrong in sfq, even changing allot from short to int...

    We should init allot for each new flow, not using a previous value found
    in slot.

    Before patch, I saw bursts of several packets per flow, apparently
    denying the default "quantum 1514" limit I had on my SFQ class.

    class sfq 11:1 parent 11:
    (dropped 0, overlimits 0 requeues 0)
    backlog 0b 7p requeues 0
    allot 11546

    class sfq 11:46 parent 11:
    (dropped 0, overlimits 0 requeues 0)
    backlog 0b 1p requeues 0
    allot -23873

    class sfq 11:78 parent 11:
    (dropped 0, overlimits 0 requeues 0)
    backlog 0b 5p requeues 0
    allot 11393

    After patch, better fairness among each flow, allot limit being
    respected, allot is positive :

    class sfq 11:e parent 11:
    (dropped 0, overlimits 0 requeues 86)
    backlog 0b 3p requeues 86
    allot 596

    class sfq 11:94 parent 11:
    (dropped 0, overlimits 0 requeues 0)
    backlog 0b 3p requeues 0
    allot 1468

    class sfq 11:a4 parent 11:
    (dropped 0, overlimits 0 requeues 0)
    backlog 0b 4p requeues 0
    allot 650

    class sfq 11:bb parent 11:
    (dropped 0, overlimits 0 requeues 0)
    backlog 0b 3p requeues 0
    allot 596

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Dec, 2010

1 commit


18 Dec, 2010

2 commits


17 Dec, 2010

5 commits

  • When loopback device is being brought down, then keep the route table
    entries because they are special. The entries in the local table for
    linklocal routes and ::1 address should not be purged.

    This is a sub optimal solution to the problem and should be replaced
    by a better fix in future.

    Signed-off-by: Stephen Hemminger
    Acked-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Get the sctp partial delivery point using SCTP_PARTIAL_DELIVERY_POINT
    socket option should return 0 if success, not -ENOTSUPP.

    Signed-off-by: Wei Yongjun
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • This patch fixes a missing ntohs() for bridge IPv6 multicast snooping.

    Signed-off-by: David L Stevens
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    David Stevens
     
  • Special care is taken inside sk_port_alloc to avoid overwriting
    skc_node/skc_nulls_node. We should also avoid overwriting
    skc_bind_node/skc_portaddr_node.

    The patch fixes the following crash:

    BUG: unable to handle kernel paging request at fffffffffffffff0
    IP: [] udp4_lib_lookup2+0xad/0x370
    [] __udp4_lib_lookup+0x282/0x360
    [] __udp4_lib_rcv+0x31e/0x700
    [] ? ip_local_deliver_finish+0x65/0x190
    [] ? ip_local_deliver+0x88/0xa0
    [] udp_rcv+0x15/0x20
    [] ip_local_deliver_finish+0x65/0x190
    [] ip_local_deliver+0x88/0xa0
    [] ip_rcv_finish+0x32d/0x6f0
    [] ? netif_receive_skb+0x99c/0x11c0
    [] ip_rcv+0x2bb/0x350
    [] netif_receive_skb+0x99c/0x11c0

    Signed-off-by: Leonard Crestez
    Signed-off-by: Octavian Purdila
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Octavian Purdila
     
  • The first big packets sent to a "low-MTU" client correctly
    triggers the creation of a temporary route containing the reduced MTU.

    But after the temporary route has expired, new ICMP6 "packet too big"
    will be sent, rt6_pmtu_discovery will find the previous EXPIRED route
    check that its mtu isn't bigger then in icmp packet and do nothing
    before the temporary route will not deleted by gc.

    I make the simple experiment:
    while :; do
    time ( dd if=/dev/zero bs=10K count=1 | ssh hostname dd of=/dev/null ) || break;
    done

    The "time" reports real 0m0.197s if a temporary route isn't expired, but
    it reports real 0m52.837s (!!!!) immediately after a temporare route has
    expired.

    Signed-off-by: Andrey Vagin
    Signed-off-by: David S. Miller

    Andrey Vagin
     

16 Dec, 2010

1 commit


15 Dec, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (75 commits)
    pppoe.c: Fix kernel panic caused by __pppoe_xmit
    WAN: Fix a TX IRQ causing BUG() in PC300 and PCI200SYN drivers.
    bnx2x: Advance a version number to 1.60.01-0
    bnx2x: Fixed a compilation warning
    bnx2x: LSO code was broken on BE platforms
    qlge: Fix deadlock when cancelling worker.
    net: fix skb_defer_rx_timestamp()
    cxgb4vf: Ingress Queue Entry Size needs to be 64 bytes
    phy: add the IC+ IP1001 driver
    atm: correct sysfs 'device' link creation and parent relationships
    MAINTAINERS: remove me from tulip
    SCTP: Fix SCTP_SET_PEER_PRIMARY_ADDR to accpet v4mapped address
    enic: Bug Fix: Pass napi reference to the isr that services receive queue
    ipv6: fix nl group when advertising a new link
    connector: add module alias
    net: Document the kernel_recvmsg() function
    r8169: Fix runtime power management
    hso: IP checksuming doesn't work on GE0301 option cards
    xfrm: Fix xfrm_state_migrate leak
    net: Convert netpoll blocking api in bonding driver to be a counter
    ...

    Linus Torvalds
     

14 Dec, 2010

4 commits

  • create_workqueue() returns NULL on failure.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • On suspend, there might be usb wireless drivers which wrongly trigger
    the warning in ieee80211_work_work. If an usb driver doesn't have a
    suspend hook, the usb stack will disconnect the device. On disconnect,
    a mac80211 driver calls ieee80211_unregister_hw, which calls dev_close,
    which calls ieee80211_stop, and in the end calls ieee80211_work_purge->
    ieee80211_work_work.

    The problem is that this call to ieee80211_work_purge comes after
    mac80211 is suspended, triggering the warning even when we don't have
    work queued in work_list (the expected case when already suspended),
    because it always calls ieee80211_work_work.

    So, just call ieee80211_work_work in ieee80211_work_purge if we really
    have to abort work. This addresses the warning reported at
    https://bugzilla.kernel.org/show_bug.cgi?id=24402

    Signed-off-by: Herton Ronaldo Krzesinski
    Signed-off-by: John W. Linville

    Herton Ronaldo Krzesinski
     
  • dev_open will eventually call ieee80211_ibss_join which sets up the
    skb used for beacons/probe-responses however it is possible to
    receive beacons that attempt to merge before this occurs causing
    a null pointer dereference. Check ssid_len as that is the last
    thing set in ieee80211_ibss_join.

    This occurs quite easily in the presence of adhoc nodes with hidden SSID's

    revised previous patch to check further up based on irc feedback

    Signed-off-by: Tim Harvey
    Reviewed-by: Johannes Berg
    Signed-off-by: John W. Linville

    Tim Harvey
     
  • John W. Linville
     

11 Dec, 2010

6 commits


10 Dec, 2010

1 commit

  • xfrm_state_migrate calls kfree instead of xfrm_state_put to free
    a failed state. According to git commit 553f9118 this can cause
    memory leaks.

    Signed-off-by: Thomas Egerer
    Signed-off-by: Steffen Klassert
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Thomas Egerer
     

09 Dec, 2010

10 commits

  • Unconditional use of skb->dev won't work here,
    try to fetch the econet device via skb_dst()->dev
    instead.

    Suggested by Eric Dumazet.

    Reported-by: Nelson Elhage
    Tested-by: Nelson Elhage
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Make sure sysctl_tcp_cookie_size is read once in
    tcp_cookie_size_check(), or we might return an illegal value to caller
    if sysctl_tcp_cookie_size is changed by another cpu.

    Signed-off-by: Eric Dumazet
    Cc: Ben Hutchings
    Cc: William Allen Simpson
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
    tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
    reading sysctl_tcp_tso_win_divisor exactly once.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • mac80211 doesn't handle shared skbs correctly at the moment. As a result
    a possible resize can trigger a BUG in pskb_expand_head.

    [ 676.030000] Kernel bug detected[#1]:
    [ 676.030000] Cpu 0
    [ 676.030000] $ 0 : 00000000 00000000 819662ff 00000002
    [ 676.030000] $ 4 : 81966200 00000020 00000000 00000020
    [ 676.030000] $ 8 : 819662e0 800043c0 00000002 00020000
    [ 676.030000] $12 : 3b9aca00 00000000 00000000 00470000
    [ 676.030000] $16 : 80ea2000 00000000 00000000 00000000
    [ 676.030000] $20 : 818aa200 80ea2018 80ea2000 00000008
    [ 676.030000] $24 : 00000002 800ace5c
    [ 676.030000] $28 : 8199a000 8199bd20 81938f88 80f180d4
    [ 676.030000] Hi : 0000026e
    [ 676.030000] Lo : 0000757e
    [ 676.030000] epc : 801245e4 pskb_expand_head+0x44/0x1d8
    [ 676.030000] Not tainted
    [ 676.030000] ra : 80f180d4 ieee80211_skb_resize+0xb0/0x114 [mac80211]
    [ 676.030000] Status: 1000a403 KERNEL EXL IE
    [ 676.030000] Cause : 10800024
    [ 676.030000] PrId : 0001964c (MIPS 24Kc)
    [ 676.030000] Modules linked in: mac80211_hwsim rt2800lib rt2x00soc rt2x00pci rt2x00lib mac80211 crc_itu_t crc_ccitt cfg80211 compat arc4 aes_generic deflate ecb cbc [last unloaded: rt2800pci]
    [ 676.030000] Process kpktgend_0 (pid: 97, threadinfo=8199a000, task=81879f48, tls=00000000)
    [ 676.030000] Stack : ffffffff 00000000 00000000 00000014 00000004 80ea2000 00000000 00000000
    [ 676.030000] 818aa200 80f180d4 ffffffff 0000000a 81879f78 81879f48 81879f48 00000018
    [ 676.030000] 81966246 80ea2000 818432e0 80f1a420 80203050 81814d98 00000001 81879f48
    [ 676.030000] 81879f48 00000018 81966246 818432e0 0000001a 8199bdd4 0000001c 80f1b72c
    [ 676.030000] 80203020 8001292c 80ef4aa2 7f10b55d 801ab5b8 81879f48 00000188 80005c90
    [ 676.030000] ...
    [ 676.030000] Call Trace:
    [ 676.030000] [] pskb_expand_head+0x44/0x1d8
    [ 676.030000] [] ieee80211_skb_resize+0xb0/0x114 [mac80211]
    [ 676.030000] [] ieee80211_xmit+0x150/0x22c [mac80211]
    [ 676.030000] [] ieee80211_subif_start_xmit+0x6f4/0x73c [mac80211]
    [ 676.030000] [] pktgen_thread_worker+0xfac/0x16f8
    [ 676.030000] [] kthread+0x7c/0x88
    [ 676.030000] [] kernel_thread_helper+0x10/0x18
    [ 676.030000]
    [ 676.030000]
    [ 676.030000] Code: 24020001 10620005 2502001f 0804917a 00000000 2502001f 00441023 00531021

    Fix this by making a local copy of shared skbs prior to mangeling them.
    To avoid copying the skb unnecessarily move the skb_copy call below the
    checks that don't need write access to the skb.

    Also, move the assignment of nh_pos and h_pos below the skb_copy to point
    to the correct skb.

    It would be possible to avoid another resize of the copied skb by using
    skb_copy_expand instead of skb_copy but that would make the patch more
    complex. Also, shared skbs are a corner case right now, so the resize
    shouldn't matter much.

    Cc: Johannes Berg
    Signed-off-by: Helmut Schaa
    Cc: stable@kernel.org
    Signed-off-by: John W. Linville

    Helmut Schaa
     
  • Rather than printing the message to the log, use a mib counter to keep
    track of the count of occurences of time wait bucket overflow. Reduces
    spam in logs.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • x25 does not decrement the network device reference counts on module unload.
    Thus unregistering any pre-existing interface after unloading the x25 module
    hangs and results in

    unregister_netdevice: waiting for tap0 to become free. Usage count = 1

    This patch decrements the reference counts of all interfaces in x25_link_free,
    the way it is already done in x25_link_device_down for NETDEV_DOWN events.

    Signed-off-by: Apollon Oikonomopoulos
    Signed-off-by: David S. Miller

    Apollon Oikonomopoulos
     
  • Using the SOCK_DGRAM enum results in
    "net-pf-2-proto-SOCK_DGRAM-type-115", so use the numeric value like it
    is done in net/dccp.

    Signed-off-by: Michal Marek
    Signed-off-by: David S. Miller

    Michal Marek
     
  • We need to drop the mutex and do a dev_put, so set an error code and break like
    the other paths, instead of returning directly.

    Signed-off-by: Nelson Elhage
    Signed-off-by: David S. Miller

    Nelson Elhage
     
  • Le dimanche 05 décembre 2010 à 12:23 +0100, Eric Dumazet a écrit :
    > Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :
    >
    > > Hmm..
    > >
    > > If somebody can explain why RTNL is held in arp_ioctl() (and therefore
    > > in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
    > > that your patch can be applied.
    > >
    > > Right now it is not good, because RTNL wont be necessarly held when you
    > > are going to call arp_invalidate() ?
    >
    > While doing this analysis, I found a refcount bug in llc, I'll send a
    > patch for net-2.6

    Oh well, of course I must first fix the bug in net-2.6, and wait David
    pull the fix in net-next-2.6 before sending this rcu conversion.

    Note: this patch should be sent to stable teams (2.6.34 and up)

    [PATCH net-2.6] llc: fix a device refcount imbalance

    commit abf9d537fea225 (llc: add support for SO_BINDTODEVICE) added one
    refcount imbalance in llc_ui_bind(), because dev_getbyhwaddr() doesnt
    take a reference on device, while dev_get_by_index() does.

    Fix this using RCU locking. And since an RCU conversion will be done for
    2.6.38 for dev_getbyhwaddr(), put the rcu_read_lock/unlock exactly at
    their final place.

    Signed-off-by: Eric Dumazet
    Cc: stable@kernel.org
    Cc: Octavian Purdila
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The bug has to do with boundary checks on the initial receive window.
    If the initial receive window falls between init_cwnd and the
    receive window specified by the user, the initial window is incorrectly
    brought down to init_cwnd. The correct behavior is to allow it to
    remain unchanged.

    Signed-off-by: Nandita Dukkipati
    Signed-off-by: David S. Miller

    Nandita Dukkipati