07 Jul, 2009

1 commit

  • Commit 'net: Move rx skb_orphan call to where needed' broken sctp protocol
    with warning at inet_sock_destruct(). Actually, sctp can do this right with
    sctp_sock_rfree_frag() and sctp_skb_set_owner_r_frag() pair.

    sctp_sock_rfree_frag(skb);
    sctp_skb_set_owner_r_frag(skb, newsk);

    This patch not revert the commit d55d87fdff8252d0e2f7c28c2d443aee17e9d70f,
    instead remove the sctp_sock_rfree_frag() function.

    ------------[ cut here ]------------
    WARNING: at net/ipv4/af_inet.c:151 inet_sock_destruct+0xe0/0x142()
    Modules linked in: sctp ipv6 dm_mirror dm_region_hash dm_log dm_multipath
    scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: scsi_wait_scan]
    Pid: 1808, comm: sctp_test Not tainted 2.6.31-rc2 #40
    Call Trace:
    [] warn_slowpath_common+0x6a/0x81
    [] ? inet_sock_destruct+0xe0/0x142
    [] warn_slowpath_null+0x12/0x15
    [] inet_sock_destruct+0xe0/0x142
    [] __sk_free+0x19/0xcc
    [] sk_free+0x18/0x1a
    [] sctp_close+0x192/0x1a1 [sctp]
    [] inet_release+0x47/0x4d
    [] sock_release+0x19/0x5e
    [] sock_close+0x21/0x25
    [] __fput+0xde/0x189
    [] fput+0x18/0x1a
    [] filp_close+0x56/0x60
    [] put_files_struct+0x5d/0xa1
    [] exit_files+0x39/0x3d
    [] do_exit+0x1a5/0x5dd
    [] ? d_kill+0x35/0x3b
    [] ? dequeue_signal+0xa6/0x115
    [] do_group_exit+0x63/0x8a
    [] get_signal_to_deliver+0x2e1/0x2f9
    [] do_notify_resume+0x7c/0x6b5
    [] ? autoremove_wake_function+0x0/0x34
    [] ? __d_free+0x3d/0x40
    [] ? d_free+0x2a/0x3c
    [] ? vfs_write+0x103/0x117
    [] ? sys_socketcall+0x178/0x182
    [] work_notifysig+0x13/0x19
    ---[ end trace 9db92c463e789fba ]---

    Signed-off-by: Wei Yongjun
    Acked-by: Herbert Xu
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Wei Yongjun
     

06 Jul, 2009

1 commit

  • The bit that tells us whether a statistics counter snapshot operation
    has completed is located in the GLOBAL register block, not in the
    GLOBAL2 register block, so fix up mv88e6xxx_stats_wait() to poll the
    right register address.

    Signed-off-by: Stephane Contri
    Signed-off-by: Lennert Buytenhek
    Cc: stable@kernel.org
    Signed-off-by: David S. Miller

    Stephane Contri
     

04 Jul, 2009

3 commits

  • There's a bug in addrconf_prefix_rcv() where it won't update the
    preferred lifetime of an IPv6 address if the current valid lifetime
    of the address is less than 2 hours (the minimum value in the RA).

    For example, If I send a router advertisement with a prefix that
    has valid lifetime = preferred lifetime = 2 hours we'll build
    this address:

    3: eth0: mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
    valid_lft 7175sec preferred_lft 7175sec

    If I then send the same prefix with valid lifetime = preferred
    lifetime = 0 it will be ignored since the minimum valid lifetime
    is 2 hours:

    3: eth0: mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:217:8ff:fe7d:4718/64 scope global dynamic
    valid_lft 7161sec preferred_lft 7161sec

    But according to RFC 4862 we should always reset the preferred lifetime
    even if the valid lifetime is invalid, which would cause the address
    to immediately get deprecated. So with this patch we'd see this:

    5: eth0: mtu 1500 qlen 1000
    inet6 2001:1890:1109:a20:21f:29ff:fe5a:ef04/64 scope global deprecated dynamic
    valid_lft 7163sec preferred_lft 0sec

    The comment winds-up being 5x the size of the code to fix the problem.

    Update the preferred lifetime of IPv6 addresses derived from a prefix
    info option in a router advertisement even if the valid lifetime in
    the option is invalid, as specified in RFC 4862 Section 5.5.3e. Fixes
    an issue where an address will not immediately become deprecated.
    Reported by Jens Rosenboom.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     
  • The SCTP pushed the skb above the sctp chunk header, so the
    check of pskb_may_pull(skb, nh + offset + 1 - skb->data) in
    _decode_session6() will never return 0 and the ports decode
    of sctp will always fail. (nh + offset + 1 - skb->data < 0)

    Signed-off-by: Wei Yongjun
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Wei Yongjun
     
  • The SCTP pushed the skb data above the sctp chunk header, so the check
    of pskb_may_pull(skb, xprth + 4 - skb->data) in _decode_session4() will
    never return 0 because xprth + 4 - skb->data < 0, the ports decode of
    sctp will always fail.

    Signed-off-by: Wei Yongjun
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Wei Yongjun
     

03 Jul, 2009

1 commit


01 Jul, 2009

2 commits

  • This reverts commit 73ce7b01b4496a5fbf9caf63033c874be692333f.

    After discovering that we don't listen to gratuitious arps in 2.6.30
    I tracked the failure down to this commit.

    The patch makes absolutely no sense. RFC2131 RFC3927 and RFC5227.
    are all in agreement that an arp request with sip == 0 should be used
    for the probe (to prevent learning) and an arp request with sip == tip
    should be used for the gratitous announcement that people can learn
    from.

    It appears the author of the broken patch got those two cases confused
    and modified the code to drop all gratuitous arp traffic. Ouch!

    Cc: stable@kernel.org
    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Alas current delaying of freeing old tnodes by RCU in trie_rebalance
    is still not enough because we can free a top tnode before updating a
    t->trie pointer.

    Reported-by: Pawel Staszewski
    Tested-by: Pawel Staszewski
    Signed-off-by: Jarek Poplawski
    Signed-off-by: David S. Miller

    Jarek Poplawski
     

30 Jun, 2009

6 commits


29 Jun, 2009

4 commits


27 Jun, 2009

7 commits

  • When NAPI is disabled while we're in net_rx_action, we end up
    calling __napi_complete without flushing GRO packets. This is
    a bug as it would cause the GRO packets to linger, of course it
    also literally BUGs to catch error like this :)

    This patch changes it to napi_complete, with the obligatory IRQ
    reenabling. This should be safe because we've only just disabled
    IRQs and it does not materially affect the test conditions in
    between.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • As transparent proxying looks up the socket early and assigns
    it to the skb for later processing, we must drop any existing
    socket ownership prior to that in order to distinguish between
    the case where tproxy is active and where it is not.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The mac80211 module uses rcu_call() thus it should use rcu_barrier()
    on module unload.

    The rcu_barrier() is placed in mech.c ieee80211_stop_mesh() which is
    invoked from ieee80211_stop() in case vif.type == NL80211_IFTYPE_MESH_POINT.

    Acked-by: Paul E. McKenney
    Acked-by: Johannes Berg
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • The sunrpc module uses rcu_call() thus it should use rcu_barrier() on
    module unload.

    Have not verified that the possibility for new call_rcu() callbacks
    has been disabled. As a hint for checking, the functions calling
    call_rcu() (unx_destroy_cred and generic_destroy_cred) are
    registered as crdestroy function pointer in struct rpc_credops.

    Acked-by: Paul E. McKenney
    Acked-by: Trond Myklebust
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • When unloading modules that uses call_rcu() callbacks, then we must
    use rcu_barrier(). This module uses syncronize_net() which is not
    enough to be sure that all callback has been completed.

    Acked-by: Paul E. McKenney
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • The ipv6 module uses rcu_call() thus it should use rcu_barrier() on
    module unload.

    Acked-by: Paul E. McKenney
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • The decnet module unloading as been disabled with a '#if 0' statement,
    because it have had issues.

    We add a rcu_barrier() anyhow for correctness.

    The maintainer (Chrissie Caulfield) will look into the unload issue
    when time permits.

    Acked-by: Paul E. McKenney
    Acked-by: Chrissie Caulfield
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     

26 Jun, 2009

2 commits


25 Jun, 2009

5 commits

  • RCU barriers, rcu_barrier(), is inserted two places.

    In nf_conntrack_expect.c nf_conntrack_expect_fini() before the
    kmem_cache_destroy(). Firstly to make sure the callback to the
    nf_ct_expect_free_rcu() code is still around. Secondly because I'm
    unsure about the consequence of having in flight
    nf_ct_expect_free_rcu/kmem_cache_free() calls while doing a
    kmem_cache_destroy() slab destroy.

    And in nf_conntrack_extend.c nf_ct_extend_unregister(), inorder to
    wait for completion of callbacks to __nf_ct_ext_free_rcu(), which is
    invoked by __nf_ct_ext_add(). It might be more efficient to call
    rcu_barrier() in nf_conntrack_core.c nf_conntrack_cleanup_net(), but
    thats make it more difficult to read the code (as the callback code
    in located in nf_conntrack_extend.c).

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: Patrick McHardy

    Jesper Dangaard Brouer
     
  • Netlink address deletion events were not sent when a network device
    vanished neither when Phonet was unloaded.

    Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • Signed-off-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Rémi Denis-Courmont
     
  • Our CAST algorithm is called cast5, not cast128. Clearly nobody
    has ever used it :)

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6:
    bnx2: Fix the behavior of ethtool when ONBOOT=no
    qla3xxx: Don't sleep while holding lock.
    qla3xxx: Give the PHY time to come out of reset.
    ipv4 routing: Ensure that route cache entries are usable and reclaimable with caching is off
    net: Move rx skb_orphan call to where needed
    ipv6: Use correct data types for ICMPv6 type and code
    net: let KS8842 driver depend on HAS_IOMEM
    can: let SJA1000 driver depend on HAS_IOMEM
    netxen: fix firmware init handshake
    netxen: fix build with without CONFIG_PM
    netfilter: xt_rateest: fix comparison with self
    netfilter: xt_quota: fix incomplete initialization
    netfilter: nf_log: fix direct userspace memory access in proc handler
    netfilter: fix some sparse endianess warnings
    netfilter: nf_conntrack: fix conntrack lookup race
    netfilter: nf_conntrack: fix confirmation race condition
    netfilter: nf_conntrack: death_by_timeout() fix

    Linus Torvalds
     

24 Jun, 2009

2 commits

  • When route caching is disabled (rt_caching returns false), We still use route
    cache entries that are created and passed into rt_intern_hash once. These
    routes need to be made usable for the one call path that holds a reference to
    them, and they need to be reclaimed when they're finished with their use. To be
    made usable, they need to be associated with a neighbor table entry (which they
    currently are not), otherwise iproute_finish2 just discards the packet, since we
    don't know which L2 peer to send the packet to. To do this binding, we need to
    follow the path a bit higher up in rt_intern_hash, which calls
    arp_bind_neighbour, but not assign the route entry to the hash table.
    Currently, if caching is off, we simply assign the route to the rp pointer and
    are reutrn success. This patch associates us with a neighbor entry first.

    Secondly, we need to make sure that any single use routes like this are known to
    the garbage collector when caching is off. If caching is off, and we try to
    hash in a route, it will leak when its refcount reaches zero. To avoid this,
    this patch calls rt_free on the route cache entry passed into rt_intern_hash.
    This places us on the gc list for the route cache garbage collector, so that
    when its refcount reaches zero, it will be reclaimed (Thanks to Alexey for this
    suggestion).

    I've tested this on a local system here, and with these patches in place, I'm
    able to maintain routed connectivity to remote systems, even if I set
    /proc/sys/net/ipv4/rt_cache_rebuild_count to -1, which forces rt_caching to
    return false.

    Signed-off-by: Neil Horman
    Reported-by: Jarek Poplawski
    Reported-by: Maxime Bizon
    Signed-off-by: David S. Miller

    Neil Horman
     
  • In order to get the tun driver to account packets, we need to be
    able to receive packets with destructors set. To be on the safe
    side, I added an skb_orphan call for all protocols by default since
    some of them (IP in particular) cannot handle receiving packets
    destructors properly.

    Now it seems that at least one protocol (CAN) expects to be able
    to pass skb->sk through the rx path without getting clobbered.

    So this patch attempts to fix this properly by moving the skb_orphan
    call to where it's actually needed. In particular, I've added it
    to skb_set_owner_[rw] which is what most users of skb->destructor
    call.

    This is actually an improvement for tun too since it means that
    we only give back the amount charged to the socket when the skb
    is passed to another socket that will also be charged accordingly.

    Signed-off-by: Herbert Xu
    Tested-by: Oliver Hartkopp
    Signed-off-by: David S. Miller

    Herbert Xu
     

23 Jun, 2009

4 commits

  • Change all the code that deals directly with ICMPv6 type and code
    values to use u8 instead of a signed int as that's the actual data
    type.

    Signed-off-by: Brian Haley
    Signed-off-by: David S. Miller

    Brian Haley
     
  • * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
    SUNRPC: Fix the TCP server's send buffer accounting
    nfsd41: Backchannel: minorversion support for the back channel
    nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
    nfsd41: Remove ip address collision detection case
    nfsd: optimise the starting of zero threads when none are running.
    nfsd: don't take nfsd_mutex twice when setting number of threads.
    nfsd41: sanity check client drc maxreqs
    nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
    NFS: kill off complicated macro 'PROC'
    sunrpc: potential memory leak in function rdma_read_xdr
    nfsd: minor nfsd_vfs_write cleanup
    nfsd: Pull write-gathering code out of nfsd_vfs_write
    nfsd: track last inode only in use_wgather case
    sunrpc: align cache_clean work's timer
    nfsd: Use write gathering only with NFSv2
    NFSv4: kill off complicated macro 'PROC'
    NFSv4: do exact check about attribute specified
    knfsd: remove unreported filehandle stats counters
    knfsd: fix reply cache memory corruption
    knfsd: reply cache cleanups
    ...

    Linus Torvalds
     
  • * 'for-2.6.31' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (128 commits)
    nfs41: sunrpc: xprt_alloc_bc_request() should not use spin_lock_bh()
    nfs41: Move initialization of nfs4_opendata seq_res to nfs4_init_opendata_res
    nfs: remove unnecessary NFS_INO_INVALID_ACL checks
    NFS: More "sloppy" parsing problems
    NFS: Invalid mount option values should always fail, even with "sloppy"
    NFS: Remove unused XDR decoder functions
    NFS: Update MNT and MNT3 reply decoding functions
    NFS: add XDR decoder for mountd version 3 auth-flavor lists
    NFS: add new file handle decoders to in-kernel mountd client
    NFS: Add separate mountd status code decoders for each mountd version
    NFS: remove unused function in fs/nfs/mount_clnt.c
    NFS: Use xdr_stream-based XDR encoder for MNT's dirpath argument
    NFS: Clean up MNT program definitions
    lockd: Don't bother with RPC ping for NSM upcalls
    lockd: Update NSM state from SM_MON replies
    NFS: Fix false error return from nfs_callback_up() if ipv6.ko is not available
    NFS: Return error code from nfs_callback_up() to user space
    NFS: Do not display the setting of the "intr" mount option
    NFS: add support for splice writes
    nfs41: Backchannel: CB_SEQUENCE validation
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (43 commits)
    via-velocity: Fix velocity driver unmapping incorrect size.
    mlx4_en: Remove redundant refill code on RX
    mlx4_en: Removed redundant check on lso header size
    mlx4_en: Cancel port_up check in transmit function
    mlx4_en: using stop/start_all_queues
    mlx4_en: Removed redundant skb->len check
    mlx4_en: Counting all the dropped packets on the TX side
    usbnet cdc_subset: fix issues talking to PXA gadgets
    Net: qla3xxx, remove sleeping in atomic
    ipv4: fix NULL pointer + success return in route lookup path
    isdn: clean up documentation index
    cfg80211: validate station settings
    cfg80211: allow setting station parameters in mesh
    cfg80211: allow adding/deleting stations on mesh
    ath5k: fix beacon_int handling
    MAINTAINERS: Fix Atheros pattern paths
    ath9k: restore PS mode, before we put the chip into FULL SLEEP state.
    ath9k: wait for beacon frame along with CAB
    acer-wmi: fix rfkill conversion
    ath5k: avoid PCI FATAL interrupts by restoring RETRY_TIMEOUT disabling
    ...

    Linus Torvalds
     

22 Jun, 2009

2 commits

  • As noticed by Török Edwin :

    Compiling the kernel with clang has shown this warning:

    net/netfilter/xt_rateest.c:69:16: warning: self-comparison always results in a
    constant value
    ret &= pps2 == pps2;
    ^
    Looking at the code:
    if (info->flags & XT_RATEEST_MATCH_BPS)
    ret &= bps1 == bps2;
    if (info->flags & XT_RATEEST_MATCH_PPS)
    ret &= pps2 == pps2;

    Judging from the MATCH_BPS case it seems to be a typo, with the intention of
    comparing pps1 with pps2.

    http://bugzilla.kernel.org/show_bug.cgi?id=13535

    Signed-off-by: Patrick McHardy

    Patrick McHardy
     
  • Commit v2.6.29-rc5-872-gacc738f ("xtables: avoid pointer to self")
    forgot to copy the initial quota value supplied by iptables into the
    private structure, thus counting from whatever was in the memory
    kmalloc returned.

    Signed-off-by: Jan Engelhardt
    Signed-off-by: Patrick McHardy

    Jan Engelhardt