26 May, 2011

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
    net: fix get_net_ns_by_fd for !CONFIG_NET_NS
    ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
    ns: Declare sys_setns in syscalls.h
    net: Allow setting the network namespace by fd
    ns proc: Add support for the ipc namespace
    ns proc: Add support for the uts namespace
    ns proc: Add support for the network namespace.
    ns: Introduce the setns syscall
    ns: proc files for namespace naming policy.

    Linus Torvalds
     
  • Commit e67f88dd12f6 (dont hold rtnl mutex during netlink dump callbacks)
    missed fact that rtnl_fill_ifinfo() must be called with rtnl held.

    Because of possible deadlocks between two mutexes (cb_mutex and rtnl),
    its not easy to solve this problem, so revert this part of the patch.

    It also forgot one rcu_read_unlock() in FIB dump_rules()

    Add one ASSERT_RTNL() in rtnl_fill_ifinfo() to remind us the rule.

    Signed-off-by: Eric Dumazet
    CC: Patrick McHardy
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • If the device passed into dev_disable_lro is a vlan, then repoint the dev
    poniter so that we actually modify the underlying physical device.

    Signed-of-by: Neil Horman
    CC: davem@davemloft.net
    CC: bhutchings@solarflare.com

    Signed-off-by: David S. Miller

    Neil Horman
     

25 May, 2011

3 commits

  • After merging the final tree, today's linux-next build (powerpc
    ppc44x_defconfig) failed like this:

    net/built-in.o: In function `get_net_ns_by_fd':
    (.text+0x11976): undefined reference to `netns_operations'
    net/built-in.o: In function `get_net_ns_by_fd':
    (.text+0x1197a): undefined reference to `netns_operations'

    netns_operations is only available if CONFIG_NET_NS is set ...

    Caused by commit f063052947f7 ("net: Allow setting the network namespace
    by fd").

    Signed-off-by: Stephen Rothwell
    Signed-off-by: Eric W. Biederman

    Stephen Rothwell
     
  • dst_default_metrics is readonly, we dont want to kfree() it later.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • synchronize_rcu() is very slow in various situations (HZ=100,
    CONFIG_NO_HZ=y, CONFIG_PREEMPT=n)

    Extract from my (mostly idle) 8 core machine :

    synchronize_rcu() in 99985 us
    synchronize_rcu() in 79982 us
    synchronize_rcu() in 87612 us
    synchronize_rcu() in 79827 us
    synchronize_rcu() in 109860 us
    synchronize_rcu() in 98039 us
    synchronize_rcu() in 89841 us
    synchronize_rcu() in 79842 us
    synchronize_rcu() in 80151 us
    synchronize_rcu() in 119833 us
    synchronize_rcu() in 99858 us
    synchronize_rcu() in 73999 us
    synchronize_rcu() in 79855 us
    synchronize_rcu() in 79853 us

    When we hold RTNL mutex, we would like to spend some cpu cycles but not
    block too long other processes waiting for this mutex.

    We also want to setup/dismantle network features as fast as possible at
    boot/shutdown time.

    This patch makes synchronize_net() call the expedited version if RTNL is
    locked.

    synchronize_rcu_expedited() typical delay is about 20 us on my machine.

    synchronize_rcu_expedited() in 18 us
    synchronize_rcu_expedited() in 18 us
    synchronize_rcu_expedited() in 18 us
    synchronize_rcu_expedited() in 18 us
    synchronize_rcu_expedited() in 20 us
    synchronize_rcu_expedited() in 16 us
    synchronize_rcu_expedited() in 20 us
    synchronize_rcu_expedited() in 18 us
    synchronize_rcu_expedited() in 18 us

    Signed-off-by: Eric Dumazet
    CC: Paul E. McKenney
    CC: Ben Greear
    Reviewed-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric Dumazet
     

24 May, 2011

1 commit

  • A mis-configured filter can spam the logs with lots of stack traces.

    Rate-limit the warnings and add printout of the bogus filter information.

    Original-patch-by: Ben Greear
    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

23 May, 2011

4 commits


21 May, 2011

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
    macvlan: fix panic if lowerdev in a bond
    tg3: Add braces around 5906 workaround.
    tg3: Fix NETIF_F_LOOPBACK error
    macvlan: remove one synchronize_rcu() call
    networking: NET_CLS_ROUTE4 depends on INET
    irda: Fix error propagation in ircomm_lmp_connect_response()
    irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
    irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
    be2net: Kill set but unused variable 'req' in lancer_fw_download()
    irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
    atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
    rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
    pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
    isdn: capi: Use pr_debug() instead of ifdefs.
    tg3: Update version to 3.119
    tg3: Apply rx_discards fix to 5719/5720
    ...

    Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
    as per Davem.

    Linus Torvalds
     
  • Commit e66eed651fd1 ("list: remove prefetching from regular list
    iterators") removed the include of prefetch.h from list.h, which
    uncovered several cases that had apparently relied on that rather
    obscure header file dependency.

    So this fixes things up a bit, using

    grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
    grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

    to guide us in finding files that either need
    inclusion, or have it despite not needing it.

    There are more of them around (mostly network drivers), but this gets
    many core ones.

    Reported-by: Stephen Rothwell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

20 May, 2011

3 commits

  • When one macvlan device is dismantled, we can avoid one
    synchronize_rcu() call done after deletion from hash list, since caller
    will perform a synchronize_net() call after its ndo_stop() call.

    Add a new netdev->dismantle field to signal this dismantle intent.

    Reduces RTNL hold time.

    Signed-off-by: Eric Dumazet
    CC: Patrick McHardy
    CC: Ben Greear
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
    Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
    net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
    batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
    batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
    batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
    net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
    net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
    net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
    perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
    perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
    net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
    net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
    net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
    security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
    net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
    net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
    net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
    ...

    Linus Torvalds
     
  • David S. Miller
     

19 May, 2011

1 commit

  • It's way past it's usefulness. And this gets rid of a bunch
    of stray ->rt_{dst,src} references.

    Even the comment documenting the macro was inaccurate (stated
    default was 1 when it's 0).

    If reintroduced, it should be done properly, with dynamic debug
    facilities.

    Signed-off-by: David S. Miller

    David S. Miller
     

18 May, 2011

5 commits

  • Commit 7fee226ad239 (add a noref bit on skb dst) forgot to use
    skb_dst_force() on packets queued in sk_error_queue

    This triggers following warning, for applications using IP_CMSG_PKTINFO
    receiving one error status

    ------------[ cut here ]------------
    WARNING: at include/linux/skbuff.h:457 ip_cmsg_recv_pktinfo+0xa6/0xb0()
    Hardware name: 2669UYD
    Modules linked in: isofs vboxnetadp vboxnetflt nfsd ebtable_nat ebtables
    lib80211_crypt_ccmp uinput xcbc hdaps tp_smapi thinkpad_ec radeonfb fb_ddc
    radeon ttm drm_kms_helper drm ipw2200 intel_agp intel_gtt libipw i2c_algo_bit
    i2c_i801 agpgart rng_core cfbfillrect cfbcopyarea cfbimgblt video raid10 raid1
    raid0 linear md_mod vboxdrv
    Pid: 4697, comm: miredo Not tainted 2.6.39-rc6-00569-g5895198-dirty #22
    Call Trace:
    [] ? printk+0x1d/0x1f
    [] warn_slowpath_common+0x72/0xa0
    [] ? ip_cmsg_recv_pktinfo+0xa6/0xb0
    [] ? ip_cmsg_recv_pktinfo+0xa6/0xb0
    [] warn_slowpath_null+0x20/0x30
    [] ip_cmsg_recv_pktinfo+0xa6/0xb0
    [] ip_cmsg_recv+0x127/0x260
    [] ? skb_dequeue+0x4d/0x70
    [] ? skb_copy_datagram_iovec+0x53/0x300
    [] ? sub_preempt_count+0x24/0x50
    [] ip_recv_error+0x23d/0x270
    [] udp_recvmsg+0x264/0x2b0
    [] inet_recvmsg+0xd9/0x130
    [] sock_recvmsg+0xf2/0x120
    [] ? might_fault+0x4b/0xa0
    [] ? verify_iovec+0x4c/0xc0
    [] ? sock_recvmsg_nosec+0x100/0x100
    [] __sys_recvmsg+0x114/0x1e0
    [] ? __lock_acquire+0x365/0x780
    [] ? fget_light+0xa6/0x3e0
    [] ? fget_light+0xbf/0x3e0
    [] ? fget_light+0x2e/0x3e0
    [] sys_recvmsg+0x39/0x60

    Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34622

    Reported-by: Witold Baryluk
    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Conflicts:
    drivers/net/vmxnet3/vmxnet3_ethtool.c
    net/core/dev.c

    David S. Miller
     
  • Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     
  • Cool, how about we make 'Features changed' debug as well?
    This way userspace can't fill up the log just by tweaking tun features
    with an ioctl.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • Using plain hlist_del() in dev_change_name() is wrong since a
    concurrent reader can crash trying to dereference LIST_POISON1.

    Bug introduced in commit 72c9528bab94 (net: Introduce
    dev_get_by_name_rcu())

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

17 May, 2011

2 commits

  • Those reduced to DEBUG can possibly be triggered by unprivileged processes
    and are nothing exceptional. Illegal checksum combinations can only be
    caused by driver bug, so promote those messages to WARN.

    Since GSO without SG will now only cause DEBUG message from
    netdev_fix_features(), remove the workaround from register_netdevice().

    Signed-off-by: Michał Mirosław
    Signed-off-by: David S. Miller

    Michał Mirosław
     
  • We plan to remove cpu_xx() old api later. Thus this patch
    convert it.

    This patch has no functional change.

    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: David S. Miller

    KOSAKI Motohiro
     

14 May, 2011

2 commits

  • In commit 1c5cae815d19 (net: call dev_alloc_name from register_netdevice),
    a bug of bonding was involved, see example 1 and 2.

    In register_netdevice(), the name of net_device is not valid until
    dev_get_valid_name() is called. But dev->netdev_ops->ndo_init(that is
    bond_init) is called before dev_get_valid_name(),
    and it uses the invalid name of net_device.

    I think register_netdevice() should make sure that the name of net_device is
    valid before calling ndo_init().

    example 1:
    modprobe bonding
    ls /proc/net/bonding/bond%d

    ps -eLf
    root 3398 2 3398 0 1 21:34 ? 00:00:00 [bond%d]

    example 2:
    modprobe bonding max_bonds=3

    [ 170.100292] bonding: Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
    [ 170.101090] bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
    [ 170.102469] ------------[ cut here ]------------
    [ 170.103150] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
    [ 170.104075] Hardware name: VirtualBox
    [ 170.105065] proc_dir_entry 'bonding/bond%d' already registered
    [ 170.105613] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
    [ 170.108397] Pid: 3457, comm: modprobe Not tainted 2.6.39-rc2+ #14
    [ 170.108935] Call Trace:
    [ 170.109382] [] warn_slowpath_common+0x6a/0x7f
    [ 170.109911] [] ? proc_register+0x126/0x157
    [ 170.110329] [] warn_slowpath_fmt+0x2b/0x2f
    [ 170.110846] [] proc_register+0x126/0x157
    [ 170.111870] [] proc_create_data+0x82/0x98
    [ 170.112335] [] bond_create_proc_entry+0x3f/0x73 [bonding]
    [ 170.112905] [] bond_init+0x77/0xa5 [bonding]
    [ 170.113319] [] register_netdevice+0x8c/0x1d3
    [ 170.113848] [] bond_create+0x6c/0x90 [bonding]
    [ 170.114322] [] bonding_init+0x763/0x7b1 [bonding]
    [ 170.114879] [] do_one_initcall+0x76/0x122
    [ 170.115317] [] ? 0xf94f3fff
    [ 170.115799] [] sys_init_module+0x1286/0x140d
    [ 170.116879] [] sysenter_do_call+0x12/0x28
    [ 170.117404] ---[ end trace 64e4fac3ae5fff1a ]---
    [ 170.117924] bond%d: Warning: failed to register to debugfs
    [ 170.128728] ------------[ cut here ]------------
    [ 170.129360] WARNING: at /home/pwp/net-next-2.6/fs/proc/generic.c:586 proc_register+0x126/0x157()
    [ 170.130323] Hardware name: VirtualBox
    [ 170.130797] proc_dir_entry 'bonding/bond%d' already registered
    [ 170.131315] Modules linked in: bonding(+) sunrpc ipv6 uinput microcode ppdev parport_pc parport joydev e1000 pcspkr i2c_piix4 i2c_core [last unloaded: bonding]
    [ 170.133731] Pid: 3457, comm: modprobe Tainted: G W 2.6.39-rc2+ #14
    [ 170.134308] Call Trace:
    [ 170.134743] [] warn_slowpath_common+0x6a/0x7f
    [ 170.135305] [] ? proc_register+0x126/0x157
    [ 170.135820] [] warn_slowpath_fmt+0x2b/0x2f
    [ 170.137168] [] proc_register+0x126/0x157
    [ 170.137700] [] proc_create_data+0x82/0x98
    [ 170.138174] [] bond_create_proc_entry+0x3f/0x73 [bonding]
    [ 170.138745] [] bond_init+0x77/0xa5 [bonding]
    [ 170.139278] [] register_netdevice+0x8c/0x1d3
    [ 170.139828] [] bond_create+0x6c/0x90 [bonding]
    [ 170.140361] [] bonding_init+0x763/0x7b1 [bonding]
    [ 170.140927] [] do_one_initcall+0x76/0x122
    [ 170.141494] [] ? 0xf94f3fff
    [ 170.141975] [] sys_init_module+0x1286/0x140d
    [ 170.142463] [] sysenter_do_call+0x12/0x28
    [ 170.142974] ---[ end trace 64e4fac3ae5fff1b ]---
    [ 170.144949] bond%d: Warning: failed to register to debugfs

    Signed-off-by: Weiping Pan
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Peter Pan(潘卫平)
     
  • Added code to take FW dump via ethtool. Dump level can be controlled via setting the
    dump flag. A get function is provided to query the current setting of the dump flag.
    Dump data is obtained from the driver via a separate get function.

    Changes from v3:
    Fixed buffer length issue in ethtool_get_dump_data function.
    Updated kernel doc for ethtool_dump struct and get_dump_flag function.

    Changes from v2:
    Provided separate commands for get flag and data.
    Check for minimum of the two buffer length obtained via ethtool and driver and
    use that for dump buffer
    Pass up the driver return error codes up to the caller.
    Added kernel doc comments.

    Signed-off-by: Anirban Chakraborty
    Reviewed-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Anirban Chakraborty
     

13 May, 2011

2 commits


12 May, 2011

1 commit


11 May, 2011

3 commits

  • Commit 443457242beb (factorize sync-rcu call in
    unregister_netdevice_many) mistakenly removed one test from dev_close()

    Following actions trigger a BUG :

    modprobe bonding
    modprobe dummy
    ifconfig bond0 up
    ifenslave bond0 dummy0
    rmmod dummy

    dev_close() must not close a non IFF_UP device.

    With help from Frank Blaschka and Einar EL Lueck

    Reported-by: Frank Blaschka
    Reported-by: Einar EL Lueck
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Take advantage of the new abstraction and allow network devices
    to be placed in any network namespace that we have a fd to talk
    about.

    Acked-by: David S. Miller
    Acked-by: Daniel Lezcano
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     
  • Implementing file descriptors for the network namespace
    is simple and straight forward.

    Acked-by: David S. Miller
    Acked-by: Daniel Lezcano
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

10 May, 2011

2 commits

  • mac_pton() parses MAC address in form XX:XX:XX:XX:XX:XX and only in that form.

    mac_pton() doesn't dirty result until it's sure string representation is valid.

    mac_pton() doesn't care about characters _after_ last octet,
    it's up to caller to deal with it.

    mac_pton() diverges from 0/-E return value convention.
    Target usage:

    if (!mac_pton(str, whatever->mac))
    return -EINVAL;
    /* ->mac being u8 [ETH_ALEN] is filled at this point. */
    /* optionally check str[3 * ETH_ALEN - 1] for termination */

    Use mac_pton() in pktgen and netconsole for start.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • veth devices dont use the batched device unregisters yet.

    Since veth are a pair of devices, it makes sense to use a batch of two
    unregisters, this roughly divides dismantle time by two.

    Fix this by changing dellink() callers to always provide a non NULL
    head. (Idea from Michał Mirosław)

    This patch also handles macvlan case : We now dismantle all macvlans on
    top of a lower dev at once.

    Reported-by: Alex Bligh
    Signed-off-by: Eric Dumazet
    Cc: Michał Mirosław
    Cc: Jesse Gross
    Cc: Paul E. McKenney
    Cc: Ben Greear
    Signed-off-by: David S. Miller

    Eric Dumazet
     

09 May, 2011

3 commits

  • This patch enables ethtool to set the loopback mode on a given interface.
    By configuring the interface in loopback mode in conjunction with a policy
    route / rule, a userland application can stress the egress / ingress path
    exposing the flows of the change in progress and potentially help developer(s)
    understand the impact of those changes without even sending a packet out
    on the network.

    Following set of commands illustrates one such example -
    a) ip -4 addr add 192.168.1.1/24 dev eth1
    b) ip -4 rule add from all iif eth1 lookup 250
    c) ip -4 route add local 0/0 dev lo proto kernel scope host table 250
    d) arp -Ds 192.168.1.100 eth1
    e) arp -Ds 192.168.1.200 eth1
    f) sysctl -w net.ipv4.ip_nonlocal_bind=1
    g) sysctl -w net.ipv4.conf.all.accept_local=1
    # Assuming that the machine has 8 cores
    h) taskset 000f netserver -L 192.168.1.200
    i) taskset 00f0 netperf -t TCP_CRR -L 192.168.1.100 -H 192.168.1.200 -l 30

    Signed-off-by: Mahesh Bandewar
    Acked-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Mahesh Bandewar
     
  • I don't know why %pI6 doesn't compress, but the format specifier is
    kernel-standard, so use it.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     
  • After that all the upstream kernel drivers now use phys_id,
    and the old ethtool_ops interface (phys_id) can be removed.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

08 May, 2011

3 commits