01 Nov, 2011

1 commit


02 Jul, 2011

1 commit

  • This patch adds a change sequence counter to each net namespace
    which is bumped whenever a netdevice is added or removed from
    the list. If such a change occurred while a link dump took place,
    the dump will have the NLM_F_DUMP_INTR flag set in the first
    message which has been interrupted and in all subsequent messages
    of the same dump.

    Note that links may still be modified or renamed while a dump is
    taking place but we can guarantee for userspace to receive a
    complete list of links and not miss any.

    Testing:
    I have added 500 VLAN netdevices to make sure the dump is split
    over multiple messages. Then while continuously dumping links in
    one process I also continuously deleted and re-added a dummy
    netdevice in another process. Multiple dumps per seconds have
    had the NLM_F_DUMP_INTR flag set.

    I guess we can wait for Johannes patch to hit net-next via the
    wireless tree. I just wanted to give this some testing right away.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     

17 Jun, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    AFS: Use i_generation not i_version for the vnode uniquifier
    AFS: Set s_id in the superblock to the volume name
    vfs: Fix data corruption after failed write in __block_write_begin()
    afs: afs_fill_page reads too much, or wrong data
    VFS: Fix vfsmount overput on simultaneous automount
    fix wrong iput on d_inode introduced by e6bc45d65d
    Delay struct net freeing while there's a sysfs instance refering to it
    afs: fix sget() races, close leak on umount
    ubifs: fix sget races
    ubifs: split allocation of ubifs_info into a separate function
    fix leak in proc_set_super()

    Linus Torvalds
     

13 Jun, 2011

1 commit

  • * new refcount in struct net, controlling actual freeing of the memory
    * new method in kobj_ns_type_operations (->drop_ns())
    * ->current_ns() semantics change - it's supposed to be followed by
    corresponding ->drop_ns(). For struct net in case of CONFIG_NET_NS it bumps
    the new refcount; net_drop_ns() decrements it and calls net_free() if the
    last reference has been dropped. Method renamed to ->grab_current_ns().
    * old net_free() callers call net_drop_ns() instead.
    * sysfs_exit_ns() is gone, along with a large part of callchain
    leading to it; now that the references stored in ->ns[...] stay valid we
    do not need to hunt them down and replace them with NULL. That fixes
    problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
    of sb->s_instances abuse.

    Note that struct net *shutdown* logics has not changed - net_cleanup()
    is called exactly when it used to be called. The only thing postponed by
    having a sysfs instance refering to that struct net is actual freeing of
    memory occupied by struct net.

    Signed-off-by: Al Viro

    Al Viro
     

06 Jun, 2011

1 commit

  • BTW, looking through the code related to struct net lifetime rules has
    caught something else:

    struct net *get_net_ns_by_fd(int fd)
    {
    ...
    file = proc_ns_fget(fd);
    if (!file)
    goto out;

    ei = PROC_I(file->f_dentry->d_inode);

    while in proc_ns_fget() we have two return ERR_PTR(...) and not a single
    path that would return NULL. The other caller of proc_ns_fget() treats
    ERR_PTR() correctly...

    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

26 May, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/linux-2.6-nsfd:
    net: fix get_net_ns_by_fd for !CONFIG_NET_NS
    ns proc: Return -ENOENT for a nonexistent /proc/self/ns/ entry.
    ns: Declare sys_setns in syscalls.h
    net: Allow setting the network namespace by fd
    ns proc: Add support for the ipc namespace
    ns proc: Add support for the uts namespace
    ns proc: Add support for the network namespace.
    ns: Introduce the setns syscall
    ns: proc files for namespace naming policy.

    Linus Torvalds
     

25 May, 2011

1 commit

  • After merging the final tree, today's linux-next build (powerpc
    ppc44x_defconfig) failed like this:

    net/built-in.o: In function `get_net_ns_by_fd':
    (.text+0x11976): undefined reference to `netns_operations'
    net/built-in.o: In function `get_net_ns_by_fd':
    (.text+0x1197a): undefined reference to `netns_operations'

    netns_operations is only available if CONFIG_NET_NS is set ...

    Caused by commit f063052947f7 ("net: Allow setting the network namespace
    by fd").

    Signed-off-by: Stephen Rothwell
    Signed-off-by: Eric W. Biederman

    Stephen Rothwell
     

21 May, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1446 commits)
    macvlan: fix panic if lowerdev in a bond
    tg3: Add braces around 5906 workaround.
    tg3: Fix NETIF_F_LOOPBACK error
    macvlan: remove one synchronize_rcu() call
    networking: NET_CLS_ROUTE4 depends on INET
    irda: Fix error propagation in ircomm_lmp_connect_response()
    irda: Kill set but unused variable 'bytes' in irlan_check_command_param()
    irda: Kill set but unused variable 'clen' in ircomm_connect_indication()
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_transport()
    be2net: Kill set but unused variable 'req' in lancer_fw_download()
    irda: Kill set but unused vars 'saddr' and 'daddr' in irlan_provider_connect_indication()
    atl1c: atl1c_resume() is only used when CONFIG_PM_SLEEP is defined.
    rxrpc: Fix set but unused variable 'usage' in rxrpc_get_peer().
    rxrpc: Kill set but unused variable 'local' in rxrpc_UDP_error_handler()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_process_connection()
    rxrpc: Kill set but unused variable 'sp' in rxrpc_rotate_tx_window()
    pkt_sched: Kill set but unused variable 'protocol' in tc_classify()
    isdn: capi: Use pr_debug() instead of ifdefs.
    tg3: Update version to 3.119
    tg3: Apply rx_discards fix to 5719/5720
    ...

    Fix up trivial conflicts in arch/x86/Kconfig and net/mac80211/agg-tx.c
    as per Davem.

    Linus Torvalds
     

11 May, 2011

2 commits


08 May, 2011

1 commit


16 Apr, 2011

1 commit


26 Oct, 2010

1 commit


28 Apr, 2010

1 commit


25 Apr, 2010

1 commit


04 Dec, 2009

3 commits


02 Dec, 2009

2 commits

  • To get the full benefit of batched network namespace cleanup netowrk
    device deletion needs to be performed by the generic code. When
    using register_pernet_gen_device and freeing the data in exit_net
    it is impossible to delay allocation until after exit_net has called
    as the device uninit methods are no longer safe.

    To correct this, and to simplify working with per network namespace data
    I have moved allocation and deletion of per network namespace data into
    the network namespace core. The core now frees the data only after
    all of the network namespace exit routines have run.

    Now it is only required to set the new fields .id and .size
    in the pernet_operations structure if you want network namespace
    data to be managed for you automatically.

    This makes the current register_pernet_gen_device and
    register_pernet_gen_subsys routines unnecessary. For the moment
    I have left them as compatibility wrappers in net_namespace.h
    They will be removed once all of the users have been updated.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • It is fairly common to kill several network namespaces at once. Either
    because they are nested one inside the other or because they are cooperating
    in multiple machine networking experiments. As the network stack control logic
    does not parallelize easily batch up multiple network namespaces existing
    together.

    To get the full benefit of batching the virtual network devices to be
    removed must be all removed in one batch. For that purpose I have added
    a loop after the last network device operations have run that batches
    up all remaining network devices and deletes them.

    An extra benefit is that the reorganization slightly shrinks the size
    of the per network namespace data structures replaceing a work_struct
    with a list_head.

    In a trivial test with 4K namespaces this change reduced the cost of
    a destroying 4K namespaces from 7+ minutes (at 12% cpu) to 44 seconds
    (at 60% cpu). The bulk of that 44s was spent in inet_twsk_purge.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

26 Nov, 2009

1 commit

  • Generated with the following semantic patch

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 == n2
    + net_eq(n1, n2)

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 != n2
    + !net_eq(n1, n2)

    applied over {include,net,drivers/net}.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

13 Aug, 2009

1 commit


03 Aug, 2009

1 commit


13 Jul, 2009

2 commits

  • The function get_net_ns_by_pid(), to get a network
    namespace from a pid_t, will be required in cfg80211
    as well. Therefore, let's move it to net_namespace.c
    and export it. We can't make it a static inline in
    the !NETNS case because it needs to verify that the
    given pid even exists (and return -ESRCH).

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • All we need to take care of is using proper RCU list
    add/del primitives and inserting a synchronize_rcu()
    at one place to make sure the exit notifiers are run
    after everybody has stopped iterating the list.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

22 May, 2009

2 commits


05 May, 2009

2 commits


03 Mar, 2009

1 commit

  • It turns out that net_alive is unnecessary, and the original problem
    that led to it being added was simply that the icmp code thought
    it was a network device and wound up being unable to handle packets
    while there were still packets in the network namespace.

    Now that icmp and tcp have been fixed to properly register themselves
    this problem is no longer present and we have a stronger guarantee
    that packets will not arrive in a network namespace then that provided
    by net_alive in netif_receive_skb. So remove net_alive allowing
    packet reception run a little faster.

    Additionally document the strong reason why network namespace cleanup
    is safe so that if something happens again someone else will have
    a chance of figuring it out.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

24 Feb, 2009

1 commit


22 Feb, 2009

1 commit

  • This patch fix a double free when a network namespace fails.
    The previous code does a kfree of the net_generic structure when
    one of the init subsystem initialization fails.
    The 'setup_net' function does kfree(ng) and returns an error.
    The caller, 'copy_net_ns', call net_free on error, and this one
    calls kfree(net->gen), making this pointer freed twice.

    This patch make the code symetric, the net_alloc does the net_generic
    allocation and the net_free frees the net_generic.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: David S. Miller

    Daniel Lezcano
     

21 Jan, 2009

1 commit


31 Oct, 2008

2 commits


29 Oct, 2008

1 commit

  • call_rcu() will unconditionally rewrite RCU head anyway.
    Applies to
    struct neigh_parms
    struct neigh_table
    struct net
    struct cipso_v4_doi
    struct in_ifaddr
    struct in_device
    rt->u.dst

    Signed-off-by: Alexey Dobriyan
    Acked-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

15 Oct, 2008

1 commit


08 Oct, 2008

1 commit

  • Conntrack code will use it for
    a) removing expectations and helpers when corresponding module is removed, and
    b) removing conntracks when L3 protocol conntrack module is removed.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Patrick McHardy

    Alexey Dobriyan
     

21 Jun, 2008

1 commit

  • Alexey Dobriyan writes:
    > Subject: ICMP sockets destruction vs ICMP packets oops

    > After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
    > icmp_reply() wants ICMP socket.
    >
    > Steps to reproduce:
    >
    > launch shell in new netns
    > move real NIC to netns
    > setup routing
    > ping -i 0
    > exit from shell
    >
    > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
    > IP: [] icmp_sk+0x17/0x30
    > PGD 17f3cd067 PUD 17f3ce067 PMD 0
    > Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
    > CPU 0
    > Modules linked in: usblp usbcore
    > Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
    > RIP: 0010:[] [] icmp_sk+0x17/0x30
    > RSP: 0018:ffffffff8057fc30 EFLAGS: 00010286
    > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
    > RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
    > RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
    > R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
    > R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
    > FS: 0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
    > CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
    > CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
    > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    > DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    > Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
    > Stack: 0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
    > ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
    > 000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
    > Call Trace:
    > [] icmp_reply+0x44/0x1e0
    > [] ? ip_route_input+0x23a/0x1360
    > [] icmp_echo+0x65/0x70
    > [] icmp_rcv+0x180/0x1b0
    > [] ip_local_deliver+0xf4/0x1f0
    > [] ip_rcv+0x33b/0x650
    > [] netif_receive_skb+0x27a/0x340
    > [] process_backlog+0x9d/0x100
    > [] net_rx_action+0x18d/0x250
    > [] __do_softirq+0x75/0x100
    > [] call_softirq+0x1c/0x30
    > [] do_softirq+0x65/0xa0
    > [] irq_exit+0x97/0xa0
    > [] do_IRQ+0xa8/0x130
    > [] ? mwait_idle+0x0/0x60
    > [] ret_from_intr+0x0/0xf
    > [] ? mwait_idle+0x4c/0x60
    > [] ? mwait_idle+0x43/0x60
    > [] ? cpu_idle+0x57/0xa0
    > [] ? rest_init+0x70/0x80
    > Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
    > 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 8b 04 c3 48 83 c4 08
    > 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
    > RIP [] icmp_sk+0x17/0x30
    > RSP
    > CR2: 0000000000000000
    > ---[ end trace ea161157b76b33e8 ]---
    > Kernel panic - not syncing: Aiee, killing interrupt handler!

    Receiving packets while we are cleaning up a network namespace is a
    racy proposition. It is possible when the packet arrives that we have
    removed some but not all of the state we need to fully process it. We
    have the choice of either playing wack-a-mole with the cleanup routines
    or simply dropping packets when we don't have a network namespace to
    handle them.

    Since the check looks inexpensive in netif_receive_skb let's just
    drop the incoming packets.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

16 Apr, 2008

1 commit

  • Make release_net/hold_net noop for performance-hungry people. This is a debug
    staff and should be used in the debug mode only.

    Add check for net != NULL in hold/release calls. This will be required
    later on.

    [ Added minor simplifications suggested by Brian Haley. -DaveM ]

    Signed-off-by: Denis V. Lunev
    Signed-off-by: David S. Miller

    Denis V. Lunev