20 Oct, 2011

1 commit


22 Sep, 2011

1 commit

  • add new fib rule can cause BUG_ON happen
    the reproduce shell is
    ip rule add pref 38
    ip rule add pref 38
    ip rule add to 192.168.3.0/24 goto 38
    ip rule del pref 38
    ip rule add to 192.168.3.0/24 goto 38
    ip rule add pref 38

    then the BUG_ON will happen
    del BUG_ON and use (ctarget == NULL) identify whether this rule is unresolved

    Signed-off-by: Gao feng
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Gao feng
     

17 Sep, 2011

1 commit


16 Sep, 2011

2 commits

  • dev_forward_skb loops an skb back into host networking
    stack which might hang on the memory indefinitely.
    In particular, this can happen in macvtap in bridged mode.
    Copy the userspace fragments to avoid blocking the
    sender in that case.

    As this patch makes skb_copy_ubufs extern now,
    I also added some documentation and made it clear
    the SKBTX_DEV_ZEROCOPY flag automatically instead
    of doing it in all callers. This can be made into a separate
    patch if people feel it's worth it.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • flow_cache_lookup will return a cached object (or null pointer) that the
    resolver (i.e. xfrm_policy_lookup) previously found for another namespace
    using the same key/family/dir. Instead, make the namespace part of what
    identifies entries in the cache.

    As before, flow_entry_valid will return 0 for entries where the namespace
    has been deleted, and they will be removed from the cache the next time
    flow_cache_gc_task is run.

    Reported-by: Andrew Dickinson
    Signed-off-by: David Ward
    Signed-off-by: David S. Miller

    dpward
     

27 Aug, 2011

1 commit


25 Aug, 2011

1 commit

  • Dave Jones reported a lockdep splat triggered by an arp_process() call
    from parp_redo().

    Commit faa9dcf793be (arp: RCU changes) is the origin of the bug, since
    it assumed arp_process() was called under rcu_read_lock(), which is not
    true in this particular path.

    Instead of adding rcu_read_lock() in parp_redo(), I chose to add it in
    neigh_proxy_process() to take care of IPv6 side too.

    ===================================================
    [ INFO: suspicious rcu_dereference_check() usage. ]
    ---------------------------------------------------
    include/linux/inetdevice.h:209 invoked rcu_dereference_check() without
    protection!

    other info that might help us debug this:

    rcu_scheduler_active = 1, debug_locks = 0
    4 locks held by setfiles/2123:
    #0: (&sb->s_type->i_mutex_key#13){+.+.+.}, at: []
    walk_component+0x1ef/0x3e8
    #1: (&isec->lock){+.+.+.}, at: []
    inode_doinit_with_dentry+0x3f/0x41f
    #2: (&tbl->proxy_timer){+.-...}, at: []
    run_timer_softirq+0x157/0x372
    #3: (class){+.-...}, at: [] neigh_proxy_process
    +0x36/0x103

    stack backtrace:
    Pid: 2123, comm: setfiles Tainted: G W
    3.1.0-0.rc2.git7.2.fc16.x86_64 #1
    Call Trace:
    [] lockdep_rcu_dereference+0xa7/0xaf
    [] __in_dev_get_rcu+0x55/0x5d
    [] arp_process+0x25/0x4d7
    [] parp_redo+0xe/0x10
    [] neigh_proxy_process+0x9a/0x103
    [] run_timer_softirq+0x218/0x372
    [] ? run_timer_softirq+0x157/0x372
    [] ? neigh_stat_seq_open+0x41/0x41
    [] ? mark_held_locks+0x6d/0x95
    [] __do_softirq+0x112/0x25a
    [] call_softirq+0x1c/0x30
    [] do_softirq+0x4b/0xa2
    [] irq_exit+0x5d/0xcf
    [] smp_apic_timer_interrupt+0x7c/0x8a
    [] apic_timer_interrupt+0x73/0x80
    [] ? trace_hardirqs_on_caller+0x121/0x158
    [] ? __slab_free+0x30/0x24c
    [] ? __slab_free+0x2e/0x24c
    [] ? inode_doinit_with_dentry+0x2e9/0x41f
    [] ? inode_doinit_with_dentry+0x2e9/0x41f
    [] ? inode_doinit_with_dentry+0x2e9/0x41f
    [] kfree+0x108/0x131
    [] inode_doinit_with_dentry+0x2e9/0x41f
    [] selinux_d_instantiate+0x1c/0x1e
    [] security_d_instantiate+0x21/0x23
    [] d_instantiate+0x5c/0x61
    [] d_splice_alias+0xbc/0xd2
    [] ext4_lookup+0xba/0xeb
    [] d_alloc_and_lookup+0x45/0x6b
    [] walk_component+0x215/0x3e8
    [] lookup_last+0x3b/0x3d
    [] path_lookupat+0x82/0x2af
    [] ? might_fault+0xa5/0xac
    [] ? might_fault+0x5c/0xac
    [] ? getname_flags+0x31/0x1ca
    [] do_path_lookup+0x28/0x97
    [] user_path_at+0x59/0x96
    [] ? cp_new_stat+0xf7/0x10d
    [] vfs_fstatat+0x44/0x6e
    [] vfs_lstat+0x1e/0x20
    [] sys_newlstat+0x1a/0x33
    [] ? trace_hardirqs_on_caller+0x121/0x158
    [] ? trace_hardirqs_on_thunk+0x3a/0x3f
    [] system_call_fastpath+0x16/0x1b

    Reported-by: Dave Jones
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

11 Aug, 2011

1 commit


07 Aug, 2011

1 commit

  • Computers have become a lot faster since we compromised on the
    partial MD4 hash which we use currently for performance reasons.

    MD5 is a much safer choice, and is inline with both RFC1948 and
    other ISS generators (OpenBSD, Solaris, etc.)

    Furthermore, only having 24-bits of the sequence number be truly
    unpredictable is a very serious limitation. So the periodic
    regeneration and 8-bit counter have been removed. We compute and
    use a full 32-bit sequence number.

    For ipv6, DCCP was found to use a 32-bit truncated initial sequence
    number (it needs 43-bits) and that is fixed here as well.

    Reported-by: Dan Kaminsky
    Tested-by: Willy Tarreau
    Signed-off-by: David S. Miller

    David S. Miller
     

02 Aug, 2011

1 commit


28 Jul, 2011

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits)
    tg3: Remove 5719 jumbo frames and TSO blocks
    tg3: Break larger frags into 4k chunks for 5719
    tg3: Add tx BD budgeting code
    tg3: Consolidate code that calls tg3_tx_set_bd()
    tg3: Add partial fragment unmapping code
    tg3: Generalize tg3_skb_error_unmap()
    tg3: Remove short DMA check for 1st fragment
    tg3: Simplify tx bd assignments
    tg3: Reintroduce tg3_tx_ring_info
    ASIX: Use only 11 bits of header for data size
    ASIX: Simplify condition in rx_fixup()
    Fix cdc-phonet build
    bonding: reduce noise during init
    bonding: fix string comparison errors
    net: Audit drivers to identify those needing IFF_TX_SKB_SHARING cleared
    net: add IFF_SKB_TX_SHARED flag to priv_flags
    net: sock_sendmsg_nosec() is static
    forcedeth: fix vlans
    gianfar: fix bug caused by 87c288c6e9aa31720b72e2bc2d665e24e1653c3e
    gro: Only reset frag0 when skb can be pulled
    ...

    Linus Torvalds
     
  • Pktgen attempts to transmit shared skbs to net devices, which can't be used by
    some drivers as they keep state information in skbs. This patch adds a flag
    marking drivers as being able to handle shared skbs in their tx path. Drivers
    are defaulted to being unable to do so, but calling ether_setup enables this
    flag, as 90% of the drivers calling ether_setup touch real hardware and can
    handle shared skbs. A subsequent patch will audit drivers to ensure that the
    flag is set properly

    Signed-off-by: Neil Horman
    Reported-by: Jiri Pirko
    CC: Robert Olsson
    CC: Eric Dumazet
    CC: Alexey Dobriyan
    CC: David S. Miller
    Signed-off-by: David S. Miller

    Neil Horman
     

27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

26 Jul, 2011

1 commit


23 Jul, 2011

1 commit

  • As reported by Ben Greer and Froncois Romieu. The code path in
    the netif_carrier code leads it to try and disable
    a late workqueue to reenable it immediately
    netif_carrier_on
    -> linkwatch_fire_event
    -> linkwatch_schedule_work
    -> cancel_delayed_work
    -> del_timer_sync

    If __cancel_delayed_work is used instead then there is no
    problem of waiting for running linkwatch_event.

    There is a race between linkwatch_event running re-scheduling
    but it is harmless to schedule an extra scan of the linkwatch queue.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     

22 Jul, 2011

2 commits

  • Some drivers (ab)use the ethtool_ops::get_regs operation to expose
    only a hardware revision ID. Commit
    a77f5db361ed9953b5b749353ea2c7fed2bf8d93 ('ethtool: Allocate register
    dump buffer with vmalloc()') had the side-effect of breaking these, as
    vmalloc() returns a null pointer for size=0 whereas kmalloc() did not.

    For backward-compatibility, allow zero-length dumps again.

    Reported-by: Kalle Valo
    Signed-off-by: Ben Hutchings
    Cc: stable@kernel.org [2.6.37+]
    Signed-off-by: David S. Miller

    Ben Hutchings
     
  • There are two problems:
    1) "n" was allocated with alloc_skb() so we should free it with
    kfree_skb() instead of regular kfree().
    2) We return the freed pointer instead of NULL.

    Signed-off-by: Dan Carpenter
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Dan Carpenter
     

18 Jul, 2011

2 commits


17 Jul, 2011

4 commits


15 Jul, 2011

3 commits


14 Jul, 2011

1 commit

  • Now that there is a one-to-one correspondance between neighbour
    and hh_cache entries, we no longer need:

    1) dynamic allocation
    2) attachment to dst->hh
    3) refcounting

    Initialization of the hh_cache entry is indicated by hh_len
    being non-zero, and such initialization is always done with
    the neighbour's lock held as a writer.

    Signed-off-by: David S. Miller

    David S. Miller
     

13 Jul, 2011

2 commits


11 Jul, 2011

2 commits


09 Jul, 2011

1 commit


08 Jul, 2011

1 commit


07 Jul, 2011

1 commit

  • This patch adds userspace buffers support in skb shared info. A new
    struct skb_ubuf_info is needed to maintain the userspace buffers
    argument and index, a callback is used to notify userspace to release
    the buffers once lower device has done DMA (Last reference to that skb
    has gone).

    If there is any userspace apps to reference these userspace buffers,
    then these userspaces buffers will be copied into kernel. This way we
    can prevent userspace apps from holding these userspace buffers too long.

    Use destructor_arg to point to the userspace buffer info; a new tx flags
    SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.

    Signed-off-by: Shirley Ma
    Signed-off-by: David S. Miller

    Shirley Ma
     

06 Jul, 2011

3 commits


04 Jul, 2011

2 commits


02 Jul, 2011

1 commit

  • IPV6, unlike IPV4, doesn't have a routing cache.

    Routing table entries, as well as clones made in response
    to route lookup requests, all live in the same table. And
    all of these things are together collected in the destination
    cache table for ipv6.

    This means that routing table entries count against the garbage
    collection limits, even though such entries cannot ever be reclaimed
    and are added explicitly by the administrator (rather than being
    created in response to lookups).

    Therefore it makes no sense to count ipv6 routing table entries
    against the GC limits.

    Add a DST_NOCOUNT destination cache entry flag, and skip the counting
    if it is set. Use this flag bit in ipv6 when adding routing table
    entries.

    Signed-off-by: David S. Miller

    David S. Miller