23 Dec, 2011

2 commits

  • skb->truesize might be big even for a small packet.

    Its even bigger after commit 87fb4b7b533 (net: more accurate skb
    truesize) and big MTU.

    We should allow queueing at least one packet per receiver, even with a
    low RCVBUF setting.

    Reported-by: Michal Simek
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Setting a large rps_flow_cnt like (1 << 30) on 32-bit platform will
    cause a kernel oops due to insufficient bounds checking.

    if (count > 1<<< 30) * 8 will overflow
    32 bits.

    This patch replaces the magic number (1 << 30) with a symbolic bound.

    Suggested-by: Eric Dumazet
    Signed-off-by: Xi Wang
    Signed-off-by: David S. Miller

    Xi Wang
     

22 Dec, 2011

1 commit

  • flow_cach_flush() might sleep but can be called from
    atomic context via the xfrm garbage collector. So add
    a flow_cache_flush_deferred() function and use this if
    the xfrm garbage colector is invoked from within the
    packet path.

    Signed-off-by: Steffen Klassert
    Acked-by: Timo Teräs
    Signed-off-by: David S. Miller

    Steffen Klassert
     

07 Dec, 2011

2 commits

  • On a CONFIG_NET=y build

    net/core/secure_seq.c:22: warning: 'seq_scale' defined but not
    used

    Signed-off-by: Stephen Boyd
    Signed-off-by: David S. Miller

    Stephen Boyd
     
  • Since commit c5ed63d66f24(tcp: fix three tcp sysctls tuning),
    sysctl_max_syn_backlog is determined by tcp_hashinfo->ehash_mask,
    and the minimal value is 128, and it will increase in proportion to the
    memory of machine.
    The original description for tcp_max_syn_backlog and sysctl_max_syn_backlog
    are out of date.

    Changelog:
    V2: update description for sysctl_max_syn_backlog

    Signed-off-by: Weiping Pan
    Reviewed-by: Shan Wei
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Peter Pan(潘卫平)
     

01 Dec, 2011

1 commit


29 Nov, 2011

1 commit

  • I just hit this during my testing. Isn't there another bug lurking?

    BUG kmalloc-8: Redzone overwritten

    INFO: 0xc0000000de9dec48-0xc0000000de9dec4b. First byte 0x0 instead of 0xcc
    INFO: Allocated in .__seq_open_private+0x30/0xa0 age=0 cpu=5 pid=3896
    .__kmalloc+0x1e0/0x2d0
    .__seq_open_private+0x30/0xa0
    .seq_open_net+0x60/0xe0
    .dev_mc_seq_open+0x4c/0x70
    .proc_reg_open+0xd8/0x260
    .__dentry_open.clone.11+0x2b8/0x400
    .do_last+0xf4/0x950
    .path_openat+0xf8/0x480
    .do_filp_open+0x48/0xc0
    .do_sys_open+0x140/0x250
    syscall_exit+0x0/0x40

    dev_mc_seq_ops uses dev_seq_start/next/stop but only allocates
    sizeof(struct seq_net_private) of private data, whereas it expects
    sizeof(struct dev_iter_state):

    struct dev_iter_state {
    struct seq_net_private p;
    unsigned int pos; /* bucket << BUCKET_SPACE + offset */
    };

    Create dev_seq_open_ops and use it so we don't have to expose
    struct dev_iter_state.

    [ Problem added by commit f04565ddf52e4 (dev: use name hash for
    dev_seq_ops) -Eric ]

    Signed-off-by: Anton Blanchard
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Anton Blanchard
     

26 Nov, 2011

1 commit


23 Nov, 2011

1 commit


07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

04 Nov, 2011

1 commit

  • Commit 87fb4b7b533073eeeaed0b6bf7c2328995f6c075 (net: more
    accurate skb truesize) changed the alignment of size. This
    can cause problems at least on some machines with NFS root:

    Unhandled fault: alignment exception (0x801) at 0xc183a43a
    Internal error: : 801 [#1] PREEMPT
    Modules linked in:
    CPU: 0 Not tainted (3.1.0-08784-g5eeee4a #733)
    pc : [] lr : [] psr: 60000013
    sp : c180fef8 ip : 00000000 fp : c181f580
    r10: 00000000 r9 : c044b28c r8 : 00000001
    r7 : c183a3a0 r6 : c1835be0 r5 : c183a412 r4 : 000001f2
    r3 : 00000000 r2 : 00000000 r1 : ffffffe6 r0 : c183a43a
    Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
    Control: 0005317f Table: 10004000 DAC: 00000017
    Process swapper (pid: 1, stack limit = 0xc180e270)
    Stack: (0xc180fef8 to 0xc1810000)
    fee0: 00000024 00000000
    ff00: 00000000 c183b9c0 c183b8e0 c044b28c c0507ccc c019dfc4 c180ff2c c0503cf8
    ff20: c180ff4c c180ff4c 00000000 c1835420 c182c740 c18349c0 c05233c0 00000000
    ff40: 00000000 c00e6bb8 c180e000 00000000 c04dd82c c0507e7c c050cc18 c183b9c0
    ff60: c05233c0 00000000 00000000 c01f34f4 c0430d70 c019d364 c04dd898 c04dd898
    ff80: c04dd82c c0507e7c c180e000 00000000 c04c584c c01f4918 c04dd898 c04dd82c
    ffa0: c04ddd28 c180e000 00000000 c0008758 c181fa60 3231d82c 00000037 00000000
    ffc0: 00000000 c04dd898 c04dd82c c04ddd28 00000013 00000000 00000000 00000000
    ffe0: 00000000 c04b2224 00000000 c04b21a0 c001056c c001056c 00000000 00000000
    Function entered at [] from []
    Function entered at [] from []
    Function entered at [] from []
    Function entered at [] from []
    Function entered at [] from []
    Function entered at [] from []
    Code: e1a00005 e3a01028 ebfa7cb0 e35a0000 (e5858028)

    Here PC is at __alloc_skb and &shinfo->dataref is unaligned because
    skb->end can be unaligned without this patch.

    As explained by Eric Dumazet , this happens
    only with SLOB, and not with SLAB or SLUB:

    * Eric Dumazet [111102 15:56]:
    >
    > Your patch is absolutely needed, I completely forgot about SLOB :(
    >
    > since, kmalloc(386) on SLOB gives exactly ksize=386 bytes, not nearest
    > power of two.
    >
    > [ 60.305763] malloc(size=385)->ffff880112c11e38 ksize=386 -> nsize=2
    > [ 60.305921] malloc(size=385)->ffff88007c92ce28 ksize=386 -> nsize=2
    > [ 60.306898] malloc(size=656)->ffff88007c44ad28 ksize=656 -> nsize=272
    > [ 60.325385] malloc(size=656)->ffff88007c575868 ksize=656 -> nsize=272
    > [ 60.325531] malloc(size=656)->ffff88011c777230 ksize=656 -> nsize=272
    > [ 60.325701] malloc(size=656)->ffff880114011008 ksize=656 -> nsize=272
    > [ 60.346716] malloc(size=385)->ffff880114142008 ksize=386 -> nsize=2
    > [ 60.346900] malloc(size=385)->ffff88011c777690 ksize=386 -> nsize=2

    Signed-off-by: Tony Lindgren
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Tony Lindgren
     

02 Nov, 2011

1 commit

  • Whatever situations make this state legitimate when SMP
    also would be legitimate when !SMP and f.e. preemption is
    enabled.

    This is dubious enough that we should just delete it entirely. If we
    want to add debugging for neigh timer races, better more thorough
    mechanisms are needed.

    Signed-off-by: David S. Miller

    David S. Miller
     

01 Nov, 2011

2 commits


30 Oct, 2011

1 commit

  • commit 2425717b27eb (net: allow vlan traffic to be received under bond)
    broke ARP processing on vlan on top of bonding.

    +-------+
    eth0 --| bond0 |---bond0.103
    eth1 --| |
    +-------+

    52870.115435: skb_gro_reset_offset
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Oct, 2011

1 commit


25 Oct, 2011

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1745 commits)
    dp83640: free packet queues on remove
    dp83640: use proper function to free transmit time stamping packets
    ipv6: Do not use routes from locally generated RAs
    |PATCH net-next] tg3: add tx_dropped counter
    be2net: don't create multiple RX/TX rings in multi channel mode
    be2net: don't create multiple TXQs in BE2
    be2net: refactor VF setup/teardown code into be_vf_setup/clear()
    be2net: add vlan/rx-mode/flow-control config to be_setup()
    net_sched: cls_flow: use skb_header_pointer()
    ipv4: avoid useless call of the function check_peer_pmtu
    TCP: remove TCP_DEBUG
    net: Fix driver name for mdio-gpio.c
    ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT
    rtnetlink: Add missing manual netlink notification in dev_change_net_namespaces
    ipv4: fix ipsec forward performance regression
    jme: fix irq storm after suspend/resume
    route: fix ICMP redirect validation
    net: hold sock reference while processing tx timestamps
    tcp: md5: add more const attributes
    Add ethtool -g support to virtio_net
    ...

    Fix up conflicts in:
    - drivers/net/Kconfig:
    The split-up generated a trivial conflict with removal of a
    stale reference to Documentation/networking/net-modules.txt.
    Remove it from the new location instead.
    - fs/sysfs/dir.c:
    Fairly nasty conflicts with the sysfs rb-tree usage, conflicting
    with Eric Biederman's changes for tagged directories.

    Linus Torvalds
     
  • * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (38 commits)
    mm: memory hotplug: Check if pages are correctly reserved on a per-section basis
    Revert "memory hotplug: Correct page reservation checking"
    Update email address for stable patch submission
    dynamic_debug: fix undefined reference to `__netdev_printk'
    dynamic_debug: use a single printk() to emit messages
    dynamic_debug: remove num_enabled accounting
    dynamic_debug: consolidate repetitive struct _ddebug descriptor definitions
    uio: Support physical addresses >32 bits on 32-bit systems
    sysfs: add unsigned long cast to prevent compile warning
    drivers: base: print rejected matches with DEBUG_DRIVER
    memory hotplug: Correct page reservation checking
    memory hotplug: Refuse to add unaligned memory regions
    remove the messy code file Documentation/zh_CN/SubmitChecklist
    ARM: mxc: convert device creation to use platform_device_register_full
    new helper to create platform devices with dma mask
    docs/driver-model: Update device class docs
    docs/driver-model: Document device.groups
    kobj_uevent: Ignore if some listeners cannot handle message
    dynamic_debug: make netif_dbg() call __netdev_printk()
    dynamic_debug: make netdev_dbg() call __netdev_printk()
    ...

    Linus Torvalds
     
  • David S. Miller
     

24 Oct, 2011

2 commits

  • Renato Westphal noticed that since commit a2835763e130c343ace5320c20d33c281e7097b7
    "rtnetlink: handle rtnl_link netlink notifications manually" was merged
    we no longer send a netlink message when a networking device is moved
    from one network namespace to another.

    Fix this by adding the missing manual notification in dev_change_net_namespaces.

    Since all network devices that are processed by dev_change_net_namspaces are
    in the initialized state the complicated tests that guard the manual
    rtmsg_ifinfo calls in rollback_registered and register_netdevice are
    unnecessary and we can just perform a plain notification.

    Cc: stable@kernel.org
    Tested-by: Renato Westphal
    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • The pair of functions,

    * skb_clone_tx_timestamp()
    * skb_complete_tx_timestamp()

    were designed to allow timestamping in PHY devices. The first
    function, called during the MAC driver's hard_xmit method, identifies
    PTP protocol packets, clones them, and gives them to the PHY device
    driver. The PHY driver may hold onto the packet and deliver it at a
    later time using the second function, which adds the packet to the
    socket's error queue.

    As pointed out by Johannes, nothing prevents the socket from
    disappearing while the cloned packet is sitting in the PHY driver
    awaiting a timestamp. This patch fixes the issue by taking a reference
    on the socket for each such packet. In addition, the comments
    regarding the usage of these function are expanded to highlight the
    rule that PHY drivers must use skb_complete_tx_timestamp() to release
    the packet, in order to release the socket reference, too.

    These functions first appeared in v2.6.36.

    Reported-by: Johannes Berg
    Signed-off-by: Richard Cochran
    Cc:
    Signed-off-by: Eric Dumazet
    Reviewed-by: Johannes Berg
    Signed-off-by: David S. Miller

    Richard Cochran
     

21 Oct, 2011

4 commits

  • Adding const qualifiers to pointers can ease code review, and spot some
    bugs. It might allow compiler to optimize code further.

    For example, is it legal to temporary write a null cksum into tcphdr
    in tcp_md5_hash_header() ? I am afraid a sniffer could catch the
    temporary null value...

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Instead of using the dev->next chain and trying to resync at each call to
    dev_seq_start, use the name hash, keeping the bucket and the offset in
    seq->private field.

    Tests revealed the following results for ifconfig > /dev/null
    * 1000 interfaces:
    * 0.114s without patch
    * 0.089s with patch
    * 3000 interfaces:
    * 0.489s without patch
    * 0.110s with patch
    * 5000 interfaces:
    * 1.363s without patch
    * 0.250s with patch
    * 128000 interfaces (other setup):
    * ~100s without patch
    * ~30s with patch

    Signed-off-by: Mihai Maruseac
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Mihai Maruseac
     
  • I've split this bit out of the skb frag destructor patch since it helps enforce
    the use of the fragment API.

    Signed-off-by: Ian Campbell
    Signed-off-by: David S. Miller

    Ian Campbell
     
  • Daniel Turull reported inaccuracies in pktgen when using low packet
    rates, because we call ndelay(val) with values bigger than 20000.

    Instead of calling ndelay() for delays < 100us, we can instead loop
    calling ktime_now() only.

    Reported-by: Daniel Turull
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

20 Oct, 2011

7 commits

  • I audited all of the callers in the tree and only one of them (pktgen) expects
    it to do so. Taking this reference is pretty obviously confusing and error
    prone.

    In particular I looked at the following commits which switched callers of
    (__)skb_frag_set_page to the skb paged fragment api:

    6a930b9f163d7e6d9ef692e05616c4ede65038ec cxgb3: convert to SKB paged frag API.
    5dc3e196ea21e833128d51eb5b788a070fea1f28 myri10ge: convert to SKB paged frag API.
    0e0634d20dd670a89af19af2a686a6cce943ac14 vmxnet3: convert to SKB paged frag API.
    86ee8130a46769f73f8f423f99dbf782a09f9233 virtionet: convert to SKB paged frag API.
    4a22c4c919c201c2a7f4ee09e672435a3072d875 sfc: convert to SKB paged frag API.
    18324d690d6a5028e3c174fc1921447aedead2b8 cassini: convert to SKB paged frag API.
    b061b39e3ae18ad75466258cf2116e18fa5bbd80 benet: convert to SKB paged frag API.
    b7b6a688d217936459ff5cf1087b2361db952509 bnx2: convert to SKB paged frag API.
    804cf14ea5ceca46554d5801e2817bba8116b7e5 net: xfrm: convert to SKB frag APIs
    ea2ab69379a941c6f8884e290fdd28c93936a778 net: convert core to skb paged frag APIs

    Signed-off-by: Ian Campbell
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Ian Campbell
     
  • when use dst_get_neighbour to get neighbour, we need
    rcu_read_lock to protect, since dst_get_neighbour uses
    rcu_dereference.

    The bug was reported by Ari Savolainen

    [ 105.612095]
    [ 105.612096] ===================================================
    [ 105.612100] [ INFO: suspicious rcu_dereference_check() usage. ]
    [ 105.612101] ---------------------------------------------------
    [ 105.612103] include/net/dst.h:91 invoked rcu_dereference_check()
    without protection!
    [ 105.612105]
    [ 105.612106] other info that might help us debug this:
    [ 105.612106]
    [ 105.612108]
    [ 105.612108] rcu_scheduler_active = 1, debug_locks = 0
    [ 105.612110] 1 lock held by dnsmasq/2618:
    [ 105.612111] #0: (rtnl_mutex){+.+.+.}, at: []
    rtnl_lock+0x17/0x20
    [ 105.612120]
    [ 105.612121] stack backtrace:
    [ 105.612123] Pid: 2618, comm: dnsmasq Not tainted 3.1.0-rc1 #41
    [ 105.612125] Call Trace:
    [ 105.612129] [] lockdep_rcu_dereference+0xbb/0xc0
    [ 105.612132] [] neigh_update+0x4f9/0x5f0
    [ 105.612135] [] ? neigh_lookup+0xe1/0x220
    [ 105.612139] [] arp_req_set+0xb8/0x230
    [ 105.612142] [] arp_ioctl+0x1bf/0x310
    [ 105.612146] [] ? lock_hrtimer_base.isra.26+0x30/0x60
    [ 105.612150] [] inet_ioctl+0x85/0x90
    [ 105.612154] [] sock_do_ioctl+0x30/0x70
    [ 105.612157] [] sock_ioctl+0x73/0x280
    [ 105.612162] [] do_vfs_ioctl+0x98/0x570
    [ 105.612165] [] ? fget_light+0x340/0x3a0
    [ 105.612168] [] sys_ioctl+0x4f/0x80
    [ 105.612172] [] system_call_fastpath+0x16/0x1b

    Reported-by: Ari Savolainen
    Signed-off-by: RongQing
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    roy.qing.li@gmail.com
     
  • This is just a cleanup.

    My testing version of Smatch warns about this:
    net/core/filter.c +380 check_load_and_stores(6)
    warn: check 'flen' for negative values

    flen comes from the user. We try to clamp the values here between 1
    and BPF_MAXINSNS but the clamp doesn't work because it could be
    negative. This is a bug, but it's not exploitable.

    Signed-off-by: Dan Carpenter
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • we should decrease ops->unresolved_rules when deleting a unresolved rule.

    Signed-off-by: Zheng Yan
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Yan, Zheng
     
  • This patch adds a sanity check on the values provided by user space for
    the hardware time stamping configuration. If the values lie outside of
    the absolute limits, then the ioctl request will be denied.

    Signed-off-by: Richard Cochran
    Signed-off-by: David S. Miller

    Richard Cochran
     
  • This patch moves the rcu_barrier from rollback_registered_many
    (inside the rtnl_lock) into netdev_run_todo (just outside the rtnl_lock).
    This allows us to gain the full benefit of sychronize_net calling
    synchronize_rcu_expedited when the rtnl_lock is held.

    The rcu_barrier in rollback_registered_many was originally a synchronize_net
    but was promoted to be a rcu_barrier() when it was found that people were
    unnecessarily hitting the 250ms wait in netdev_wait_allrefs(). Changing
    the rcu_barrier back to a synchronize_net is therefore safe.

    Since we only care about waiting for the rcu callbacks before we get
    to netdev_wait_allrefs() it is also safe to move the wait into
    netdev_run_todo.

    This was tested by creating and destroying 1000 tap devices and observing
    /proc/lock_stat. /proc/lock_stat reports this change reduces the hold
    times of the rtnl_lock by a factor of 10. There was no observable
    difference in the amount of time it takes to destroy a network device.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • skb_recycle_check resets the skb if it's eligible for recycling.
    However, there are times when a driver might want to optionally
    manipulate the skb data with the skb before resetting the skb,
    but after it has determined eligibility. We do this by splitting the
    eligibility check from the skb reset, creating two inline functions to
    accomplish that task.

    Signed-off-by: Andy Fleming
    Acked-by: David Daney
    Signed-off-by: David S. Miller

    Andy Fleming
     

19 Oct, 2011

2 commits

  • To ease skb->truesize sanitization, its better to be able to localize
    all references to skb frags size.

    Define accessors : skb_frag_size() to fetch frag size, and
    skb_frag_size_{set|add|sub}() to manipulate it.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The following configuration used to work as I expected. At least
    we could use the fcoe interfaces to do MPIO and the bond0 iface
    to do load balancing or failover.

    ---eth2.228-fcoe
    |
    eth2 -----|
    |
    |---- bond0
    |
    eth3 -----|
    |
    ---eth3.228-fcoe

    This worked because of a change we added to allow inactive slaves
    to rx 'exact' matches. This functionality was kept intact with the
    rx_handler mechanism. However now the vlan interface attached to the
    active slave never receives traffic because the bonding rx_handler
    updates the skb->dev and goto's another_round. Previously, the
    vlan_do_receive() logic was called before the bonding rx_handler.

    Now by the time vlan_do_receive calls vlan_find_dev() the
    skb->dev is set to bond0 and it is clear no vlan is attached
    to this iface. The vlan lookup fails.

    This patch moves the VLAN check above the rx_handler. A VLAN
    tagged frame is now routed to the eth2.228-fcoe iface in the
    above schematic. Untagged frames continue to the bond0 as
    normal. This case also remains intact,

    eth2 --> bond0 --> vlan.228

    Here the skb is VLAN tagged but the vlan lookup fails on eth2
    causing the bonding rx_handler to be called. On the second
    pass the vlan lookup is on the bond0 iface and completes as
    expected.

    Putting a VLAN.228 on both the bond0 and eth2 device will
    result in eth2.228 receiving the skb. I don't think this is
    completely unexpected and was the result prior to the rx_handler
    result.

    Note, the same setup is also used for other storage traffic that
    MPIO is used with eg. iSCSI and similar setups can be contrived
    without storage protocols.

    Signed-off-by: John Fastabend
    Acked-by: Jesse Gross
    Reviewed-by: Jiri Pirko
    Tested-by: Hans Schillstrom
    Signed-off-by: David S. Miller

    John Fastabend
     

18 Oct, 2011

1 commit


17 Oct, 2011

1 commit

  • Add configuration setting for drivers to turn spoof checking on or off
    for discrete VFs.

    v2 - Fix indentation problem, wrap the ifla_vf_info structure in
    #ifdef __KERNEL__ to prevent user space from accessing and
    change function paramater for the spoof check setting netdev
    op from u8 to bool.
    v3 - Preset spoof check setting to -1 so that user space tools such
    as ip can detect that the driver didn't report a spoofcheck
    setting. Prevents incorrect display of spoof check settings
    for drivers that don't report it.

    Signed-off-by: Greg Rose
    Signed-off-by: Jeff Kirsher

    Greg Rose
     

14 Oct, 2011

1 commit

  • skb truesize currently accounts for sk_buff struct and part of skb head.
    kmalloc() roundings are also ignored.

    Considering that skb_shared_info is larger than sk_buff, its time to
    take it into account for better memory accounting.

    This patch introduces SKB_TRUESIZE(X) macro to centralize various
    assumptions into a single place.

    At skb alloc phase, we put skb_shared_info struct at the exact end of
    skb head, to allow a better use of memory (lowering number of
    reallocations), since kmalloc() gives us power-of-two memory blocks.

    Unless SLUB/SLUB debug is active, both skb->head and skb_shared_info are
    aligned to cache lines, as before.

    Note: This patch might trigger performance regressions because of
    misconfigured protocol stacks, hitting per socket or global memory
    limits that were previously not reached. But its a necessary step for a
    more accurate memory accounting.

    Signed-off-by: Eric Dumazet
    CC: Andi Kleen
    CC: Ben Hutchings
    Signed-off-by: David S. Miller

    Eric Dumazet
     

08 Oct, 2011

1 commit


04 Oct, 2011

1 commit

  • Amir Vadai wrote:
    > When a stream is paused, and its rule is expired while it is paused,
    > no new rule will be configured to the HW when traffic resume.
    [...]
    > - When stream was resumed, traffic was steered again by RSS, and
    > because current-cpu was equal to desired-cpu, ndo_rx_flow_steer
    > wasn't called and no rule was configured to the HW.

    Fix this by setting the flow's current CPU only in the table for the
    newly selected RX queue.

    Reported-and-tested-by: Amir Vadai
    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     

29 Sep, 2011

1 commit