26 Jun, 2015

1 commit

  • printk logbuf keeps various metadata and optional key=value dictionary for
    structured messages, both of which are stripped when messages are handed
    to regular console drivers.

    It can be useful to have this metadata and dictionary available to
    netconsole consumers. This obviously makes logging via netconsole more
    complete and the sequence number in particular is useful in environments
    where messages may be lost or reordered in transit - e.g. when netconsole
    is used to collect messages in a large cluster where packets may have to
    travel congested hops to reach the aggregator. The lost and reordered
    messages can easily be identified and handled accordingly using the
    sequence numbers.

    printk recently added extended console support which can be selected by
    setting CON_EXTENDED flag. From console driver side, not much changes.
    The only difference is that the text passed to the write callback is
    formatted the same way as /dev/kmsg.

    This patch implements extended console support for netconsole which can be
    enabled by either prepending "+" to a netconsole boot param entry or
    echoing 1 to "extended" file in configfs. When enabled, netconsole
    transmits extended log messages with headers identical to /dev/kmsg
    output.

    There's one complication due to message fragments. netconsole limits the
    maximum message size to 1k and messages longer than that are split into
    multiple fragments. As all extended console messages should carry
    matching headers and be uniquely identifiable, each extended message
    fragment carries full copy of the metadata and an extra header field to
    identify the specific fragment. The optional header is of the form
    "ncfrag=OFF/LEN" where OFF is the byte offset into the message body and
    LEN is the total length.

    To avoid unnecessarily making printk format extended messages, Extended
    netconsole is registered with printk when the first extended netconsole is
    configured.

    Signed-off-by: Tejun Heo
    Cc: David Miller
    Cc: Kay Sievers
    Cc: Petr Mladek
    Cc: Tetsuo Handa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

25 Jun, 2015

1 commit

  • Pull documentation updates from Jonathan Corbet:
    "The main thing here is Ingo's big subdirectory documenting feature
    support for each architecture. Beyond that, it's the usual pile of
    fixes, tweaks, and small additions"

    * tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (79 commits)
    doc:md: fix typo in md.txt.
    Documentation/mic/mpssd: don't build x86 userspace when cross compiling
    Documentation/prctl: don't build tsc tests when cross compiling
    Documentation/vDSO: don't build tests when cross compiling
    Doc:ABI/testing: Fix typo in sysfs-bus-fcoe
    Doc: Docbook: Change wikipedia's URL from http to https in scsi.tmpl
    Doc: Change wikipedia's URL from http to https
    Documentation/kernel-parameters: add missing pciserial to the earlyprintk
    Doc:pps: Fix typo in pps.txt
    kbuild : Fix documentation of INSTALL_HDR_PATH
    Documentation: filesystems: updated struct file_operations documentation in vfs.txt
    kbuild: edit explanation of clean-files variable
    Doc: ja_JP: Fix typo in HOWTO
    Move freefall program from Documentation/ to tools/
    Documentation: ARM: EXYNOS: Describe boot loaders interface
    Doc:nfc: Fix typo in nfc-hci.txt
    vfs: Minor documentation fix
    Doc: networking: txtimestamp: fix printf format warning
    Documentation, intel_pstate: Improve legacy mode internal governors description
    Documentation: extend use case for EXPORT_SYMBOL_GPL()
    ...

    Linus Torvalds
     

23 Jun, 2015

1 commit


16 Jun, 2015

1 commit

  • We need to delete from offload the device externally learnded fdbs when any
    one of these events happen:

    1) Bridge ages out fdb. (When bridge is doing ageing vs. device doing
    ageing. If device is doing ageing, it would send SWITCHDEV_FDB_DEL
    directly).

    2) STP state change flushes fdbs on port.

    3) User uses sysfs interface to flush fdbs from bridge or bridge port:

    echo 1 >/sys/class/net/BR_DEV/bridge/flush
    echo 1 >/sys/class/net/BR_PORT/brport/flush

    4) Offload driver send event SWITCHDEV_FDB_DEL to delete fdb entry.

    For rocker, we can now get called to delete fdb entry in wait and nowait
    contexts, so set NOWAIT flag when deleting fdb entry.

    Signed-off-by: Scott Feldman
    Signed-off-by: David S. Miller

    Scott Feldman
     

14 Jun, 2015

1 commit


13 Jun, 2015

1 commit


05 Jun, 2015

1 commit

  • Documentation/networking/timestamping/txtimestamp.c: In function ‘__print_timestamp’:
    Documentation/networking/timestamping/txtimestamp.c:99:3: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘int64_t’ [-Wformat=]
    fprintf(stderr, " (%+ld us)", cur_ms - prev_ms);

    int64_t differs per platform, so a type specifier that differs along
    with it is required.

    Signed-off-by: Frans Klaver
    Signed-off-by: Jonathan Corbet

    Frans Klaver
     

04 Jun, 2015

4 commits


31 May, 2015

1 commit

  • …etooth/bluetooth-next

    Johan Hedberg says:

    ====================
    pull request: bluetooth-next 2015-05-28

    Here's a set of patches intended for 4.2. The majority of the changes
    are on the 802.15.4 side of things rather than Bluetooth related:

    - All sorts of cleanups & fixes to ieee802154 and related drivers
    - Rework of tx power support in ieee802154 and its drivers
    - Support for setting ieee802154 tx power through nl802154
    - New IDs for the btusb driver
    - Various cleanups & smaller fixes to btusb
    - New btrtl driver for Realtec devices
    - Fix suspend/resume for Realtek devices

    Please let me know if there are any issues pulling. Thanks.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     

28 May, 2015

1 commit

  • A long standing problem on busy servers is the tiny available TCP port
    range (/proc/sys/net/ipv4/ip_local_port_range) and the default
    sequential allocation of source ports in connect() system call.

    If a host is having a lot of active TCP sessions, chances are
    very high that all ports are in use by at least one flow,
    and subsequent bind(0) attempts fail, or have to scan a big portion of
    space to find a slot.

    In this patch, I changed the starting point in __inet_hash_connect()
    so that we try to favor even [1] ports, leaving odd ports for bind()
    users.

    We still perform a sequential search, so there is no guarantee, but
    if connect() targets are very different, end result is we leave
    more ports available to bind(), and we spread them all over the range,
    lowering time for both connect() and bind() to find a slot.

    This strategy only works well if /proc/sys/net/ipv4/ip_local_port_range
    is even, ie if start/end values have different parity.

    Therefore, default /proc/sys/net/ipv4/ip_local_port_range was changed to
    32768 - 60999 (instead of 32768 - 61000)

    There is no change on security aspects here, only some poor hashing
    schemes could be eventually impacted by this change.

    [1] : The odd/even property depends on ip_local_port_range values parity

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 May, 2015

1 commit

  • * Update the linux-zigbee git:// repository URL.

    * Remove the MLME section as the current kernel does not provide a
    full 802.15.4 MLME implementation.

    * The hardmac example driver 'fakehard' was removed some time ago.

    * The IEEE 802.15.4 device drivers live in drivers/net/ieee802154/,
    not in drivers/ieee802154/.

    * The IEEE 802.15.4 MTU is 127 bytes, not 128 bytes.

    * Some of the 6LoWPAN code lives in net/6lowpan/.

    Signed-off-by: Lennert Buytenhek
    Reviewed-by: Stefan Schmidt
    Acked-by: Alexander Aring
    Signed-off-by: Marcel Holtmann

    Lennert Buytenhek
     

23 May, 2015

5 commits

  • Add the pktgen samples script pktgen_sample02_multiqueue.sh that
    demonstrates generating packets on multiqueue NICs.

    Specifically notice the options "-t" that specifies how many
    kernel threads to activate. Also notice the flag QUEUE_MAP_CPU,
    which cause the SKB TX queue to be mapped to the CPU running the
    kernel thread. For best scalability people are also encourage to
    map NIC IRQ /proc/irq/*/smp_affinity to CPU number.

    Usage example with "-t" 4 threads and help:
    ./pktgen_sample02_multiqueue.sh -i eth4 -m 00:1B:21:3C:9D:F8 -t 4

    Usage: ./pktgen_sample02_multiqueue.sh [-vx] -i ethX
    -i : ($DEV) output interface/device (required)
    -s : ($PKT_SIZE) packet size
    -d : ($DEST_IP) destination IP
    -m : ($DST_MAC) destination MAC-addr
    -t : ($THREADS) threads to start
    -c : ($SKB_CLONE) SKB clones send before alloc new SKB
    -b : ($BURST) HW level bursting of SKBs
    -v : ($VERBOSE) verbose
    -x : ($DEBUG) debug

    Removing pktgen.conf-2-1 and pktgen.conf-2-2 as these examples
    should be covered now.

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • Add the first basic pktgen samples script pktgen_sample01_simple.sh,
    which demonstrates the a simple use of the helper functions.
    Removing pktgen.conf-1-1 as that example should be covered now.

    The naming scheme pktgen_sampleNN, where NN is a number, should encourage
    reading the samples in a specific order.

    Script cause pktgen sending with a single thread and single interface,
    and introduce flow variation via random UDP source port.

    Usage example and help:
    ./pktgen_sample01_simple.sh -i eth4 -m 00:1B:21:3C:9D:F8 -d 192.168.8.2

    Usage: ./pktgen_sample01_simple.sh [-vx] -i ethX
    -i : ($DEV) output interface/device (required)
    -s : ($PKT_SIZE) packet size
    -d : ($DEST_IP) destination IP
    -m : ($DST_MAC) destination MAC-addr
    -c : ($SKB_CLONE) SKB clones send before alloc new SKB
    -v : ($VERBOSE) verbose
    -x : ($DEBUG) debug

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • The pktgen.txt documentation still claimed that adding same device to
    multiple threads were not supported, but it have been since 2008 via
    commit e6fce5b916cd7 ("pktgen: multiqueue etc.").

    Document this and describe the naming scheme dev@X, as the procfile name
    still need to be unique.

    Fixes: e6fce5b916cd7 ("pktgen: multiqueue etc.")
    Signed-off-by: Jesper Dangaard Brouer
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • The pktgen.txt documentation over available config options were not complete.
    Making the list complete by adding the following.

    Pgcontrol commands:
    reset

    Device commands:
    burst
    queue_map_min
    queue_map_max
    skb_priority
    tos
    traffic_class
    node
    spi
    dst6_max
    dst6_min
    vlan_cfi
    vlan_id
    vlan_p
    svlan_cfi
    svlan_id
    svlan_p

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     
  • And cleanup some whitespaces in pktgen.txt.

    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     

20 May, 2015

1 commit

  • This work as a follow-up of commit f7b3bec6f516 ("net: allow setting ecn
    via routing table") and adds RFC3168 section 6.1.1.1. fallback for outgoing
    ECN connections. In other words, this work adds a retry with a non-ECN
    setup SYN packet, as suggested from the RFC on the first timeout:

    [...] A host that receives no reply to an ECN-setup SYN within the
    normal SYN retransmission timeout interval MAY resend the SYN and
    any subsequent SYN retransmissions with CWR and ECE cleared. [...]

    Schematic client-side view when assuming the server is in tcp_ecn=2 mode,
    that is, Linux default since 2009 via commit 255cac91c3c9 ("tcp: extend
    ECN sysctl to allow server-side only ECN"):

    1) Normal ECN-capable path:

    SYN ECE CWR ----->

    2) Path with broken middlebox, when client has fallback:

    SYN ECE CWR ----X crappy middlebox drops packet
    (timeout, rtx)
    SYN ----->

    In case we would not have the fallback implemented, the middlebox drop
    point would basically end up as:

    SYN ECE CWR ----X crappy middlebox drops packet
    (timeout, rtx)
    SYN ECE CWR ----X crappy middlebox drops packet
    (timeout, rtx)
    SYN ECE CWR ----X crappy middlebox drops packet
    (timeout, rtx)

    In any case, it's rather a smaller percentage of sites where there would
    occur such additional setup latency: it was found in end of 2014 that ~56%
    of IPv4 and 65% of IPv6 servers of Alexa 1 million list would negotiate
    ECN (aka tcp_ecn=2 default), 0.42% of these webservers will fail to connect
    when trying to negotiate with ECN (tcp_ecn=1) due to timeouts, which the
    fallback would mitigate with a slight latency trade-off. Recent related
    paper on this topic:

    Brian Trammell, Mirja Kühlewind, Damiano Boppart, Iain Learmonth,
    Gorry Fairhurst, and Richard Scheffenegger:
    "Enabling Internet-Wide Deployment of Explicit Congestion Notification."
    Proc. PAM 2015, New York.
    http://ecn.ethz.ch/ecn-pam15.pdf

    Thus, when net.ipv4.tcp_ecn=1 is being set, the patch will perform RFC3168,
    section 6.1.1.1. fallback on timeout. For users explicitly not wanting this
    which can be in DC use case, we add a net.ipv4.tcp_ecn_fallback knob that
    allows for disabling the fallback.

    tp->ecn_flags are not being cleared in tcp_ecn_clear_syn() on output, but
    rather we let tcp_ecn_rcv_synack() take that over on input path in case a
    SYN ACK ECE was delayed. Thus a spurious SYN retransmission will not prevent
    ECN being negotiated eventually in that case.

    Reference: https://www.ietf.org/proceedings/92/slides/slides-92-iccrg-1.pdf
    Reference: https://www.ietf.org/proceedings/89/slides/slides-89-tsvarea-1.pdf
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Florian Westphal
    Signed-off-by: Mirja Kühlewind
    Signed-off-by: Brian Trammell
    Cc: Eric Dumazet
    Cc: Dave That
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

14 May, 2015

2 commits

  • Seems all we want here is to avoid endless 'goto reclassify' loop.
    tc_classify_compat even resets this counter when something other
    than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't
    break hypothetical loops induced by something other than perpetual
    TC_ACT_RECLASSIFY return values.

    skb_act_clone is now identical to skb_clone, so just use that.

    Tested with following (bogus) filter:
    tc filter add dev eth0 parent ffff: \
    protocol ip u32 match u32 0 0 police rate 10Kbit burst \
    64000 mtu 1500 action reclassify

    Acked-by: Daniel Borkmann
    Signed-off-by: Florian Westphal
    Acked-by: Alexei Starovoitov
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Florian Westphal
     
  • There were a few review comments on the switchdev.txt documentation that
    didn't get included with the Spring Cleanup series, so include them now.

    Signed-off-by: Scott Feldman
    Signed-off-by: David S. Miller

    Scott Feldman
     

13 May, 2015

1 commit

  • Much need updated of switchdev documentation to cover what's been
    implmented to-date. There are some XXX comments in the text for
    unimplemented or broken items. I'd like to keep these in there (poor-man's
    TODO list) and update the document once each issue is resolved.

    Signed-off-by: Scott Feldman
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Scott Feldman
     

11 May, 2015

3 commits

  • The port key has three components - user-key, speed-part, and duplex-part.
    The LSBit is for the duplex-part, next 5 bits are for the speed while the
    remaining 10 bits are the user defined key bits. Get these 10 bits
    from the user-space (through the SysFs interface) and use it to form the
    admin port-key. Allowed range for the user-key is 0 - 1023 (10 bits). If
    it is not provided then use zero for the user-key-bits (default).

    It can set using following example code -

    # modprobe bonding mode=4
    # usr_port_key=$(( RANDOM & 0x3FF ))
    # echo $usr_port_key > /sys/class/net/bond0/bonding/ad_user_port_key
    # echo +eth1 > /sys/class/net/bond0/bonding/slaves
    ...
    # ip link set bond0 up

    Signed-off-by: Mahesh Bandewar
    Reviewed-by: Nikolay Aleksandrov
    [jt: * fixed up style issues reported by checkpatch
    * fixed up context from change in ad_actor_sys_prio patch]
    Signed-off-by: Jonathan Toppins
    Signed-off-by: David S. Miller

    Mahesh Bandewar
     
  • In an AD system, the communication between actor and partner is the
    business between these two entities. In the current setup anyone on the
    same L2 can "guess" the LACPDU contents and then possibly send the
    spoofed LACPDUs and trick the partner causing connectivity issues for
    the AD system. This patch allows to use a random mac-address obscuring
    it's identity making it harder for someone in the L2 is do the same thing.

    This patch allows user-space to choose the mac-address for the AD-system.
    This mac-address can not be NULL or a Multicast. If the mac-address is set
    from user-space; kernel will honor it and will not overwrite it. In the
    absence (value from user space); the logic will default to using the
    masters' mac as the mac-address for the AD-system.

    It can be set using example code below -

    # modprobe bonding mode=4
    # sys_mac_addr=$(printf '%02x:%02x:%02x:%02x:%02x:%02x' \
    $(( (RANDOM & 0xFE) | 0x02 )) \
    $(( RANDOM & 0xFF )) \
    $(( RANDOM & 0xFF )) \
    $(( RANDOM & 0xFF )) \
    $(( RANDOM & 0xFF )) \
    $(( RANDOM & 0xFF )))
    # echo $sys_mac_addr > /sys/class/net/bond0/bonding/ad_actor_system
    # echo +eth1 > /sys/class/net/bond0/bonding/slaves
    ...
    # ip link set bond0 up

    Signed-off-by: Mahesh Bandewar
    Reviewed-by: Nikolay Aleksandrov
    [jt: fixed up style issues reported by checkpatch]
    Signed-off-by: Jonathan Toppins
    Signed-off-by: David S. Miller

    Mahesh Bandewar
     
  • This patch allows user to randomize the system-priority in an ad-system.
    The allowed range is 1 - 0xFFFF while default value is 0xFFFF. If user
    does not specify this value, the system defaults to 0xFFFF, which is
    what it was before this patch.

    Following example code could set the value -
    # modprobe bonding mode=4
    # sys_prio=$(( 1 + RANDOM + RANDOM ))
    # echo $sys_prio > /sys/class/net/bond0/bonding/ad_actor_sys_prio
    # echo +eth1 > /sys/class/net/bond0/bonding/slaves
    ...
    # ip link set bond0 up

    Signed-off-by: Mahesh Bandewar
    Reviewed-by: Nikolay Aleksandrov
    [jt: * fixed up style issues reported by checkpatch
    * changed how the default value is set in bond_check_params(), this
    makes the default consistent between what gets set for a new bond
    and what the default is claimed to be in the bonding options.]
    Signed-off-by: Jonathan Toppins
    Signed-off-by: David S. Miller

    Mahesh Bandewar
     

10 May, 2015

2 commits

  • Introduce xmit_mode 'netif_receive' for pktgen which generates the
    packets using familiar pktgen commands, but feeds them into
    netif_receive_skb() instead of ndo_start_xmit().

    Default mode is called 'start_xmit'.

    It is designed to test netif_receive_skb and ingress qdisc
    performace only. Make sure to understand how it works before
    using it for other rx benchmarking.

    Sample script 'pktgen.sh':
    \#!/bin/bash
    function pgset() {
    local result

    echo $1 > $PGDEV

    result=`cat $PGDEV | fgrep "Result: OK:"`
    if [ "$result" = "" ]; then
    cat $PGDEV | fgrep Result:
    fi
    }

    [ -z "$1" ] && echo "Usage: $0 DEV" && exit 1
    ETH=$1

    PGDEV=/proc/net/pktgen/kpktgend_0
    pgset "rem_device_all"
    pgset "add_device $ETH"

    PGDEV=/proc/net/pktgen/$ETH
    pgset "xmit_mode netif_receive"
    pgset "pkt_size 60"
    pgset "dst 198.18.0.1"
    pgset "dst_mac 90:e2:ba:ff:ff:ff"
    pgset "count 10000000"
    pgset "burst 32"

    PGDEV=/proc/net/pktgen/pgctrl
    echo "Running... ctrl^C to stop"
    pgset "start"
    echo "Done"
    cat /proc/net/pktgen/$ETH

    Usage:
    $ sudo ./pktgen.sh eth2
    ...
    Result: OK: 232376(c232372+d3) usec, 10000000 (60byte,0frags)
    43033682pps 20656Mb/sec (20656167360bps) errors: 10000000

    Raw netif_receive_skb speed should be ~43 million packet
    per second on 3.7Ghz x86 and 'perf report' should look like:
    37.69% kpktgend_0 [kernel.vmlinux] [k] __netif_receive_skb_core
    25.81% kpktgend_0 [kernel.vmlinux] [k] kfree_skb
    7.22% kpktgend_0 [kernel.vmlinux] [k] ip_rcv
    5.68% kpktgend_0 [pktgen] [k] pktgen_thread_worker

    If fib_table_lookup is seen on top, it means skb was processed
    by the stack. To benchmark netif_receive_skb only make sure
    that 'dst_mac' of your pktgen script is different from
    receiving device mac and it will be dropped by ip_rcv

    Signed-off-by: Alexei Starovoitov
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Alexei Starovoitov
     
  • Allow flag NO_TIMESTAMP to turn timestamping on again, like other flags,
    with a negation of the flag like !NO_TIMESTAMP.

    Also document the option flag NO_TIMESTAMP.

    Fixes: afb84b626184 ("pktgen: add flag NO_TIMESTAMP to disable timestamping")
    Signed-off-by: Jesper Dangaard Brouer
    Signed-off-by: David S. Miller

    Jesper Dangaard Brouer
     

06 May, 2015

1 commit

  • The current definition of struct can_frame has a 16-byte size, with 8-byte
    alignment, but the 3 bytes of padding are not explicit like the similar 2 bytes
    of padding of struct canfd_frame. Make it explicit so it is easier to read.

    Signed-off-by: Shawn Landden
    Acked-by: Oliver Hartkopp
    Signed-off-by: Marc Kleine-Budde

    Shawn Landden
     

04 May, 2015

1 commit

  • This patch divides the IPv6 flow label space into two ranges:
    0-7ffff is reserved for flow label manager, 80000-fffff will be
    used for creating auto flow labels (per RFC6438). This only affects how
    labels are set on transmit, it does not affect receive. This range split
    can be disbaled by systcl.

    Background:

    IPv6 flow labels have been an unmitigated disappointment thus far
    in the lifetime of IPv6. Support in HW devices to use them for ECMP
    is lacking, and OSes don't turn them on by default. If we had these
    we could get much better hashing in IPv6 networks without resorting
    to DPI, possibly eliminating some of the motivations to to define new
    encaps in UDP just for getting ECMP.

    Unfortunately, the initial specfications of IPv6 did not clarify
    how they are to be used. There has always been a vague concept that
    these can be used for ECMP, flow hashing, etc. and we do now have a
    good standard how to this in RFC6438. The problem is that flow labels
    can be either stateful or stateless (as in RFC6438), and we are
    presented with the possibility that a stateless label may collide
    with a stateful one. Attempts to split the flow label space were
    rejected in IETF. When we added support in Linux for RFC6438, we
    could not turn on flow labels by default due to this conflict.

    This patch splits the flow label space and should give us
    a path to enabling auto flow labels by default for all IPv6 packets.
    This is an API change so we need to consider compatibility with
    existing deployment. The stateful range is chosen to be the lower
    values in hopes that most uses would have chosen small numbers.

    Once we resolve the stateless/stateful issue, we can proceed to
    look at enabling RFC6438 flow labels by default (starting with
    scaled testing).

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     

03 May, 2015

1 commit

  • Not used.

    pedit sets TC_MUNGED when packet content was altered, but all the core
    does is unset MUNGED again and then set OK2MUNGE.

    And the latter isn't tested anywhere. So lets remove both
    TC_MUNGED and TC_OK2MUNGE.

    Signed-off-by: Florian Westphal
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Florian Westphal
     

27 Apr, 2015

1 commit

  • Commit 567e4b79731c ("net: rfs: add hash collision detection") had one
    mistake :

    RPS_NO_CPU is no longer the marker for invalid cpu in set_rps_cpu()
    and get_rps_cpu(), as @next_cpu was the result of an AND with
    rps_cpu_mask

    This bug showed up on a host with 72 cpus :
    next_cpu was 0x7f, and the code was trying to access percpu data of an
    non existent cpu.

    In a follow up patch, we might get rid of compares against nr_cpu_ids,
    if we init the tables with 0. This is silly to test for a very unlikely
    condition that exists only shortly after table initialization, as
    we got rid of rps_reset_sock_flow() and similar functions that were
    writing this RPS_NO_CPU magic value at flow dismantle : When table is
    old enough, it never contains this value anymore.

    Fixes: 567e4b79731c ("net: rfs: add hash collision detection")
    Signed-off-by: Eric Dumazet
    Cc: Tom Herbert
    Cc: Ben Hutchings
    Signed-off-by: David S. Miller

    Eric Dumazet
     

23 Apr, 2015

1 commit

  • An MPLS network is a single trust domain where the edges must be in
    control of what labels make their way into the core. The simplest way
    of ensuring this is for the edge device to always impose the labels,
    and not allow forward labeled traffic from untrusted neighbours. This
    is achieved by allowing a per-device configuration of whether MPLS
    traffic input from that interface should be processed or not.

    To be secure by default, the default state is changed to MPLS being
    disabled on all interfaces unless explicitly enabled and no global
    option is provided to change the default. Whilst this differs from
    other protocols (e.g. IPv6), network operators are used to explicitly
    enabling MPLS forwarding on interfaces, and with the number of links
    to the MPLS core typically fairly low this doesn't present too much of
    a burden on operators.

    Cc: "Eric W. Biederman"
    Signed-off-by: Robert Shearman
    Reviewed-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Robert Shearman
     

15 Apr, 2015

1 commit


10 Apr, 2015

3 commits


09 Apr, 2015

1 commit


01 Apr, 2015

1 commit

  • The CAN_RAW socket can set multiple CAN identifier specific filters that lead
    to multiple filters in the af_can.c filter processing. These filters are
    indenpendent from each other which leads to logical OR'ed filters when applied.

    This socket option joines the given CAN filters in the way that only CAN frames
    are passed to user space that matched *all* given CAN filters. The semantic for
    the applied filters is therefore changed to a logical AND.

    This is useful especially when the filterset is a combination of filters where
    the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID
    ranges from the incoming traffic.

    As the raw_rcv() function is executed from NET_RX softirq the introduced
    variables are implemented as per-CPU variables to avoid extensive locking at
    CAN frame reception time.

    Signed-off-by: Oliver Hartkopp
    Signed-off-by: Marc Kleine-Budde

    Oliver Hartkopp
     

25 Mar, 2015

1 commit

  • If vlan offloading takes place then vlan header is removed from frame
    and its contents, both vlan_tci and vlan_proto, is available to user
    space via TPACKET interface. However, only vlan_tci can be used in BPF
    filters.

    This commit introduces a new BPF extension. It makes possible to load
    the value of vlan_proto (vlan TPID) to register A. Support for classic
    BPF and eBPF is being added, analogous to skb->protocol.

    Cc: Daniel Borkmann
    Cc: Alexei Starovoitov
    Cc: Jiri Pirko

    Signed-off-by: Michal Sekletar
    Acked-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Michal Sekletar