20 Aug, 2016

7 commits

  • We've already set sk to sock->sk and dereferenced it, so if it's NULL
    we would have crashed already. Moreover, if it was NULL we would have
    crashed anyway when jumping to 'out' and trying to unlock the sock.
    Furthermore, if we had assigned a different value to 'sk' we would
    have been calling lock_sock() and release_sock() on different sockets.

    My conclusion is that these two lines are complete nonsense and only
    serve to confuse the reader.

    Signed-off-by: Vegard Nossum
    Signed-off-by: David S. Miller

    Vegard Nossum
     
  • The Broadcom Starfighter 2 switch driver should be a proper platform
    driver, now that the DSA code has been updated to allow that, register a
    switch device, feed it with the proper configuration data coming from
    Device Tree and register our switch device with DSA.

    The bulk of the changes consist in moving what bcm_sf2_sw_setup() did
    into the platform driver probe function.

    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • In preparation for allowing switch drivers to implement system-wide
    suspend/resume functions, export dsa_switch_suspend and
    dsa_switch_resume() such that these are callable from the appropriate
    driver specific suspend/resume functions.

    Reviewed-by: Andrew Lunn
    Tested-by: Vivien Didelot
    Signed-off-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Florian Fainelli
     
  • Fixes following sparse errors :

    net/ipv4/fib_semantics.c:1579:61: warning: incorrect type in argument 2
    (different base types)
    net/ipv4/fib_semantics.c:1579:61: expected unsigned int [unsigned]
    [usertype] key
    net/ipv4/fib_semantics.c:1579:61: got restricted __be32 const
    [usertype] nh_gw

    Fixes: a6db4494d218c ("net: ipv4: Consider failed nexthops in multipath routes")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Include ipv4_rcv_saddr_equal() definition to avoid this sparse error :

    net/ipv4/udp.c:362:5: warning: symbol 'ipv4_rcv_saddr_equal' was not
    declared. Should it be static?

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • After commit 19689e38eca5 ("tcp: md5: use kmalloc() backed scratch
    areas") this function is no longer used.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This patch converts the diag dumping code to use the rhashtable
    walk code instead of going through rhashtable by hand. The lock
    nl_table_lock is now only taken while we process the multicast
    list as it's not needed for the rhashtable walk.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

19 Aug, 2016

19 commits

  • As recently discussed during the task_under_cgroup_hierarchy() addition,
    we should get rid of the ifdefs surrounding the bpf_skb_under_cgroup()
    helper. If related functionality is not built-in, the helper cannot be
    used anyway, which is also in line with what we do for all other helpers.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Follow-up to 555c8a8623a3 ("bpf: avoid stack copy and use skb ctx for
    event output") for also adding the event output helper for XDP typed
    programs. The event output helper has been very useful in particular for
    debugging or event notification purposes, since it's much faster and
    flexible than regular trace printk due to programmatically being able to
    attach meta data. Same flags structure applies as with tc BPF programs.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • This work adds a bpf_skb_change_tail() helper for tc BPF programs. The
    basic idea is to expand or shrink the skb in a controlled manner. The
    eBPF program can then rewrite the rest via helpers like bpf_skb_store_bytes(),
    bpf_lX_csum_replace() and others rather than passing a raw buffer for
    writing here.

    bpf_skb_change_tail() is really a slow path helper and intended for
    replies with f.e. ICMP control messages. Concept is similar to other
    helpers like bpf_skb_change_proto() helper to keep the helper without
    protocol specifics and let the BPF program mangle the remaining parts.
    A flags field has been added and is reserved for now should we extend
    the helper in future.

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Since we have a skb_pkt_type_ok() helper for checking the type before
    mangling, make use of it instead of open coding. Follow-up to commit
    8b10cab64c13 ("net: simplify and make pkt_type_ok() available for other
    users") that came in after d2485c4242a8 ("bpf: add bpf_skb_change_type
    helper").

    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
    an offline peer node from the internal data structures.

    This will be supported by the tipc user space tool in iproute2.

    Signed-off-by: Richard Alpe
    Reviewed-by: Jon Maloy
    Acked-by: Ying Xue
    Signed-off-by: David S. Miller

    Richard Alpe
     
  • Over the years, TCP BDP has increased a lot, and is typically
    in the order of ~10 Mbytes with help of clever Congestion Control
    modules.

    In presence of packet losses, TCP stores incoming packets into an out of
    order queue, and number of skbs sitting there waiting for the missing
    packets to be received can match the BDP (~10 Mbytes)

    In some cases, TCP needs to make room for incoming skbs, and current
    strategy can simply remove all skbs in the out of order queue as a last
    resort, incurring a huge penalty, both for receiver and sender.

    Unfortunately these 'last resort events' are quite frequent, forcing
    sender to send all packets again, stalling the flow and wasting a lot of
    resources.

    This patch cleans only a part of the out of order queue in order
    to meet the memory constraints.

    Signed-off-by: Eric Dumazet
    Cc: Neal Cardwell
    Cc: Yuchung Cheng
    Cc: Soheil Hassas Yeganeh
    Cc: C. Stephen Gun
    Cc: Van Jacobson
    Acked-by: Soheil Hassas Yeganeh
    Acked-by: Yuchung Cheng
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • While chasing tcp_xmit_retransmit_queue() kasan issue, I found
    that we could avoid reading sacked field of skb that we wont send,
    possibly removing one cache line miss.

    Very minor change in slow path, but why not ? ;)

    Signed-off-by: Eric Dumazet
    Acked-by: Neal Cardwell
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Use one of the vlan xstats padding fields to export the vlan flags. This is
    needed in order to be able to distinguish between master (bridge) and port
    vlan entries in user-space when dumping the bridge vlan stats.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • In the bridge driver we usually have the same function working for both
    port and bridge. In order to follow that logic and also avoid code
    duplication, consolidate the bridge_ and brport_ linkxstats calls into
    one since they share most of their code. As a side effect this allows us
    to dump the vlan stats also via the slave call which is in preparation for
    the upcoming per-port vlan stats and vlan flag dumping.

    Signed-off-by: Nikolay Aleksandrov
    Signed-off-by: David S. Miller

    Nikolay Aleksandrov
     
  • The current vlan push action supports only vid and protocol options.
    Add priority option.

    Example script that adds vlan push action with vid and
    priority:

    tc filter add dev veth0 protocol ip parent ffff: \
    flower \
    indev veth0 \
    action vlan push id 100 priority 5

    Signed-off-by: Hadar Hen Zion
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hadar Hen Zion
     
  • Enhance flower to support 802.1Q vlan protocol classification.
    Currently, the supported fields are vlan_id and vlan_priority.

    Example:

    # add a flower filter with vlan id and priority classification
    tc filter add dev ens4f0 protocol 802.1Q parent ffff: \
    flower \
    indev ens4f0 \
    vlan_ethtype ipv4 \
    vlan_id 100 \
    vlan_prio 3 \
    action vlan pop

    Signed-off-by: Hadar Hen Zion
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hadar Hen Zion
     
  • The current flower implementation checks the mask range and set all the
    keys included in that range as "used_keys", even if a specific key in
    the range has a zero mask.

    This behavior can cause a false positive return value of
    dissector_uses_key function and unnecessary dissection in
    __skb_flow_dissect.

    This patch checks explicitly the mask of each key and "used_keys" will
    be set accordingly.

    Fixes: 77b9900ef53a ('tc: introduce Flower classifier')
    Signed-off-by: Hadar Hen Zion
    Signed-off-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hadar Hen Zion
     
  • Add vlan priority check to the flow dissector by adding new flow
    dissector struct, flow_dissector_key_vlan which includes vlan tag
    fields.

    vlan_id and flow_label fields were under the same struct
    (flow_dissector_key_tags). It was a convenient setting since struct
    flow_dissector_key_tags is used by struct flow_keys and by setting
    vlan_id and flow_label under the same struct, we get precisely 24 or 48
    bytes in flow_keys from flow_dissector_key_basic.

    Now, when adding vlan priority support, the code will be cleaner if
    flow_label and vlan tag won't be under the same struct anymore.

    Signed-off-by: Hadar Hen Zion
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hadar Hen Zion
     
  • Early in the datapath skb_vlan_untag function is called, stripped
    the vlan from the skb and set skb->vlan_tci and skb->vlan_proto fields.

    The current dissection doesn't handle stripped vlan packets correctly.
    In some flows, vlan doesn't exist in skb->data anymore when applying
    flow dissection on the skb, fix that.

    In case vlan info wasn't stripped before applying flow_dissector (RPS
    flow for example), or in case of skb with multiple vlans (e.g. 802.1ad),
    get the vlan info from skb->data. The flow_dissector correctly skips
    any number of vlans and stores only the first level vlan.

    Fixes: 0744dd00c1b1 ('net: introduce skb_flow_dissect()')
    Signed-off-by: Hadar Hen Zion
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Hadar Hen Zion
     
  • tc_dump_qdisc() performs dumping of the per-device qdiscs in two phases;
    first, the "standard" dev->qdisc is being dumped. Second, if there is/are
    ingress queue(s), they are being dumped as well.

    After conversion of netdevice's qdisc linked-list into hashtable, these
    two sets are not in two disjunctive sets/lists any more, but are both
    "reachable" directly from netdevice's hashtable. As a consequence, the
    "full-depth" dump of the ingress qdiscs results in immediately hitting the
    netdevice hashtable again, and duplicating the dump that has already been
    performed for dev->qdisc.
    What in fact needs to be dumped in case of ingress queue is "just" the
    top-level ingress qdisc, as everything else has been dumped already.

    Fix this by extending tc_dump_qdisc_root() in a way that it can be instructed
    whether it should (while performing the "full" per-netdev qdisc dump) perform
    the whole recursion, or just dump "additional" top-level (ingress) qdiscs
    without performing any kind of recursion.

    This fixes duplicate dumps such as

    qdisc mq 0: root
    qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc clsact ffff: parent ffff:fff1
    qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
    qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1

    Fixes: 59cc1f61f ("net: sched: convert qdisc linked list to hashtable")
    Reported-by: Daniel Borkmann
    Tested-by: Daniel Borkmann
    Signed-off-by: Jiri Kosina
    Signed-off-by: David S. Miller

    Jiri Kosina
     
  • qdisc_match_from_root() is now iterating over per-netdevice qdisc
    hashtable instead of going through a linked-list of qdiscs (independently
    on the actual underlying netdev), which was the case before the switch to
    hashtable for qdiscs.

    For singleton qdiscs, there is no underlying netdev associated though, and
    therefore dumping a singleton qdisc will panic, as qdisc_dev(root) will
    always be NULL.

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000410
    IP: [] qdisc_match_from_root+0x2c/0x70
    PGD 1aceba067 PUD 1aceb7067 PMD 0
    Oops: 0000 [#1] PREEMPT SMP
    [ ... ]
    task: ffff8801ec996e00 task.stack: ffff8801ec934000
    RIP: 0010:[] [] qdisc_match_from_root+0x2c/0x70
    RSP: 0018:ffff8801ec937ab0 EFLAGS: 00010203
    RAX: 0000000000000408 RBX: ffff88025e612000 RCX: ffffffffffffffd8
    RDX: 0000000000000000 RSI: 00000000ffff0000 RDI: ffffffff81cf8100
    RBP: ffff8801ec937ab0 R08: 000000000001c160 R09: ffff8802668032c0
    R10: ffffffff81cf8100 R11: 0000000000000030 R12: 00000000ffff0000
    R13: ffff88025e612000 R14: ffffffff81cf3140 R15: 0000000000000000
    FS: 00007f24b9af6740(0000) GS:ffff88026f280000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000410 CR3: 00000001aceec000 CR4: 00000000001406e0
    Stack:
    ffff8801ec937ad0 ffffffff81681210 ffff88025dd51a00 00000000fffffff1
    ffff8801ec937b88 ffffffff81681e4e ffffffff81c42bc0 ffff880262431500
    ffffffff81cf3140 ffff88025dd51a10 ffff88025dd51a24 00000000ec937b38
    Call Trace:
    [] qdisc_lookup+0x40/0x50
    [] tc_modify_qdisc+0x21e/0x550
    [] rtnetlink_rcv_msg+0x95/0x220
    [] ? __kmalloc_track_caller+0x172/0x230
    [] ? rtnl_newlink+0x870/0x870
    [] netlink_rcv_skb+0xa7/0xc0
    [] rtnetlink_rcv+0x28/0x30
    [] netlink_unicast+0x15b/0x210
    [] netlink_sendmsg+0x319/0x390
    [] sock_sendmsg+0x38/0x50
    [] ___sys_sendmsg+0x256/0x260
    [] ? __pagevec_lru_add_fn+0x135/0x280
    [] ? pagevec_lru_move_fn+0xd0/0xf0
    [] ? trace_event_raw_event_mm_lru_insertion+0x180/0x180
    [] ? __lru_cache_add+0x75/0xb0
    [] ? _raw_spin_unlock+0x16/0x40
    [] ? handle_mm_fault+0x39f/0x1160
    [] __sys_sendmsg+0x45/0x80
    [] SyS_sendmsg+0x12/0x20
    [] do_syscall_64+0x57/0xb0

    Fix this by special-casing singleton qdiscs (those that don't have
    underlying netdevice) and introduce immediate handling of those rather
    than trying to go over an underlying netdevice. We're in the same
    situation in tc_dump_qdisc_root() and tc_dump_tclass_root().

    Ultimately, this will have to be slightly reworked so that we are actually
    able to show singleton qdiscs (noop) in the dump properly; but we're not
    currently doing that anyway, so no regression there, and better do this in
    a gradual manner.

    Fixes: 59cc1f61f ("net: sched: convert qdisc linked list to hashtable")
    Reported-by: Daniel Borkmann
    Tested-by: Daniel Borkmann
    Reported-by: David Ahern
    Tested-by: David Ahern
    Signed-off-by: Jiri Kosina
    Signed-off-by: David S. Miller

    Jiri Kosina
     
  • When a link is attempted woken up after congestion, it uses a different,
    more generous criteria than when it was originally declared congested.
    This has the effect that the link, and the sending process, sometimes
    will be woken up unnecessarily, just to immediately return to congestion
    when it turns out there is not not enough space in its send queue to
    host the pending message. This is a waste of CPU cycles.

    We now change the function link_prepare_wakeup() to use exactly the same
    criteria as tipc_link_xmit(). However, since we are now excluding the
    window limit from the wakeup calculation, and the current backlog limit
    for the lowest level is too small to house even a single maximum-size
    message, we have to expand this limit. We do this by evaluating an
    alternative, minimum value during the setting of the importance limits.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • In commit 5b7066c3dd24 ("tipc: stricter filtering of packets in bearer
    layer") we introduced a method of filtering out messages while a bearer
    is being reset, to avoid that links may be re-created and come back in
    working state while we are still in the process of shutting them down.

    This solution works well, but is limited to only work with L2 media, which
    is insufficient with the increasing use of UDP as carrier media.

    We now replace this solution with a more generic one, by introducing a
    new flag "up" in the generic struct tipc_bearer. This field will be set
    and reset at the same locations as with the previous solution, while
    the packet filtering is moved to the generic code for the sending side.
    On the receiving side, the filtering is still done in media specific
    code, but now including the UDP bearer.

    Acked-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     
  • dev->name is a char array of IFNAMSIZ elements, hence can never be
    null, so the null pointer check is redundant. Remove it.

    Signed-off-by: Colin Ian King
    Signed-off-by: David S. Miller

    Colin Ian King
     

18 Aug, 2016

11 commits

  • Minor overlapping changes for both merge conflicts.

    Resolution work done by Stephen Rothwell was used
    as a reference.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Pull networking fixes from David Miller:

    1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
    Fietkau.

    2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.

    3) Fix some tg3 ethtool logic bugs, and one that would cause no
    interrupts to be generated when rx-coalescing is set to 0. From
    Satish Baddipadige and Siva Reddy Kallam.

    4) QLCNIC mailbox corruption and napi budget handling fix from Manish
    Chopra.

    5) Fix fib_trie logic when walking the trie during /proc/net/route
    output than can access a stale node pointer. From David Forster.

    6) Several sctp_diag fixes from Phil Sutter.

    7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.

    8) Checksum fixup fixes in bpf from Daniel Borkmann.

    9) Memork leaks in nfnetlink, from Liping Zhang.

    10) Use after free in rxrpc, from David Howells.

    11) Use after free in new skb_array code of macvtap driver, from Jason
    Wang.

    12) Calipso resource leak, from Colin Ian King.

    13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.

    14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.

    15) Fix lockdep splats in macsec, from Sabrina Dubroca.

    16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
    handling.

    17) Various tc-action bug fixes, from CONG Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
    net_sched: allow flushing tc police actions
    net_sched: unify the init logic for act_police
    net_sched: convert tcf_exts from list to pointer array
    net_sched: move tc offload macros to pkt_cls.h
    net_sched: fix a typo in tc_for_each_action()
    net_sched: remove an unnecessary list_del()
    net_sched: remove the leftover cleanup_a()
    mlxsw: spectrum: Allow packets to be trapped from any PG
    mlxsw: spectrum: Unmap 802.1Q FID before destroying it
    mlxsw: spectrum: Add missing rollbacks in error path
    mlxsw: reg: Fix missing op field fill-up
    mlxsw: spectrum: Trap loop-backed packets
    mlxsw: spectrum: Add missing packet traps
    mlxsw: spectrum: Mark port as active before registering it
    mlxsw: spectrum: Create PVID vPort before registering netdevice
    mlxsw: spectrum: Remove redundant errors from the code
    mlxsw: spectrum: Don't return upon error in removal path
    i40e: check for and deal with non-contiguous TCs
    ixgbe: Re-enable ability to toggle VLAN filtering
    ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
    ...

    Linus Torvalds
     
  • Adapt KCM to use the stream parser. This mostly involves removing
    the RX handling and setting up the strparser using the interface.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • This patch introduces a utility for parsing application layer protocol
    messages in a TCP stream. This is a generalization of the mechanism
    implemented of Kernel Connection Multiplexor.

    The API includes a context structure, a set of callbacks, utility
    functions, and a data ready function.

    A stream parser instance is defined by a strparse structure that
    is bound to a TCP socket. The function to initialize the structure
    is:

    int strp_init(struct strparser *strp, struct sock *csk,
    struct strp_callbacks *cb);

    csk is the TCP socket being bound to and cb are the parser callbacks.

    The upper layer calls strp_tcp_data_ready when data is ready on the lower
    socket for strparser to process. This should be called from a data_ready
    callback that is set on the socket:

    void strp_tcp_data_ready(struct strparser *strp);

    A parser is bound to a TCP socket by setting data_ready function to
    strp_tcp_data_ready so that all receive indications on the socket
    go through the parser. This is assumes that sk_user_data is set to
    the strparser structure.

    There are four callbacks.
    - parse_msg is called to parse the message (returns length or error).
    - rcv_msg is called when a complete message has been received
    - read_sock_done is called when data_ready function exits
    - abort_parser is called to abort the parser

    The input to parse_msg is an skbuff which contains next message under
    construction. The backend processing of parse_msg will parse the
    application layer protocol headers to determine the length of
    the message in the stream. The possible return values are:

    >0 : indicates length of successfully parsed message
    0 : indicates more data must be received to parse the message
    -ESTRPIPE : current message should not be processed by the
    kernel, return control of the socket to userspace which
    can proceed to read the messages itself
    other < 0 : Error is parsing, give control back to userspace
    assuming that synchronzation is lost and the stream
    is unrecoverable (application expected to close TCP socket)

    In the case of error return (< 0) strparse will stop the parser
    and report and error to userspace. The application must deal
    with the error. To handle the error the strparser is unbound
    from the TCP socket. If the error indicates that the stream
    TCP socket is at recoverable point (ESTRPIPE) then the application
    can read the TCP socket to process the stream. Once the application
    has dealt with the exceptions in the stream, it may again bind the
    socket to a strparser to continue data operations.

    Note that ENODATA may be returned to the application. In this case
    parse_msg returned -ESTRPIPE, however strparser was unable to maintain
    synchronization of the stream (i.e. some of the message in question
    was already read by the parser).

    strp_pause and strp_unpause are used to provide flow control. For
    instance, if rcv_msg is called but the upper layer can't immediately
    consume the message it can hold the message and pause strparser.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • While commit 9c706a49d660 ("net: ipconfig: fix use after free") avoids
    the use after free, the resulting code still ends up calling both the
    ic_setup_if() and ic_setup_routes() after calling ic_close_devs(), and
    access to the device is still required.

    Move the call to ic_close_devs() to the very end of the function.

    Signed-off-by: Thierry Reding
    Signed-off-by: David S. Miller

    Thierry Reding
     
  • The act_police uses its own code to walk the
    action hashtable, which leads to that we could
    not flush standalone tc police actions, so just
    switch to tcf_generic_walker() like other actions.

    (Joint work from Roman and Cong.)

    Signed-off-by: Roman Mashak
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    Roman Mashak
     
  • Jamal reported a crash when we create a police action
    with a specific index, this is because the init logic
    is not correct, we should always create one for this
    case. Just unify the logic with other tc actions.

    Fixes: a03e6fe56971 ("act_police: fix a crash during removal")
    Reported-by: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     
  • As pointed out by Jamal, an action could be shared by
    multiple filters, so we can't use list to chain them
    any more after we get rid of the original tc_action.
    Instead, we could just save pointers to these actions
    in tcf_exts, since they are refcount'ed, so convert
    the list to an array of pointers.

    The "ugly" part is the action API still accepts list
    as a parameter, I just introduce a helper function to
    convert the array of pointers to a list, instead of
    relying on the C99 feature to iterate the array.

    Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
    Reported-by: Jamal Hadi Salim
    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     
  • This list_del() for tc action is not needed actually,
    because we only use this list to chain bulk operations,
    therefore should not be carried for latter operations.

    Fixes: ec0595cc4495 ("net_sched: get rid of struct tcf_common")
    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     
  • After refactoring tc_action into tcf_common, we no
    longer need to cleanup temporary "actions" in list,
    they are permanently stored in the hashtable.

    Fixes: a85a970af265 ("net_sched: move tc_action into tcf_common")
    Reported-by: Jamal Hadi Salim
    Cc: Jamal Hadi Salim
    Signed-off-by: Cong Wang
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    WANG Cong
     
  • Simon Wunderlich says:

    ====================
    pull request for net-next: batman-adv 2016-08-16

    This feature patchset is all about adding netlink support, which should
    supersede our debugfs configuration interface in the long run. It is
    especially necessary when batman-adv should be used in different
    namespaces, since debugfs can not differentiate between those.

    More specifically, the following changes are included:

    - Two fixes for namespace handling by Andrew Lunn, checking also the
    namespaces for parent interfaces, and supress debugfs entries
    for non-default netns

    - Implement various netlink commands for the new interface, by
    Matthias Schiffer, Andrew Lunn, Sven Eckelmann and Simon Wunderlich
    (13 patches):
    * routing algorithm list
    * hardif list
    * translation tables (local and global)
    * TTVN for the translation tables
    * originator and neighbor tables for B.A.T.M.A.N. IV
    and B.A.T.M.A.N. V
    * gateway dump functionality for B.A.T.M.A.N. IV
    and B.A.T.M.A.N. V
    * Bridge Loop Avoidance claims, and corresponding BLA group
    * Bridge Loop Avoidance backbone tables

    - Finally, mark batman-adv as netns compatible, by Andrew Lunn (1 patch)
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

17 Aug, 2016

1 commit


16 Aug, 2016

2 commits

  • tipc_msg_create() can return a NULL skb and if so, we shouldn't try to
    call tipc_node_xmit_skb() on it.

    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    CPU: 3 PID: 30298 Comm: trinity-c0 Not tainted 4.7.0-rc7+ #19
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
    task: ffff8800baf09980 ti: ffff8800595b8000 task.ti: ffff8800595b8000
    RIP: 0010:[] [] tipc_node_xmit_skb+0x6b/0x140
    RSP: 0018:ffff8800595bfce8 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003023b0e0
    RDX: 0000000000000000 RSI: dffffc0000000000 RDI: ffffffff83d12580
    RBP: ffff8800595bfd78 R08: ffffed000b2b7f32 R09: 0000000000000000
    R10: fffffbfff0759725 R11: 0000000000000000 R12: 1ffff1000b2b7f9f
    R13: ffff8800595bfd58 R14: ffffffff83d12580 R15: dffffc0000000000
    FS: 00007fcdde242700(0000) GS:ffff88011af80000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fcddde1db10 CR3: 000000006874b000 CR4: 00000000000006e0
    DR0: 00007fcdde248000 DR1: 00007fcddd73d000 DR2: 00007fcdde248000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000090602
    Stack:
    0000000000000018 0000000000000018 0000000041b58ab3 ffffffff83954208
    ffffffff830bb400 ffff8800595bfd30 ffffffff8309d767 0000000000000018
    0000000000000018 ffff8800595bfd78 ffffffff8309da1a 00000000810ee611
    Call Trace:
    [] tipc_shutdown+0x553/0x880
    [] SyS_shutdown+0x14b/0x170
    [] do_syscall_64+0x19c/0x410
    [] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 90 00 b4 0b 83 c7 00 f1 f1 f1 f1 4c 8d 6d e0 c7 40 04 00 00 00 f4 c7 40 08 f3 f3 f3 f3 48 89 d8 48 c1 e8 03 c7 45 b4 00 00 00 00 3c 30 00 75 78 48 8d 7b 08 49 8d 75 c0 48 b8 00 00 00 00 00
    RIP [] tipc_node_xmit_skb+0x6b/0x140
    RSP
    ---[ end trace 57b0484e351e71f1 ]---

    I feel like we should maybe return -ENOMEM or -ENOBUFS, but I'm not sure
    userspace is equipped to handle that. Anyway, this is better than a GPF
    and looks somewhat consistent with other tipc_msg_create() callers.

    Signed-off-by: Vegard Nossum
    Acked-by: Ying Xue
    Acked-by: Jon Maloy
    Signed-off-by: David S. Miller

    Vegard Nossum
     
  • Move exporting of switchdev_port_same_parent_id to be right
    below it and not elsewhere.

    Signed-off-by: Or Gerlitz
    Reported-by: Ido Schimmel
    Signed-off-by: David S. Miller

    Or Gerlitz