20 Mar, 2016

1 commit

  • Pull networking updates from David Miller:
    "Highlights:

    1) Support more Realtek wireless chips, from Jes Sorenson.

    2) New BPF types for per-cpu hash and arrap maps, from Alexei
    Starovoitov.

    3) Make several TCP sysctls per-namespace, from Nikolay Borisov.

    4) Allow the use of SO_REUSEPORT in order to do per-thread processing
    of incoming TCP/UDP connections. The muxing can be done using a
    BPF program which hashes the incoming packet. From Craig Gallek.

    5) Add a multiplexer for TCP streams, to provide a messaged based
    interface. BPF programs can be used to determine the message
    boundaries. From Tom Herbert.

    6) Add 802.1AE MACSEC support, from Sabrina Dubroca.

    7) Avoid factorial complexity when taking down an inetdev interface
    with lots of configured addresses. We were doing things like
    traversing the entire address less for each address removed, and
    flushing the entire netfilter conntrack table for every address as
    well.

    8) Add and use SKB bulk free infrastructure, from Jesper Brouer.

    9) Allow offloading u32 classifiers to hardware, and implement for
    ixgbe, from John Fastabend.

    10) Allow configuring IRQ coalescing parameters on a per-queue basis,
    from Kan Liang.

    11) Extend ethtool so that larger link mode masks can be supported.
    From David Decotigny.

    12) Introduce devlink, which can be used to configure port link types
    (ethernet vs Infiniband, etc.), port splitting, and switch device
    level attributes as a whole. From Jiri Pirko.

    13) Hardware offload support for flower classifiers, from Amir Vadai.

    14) Add "Local Checksum Offload". Basically, for a tunneled packet
    the checksum of the outer header is 'constant' (because with the
    checksum field filled into the inner protocol header, the payload
    of the outer frame checksums to 'zero'), and we can take advantage
    of that in various ways. From Edward Cree"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1548 commits)
    bonding: fix bond_get_stats()
    net: bcmgenet: fix dma api length mismatch
    net/mlx4_core: Fix backward compatibility on VFs
    phy: mdio-thunder: Fix some Kconfig typos
    lan78xx: add ndo_get_stats64
    lan78xx: handle statistics counter rollover
    RDS: TCP: Remove unused constant
    RDS: TCP: Add sysctl tunables for sndbuf/rcvbuf on rds-tcp socket
    net: smc911x: convert pxa dma to dmaengine
    team: remove duplicate set of flag IFF_MULTICAST
    bonding: remove duplicate set of flag IFF_MULTICAST
    net: fix a comment typo
    ethernet: micrel: fix some error codes
    ip_tunnels, bpf: define IP_TUNNEL_OPTS_MAX and use it
    bpf, dst: add and use dst_tclassid helper
    bpf: make skb->tc_classid also readable
    net: mvneta: bm: clarify dependencies
    cls_bpf: reset class and reuse major in da
    ldmvsw: Checkpatch sunvnet.c and sunvnet_common.c
    ldmvsw: Add ldmvsw.c driver code
    ...

    Linus Torvalds
     

14 Mar, 2016

1 commit


05 Mar, 2016

1 commit


04 Mar, 2016

12 commits


27 Jan, 2016

1 commit


13 Jan, 2016

1 commit

  • Pull networking updates from Davic Miller:

    1) Support busy polling generically, for all NAPI drivers. From Eric
    Dumazet.

    2) Add byte/packet counter support to nft_ct, from Floriani Westphal.

    3) Add RSS/XPS support to mvneta driver, from Gregory Clement.

    4) Implement IPV6_HDRINCL socket option for raw sockets, from Hannes
    Frederic Sowa.

    5) Add support for T6 adapter to cxgb4 driver, from Hariprasad Shenai.

    6) Add support for VLAN device bridging to mlxsw switch driver, from
    Ido Schimmel.

    7) Add driver for Netronome NFP4000/NFP6000, from Jakub Kicinski.

    8) Provide hwmon interface to mlxsw switch driver, from Jiri Pirko.

    9) Reorganize wireless drivers into per-vendor directories just like we
    do for ethernet drivers. From Kalle Valo.

    10) Provide a way for administrators "destroy" connected sockets via the
    SOCK_DESTROY socket netlink diag operation. From Lorenzo Colitti.

    11) Add support to add/remove multicast routes via netlink, from Nikolay
    Aleksandrov.

    12) Make TCP keepalive settings per-namespace, from Nikolay Borisov.

    13) Add forwarding and packet duplication facilities to nf_tables, from
    Pablo Neira Ayuso.

    14) Dead route support in MPLS, from Roopa Prabhu.

    15) TSO support for thunderx chips, from Sunil Goutham.

    16) Add driver for IBM's System i/p VNIC protocol, from Thomas Falcon.

    17) Rationalize, consolidate, and more completely document the checksum
    offloading facilities in the networking stack. From Tom Herbert.

    18) Support aborting an ongoing scan in mac80211/cfg80211, from
    Vidyullatha Kanchanapally.

    19) Use per-bucket spinlock for bpf hash facility, from Tom Leiming.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1375 commits)
    net: bnxt: always return values from _bnxt_get_max_rings
    net: bpf: reject invalid shifts
    phonet: properly unshare skbs in phonet_rcv()
    dwc_eth_qos: Fix dma address for multi-fragment skbs
    phy: remove an unneeded condition
    mdio: remove an unneed condition
    mdio_bus: NULL dereference on allocation error
    net: Fix typo in netdev_intersect_features
    net: freescale: mac-fec: Fix build error from phy_device API change
    net: freescale: ucc_geth: Fix build error from phy_device API change
    bonding: Prevent IPv6 link local address on enslaved devices
    IB/mlx5: Add flow steering support
    net/mlx5_core: Export flow steering API
    net/mlx5_core: Make ipv4/ipv6 location more clear
    net/mlx5_core: Enable flow steering support for the IB driver
    net/mlx5_core: Initialize namespaces only when supported by device
    net/mlx5_core: Set priority attributes
    net/mlx5_core: Connect flow tables
    net/mlx5_core: Introduce modify flow table command
    net/mlx5_core: Managing root flow table
    ...

    Linus Torvalds
     

04 Jan, 2016

1 commit


04 Dec, 2015

1 commit


02 Dec, 2015

1 commit

  • This patch is a cleanup to make following patch easier to
    review.

    Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
    from (struct socket)->flags to a (struct socket_wq)->flags
    to benefit from RCU protection in sock_wake_async()

    To ease backports, we rename both constants.

    Two new helpers, sk_set_bit(int nr, struct sock *sk)
    and sk_clear_bit(int net, struct sock *sk) are added so that
    following patch can change their implementation.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

01 Dec, 2015

1 commit

  • The memory barrier in the helper wq_has_sleeper is needed by just
    about every user of waitqueue_active. This patch generalises it
    by making it take a wait_queue_head_t directly. The existing
    helper is renamed to skwq_has_sleeper.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

25 Nov, 2015

1 commit

  • Normally, the transmit phase of a client call is implicitly ack'd by the
    reception of the first data packet of the response being received.
    However, if a security negotiation happens, the transmit phase, if it is
    entirely contained in a single packet, may get an ack packet in response
    and then may get aborted due to security negotiation failure.

    Because the client has shifted state to RXRPC_CALL_CLIENT_AWAIT_REPLY due
    to having transmitted all the data, the code that handles processing of the
    received ack packet doesn't note the hard ack the data packet.

    The following abort packet in the case of security negotiation failure then
    incurs an assertion failure when it tries to drain the Tx queue because the
    hard ack state is out of sync (hard ack means the packets have been
    processed and can be discarded by the sender; a soft ack means that the
    packets are received but could still be discarded and rerequested by the
    receiver).

    To fix this, we should record the hard ack we received for the ack packet.

    The assertion failure looks like:

    RxRPC: Assertion failed
    1 ] [] rxrpc_rotate_tx_window+0xbc/0x131 [af_rxrpc]
    ...

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

07 Nov, 2015

1 commit

  • …d avoiding waking kswapd

    __GFP_WAIT has been used to identify atomic context in callers that hold
    spinlocks or are in interrupts. They are expected to be high priority and
    have access one of two watermarks lower than "min" which can be referred
    to as the "atomic reserve". __GFP_HIGH users get access to the first
    lower watermark and can be called the "high priority reserve".

    Over time, callers had a requirement to not block when fallback options
    were available. Some have abused __GFP_WAIT leading to a situation where
    an optimisitic allocation with a fallback option can access atomic
    reserves.

    This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
    cannot sleep and have no alternative. High priority users continue to use
    __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
    are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
    callers that want to wake kswapd for background reclaim. __GFP_WAIT is
    redefined as a caller that is willing to enter direct reclaim and wake
    kswapd for background reclaim.

    This patch then converts a number of sites

    o __GFP_ATOMIC is used by callers that are high priority and have memory
    pools for those requests. GFP_ATOMIC uses this flag.

    o Callers that have a limited mempool to guarantee forward progress clear
    __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
    into this category where kswapd will still be woken but atomic reserves
    are not used as there is a one-entry mempool to guarantee progress.

    o Callers that are checking if they are non-blocking should use the
    helper gfpflags_allow_blocking() where possible. This is because
    checking for __GFP_WAIT as was done historically now can trigger false
    positives. Some exceptions like dm-crypt.c exist where the code intent
    is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
    flag manipulations.

    o Callers that built their own GFP flags instead of starting with GFP_KERNEL
    and friends now also need to specify __GFP_KSWAPD_RECLAIM.

    The first key hazard to watch out for is callers that removed __GFP_WAIT
    and was depending on access to atomic reserves for inconspicuous reasons.
    In some cases it may be appropriate for them to use __GFP_HIGH.

    The second key hazard is callers that assembled their own combination of
    GFP flags instead of starting with something like GFP_KERNEL. They may
    now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
    if it's missed in most cases as other activity will wake kswapd.

    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Mel Gorman
     

06 Nov, 2015

1 commit

  • Pull security subsystem update from James Morris:
    "This is mostly maintenance updates across the subsystem, with a
    notable update for TPM 2.0, and addition of Jarkko Sakkinen as a
    maintainer of that"

    * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (40 commits)
    apparmor: clarify CRYPTO dependency
    selinux: Use a kmem_cache for allocation struct file_security_struct
    selinux: ioctl_has_perm should be static
    selinux: use sprintf return value
    selinux: use kstrdup() in security_get_bools()
    selinux: use kmemdup in security_sid_to_context_core()
    selinux: remove pointless cast in selinux_inode_setsecurity()
    selinux: introduce security_context_str_to_sid
    selinux: do not check open perm on ftruncate call
    selinux: change CONFIG_SECURITY_SELINUX_CHECKREQPROT_VALUE default
    KEYS: Merge the type-specific data with the payload data
    KEYS: Provide a script to extract a module signature
    KEYS: Provide a script to extract the sys cert list from a vmlinux file
    keys: Be more consistent in selection of union members used
    certs: add .gitignore to stop git nagging about x509_certificate_list
    KEYS: use kvfree() in add_key
    Smack: limited capability for changing process label
    TPM: remove unnecessary little endian conversion
    vTPM: support little endian guests
    char: Drop owner assignment from i2c_driver
    ...

    Linus Torvalds
     

21 Oct, 2015

1 commit

  • Merge the type-specific data with the payload data into one four-word chunk
    as it seems pointless to keep them separate.

    Use user_key_payload() for accessing the payloads of overloaded
    user-defined keys.

    Signed-off-by: David Howells
    cc: linux-cifs@vger.kernel.org
    cc: ecryptfs@vger.kernel.org
    cc: linux-ext4@vger.kernel.org
    cc: linux-f2fs-devel@lists.sourceforge.net
    cc: linux-nfs@vger.kernel.org
    cc: ceph-devel@vger.kernel.org
    cc: linux-ima-devel@lists.sourceforge.net

    David Howells
     

21 Sep, 2015

1 commit


11 May, 2015

2 commits


12 Apr, 2015

2 commits


01 Apr, 2015

4 commits

  • Handle VERSION Rx protocol packets. We should respond to a VERSION packet
    with a string indicating the Rx version. This is a maximum of 64 characters
    and is padded out to 65 chars with NUL bytes.

    Note that other AFS clients use the version request as a NAT keepalive so we
    need to handle it rather than returning an abort.

    The standard formulation seems to be:

    built --

    for example:

    " OpenAFS 1.6.2 built 2013-05-07 "

    (note the three extra spaces) as obtained with:

    rxdebug grand.mit.edu -version

    from the openafs package.

    Signed-off-by: David Howells

    David Howells
     
  • Use iov_iter_count() in rxrpc_send_data() to get the remaining data length
    instead of using the len argument as the len argument is now redundant.

    Signed-off-by: David Howells

    David Howells
     
  • Don't call skb_add_data() in rxrpc_send_data() if there's no data to copy and
    also skip the calculations associated with it in such a case.

    Signed-off-by: David Howells

    David Howells
     
  • This commit:

    commit af2b040e470b470bfc881981db3c796072853eae
    Author: Al Viro
    Date: Thu Nov 27 21:44:24 2014 -0500
    Subject: rxrpc: switch rxrpc_send_data() to iov_iter primitives

    incorrectly changes a do-while loop into a while loop in rxrpc_send_data().

    Unfortunately, at least one pass through the loop is required - even if
    there is no data - so that the packet the closes the send phase can be
    sent if MSG_MORE is not set.

    Signed-off-by: David Howells

    David Howells
     

21 Mar, 2015

1 commit

  • Conflicts:
    drivers/net/ethernet/emulex/benet/be_main.c
    net/core/sysctl_net_core.c
    net/ipv4/inet_diag.c

    The be_main.c conflict resolution was really tricky. The conflict
    hunks generated by GIT were very unhelpful, to say the least. It
    split functions in half and moved them around, when the real actual
    conflict only existed solely inside of one function, that being
    be_map_pci_bars().

    So instead, to resolve this, I checked out be_main.c from the top
    of net-next, then I applied the be_main.c changes from 'net' since
    the last time I merged. And this worked beautifully.

    The inet_diag.c and sysctl_net_core.c conflicts were simple
    overlapping changes, and were easily to resolve.

    Signed-off-by: David S. Miller

    David S. Miller
     

16 Mar, 2015

1 commit

  • [I would really like an ACK on that one from dhowells; it appears to be
    quite straightforward, but...]

    MSG_PEEK isn't passed to ->recvmsg() via msg->msg_flags; as the matter of
    fact, neither the kernel users of rxrpc, nor the syscalls ever set that bit
    in there. It gets passed via flags; in fact, another such check in the same
    function is done correctly - as flags & MSG_PEEK.

    It had been that way (effectively disabled) for 8 years, though, so the patch
    needs beating up - that case had never been tested. If it is correct, it's
    -stable fodder.

    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

10 Mar, 2015

1 commit


09 Mar, 2015

1 commit

  • When reading from the error queue, msg_name and msg_control are only
    populated for some errors. A new exception for empty timestamp skbs
    added a false positive on icmp errors without payload.

    `traceroute -M udpconn` only displayed gateways that return payload
    with the icmp error: the embedded network headers are pulled before
    sock_queue_err_skb, leaving an skb with skb->len == 0 otherwise.

    Fix this regression by refining when msg_name and msg_control
    branches are taken. The solutions for the two fields are independent.

    msg_name only makes sense for errors that configure serr->port and
    serr->addr_offset. Test the first instead of skb->len. This also fixes
    another issue. saddr could hold the wrong data, as serr->addr_offset
    is not initialized in some code paths, pointing to the start of the
    network header. It is only valid when serr->port is set (non-zero).

    msg_control support differs between IPv4 and IPv6. IPv4 only honors
    requests for ICMP and timestamps with SOF_TIMESTAMPING_OPT_CMSG. The
    skb->len test can simply be removed, because skb->dev is also tested
    and never true for empty skbs. IPv6 honors requests for all errors
    aside from local errors and timestamps on empty skbs.

    In both cases, make the policy more explicit by moving this logic to
    a new function that decides whether to process msg_control and that
    optionally prepares the necessary fields in skb->cb[]. After this
    change, the IPv4 and IPv6 paths are more similar.

    The last case is rxrpc. Here, simply refine to only match timestamps.

    Fixes: 49ca0d8bfaf3 ("net-timestamp: no-payload option")

    Reported-by: Jan Niehusmann
    Signed-off-by: Willem de Bruijn

    ----

    Changes
    v1->v2
    - fix local origin test inversion in ip6_datagram_support_cmsg
    - make v4 and v6 code paths more similar by introducing analogous
    ipv4_datagram_support_cmsg
    - fix compile bug in rxrpc
    Signed-off-by: David S. Miller

    Willem de Bruijn
     

04 Mar, 2015

1 commit


03 Mar, 2015

1 commit

  • After TIPC doesn't depend on iocb argument in its internal
    implementations of sendmsg() and recvmsg() hooks defined in proto
    structure, no any user is using iocb argument in them at all now.
    Then we can drop the redundant iocb argument completely from kinds of
    implementations of both sendmsg() and recvmsg() in the entire
    networking stack.

    Cc: Christoph Hellwig
    Suggested-by: Al Viro
    Signed-off-by: Ying Xue
    Signed-off-by: David S. Miller

    Ying Xue