31 Dec, 2014

2 commits

  • This reverts commit 24a0aa212ee2dbe44360288684478d76a8e20a0a.

    It's causing severe userspace breakage. Namely, all the utilities from
    wireless-utils which are relying on CONFIG_WEXT (which means tools like
    'iwconfig', 'iwlist', etc) are not working anymore. There is a 'iw'
    utility in newer wireless-tools, which is supposed to be a replacement
    for all the "deprecated" binaries, but it's far away from being
    massively adopted.

    Please see [1] for example of the userspace breakage this is causing.

    In addition to that, Larry Finger reports [2] that this patch is also
    causing ipw2200 driver being impossible to build.

    To me this clearly shows that CONFIG_WEXT is far, far away from being
    "deprecated enough" to be removed.

    [1] http://thread.gmane.org/gmane.linux.kernel/1857010
    [2] http://thread.gmane.org/gmane.linux.network/343688

    Signed-off-by: Jiri Kosina
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     
  • Pull networking fixes from David Miller:

    1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen.

    2) Fix receive checksum handling in enic driver, from Govindarajulu
    Varadarajan.

    3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from
    Herbert Xu. Also, add code to detect drivers that have this mistake
    in the future.

    4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai.

    5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP
    input path,f rom Nicolas Dichtel.

    6) Fix MPLS action validation in openvswitch, from Pravin B Shelar.

    7) Fix double SKB free in vxlan driver, also from Pravin.

    8) When we scrub a packet, which happens when we are switching the
    context of the packet (namespace, etc.), we should reset the
    secmark. From Thomas Graf.

    9) ->ndo_gso_check() needs to do more than return true/false, it also
    has to allow the driver to clear netdev feature bits in order for
    the caller to be able to proceed properly. From Jesse Gross.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
    genetlink: A genl_bind() to an out-of-range multicast group should not WARN().
    netlink/genetlink: pass network namespace to bind/unbind
    ne2k-pci: Add pci_disable_device in error handling
    bonding: change error message to debug message in __bond_release_one()
    genetlink: pass multicast bind/unbind to families
    netlink: call unbind when releasing socket
    netlink: update listeners directly when removing socket
    genetlink: pass only network namespace to genl_has_listeners()
    netlink: rename netlink_unbind() to netlink_undo_bind()
    net: Generalize ndo_gso_check to ndo_features_check
    net: incorrect use of init_completion fixup
    neigh: remove next ptr from struct neigh_table
    net: xilinx: Remove unnecessary temac_property in the driver
    net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
    net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
    openvswitch: fix odd_ptr_err.cocci warnings
    Bluetooth: Fix accepting connections when not using mgmt
    Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR
    brcmfmac: Do not crash if platform data is not populated
    ipw2200: select CFG80211_WEXT
    ...

    Linus Torvalds
     

30 Dec, 2014

1 commit


27 Dec, 2014

9 commits

  • Netlink families can exist in multiple namespaces, and for the most
    part multicast subscriptions are per network namespace. Thus it only
    makes sense to have bind/unbind notifications per network namespace.

    To achieve this, pass the network namespace of a given client socket
    to the bind/unbind functions.

    Also do this in generic netlink, and there also make sure that any
    bind for multicast groups that only exist in init_net is rejected.
    This isn't really a problem if it is accepted since a client in a
    different namespace will never receive any notifications from such
    a group, but it can confuse the family if not rejected (it's also
    possible to silently (without telling the family) accept it, but it
    would also have to be ignored on unbind so families that take any
    kind of action on bind/unbind won't do unnecessary work for invalid
    clients like that.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • In order to make the newly fixed multicast bind/unbind
    functionality in generic netlink, pass them down to the
    appropriate family.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Currently, netlink_unbind() is only called when the socket
    explicitly unbinds, which limits its usefulness (luckily
    there are no users of it yet anyway.)

    Call netlink_unbind() also when a socket is released, so it
    becomes possible to track listeners with this callback and
    without also implementing a netlink notifier (and checking
    netlink_has_listeners() in there.)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • The code is now confusing to read - first in one function down
    (netlink_remove) any group subscriptions are implicitly removed
    by calling __sk_del_bind_node(), but the subscriber database is
    only updated far later by calling netlink_update_listeners().

    Move the latter call to just after removal from the list so it
    is easier to follow the code.

    This also enables moving the locking inside the kernel-socket
    conditional, which improves the normal socket destruction path.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • There's no point to force the caller to know about the internal
    genl_sock to use inside struct net, just have them pass the network
    namespace. This doesn't really change code generation since it's
    an inline, but makes the caller less magic - there's never any
    reason to pass another socket.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • The new name is more expressive - this isn't a generic unbind
    function but rather only a little undo helper for use only in
    netlink_bind().

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Johan Hedberg says:

    ====================
    Here's one more bluetooth pull request for 3.19. We've got two fixes:

    - Fix for accepting connections with old user space versions of BlueZ
    - Fix for Bluetooth controllers that don't have a public address

    Both of these are regressions that were introduced in 3.17, so the
    appropriate Cc: stable annotations are provided.

    Please let me know if there are any issues pulling. Thanks.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • GSO isn't the only offload feature with restrictions that
    potentially can't be expressed with the current features mechanism.
    Checksum is another although it's a general issue that could in
    theory apply to anything. Even if it may be possible to
    implement these restrictions in other ways, it can result in
    duplicate code or inefficient per-packet behavior.

    This generalizes ndo_gso_check so that drivers can remove any
    features that don't make sense for a given packet, similar to
    netif_skb_features(). It also converts existing driver
    restrictions to the new format, completing the work that was
    done to support tunnel protocols since the issues apply to
    checksums as well.

    By actually removing features from the set that are used to do
    offloading, it solves another problem with the existing
    interface. In these cases, GSO would run with the original set
    of features and not do anything because it appears that
    segmentation is not required.

    CC: Tom Herbert
    CC: Joe Stringer
    CC: Eric Dumazet
    CC: Hayes Wang
    Signed-off-by: Jesse Gross
    Acked-by: Tom Herbert
    Fixes: 04ffcb255f22 ("net: Add ndo_gso_check")
    Tested-by: Hayes Wang
    Signed-off-by: David S. Miller

    Jesse Gross
     
  • When using VXLAN tunnels and a sky2 device, I have experienced
    checksum failures of the following type:

    [ 4297.761899] eth0: hw csum failure
    [...]
    [ 4297.765223] Call Trace:
    [ 4297.765224] [] dump_stack+0x46/0x58
    [ 4297.765235] [] netdev_rx_csum_fault+0x42/0x50
    [ 4297.765238] [] ? skb_push+0x40/0x40
    [ 4297.765240] [] __skb_checksum_complete+0xbc/0xd0
    [ 4297.765243] [] tcp_v4_rcv+0x2e2/0x950
    [ 4297.765246] [] ? ip_rcv_finish+0x360/0x360

    These are reliably reproduced in a network topology of:

    container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch

    When VXLAN encapsulated traffic is received from a similarly
    configured peer, the above warning is generated in the receive
    processing of the encapsulated packet. Note that the warning is
    associated with the container eth0.

    The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
    because the packet is an encapsulated Ethernet frame, the checksum
    generated by the hardware includes the inner protocol and Ethernet
    headers.

    The receive code is careful to update the skb->csum, except in
    __dev_forward_skb, as called by dev_forward_skb. __dev_forward_skb
    calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
    to skip over the Ethernet header, but does not update skb->csum when
    doing so.

    This patch resolves the problem by adding a call to
    skb_postpull_rcsum to update the skb->csum after the call to
    eth_type_trans.

    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Jay Vosburgh
     

25 Dec, 2014

3 commits

  • net/openvswitch/vport-gre.c:188:5-11: inconsistent IS_ERR and PTR_ERR, PTR_ERR on line 189

    PTR_ERR should access the value just tested by IS_ERR

    Semantic patch information:
    There can be false positives in the patch case, where it is the call
    IS_ERR that is wrong.

    Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

    CC: Pravin B Shelar
    Signed-off-by: Fengguang Wu
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Wu Fengguang
     
  • When connectable mode is enabled (page scan on) through some non-mgmt
    method the HCI_CONNECTABLE flag will not be set. For backwards
    compatibility with user space versions not using mgmt we should not
    require HCI_CONNECTABLE to be set if HCI_MGMT is not set.

    Reported-by: Pali Rohár
    Tested-by: Pali Rohár
    Signed-off-by: Johan Hedberg
    Signed-off-by: Marcel Holtmann
    Cc: stable@vger.kernel.org # 3.17+

    Johan Hedberg
     
  • When controllers set the HCI_QUIRK_INVALID_BDADDR flag, it is required
    by userspace to program a valid public Bluetooth device address into
    the controller before it can be used.

    After successful address configuration, the internal state changes and
    the controller runs the complete initialization procedure. However one
    small difference is that this is no longer the HCI_SETUP stage. The
    HCI_SETUP stage is only valid during initial controller setup. In this
    case the stack runs the initialization as part of the HCI_CONFIG stage.

    The controller version information, default name and supported commands
    are only stored during HCI_SETUP. While these information are static,
    they are not read initially when HCI_QUIRK_INVALID_BDADDR is set. So
    when running in HCI_CONFIG state, these information need to be updated
    as well.

    This especially impacts Bluetooth 4.1 and later controllers using
    extended feature pages and second event mask page.

    Signed-off-by: Marcel Holtmann
    Signed-off-by: Johan Hedberg
    Cc: stable@vger.kernel.org # 3.17+

    Marcel Holtmann
     

24 Dec, 2014

15 commits

  • skb_scrub_packet() is called when a packet switches between a context
    such as between underlay and overlay, between namespaces, or between
    L3 subnets.

    While we already scrub the packet mark, connection tracking entry,
    and cached destination, the security mark/context is left intact.

    It seems wrong to inherit the security context of a packet when going
    from overlay to underlay or across forwarding paths.

    Signed-off-by: Thomas Graf
    Acked-by: Flavio Leitner
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • When vlan tags are stacked, it is very likely that the outer tag is stored
    in skb->vlan_tci and skb->protocol shows the inner tag's vlan_proto.
    Currently netif_skb_features() first looks at skb->protocol even if there
    is the outer tag in vlan_tci, thus it incorrectly retrieves the protocol
    encapsulated by the inner vlan instead of the inner vlan protocol.
    This allows GSO packets to be passed to HW and they end up being
    corrupted.

    Fixes: 58e998c6d239 ("offloading: Force software GSO for multiple vlan tags.")
    Signed-off-by: Toshiaki Makita
    Signed-off-by: David S. Miller

    Toshiaki Makita
     
  • Today vport-send has complex error handling because it involves
    freeing skb and updating stats depending on return value from
    vport send implementation.
    This can be simplified by delegating responsibility of freeing
    skb to the vport implementation for all cases. So that
    vport-send needs just update stats.

    Fixes: 91b7514cdf ("openvswitch: Unify vport error stats
    handling")
    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • MPLS GSO needs to know inner most protocol to process GSO packets.

    Fixes: 25cd9ba0abc ("openvswitch: Add basic MPLS support to
    kernel").

    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Linux stack does not implement GSO for packet with multiple
    encapsulations. Therefore there was check in MPLS action
    validation to detect such case, But this check introduced
    bug which deleted one or more actions from actions list.
    Following patch removes this check to fix the validation.

    Fixes: 25cd9ba0abc ("openvswitch: Add basic MPLS support to
    kernel").

    Signed-off-by: Pravin B Shelar
    Reported-by: Srinivas Neginhal
    Acked-by: Jarno Rajahalme
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • MPLS and Tunnel GSO does not work together. Reject packet which
    request such GSO.

    Fixes: 0d89d2035f ("MPLS: Add limited GSO support").
    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • Fixes MPLS GSO for case when mpls is compiled as kernel module.

    Fixes: 0d89d2035f ("MPLS: Add limited GSO support").
    Signed-off-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Pravin B Shelar
     
  • This patch rearranges the loop in net_rx_action to reduce the
    amount of jumping back and forth when reading the code.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • We should only perform the softnet_break check after we have polled
    at least one device in net_rx_action. Otherwise a zero or negative
    setting of netdev_budget can lock up the whole system.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The commit d75b1ade567ffab085e8adbbdacf0092d10cd09c (net: less
    interrupt masking in NAPI) required drivers to leave poll_list
    empty if the entire budget is consumed.

    We have already had two broken drivers so let's add a check for
    this.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch creates a new function napi_poll and moves the napi
    polling code from net_rx_action into it.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Gateway having bandwidth_down equal to zero are not accepted
    at all and so never added to the Gateway list.
    For this reason checking the bandwidth_down member in
    batadv_gw_out_of_range() is useless.

    This is probably a copy/paste error and this check was supposed
    to be "!gw_node" only. Moreover, the way the check is written
    now may also lead to a NULL dereference.

    Fix this by rewriting the if-condition properly.

    Introduced by 414254e342a0d58144de40c3da777521ebaeeb07
    ("batman-adv: tvlv - gateway download/upload bandwidth container")

    Signed-off-by: Antonio Quartulli
    Reported-by: David Binderman
    Signed-off-by: Marek Lindner
    Signed-off-by: David S. Miller

    Antonio Quartulli
     
  • The fragmentation code was replaced in 610bfc6bc99bc83680d190ebc69359a05fc7f605
    ("batman-adv: Receive fragmented packets and merge") by an implementation which
    can handle up to 16 fragments of a packet. The packet is prepared for the split
    in fragments by the function batadv_frag_send_packet and the actual split is
    done by batadv_frag_create.

    Both functions calculate the size of a fragment themself. But their calculation
    differs because batadv_frag_send_packet also subtracts ETH_HLEN. Therefore,
    the check in batadv_frag_send_packet "can a full fragment can be created?" may
    return true even when batadv_frag_create cannot create a full fragment.

    The function batadv_frag_create doesn't check the size of the skb before
    splitting it and therefore might try to create a larger fragment than the
    remaining buffer. This creates an integer underflow and an invalid len is given
    to skb_split.

    Signed-off-by: Sven Eckelmann
    Signed-off-by: David S. Miller

    Sven Eckelmann
     
  • The fragmentation code was replaced in 610bfc6bc99bc83680d190ebc69359a05fc7f605
    ("batman-adv: Receive fragmented packets and merge"). The new code provided a
    mostly unused parameter skb for the merging function. It is used inside the
    function to calculate the additionally needed skb tailroom. But instead of
    increasing its own tailroom, it is only increasing the tailroom of the first
    queued skb. This is not correct in some situations because the first queued
    entry can be a different one than the parameter.

    An observed problem was:

    1. packet with size 104, total_size 1464, fragno 1 was received
    - packet is queued
    2. packet with size 1400, total_size 1464, fragno 0 was received
    - packet is queued at the end of the list
    3. enough data was received and can be given to the merge function
    (1464 == (1400 - 20) + (104 - 20))
    - merge functions gets 1400 byte large packet as skb argument
    4. merge function gets first entry in queue (104 byte)
    - stored as skb_out
    5. merge function calculates the required extra tail as total_size - skb->len
    - pskb_expand_head tail of skb_out with 64 bytes
    6. merge function tries to squeeze the extra 1380 bytes from the second queued
    skb (1400 byte aka skb parameter) in the 64 extra tail bytes of skb_out

    Instead calculate the extra required tail bytes for skb_out also using skb_out
    instead of using the parameter skb. The skb parameter is only used to get the
    total_size from the last received packet. This is also the total_size used to
    decide that all fragments were received.

    Reported-by: Philipp Psurek
    Signed-off-by: Sven Eckelmann
    Acked-by: Martin Hundebøll
    Signed-off-by: David S. Miller

    Sven Eckelmann
     
  • Commit cecda693a969816bac5e470e1d9c9c0ef5567bca ("net: keep original skb
    which only needs header checking during software GSO") keeps the original
    skb for packets that only needs header check, but it doesn't drop the
    packet if software segmentation or header check were failed.

    Fixes cecda693a9 ("net: keep original skb which only needs header checking during software GSO")
    Cc: Eric Dumazet
    Signed-off-by: Jason Wang
    Signed-off-by: David S. Miller

    Jason Wang
     

23 Dec, 2014

2 commits

  • When xfrm6_policy_check() is used, _decode_session6() is called after some
    intermediate functions. This function uses IP6CB(), thus TCP_SKB_CB() must be
    prepared after the call of xfrm6_policy_check().

    Before this patch, scenarii with IPv6 + TCP + IPsec Transport are broken.

    Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
    Reported-by: Huaibin Wang
    Suggested-by: Eric Dumazet
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Make TPACKET_V3 signal poll when block is closed rather than for every
    packet. Side effect is that poll will be signaled when block retire
    timer expires which didn't previously happen. Issue was visible when
    sending packets at a very low frequency such that all blocks are retired
    before packets are received by TPACKET_V3. This caused avoidable packet
    loss. The fix ensures that the signal is sent when blocks are closed
    which covers the normal path where the block is filled as well as the
    path where the timer expires. The case where a block is filled without
    moving to the next block (ie. all blocks are full) will still cause poll
    to be signaled.

    Signed-off-by: Dan Collins
    Signed-off-by: David S. Miller

    Dan Collins
     

20 Dec, 2014

1 commit

  • Pull vfs pile #3 from Al Viro:
    "Assorted fixes and patches from the last cycle"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    [regression] chunk lost from bd9b51
    vfs: make mounts and mountstats honor root dir like mountinfo does
    vfs: cleanup show_mountinfo
    init: fix read-write root mount
    unfuck binfmt_misc.c (broken by commit e6084d4)
    vm_area_operations: kill ->migrate()
    new helper: iter_is_iovec()
    move_extent_per_page(): get rid of unused w_flags
    lustre: get rid of playing with ->fs
    btrfs: filp_open() returns ERR_PTR() on failure, not NULL...

    Linus Torvalds
     

19 Dec, 2014

7 commits

  • same story as cmtp

    Signed-off-by: Al Viro
    Signed-off-by: Marcel Holtmann

    Al Viro
     
  • ... rather than relying on ciptool(8) never passing it anything else. Give
    it e.g. an AF_UNIX connected socket (from socketpair(2)) and it'll oops,
    trying to evaluate &l2cap_pi(sock->sk)->chan->dst...

    Signed-off-by: Al Viro
    Signed-off-by: Marcel Holtmann

    Al Viro
     
  • it's OK after we'd verified the sockets, but not before that.

    Signed-off-by: Al Viro
    Signed-off-by: Marcel Holtmann

    Al Viro
     
  • If we need to drop the message because of some error in the
    compression etc, then do not free the skb as that is done
    automatically in other part of networking stack.

    Signed-off-by: Jukka Rissanen
    Signed-off-by: Marcel Holtmann

    Jukka Rissanen
     
  • Reported-by: Pavel Emelyanov
    Acked-by: Pavel Emelyanov
    Signed-off-by: Al Viro

    Al Viro
     
  • Pull networking fixes from David Miller:

    1) Fix NBMA tunnel mac header handling in GRE, from Timo Teräs.

    2) Fix a NAPI race in the fec driver, from Nimrod Andy.

    3) The new IFF_VNET_LE bit is outside the size of the flags member it
    is stored in (which is 16-bits), store the state locally in the
    drivers. From Michael S Tsirkin.

    4) We are kicking the tires with the new wireless maintainership
    situation. Bluetooth fixes via Johan Hedberg, and mac80211 fixes
    from Johannes Berg.

    5) Fix locking and leaks in geneve driver, from Jesse Gross.

    6) Make netlink TX mmap code always copy, so we don't have to be
    potentially exposed to the user changing the underlying contents
    from underneath us.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (63 commits)
    be2net: Fix incorrect setting of tunnel offload flag in netdev features
    bnx2x: fix typos in "configure"
    xen-netback: support frontends without feature-rx-notify again
    MAINTAINERS: changes for wireless
    cxgb4: Fix decoding QSA module for ethtool get settings
    geneve: Fix races between socket add and release.
    geneve: Remove socket and offload handlers at destruction.
    netlink: Don't reorder loads/stores before marking mmap netlink frame as available
    netlink: Always copy on mmap TX.
    Bluetooth: Fix bug with filter in service discovery optimization
    mac80211: free management frame keys when removing station
    net: Disallow providing non zero VLAN ID for NIC drivers FDB add flow
    net/mlx4: Cache line CQE/EQE stride fixes
    net: fec: Fix NAPI race
    xen-netfront: use napi_complete() correctly to prevent Rx stalling
    ip_tunnel: Add missing validation of encap type to ip_tunnel_encap_setup()
    ip_tunnel: Add sanity checks to ip_tunnel_encap_add_ops()
    net: Allow FIXED_PHY to be modular.
    if_tun: drop broken IFF_VNET_LE
    macvtap: drop broken IFF_VNET_LE
    ...

    Linus Torvalds
     
  • …kernel/git/jberg/mac80211

    Johannes Berg says:

    ====================
    pull-request: mac80211 2014-12-18

    Also from me a first pull request - we have a number of really old
    issues that happened to crop up now with new work (or just more testing)
    in the right areas as well as some small bugs newly introduced in 3.19.

    Let me know if there are any problems.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller