15 Jan, 2015

3 commits

  • Pull networking fixes from David Miller:

    1) Don't use uninitialized data in IPVS, from Dan Carpenter.

    2) conntrack race fixes from Pablo Neira Ayuso.

    3) Fix TX hangs with i40e, from Jesse Brandeburg.

    4) Fix budget return from poll calls in dnet and alx, from Eric
    Dumazet.

    5) Fix bugus "if (unlikely(x) < 0)" test in AF_PACKET, from Christoph
    Jaeger.

    6) Fix bug introduced by conversion to list_head in TIPC retransmit
    code, from Jon Paul Maloy.

    7) Don't use GFP_NOIO under spinlock in USB kaweth driver, from Alexey
    Khoroshilov.

    8) Fix bridge build with INET disabled, from Arnd Bergmann.

    9) Fix netlink array overrun for PROBE attributes in openvswitch, from
    Thomas Graf.

    10) Don't hold spinlock across synchronize_irq() in tg3 driver, from
    Prashant Sreedharan.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
    tg3: Release tp->lock before invoking synchronize_irq()
    tg3: tg3_reset_task() needs to use rtnl_lock to synchronize
    tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync
    team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin
    openvswitch: packet messages need their own probe attribtue
    i40e: adds FCoE configure option
    cxgb4vf: Fix queue allocation for 40G adapter
    netdevice: Add missing parentheses in macro
    bridge: only provide proxy ARP when CONFIG_INET is enabled
    neighbour: fix base_reachable_time(_ms) not effective immediatly when changed
    net: fec: fix MDIO bus assignement for dual fec SoC's
    xen-netfront: use different locks for Rx and Tx stats
    drivers: net: cpsw: fix multicast flush in dual emac mode
    cxgb4vf: Initialize mdio_addr before using it
    net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations
    usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb()
    MAINTAINERS: add me as ibmveth maintainer
    tipc: fix bug in broadcast retransmit code
    update ip-sysctl.txt documentation (v2)
    net/at91_ether: prepare and unprepare clock
    ...

    Linus Torvalds
     
  • User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow
    and packet messages. This leads to an out-of-bounds access in
    ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE >
    OVS_PACKET_ATTR_MAX.

    Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value
    as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes
    while maintaining to be binary compatible with existing OVS binaries.

    Fixes: 05da589 ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.")
    Reported-by: Sander Eikelenboom
    Tracked-down-by: Florian Westphal
    Signed-off-by: Thomas Graf
    Reviewed-by: Jesse Gross
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • When IPV4 support is disabled, we cannot call arp_send from
    the bridge code, which would result in a kernel link error:

    net/built-in.o: In function `br_handle_frame_finish':
    :(.text+0x59914): undefined reference to `arp_send'
    :(.text+0x59a50): undefined reference to `arp_tbl'

    This makes the newly added proxy ARP support in the bridge
    code depend on the CONFIG_INET symbol and lets the compiler
    optimize the code out to avoid the link error.

    Signed-off-by: Arnd Bergmann
    Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP")
    Cc: Kyeyoon Park
    Signed-off-by: David S. Miller

    Arnd Bergmann
     

14 Jan, 2015

1 commit

  • When setting base_reachable_time or base_reachable_time_ms on a
    specific interface through sysctl or netlink, the reachable_time
    value is not updated.

    This means that neighbour entries will continue to be updated using the
    old value until it is recomputed in neigh_period_work (which
    recomputes the value every 300*HZ).
    On systems with HZ equal to 1000 for instance, it means 5mins before
    the change is effective.

    This patch changes this behavior by recomputing reachable_time after
    each set on base_reachable_time or base_reachable_time_ms.
    The new value will become effective the next time the neighbour's timer
    is triggered.

    Changes are made in two places: the netlink code for set and the sysctl
    handling code. For sysctl, I use a proc_handler. The ipv6 network
    code does provide its own handler but it already refreshes
    reachable_time correctly so it's not an issue.
    Any other user of neighbour which provide its own handlers must
    refresh reachable_time.

    Signed-off-by: Jean-Francois Remy
    Signed-off-by: David S. Miller

    Jean-Francois Remy
     

13 Jan, 2015

1 commit

  • In commit 58dc55f25631178ee74cd27185956a8f7dcb3e32 ("tipc: use generic
    SKB list APIs to manage link transmission queue") we replace all list
    traversal loops with the macros skb_queue_walk() or
    skb_queue_walk_safe(). While the previous loops were based on the
    assumption that the list was NULL-terminated, the standard macros
    stop when the iterator reaches the list head, which is non-NULL.

    In the function bclink_retransmit_pkt() this macro replacement has
    lead to a bug. When we receive a BCAST STATE_MSG we unconditionally
    call the function bclink_retransmit_pkt(), whether there really is
    anything to retransmit or not, assuming that the sequence number
    comparisons will lead to the correct behavior. However, if the
    transmission queue is empty, or if there are no eligible buffers in
    the transmission queue, we will by mistake pass the list head pointer
    to the function tipc_link_retransmit(). Since the list head is not a
    valid sk_buff, this leads to a crash.

    In this commit we fix this by only calling tipc_link_retransmit()
    if we actually found eligible buffers in the transmission queue.

    Reviewed-by: Ying Xue
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Paul Maloy
     

12 Jan, 2015

2 commits

  • Pablo Neira Ayuso says:

    ====================
    netfilter/ipvs fixes for net

    The following patchset contains netfilter/ipvs fixes, they are:

    1) Small fix for the FTP helper in IPVS, a diff variable may be left
    unset when CONFIG_IP_VS_IPV6 is set. Patch from Dan Carpenter.

    2) Fix nf_tables port NAT in little endian archs, patch from leroy
    christophe.

    3) Fix race condition between conntrack confirmation and flush from
    userspace. This is the second reincarnation to resolve this problem.

    4) Make sure inner messages in the batch come with the nfnetlink header.

    5) Relax strict check from nfnetlink_bind() that may break old userspace
    applications using all 1s group mask.

    6) Schedule removal of chains once no sets and rules refer to them in
    the new nf_tables ruleset flush command. Reported by Asbjoern Sloth
    Toennesen.

    Note that this batch comes later than usual because of the short
    winter holidays.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Due to a misplaced parenthesis, the expression

    (unlikely(offset) < 0),

    which expands to

    (__builtin_expect(!!(offset), 0) < 0),

    never evaluates to true. Therefore, when sending packets with
    PF_PACKET/SOCK_DGRAM, packet_snd() does not abort as intended
    if the creation of the layer 2 header fails.

    Spotted by Coverity - CID 1259975 ("Operands don't affect result").

    Fixes: 9c7077622dd9 ("packet: make packet_snd fail on len smaller than l2 header")
    Signed-off-by: Christoph Jaeger
    Acked-by: Eric Dumazet
    Acked-by: Willem de Bruijn
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Christoph Jaeger
     

10 Jan, 2015

2 commits


09 Jan, 2015

1 commit


08 Jan, 2015

1 commit

  • A struct xdr_stream at a page boundary might point to the end of one
    page or the beginning of the next, but xdr_truncate_encode isn't
    prepared to handle the former.

    This can cause corruption of NFSv4 READDIR replies in the case that a
    readdir entry that would have exceeded the client's dircount/maxcount
    limit would have ended exactly on a 4k page boundary. You're more
    likely to hit this case on large directories.

    Other xdr_truncate_encode callers are probably also affected.

    Reported-by: Holger Hoffstätte
    Tested-by: Holger Hoffstätte
    Fixes: 3e19ce762b53 "rpc: xdr_truncate_encode"
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

07 Jan, 2015

7 commits

  • Pull networking fixes from David Miller:
    "Just a pile of random fixes, including:

    1) Do not apply TSO limits to non-TSO packets, fix from Herbert Xu.

    2) MDI{,X} eeprom check in e100 driver is reversed, from John W.
    Linville.

    3) Missing error return assignments in several ethernet drivers, from
    Julia Lawall.

    4) Altera TSE device doesn't come back up after ifconfig down/up
    sequence, fix from Kostya Belezko.

    5) Add more cases to the check for whether the qmi_wwan device has a
    bogus MAC address and needs to be assigned a random one. From
    Kristian Evensen.

    6) Fix interrupt hangs in CPSW, from Felipe Balbi.

    7) Implement ndo_features_check in r8152 so that the stack doesn't
    feed GSO packets which are outside of the chip's capabilities.
    From Hayes Wang"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
    qla3xxx: don't allow never end busy loop
    xen-netback: fixing the propagation of the transmit shaper timeout
    r8152: support ndo_features_check
    batman-adv: fix potential TT client + orig-node memory leak
    batman-adv: fix multicast counter when purging originators
    batman-adv: fix counter for multicast supporting nodes
    batman-adv: fix lock class for decoding hash in network-coding.c
    batman-adv: fix delayed foreign originator recognition
    batman-adv: fix and simplify condition when bonding should be used
    Revert "mac80211: Fix accounting of the tailroom-needed counter"
    net: ethernet: cpsw: fix hangs with interrupts
    enic: free all rq buffs when allocation fails
    qmi_wwan: Set random MAC on devices with buggy fw
    openvswitch: Consistently include VLAN header in flow and port stats.
    tcp: Do not apply TSO segment limit to non-TSO packets
    Altera TSE: Add missing phydev
    net/mlx4_core: Fix error flow in mlx4_init_hca()
    net/mlx4_core: Correcly update the mtt's offset in the MR re-reg flow
    qlcnic: Fix return value in qlcnic_probe()
    net: axienet: fix error return code
    ...

    Linus Torvalds
     
  • Jumping between chains doesn't mix well with flush ruleset. Rules
    from a different chain and set elements may still refer to us.

    [ 353.373791] ------------[ cut here ]------------
    [ 353.373845] kernel BUG at net/netfilter/nf_tables_api.c:1159!
    [ 353.373896] invalid opcode: 0000 [#1] SMP
    [ 353.373942] Modules linked in: intel_powerclamp uas iwldvm iwlwifi
    [ 353.374017] CPU: 0 PID: 6445 Comm: 31c3.nft Not tainted 3.18.0 #98
    [ 353.374069] Hardware name: LENOVO 5129CTO/5129CTO, BIOS 6QET47WW (1.17 ) 07/14/2010
    [...]
    [ 353.375018] Call Trace:
    [ 353.375046] [] ? nf_tables_commit+0x381/0x540
    [ 353.375101] [] nfnetlink_rcv+0x3d8/0x4b0
    [ 353.375150] [] netlink_unicast+0x105/0x1a0
    [ 353.375200] [] netlink_sendmsg+0x32e/0x790
    [ 353.375253] [] sock_sendmsg+0x8e/0xc0
    [ 353.375300] [] ? move_addr_to_kernel.part.20+0x19/0x70
    [ 353.375357] [] ? move_addr_to_kernel+0x19/0x30
    [ 353.375410] [] ? verify_iovec+0x42/0xd0
    [ 353.375459] [] ___sys_sendmsg+0x3f0/0x400
    [ 353.375510] [] ? native_sched_clock+0x2a/0x90
    [ 353.375563] [] ? acct_account_cputime+0x17/0x20
    [ 353.375616] [] ? account_user_time+0x88/0xa0
    [ 353.375667] [] __sys_sendmsg+0x3d/0x80
    [ 353.375719] [] ? int_check_syscall_exit_work+0x34/0x3d
    [ 353.375776] [] SyS_sendmsg+0xd/0x20
    [ 353.375823] [] system_call_fastpath+0x16/0x1b

    Release objects in this order: rules -> sets -> chains -> tables, to
    make sure no references to chains are held anymore.

    Reported-by: Asbjoern Sloth Toennesen
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • Relax the checking that was introduced in 97840cb ("netfilter:
    nfnetlink: fix insufficient validation in nfnetlink_bind") when the
    subscription bitmask is used. Existing userspace code code may request
    to listen to all of the existing netlink groups by setting an all to one
    subscription group bitmask. Netlink already validates subscription via
    setsockopt() for us.

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • Make sure there is enough room for the nfnetlink header in the
    netlink messages that are part of the batch. There is a similar
    check in netlink_rcv_skb().

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • Commit 5195c14c8b27c ("netfilter: conntrack: fix race in
    __nf_conntrack_confirm against get_next_corpse") aimed to resolve the
    race condition between the confirmation (packet path) and the flush
    command (from control plane). However, it introduced a crash when
    several packets race to add a new conntrack, which seems easier to
    reproduce when nf_queue is in place.

    Fix this race, in __nf_conntrack_confirm(), by removing the CT
    from unconfirmed list before checking the DYING bit. In case
    race occured, re-add the CT to the dying list

    This patch also changes the verdict from NF_ACCEPT to NF_DROP when
    we lose race. Basically, the confirmation happens for the first packet
    that we see in a flow. If you just invoked conntrack -F once (which
    should be the common case), then this is likely to be the first packet
    of the flow (unless you already called flush anytime soon in the past).
    This should be hard to trigger, but better drop this packet, otherwise
    we leave things in inconsistent state since the destination will likely
    reply to this packet, but it will find no conntrack, unless the origin
    retransmits.

    The change of the verdict has been discussed in:
    https://www.marc.info/?l=linux-netdev&m=141588039530056&w=2

    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     
  • Included changes:
    - ensure bonding is used (if enabled) for packets coming in the soft
    interface
    - fix race condition to avoid orig_nodes to be deleted right after
    being added
    - avoid false positive lockdep splats by assigning lockclass to
    the proper hashtable lock objects
    - avoid miscounting of multicast 'disabled' nodes in the network
    - fix memory leak in the Global Translation Table in case of
    originator interval change

    Signed-off-by: David S. Miller

    David S. Miller
     
  • …kernel/git/jberg/mac80211

    Here's just a single fix - a revert of a patch that broke the
    p54 and cw2100 drivers (arguably due to bad assumptions there.)
    Since this affects kernels since 3.17, I decided to revert for
    now and we'll revisit this optimisation properly for -next.

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     

06 Jan, 2015

6 commits

  • This patch fixes a potential memory leak which can occur once an
    originator times out. On timeout the according global translation table
    entry might not get purged correctly. Furthermore, the non purged TT
    entry will cause its orig-node to leak, too. Which additionally can lead
    to the new multicast optimization feature not kicking in because of a
    therefore bogus counter.

    In detail: The batadv_tt_global_entry->orig_list holds the reference to
    the orig-node. Usually this reference is released after
    BATADV_PURGE_TIMEOUT through: _batadv_purge_orig()->
    batadv_purge_orig_node()->batadv_update_route()->_batadv_update_route()->
    batadv_tt_global_del_orig() which purges this global tt entry and
    releases the reference to the orig-node.

    However, if between two batadv_purge_orig_node() calls the orig-node
    timeout grew to 2*BATADV_PURGE_TIMEOUT then this call path isn't
    reached. Instead the according orig-node is removed from the
    originator hash in _batadv_purge_orig(), the batadv_update_route()
    part is skipped and won't be reached anymore.

    Fixing the issue by moving batadv_tt_global_del_orig() out of the rcu
    callback.

    Signed-off-by: Linus Lüssing
    Acked-by: Antonio Quartulli
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Linus Lüssing
     
  • When purging an orig_node we should only decrease counter tracking the
    number of nodes without multicast optimizations support if it was
    increased through this orig_node before.

    A not yet quite initialized orig_node (meaning it did not have its turn
    in the mcast-tvlv handler so far) which gets purged would not adhere to
    this and will lead to a counter imbalance.

    Fixing this by adding a check whether the orig_node is mcast-initalized
    before decreasing the counter in the mcast-orig_node-purging routine.

    Introduced by 60432d756cf06e597ef9da511402dd059b112447
    ("batman-adv: Announce new capability via multicast TVLV")

    Reported-by: Tobias Hachmer
    Signed-off-by: Linus Lüssing
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Linus Lüssing
     
  • A miscounting of nodes having multicast optimizations enabled can lead
    to multicast packet loss in the following scenario:

    If the first OGM a node receives from another one has no multicast
    optimizations support (no multicast tvlv) then we are missing to
    increase the counter. This potentially leads to the wrong assumption
    that we could safely use multicast optimizations.

    Fixings this by increasing the counter if the initial OGM has the
    multicast TVLV unset, too.

    Introduced by 60432d756cf06e597ef9da511402dd059b112447
    ("batman-adv: Announce new capability via multicast TVLV")

    Reported-by: Tobias Hachmer
    Signed-off-by: Linus Lüssing
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Linus Lüssing
     
  • batadv_has_set_lock_class() is called with the wrong hash table as first
    argument (probably due to a copy-paste error), which leads to false
    positives when running with lockdep.

    Introduced-by: 612d2b4fe0a1ff2f8389462a6f8be34e54124c05
    ("batman-adv: network coding - save overheard and tx packets for decoding")

    Signed-off-by: Martin Hundebøll
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Martin Hundebøll
     
  • Currently it can happen that the reception of an OGM from a new
    originator is not being accepted. More precisely it can happen that
    an originator struct gets allocated and initialized
    (batadv_orig_node_new()), even the TQ gets calculated and set correctly
    (batadv_iv_ogm_calc_tq()) but still the periodic orig_node purging
    thread will decide to delete it if it has a chance to jump between
    these two function calls.

    This is because batadv_orig_node_new() initializes the last_seen value
    to zero and its caller (batadv_iv_ogm_orig_get()) makes it visible to
    other threads by adding it to the hash table already.
    batadv_iv_ogm_calc_tq() will set the last_seen variable to the correct,
    current time a few lines later but if the purging thread jumps in between
    that it will think that the orig_node timed out and will wrongly
    schedule it for deletion already.

    If the purging interval is the same as the originator interval (which is
    the default: 1 second), then this game can continue for several rounds
    until the random OGM jitter added enough difference between these
    two (in tests, two to about four rounds seemed common).

    Fixing this by initializing the last_seen variable of an orig_node
    to the current time before adding it to the hash table.

    Signed-off-by: Linus Lüssing
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Linus Lüssing
     
  • The current condition actually does NOT consider bonding when the
    interface the packet came in from is the soft interface, which is the
    opposite of what it should do (and the comment describes). Fix that and
    slightly simplify the condition.

    Reported-by: Ray Gibson
    Signed-off-by: Simon Wunderlich
    Signed-off-by: Marek Lindner
    Signed-off-by: Antonio Quartulli

    Simon Wunderlich
     

05 Jan, 2015

1 commit

  • This reverts commit ca34e3b5c808385b175650605faa29e71e91991b.

    It turns out that the p54 and cw2100 drivers assume that there's
    tailroom even when they don't say they really need it. However,
    there's currently no way for them to explicitly say they do need
    it, so for now revert this.

    This fixes https://bugzilla.kernel.org/show_bug.cgi?id=90331.

    Cc: stable@vger.kernel.org
    Fixes: ca34e3b5c808 ("mac80211: Fix accounting of the tailroom-needed counter")
    Reported-by: Christopher Chavez
    Bisected-by: Larry Finger
    Debugged-by: Christian Lamparter
    Signed-off-by: Johannes Berg

    Johannes Berg
     

03 Jan, 2015

2 commits

  • Until now, when VLAN acceleration was in use, the bytes of the VLAN header
    were not included in port or flow byte counters. They were however
    included when VLAN acceleration was not used. This commit corrects the
    inconsistency, by always including the VLAN header in byte counters.

    Previous discussion at
    http://openvswitch.org/pipermail/dev/2014-December/049521.html

    Reported-by: Motonori Shindo
    Signed-off-by: Ben Pfaff
    Reviewed-by: Flavio Leitner
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Ben Pfaff
     
  • Thomas Jarosch reported IPsec TCP stalls when a PMTU event occurs.

    In fact the problem was completely unrelated to IPsec. The bug is
    also reproducible if you just disable TSO/GSO.

    The problem is that when the MSS goes down, existing queued packet
    on the TX queue that have not been transmitted yet all look like
    TSO packets and get treated as such.

    This then triggers a bug where tcp_mss_split_point tells us to
    generate a zero-sized packet on the TX queue. Once that happens
    we're screwed because the zero-sized packet can never be removed
    by ACKs.

    Fixes: 1485348d242 ("tcp: Apply device TSO segment limit earlier")
    Reported-by: Thomas Jarosch
    Signed-off-by: Herbert Xu

    Cheers,
    Signed-off-by: David S. Miller

    Herbert Xu
     

31 Dec, 2014

2 commits

  • This reverts commit 24a0aa212ee2dbe44360288684478d76a8e20a0a.

    It's causing severe userspace breakage. Namely, all the utilities from
    wireless-utils which are relying on CONFIG_WEXT (which means tools like
    'iwconfig', 'iwlist', etc) are not working anymore. There is a 'iw'
    utility in newer wireless-tools, which is supposed to be a replacement
    for all the "deprecated" binaries, but it's far away from being
    massively adopted.

    Please see [1] for example of the userspace breakage this is causing.

    In addition to that, Larry Finger reports [2] that this patch is also
    causing ipw2200 driver being impossible to build.

    To me this clearly shows that CONFIG_WEXT is far, far away from being
    "deprecated enough" to be removed.

    [1] http://thread.gmane.org/gmane.linux.kernel/1857010
    [2] http://thread.gmane.org/gmane.linux.network/343688

    Signed-off-by: Jiri Kosina
    Signed-off-by: Linus Torvalds

    Jiri Kosina
     
  • Pull networking fixes from David Miller:

    1) Fix double SKB free in bluetooth 6lowpan layer, from Jukka Rissanen.

    2) Fix receive checksum handling in enic driver, from Govindarajulu
    Varadarajan.

    3) Fix NAPI poll list corruption in virtio_net and caif_virtio, from
    Herbert Xu. Also, add code to detect drivers that have this mistake
    in the future.

    4) Fix doorbell endianness handling in mlx4 driver, from Amir Vadai.

    5) Don't clobber IP6CB() before xfrm6_policy_check() is called in TCP
    input path,f rom Nicolas Dichtel.

    6) Fix MPLS action validation in openvswitch, from Pravin B Shelar.

    7) Fix double SKB free in vxlan driver, also from Pravin.

    8) When we scrub a packet, which happens when we are switching the
    context of the packet (namespace, etc.), we should reset the
    secmark. From Thomas Graf.

    9) ->ndo_gso_check() needs to do more than return true/false, it also
    has to allow the driver to clear netdev feature bits in order for
    the caller to be able to proceed properly. From Jesse Gross.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
    genetlink: A genl_bind() to an out-of-range multicast group should not WARN().
    netlink/genetlink: pass network namespace to bind/unbind
    ne2k-pci: Add pci_disable_device in error handling
    bonding: change error message to debug message in __bond_release_one()
    genetlink: pass multicast bind/unbind to families
    netlink: call unbind when releasing socket
    netlink: update listeners directly when removing socket
    genetlink: pass only network namespace to genl_has_listeners()
    netlink: rename netlink_unbind() to netlink_undo_bind()
    net: Generalize ndo_gso_check to ndo_features_check
    net: incorrect use of init_completion fixup
    neigh: remove next ptr from struct neigh_table
    net: xilinx: Remove unnecessary temac_property in the driver
    net: phy: micrel: use generic config_init for KSZ8021/KSZ8031
    net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding
    openvswitch: fix odd_ptr_err.cocci warnings
    Bluetooth: Fix accepting connections when not using mgmt
    Bluetooth: Fix controller configuration with HCI_QUIRK_INVALID_BDADDR
    brcmfmac: Do not crash if platform data is not populated
    ipw2200: select CFG80211_WEXT
    ...

    Linus Torvalds
     

30 Dec, 2014

1 commit


27 Dec, 2014

9 commits

  • Netlink families can exist in multiple namespaces, and for the most
    part multicast subscriptions are per network namespace. Thus it only
    makes sense to have bind/unbind notifications per network namespace.

    To achieve this, pass the network namespace of a given client socket
    to the bind/unbind functions.

    Also do this in generic netlink, and there also make sure that any
    bind for multicast groups that only exist in init_net is rejected.
    This isn't really a problem if it is accepted since a client in a
    different namespace will never receive any notifications from such
    a group, but it can confuse the family if not rejected (it's also
    possible to silently (without telling the family) accept it, but it
    would also have to be ignored on unbind so families that take any
    kind of action on bind/unbind won't do unnecessary work for invalid
    clients like that.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • In order to make the newly fixed multicast bind/unbind
    functionality in generic netlink, pass them down to the
    appropriate family.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Currently, netlink_unbind() is only called when the socket
    explicitly unbinds, which limits its usefulness (luckily
    there are no users of it yet anyway.)

    Call netlink_unbind() also when a socket is released, so it
    becomes possible to track listeners with this callback and
    without also implementing a netlink notifier (and checking
    netlink_has_listeners() in there.)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • The code is now confusing to read - first in one function down
    (netlink_remove) any group subscriptions are implicitly removed
    by calling __sk_del_bind_node(), but the subscriber database is
    only updated far later by calling netlink_update_listeners().

    Move the latter call to just after removal from the list so it
    is easier to follow the code.

    This also enables moving the locking inside the kernel-socket
    conditional, which improves the normal socket destruction path.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • There's no point to force the caller to know about the internal
    genl_sock to use inside struct net, just have them pass the network
    namespace. This doesn't really change code generation since it's
    an inline, but makes the caller less magic - there's never any
    reason to pass another socket.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • The new name is more expressive - this isn't a generic unbind
    function but rather only a little undo helper for use only in
    netlink_bind().

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Johan Hedberg says:

    ====================
    Here's one more bluetooth pull request for 3.19. We've got two fixes:

    - Fix for accepting connections with old user space versions of BlueZ
    - Fix for Bluetooth controllers that don't have a public address

    Both of these are regressions that were introduced in 3.17, so the
    appropriate Cc: stable annotations are provided.

    Please let me know if there are any issues pulling. Thanks.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • GSO isn't the only offload feature with restrictions that
    potentially can't be expressed with the current features mechanism.
    Checksum is another although it's a general issue that could in
    theory apply to anything. Even if it may be possible to
    implement these restrictions in other ways, it can result in
    duplicate code or inefficient per-packet behavior.

    This generalizes ndo_gso_check so that drivers can remove any
    features that don't make sense for a given packet, similar to
    netif_skb_features(). It also converts existing driver
    restrictions to the new format, completing the work that was
    done to support tunnel protocols since the issues apply to
    checksums as well.

    By actually removing features from the set that are used to do
    offloading, it solves another problem with the existing
    interface. In these cases, GSO would run with the original set
    of features and not do anything because it appears that
    segmentation is not required.

    CC: Tom Herbert
    CC: Joe Stringer
    CC: Eric Dumazet
    CC: Hayes Wang
    Signed-off-by: Jesse Gross
    Acked-by: Tom Herbert
    Fixes: 04ffcb255f22 ("net: Add ndo_gso_check")
    Tested-by: Hayes Wang
    Signed-off-by: David S. Miller

    Jesse Gross
     
  • When using VXLAN tunnels and a sky2 device, I have experienced
    checksum failures of the following type:

    [ 4297.761899] eth0: hw csum failure
    [...]
    [ 4297.765223] Call Trace:
    [ 4297.765224] [] dump_stack+0x46/0x58
    [ 4297.765235] [] netdev_rx_csum_fault+0x42/0x50
    [ 4297.765238] [] ? skb_push+0x40/0x40
    [ 4297.765240] [] __skb_checksum_complete+0xbc/0xd0
    [ 4297.765243] [] tcp_v4_rcv+0x2e2/0x950
    [ 4297.765246] [] ? ip_rcv_finish+0x360/0x360

    These are reliably reproduced in a network topology of:

    container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -> switch

    When VXLAN encapsulated traffic is received from a similarly
    configured peer, the above warning is generated in the receive
    processing of the encapsulated packet. Note that the warning is
    associated with the container eth0.

    The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
    because the packet is an encapsulated Ethernet frame, the checksum
    generated by the hardware includes the inner protocol and Ethernet
    headers.

    The receive code is careful to update the skb->csum, except in
    __dev_forward_skb, as called by dev_forward_skb. __dev_forward_skb
    calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
    to skip over the Ethernet header, but does not update skb->csum when
    doing so.

    This patch resolves the problem by adding a call to
    skb_postpull_rcsum to update the skb->csum after the call to
    eth_type_trans.

    Signed-off-by: Jay Vosburgh
    Signed-off-by: David S. Miller

    Jay Vosburgh
     

25 Dec, 2014

1 commit

  • net/openvswitch/vport-gre.c:188:5-11: inconsistent IS_ERR and PTR_ERR, PTR_ERR on line 189

    PTR_ERR should access the value just tested by IS_ERR

    Semantic patch information:
    There can be false positives in the patch case, where it is the call
    IS_ERR that is wrong.

    Generated by: scripts/coccinelle/tests/odd_ptr_err.cocci

    CC: Pravin B Shelar
    Signed-off-by: Fengguang Wu
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Wu Fengguang