15 Apr, 2016

1 commit


10 Apr, 2016

1 commit

  • Pull networking fixes from David Miller:

    1) Stale SKB data pointer access across pskb_may_pull() calls in L2TP,
    from Haishuang Yan.

    2) Fix multicast frame handling in mac80211 AP code, from Felix
    Fietkau.

    3) mac80211 station hashtable insert errors not handled properly, fix
    from Johannes Berg.

    4) Fix TX descriptor count limit handling in e1000, from Alexander
    Duyck.

    5) Revert a buggy netdev refcount fix in netpoll, from Bjorn Helgaas.

    6) Must assign rtnl_link_ops of the device before registering it, fix
    in ip6_tunnel from Thadeu Lima de Souza Cascardo.

    7) Memory leak fix in tc action net exit, from WANG Cong.

    8) Add missing AF_KCM entries to name tables, from Dexuan Cui.

    9) Fix regression in GRE handling of csums wrt. FOU, from Alexander
    Duyck.

    10) Fix memory allocation alignment and congestion map corruption in
    RDS, from Shamir Rabinovitch.

    11) Fix default qdisc regression in tuntap driver, from Jason Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
    bridge, netem: mark mailing lists as moderated
    tuntap: restore default qdisc
    mpls: find_outdev: check for err ptr in addition to NULL check
    ipv6: Count in extension headers in skb->network_header
    RDS: fix congestion map corruption for PAGE_SIZE > 4k
    RDS: memory allocated must be align to 8
    GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
    net: add the AF_KCM entries to family name tables
    MAINTAINERS: intel-wired-lan list is moderated
    lib/test_bpf: Add additional BPF_ADD tests
    lib/test_bpf: Add test to check for result of 32-bit add that overflows
    lib/test_bpf: Add tests for unsigned BPF_JGT
    lib/test_bpf: Fix JMP_JSET tests
    VSOCK: Detach QP check should filter out non matching QPs.
    stmmac: fix adjust link call in case of a switch is attached
    af_packet: tone down the Tx-ring unsupported spew.
    net_sched: fix a memory leak in tc action
    samples/bpf: Enable powerpc support
    samples/bpf: Use llc in PATH, rather than a hardcoded value
    samples/bpf: Fix build breakage with map_perf_test_user.c
    ...

    Linus Torvalds
     

09 Apr, 2016

2 commits

  • …kernel/git/jberg/mac80211

    Johannes Berg says:

    ====================
    For the current RC series, we have the following fixes:
    * TDLS fixes from Arik and Ilan
    * rhashtable fixes from Ben and myself
    * documentation fixes from Luis
    * U-APSD fixes from Emmanuel
    * a TXQ fix from Felix
    * and a compiler warning suppression from Jeff
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • find_outdev calls inet{,6}_fib_lookup_dev() or dev_get_by_index() to
    find the output device. In case of an error, inet{,6}_fib_lookup_dev()
    returns error pointer and dev_get_by_index() returns NULL. But the function
    only checks for NULL and thus can end up calling dev_put on an ERR_PTR.
    This patch adds an additional check for err ptr after the NULL check.

    Before: Trying to add an mpls route with no oif from user, no available
    path to 10.1.1.8 and no default route:
    $ip -f mpls route add 100 as 200 via inet 10.1.1.8
    [ 822.337195] BUG: unable to handle kernel NULL pointer dereference at
    00000000000003a3
    [ 822.340033] IP: [] mpls_nh_assign_dev+0x10b/0x182
    [ 822.340033] PGD 1db38067 PUD 1de9e067 PMD 0
    [ 822.340033] Oops: 0000 [#1] SMP
    [ 822.340033] Modules linked in:
    [ 822.340033] CPU: 0 PID: 11148 Comm: ip Not tainted 4.5.0-rc7+ #54
    [ 822.340033] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
    BIOS rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org
    04/01/2014
    [ 822.340033] task: ffff88001db82580 ti: ffff88001dad4000 task.ti:
    ffff88001dad4000
    [ 822.340033] RIP: 0010:[] []
    mpls_nh_assign_dev+0x10b/0x182
    [ 822.340033] RSP: 0018:ffff88001dad7a88 EFLAGS: 00010282
    [ 822.340033] RAX: ffffffffffffff9b RBX: ffffffffffffff9b RCX:
    0000000000000002
    [ 822.340033] RDX: 00000000ffffff9b RSI: 0000000000000008 RDI:
    0000000000000000
    [ 822.340033] RBP: ffff88001ddc9ea0 R08: ffff88001e9f1768 R09:
    0000000000000000
    [ 822.340033] R10: ffff88001d9c1100 R11: ffff88001e3c89f0 R12:
    ffffffff8187e0c0
    [ 822.340033] R13: ffffffff8187e0c0 R14: ffff88001ddc9e80 R15:
    0000000000000004
    [ 822.340033] FS: 00007ff9ed798700(0000) GS:ffff88001fc00000(0000)
    knlGS:0000000000000000
    [ 822.340033] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 822.340033] CR2: 00000000000003a3 CR3: 000000001de89000 CR4:
    00000000000006f0
    [ 822.340033] Stack:
    [ 822.340033] 0000000000000000 0000000100000000 0000000000000000
    0000000000000000
    [ 822.340033] 0000000000000000 0801010a00000000 0000000000000000
    0000000000000000
    [ 822.340033] 0000000000000004 ffffffff8148749b ffffffff8187e0c0
    000000000000001c
    [ 822.340033] Call Trace:
    [ 822.340033] [] ? mpls_rt_alloc+0x2b/0x3e
    [ 822.340033] [] ? mpls_rtm_newroute+0x358/0x3e2
    [ 822.340033] [] ? get_page+0x5/0xa
    [ 822.340033] [] ? rtnetlink_rcv_msg+0x17e/0x191
    [ 822.340033] [] ? __kmalloc_track_caller+0x8c/0x9e
    [ 822.340033] [] ?
    rht_key_hashfn.isra.20.constprop.57+0x14/0x1f
    [ 822.340033] [] ? __rtnl_unlock+0xc/0xc
    [ 822.340033] [] ? netlink_rcv_skb+0x36/0x82
    [ 822.340033] [] ? rtnetlink_rcv+0x1f/0x28
    [ 822.340033] [] ? netlink_unicast+0x106/0x189
    [ 822.340033] [] ? netlink_sendmsg+0x27f/0x2c8
    [ 822.340033] [] ? sock_sendmsg_nosec+0x10/0x1b
    [ 822.340033] [] ? ___sys_sendmsg+0x182/0x1e3
    [ 822.340033] [] ?
    __alloc_pages_nodemask+0x11c/0x1e4
    [ 822.340033] [] ? PageAnon+0x5/0xd
    [ 822.340033] [] ? __page_set_anon_rmap+0x45/0x52
    [ 822.340033] [] ? get_page+0x5/0xa
    [ 822.340033] [] ? __lru_cache_add+0x1a/0x3a
    [ 822.340033] [] ? current_kernel_time64+0x9/0x30
    [ 822.340033] [] ? __sys_sendmsg+0x3c/0x5a
    [ 822.340033] [] ?
    entry_SYSCALL_64_fastpath+0x12/0x6a
    [ 822.340033] Code: 83 08 04 00 00 65 ff 00 48 8b 3c 24 e8 40 7c f2 ff
    eb 13 48 c7 c3 9f ff ff ff eb 0f 89 ce e8 f1 ae f1 ff 48 89 c3 48 85 db
    74 15 8b 83 08 04 00 00 65 ff 08 48 81 fb 00 f0 ff ff 76 0d eb 07
    [ 822.340033] RIP [] mpls_nh_assign_dev+0x10b/0x182
    [ 822.340033] RSP
    [ 822.340033] CR2: 00000000000003a3
    [ 822.435363] ---[ end trace 98cc65e6f6b8bf11 ]---

    After patch:
    $ip -f mpls route add 100 as 200 via inet 10.1.1.8
    RTNETLINK answers: Network is unreachable

    Signed-off-by: Roopa Prabhu
    Reported-by: David Miller
    Signed-off-by: David S. Miller

    Roopa Prabhu
     

08 Apr, 2016

4 commits

  • When sending a UDPv6 message longer than MTU, account for the length
    of fragmentable IPv6 extension headers in skb->network_header offset.
    Same as we do in alloc_new_skb path in __ip6_append_data().

    This ensures that later on __ip6_make_skb() will make space in
    headroom for fragmentable extension headers:

    /* move skb->data to ip header from ext header */
    if (skb->data < skb_network_header(skb))
    __skb_pull(skb, skb_network_offset(skb));

    Prevents a splat due to skb_under_panic:

    skbuff: skb_under_panic: text:ffffffff8143397b len:2126 put:14 \
    head:ffff880005bacf50 data:ffff880005bacf4a tail:0x48 end:0xc0 dev:lo
    ------------[ cut here ]------------
    kernel BUG at net/core/skbuff.c:104!
    invalid opcode: 0000 [#1] KASAN
    CPU: 0 PID: 160 Comm: reproducer Not tainted 4.6.0-rc2 #65
    [...]
    Call Trace:
    [] skb_push+0x79/0x80
    [] eth_header+0x2b/0x100
    [] neigh_resolve_output+0x210/0x310
    [] ip6_finish_output2+0x4a7/0x7c0
    [] ip6_output+0x16a/0x280
    [] ip6_local_out+0xb1/0xf0
    [] ip6_send_skb+0x45/0xd0
    [] udp_v6_send_skb+0x246/0x5d0
    [] udpv6_sendmsg+0xa6e/0x1090
    [...]

    Reported-by: Ji Jianwen
    Signed-off-by: Jakub Sitnicki
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Jakub Sitnicki
     
  • When PAGE_SIZE > 4k single page can contain 2 RDS fragments. If
    'rds_ib_cong_recv' ignore the RDS fragment offset in to the page it
    then read the data fragment as far congestion map update and lead to
    corruption of the RDS connection far congestion map.

    Signed-off-by: Shamir Rabinovitch
    Signed-off-by: David S. Miller

    shamir rabinovitch
     
  • Fix issue in 'rds_ib_cong_recv' when accessing unaligned memory
    allocated by 'rds_page_remainder_alloc' using uint64_t pointer.

    Signed-off-by: Shamir Rabinovitch
    Signed-off-by: David S. Miller

    shamir rabinovitch
     
  • This patch fixes an issue I found in which we were dropping frames if we
    had enabled checksums on GRE headers that were encapsulated by either FOU
    or GUE. Without this patch I was barely able to get 1 Gb/s of throughput.
    With this patch applied I am now at least getting around 6 Gb/s.

    The issue is due to the fact that with FOU or GUE applied we do not provide
    a transport offset pointing to the GRE header, nor do we offload it in
    software as the GRE header is completely skipped by GSO and treated like a
    VXLAN or GENEVE type header. As such we need to prevent the stack from
    generating it and also prevent GRE from generating it via any interface we
    create.

    Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE")
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

07 Apr, 2016

4 commits


06 Apr, 2016

4 commits


05 Apr, 2016

16 commits

  • Since we enqueued the frame that was supposed to be sent
    during the SP, and that frame may very well cary the
    IEEE80211_TX_STATUS_EOSP bit, we may never close the SP
    (WLAN_STA_SP will never be cleared). If that happens, we
    will not open any new SP and will never respond to any poll
    frame from the client.
    Clear WLAN_STA_SP manually if a frame that was polled during
    the SP is queued because of a starting A-MPDU session. The
    client may not see the EOSP bit, but it will at least be
    able to poll new frames in another SP.

    Reported-by: Alesya Shapira
    Signed-off-by: Emmanuel Grumbach
    [remove erroneous comment]
    Signed-off-by: Johannes Berg

    Emmanuel Grumbach
     
  • It is possible that the station is connected to an AP
    with bandwidth of 80+80MHz or 160MHz. In such cases
    there is no need to perform an upgrade as the maximal
    supported bandwidth is 80MHz.

    In addition, when upgrading and setting center_freq1
    and bandwidth to 80MHz also set center_freq2 to 0.

    Fixes: 0fabfaafec3a ("mac80211: upgrade BW of TDLS peers when possible"
    Signed-off-by: Ilan Peer
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg

    Ilan Peer
     
  • Frames that are sent between
    ampdu_action(IEEE80211_AMPDU_TX_START) and the move to the
    HT_AGG_STATE_OPERATIONAL state are buffered.
    If we try to start an A-MPDU session while the peer is
    sleeping and polling frames with U-APSD, we may have frames
    that will be buffered by ieee80211_tx_prep_agg. These frames
    have IEEE80211_TX_CTL_NO_PS_BUFFER set since they are sent to
    a sleeping client and possibly IEEE80211_TX_STATUS_EOSP.
    If the frame is buffered, we need clear these two flags
    since they will be re-sent after the move to
    HT_AGG_STATE_OPERATIONAL state which is very likely to
    happen after the SP ends.

    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg

    Emmanuel Grumbach
     
  • Commit 976bd9efdae6 ("mac80211: move beacon_loss_count into ifmgd")
    removed the member from the sta_info struct but the description stayed
    lingering. Remove it.

    Signed-off-by: Luis de Bethencourt
    Signed-off-by: Johannes Berg

    Luis de Bethencourt
     
  • By default, the rhashtable logic will fail to insert
    objects if the key-chains are too long and un-balanced.

    In the degenerate case where mac80211 is creating many
    virtual interfaces connected to the same peer(s), this
    case can happen.

    St insecure_elasticity to true to allow chains to grow
    as long as needed.

    Signed-off-by: Ben Greear
    [remove message, change commit message slightly]
    Signed-off-by: Johannes Berg

    Ben Greear
     
  • The original hand-implemented hash-table in mac80211 couldn't result
    in insertion errors, and while converting to rhashtable I evidently
    forgot to check the errors.

    This surfaced now only because Ben is adding many identical keys and
    that resulted in hidden insertion errors.

    Cc: stable@vger.kernel.org
    Fixes: 7bedd0cfad4e1 ("mac80211: use rhashtable for station table")
    Reported-by: Ben Greear
    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • The min_def chanctx is affected not only by the current chandef, but
    sometimes also by other stations on the vif. There's a valid scenario
    where a TDLS peer can widen its BW, thereby causing the min_def
    to increase.

    Signed-off-by: Arik Nemtsov
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg

    Arik Nemtsov
     
  • The previous approach simply ignored chandef restrictions when calculating
    the appropriate peer BW for a WIDER_BW peer. This could result in a
    regulatory violation if both peers indicated 80MHz support, but the
    regdomain forbade it.

    Change the approach to setting a WIDER_BW peer's BW. Don't exempt it from
    the chandef width at first. If during TDLS negotiation the chandef width
    is upgraded, update the peer's BW to match.

    Fixes: 0fabfaafec3a ("mac80211: upgrade BW of TDLS peers when possible")
    Signed-off-by: Arik Nemtsov
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg

    Arik Nemtsov
     
  • Even if the current chandef width is equal to the station's max-BW, it
    doesn't mean it's a valid width for TDLS. Make sure to always check
    regulatory constraints in these cases.

    Fixes: 0fabfaafec3a ("mac80211: upgrade BW of TDLS peers when possible")
    Signed-off-by: Arik Nemtsov
    Signed-off-by: Emmanuel Grumbach
    Signed-off-by: Johannes Berg

    Arik Nemtsov
     
  • Buffered multicast frames must be passed to the driver directly via
    drv_tx instead of going through the txq, otherwise they cannot easily be
    scheduled to be sent after DTIM.

    Signed-off-by: Felix Fietkau
    Signed-off-by: Johannes Berg

    Felix Fietkau
     
  • This fixes the incorrect variable assignment on error path in
    br_sysfs_addbr for when the call to kobject_create_and_add
    fails to assign the value of -EINVAL to the returned variable of
    err rather then incorrectly return zero making callers think this
    function has succeededed due to the previous assignment being
    assigned zero when assigning it the successful return value of
    the call to sysfs_create_group which is zero.

    Signed-off-by: Bastien Philbert
    Signed-off-by: David S. Miller

    Bastien Philbert
     
  • pskb_may_pull() can change skb->data, so we have to load ptr/optr at the
    right place.

    Signed-off-by: Haishuang Yan
    Signed-off-by: David S. Miller

    Haishuang Yan
     
  • pskb_may_pull() can change skb->data, so we have to load ptr/optr at the
    right place.

    Signed-off-by: Haishuang Yan
    Signed-off-by: David S. Miller

    Haishuang Yan
     
  • Merge PAGE_CACHE_SIZE removal patches from Kirill Shutemov:
    "PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The first patch with most changes has been done with coccinelle. The
    second is manual fixups on top.

    The third patch removes macros definition"

    [ I was planning to apply this just before rc2, but then I spaced out,
    so here it is right _after_ rc2 instead.

    As Kirill suggested as a possibility, I could have decided to only
    merge the first two patches, and leave the old interfaces for
    compatibility, but I'd rather get it all done and any out-of-tree
    modules and patches can trivially do the converstion while still also
    working with older kernels, so there is little reason to try to
    maintain the redundant legacy model. - Linus ]

    * PAGE_CACHE_SIZE-removal:
    mm: drop PAGE_CACHE_* and page_cache_{get,release} definition
    mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage
    mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros

    Linus Torvalds
     
  • Mostly direct substitution with occasional adjustment or removing
    outdated comments.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

04 Apr, 2016

1 commit

  • The skcpiher/shash conversion introduced a number of bugs in the
    sunrpc code:

    1) Missing calls to skcipher_request_set_tfm lead to crashes.
    2) The allocation size of shash_desc is too small which leads to
    memory corruption.

    Fixes: 3b5cf20cf439 ("sunrpc: Use skcipher and ahash/shash")
    Reported-by: J. Bruce Fields
    Tested-by: J. Bruce Fields
    Signed-off-by: Herbert Xu

    Herbert Xu
     

02 Apr, 2016

2 commits

  • Pull networking fixes from David Miller:

    1) Missing device reference in IPSEC input path results in crashes
    during device unregistration. From Subash Abhinov Kasiviswanathan.

    2) Per-queue ISR register writes not being done properly in macb
    driver, from Cyrille Pitchen.

    3) Stats accounting bugs in bcmgenet, from Patri Gynther.

    4) Lightweight tunnel's TTL and TOS were swapped in netlink dumps, from
    Quentin Armitage.

    5) SXGBE driver has off-by-one in probe error paths, from Rasmus
    Villemoes.

    6) Fix race in save/swap/delete options in netfilter ipset, from
    Vishwanath Pai.

    7) Ageing time of bridge not set properly when not operating over a
    switchdev device. Fix from Haishuang Yan.

    8) Fix GRO regression wrt nested FOU/GUE based tunnels, from Alexander
    Duyck.

    9) IPV6 UDP code bumps wrong stats, from Eric Dumazet.

    10) FEC driver should only access registers that actually exist on the
    given chipset, fix from Fabio Estevam.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
    net: mvneta: fix changing MTU when using per-cpu processing
    stmmac: fix MDIO settings
    Revert "stmmac: Fix 'eth0: No PHY found' regression"
    stmmac: fix TX normal DESC
    net: mvneta: use cache_line_size() to get cacheline size
    net: mvpp2: use cache_line_size() to get cacheline size
    net: mvpp2: fix maybe-uninitialized warning
    tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter
    net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card
    rtnl: fix msg size calculation in if_nlmsg_size()
    fec: Do not access unexisting register in Coldfire
    net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
    net: mvpp2: replace MVPP2_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
    net: dsa: mv88e6xxx: Clear the PDOWN bit on setup
    net: dsa: mv88e6xxx: Introduce _mv88e6xxx_phy_page_{read, write}
    bpf: make padding in bpf_tunnel_key explicit
    ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates
    bnxt_en: Fix ethtool -a reporting.
    bnxt_en: Fix typo in bnxt_hwrm_set_pause_common().
    bnxt_en: Implement proper firmware message padding.
    ...

    Linus Torvalds
     
  • Sasha Levin reported a suspicious rcu_dereference_protected() warning
    found while fuzzing with trinity that is similar to this one:

    [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
    [ 52.765688] other info that might help us debug this:
    [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
    [ 52.765701] 1 lock held by a.out/1525:
    [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
    [ 52.765721] stack backtrace:
    [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
    [...]
    [ 52.765768] Call Trace:
    [ 52.765775] [] dump_stack+0x85/0xc8
    [ 52.765784] [] lockdep_rcu_suspicious+0xd5/0x110
    [ 52.765792] [] sk_detach_filter+0x82/0x90
    [ 52.765801] [] tun_detach_filter+0x35/0x90 [tun]
    [ 52.765810] [] __tun_chr_ioctl+0x354/0x1130 [tun]
    [ 52.765818] [] ? selinux_file_ioctl+0x130/0x210
    [ 52.765827] [] tun_chr_ioctl+0x13/0x20 [tun]
    [ 52.765834] [] do_vfs_ioctl+0x96/0x690
    [ 52.765843] [] ? security_file_ioctl+0x43/0x60
    [ 52.765850] [] SyS_ioctl+0x79/0x90
    [ 52.765858] [] do_syscall_64+0x62/0x140
    [ 52.765866] [] entry_SYSCALL64_slow_path+0x25/0x25

    Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
    from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
    DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.

    Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
    fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
    filter with rcu_dereference_protected(), checking whether socket lock
    is held in control path.

    Since its introduction in 994051625981 ("tun: socket filter support"),
    tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
    sock_owned_by_user(sk) doesn't apply in this specific case and therefore
    triggers the false positive.

    Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
    that is used by tap filters and pass in lockdep_rtnl_is_held() for the
    rcu_dereference_protected() checks instead.

    Reported-by: Sasha Levin
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

01 Apr, 2016

1 commit


31 Mar, 2016

4 commits

  • Make the 2 byte padding in struct bpf_tunnel_key between tunnel_ttl
    and tunnel_label members explicit. No issue has been observed, and
    gcc/llvm does padding for the old struct already, where tunnel_label
    was not yet present, so the current code works, but since it's part
    of uapi, make sure we don't introduce holes in structs.

    Therefore, add tunnel_ext that we can use generically in future
    (f.e. to flag OAM messages for backends, etc). Also add the offset
    to the compat tests to be sure should some compilers not padd the
    tail of the old version of bpf_tunnel_key.

    Fixes: 4018ab1875e0 ("bpf: support flow label for bpf_skb_{set, get}_tunnel_key")
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • IPv6 counters updates use a different macro than IPv4.

    Fixes: 36cbb2452cbaf ("udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts")
    Signed-off-by: Eric Dumazet
    Cc: Rick Jones
    Cc: Willem de Bruijn
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This patch should fix the issues seen with a recent fix to prevent
    tunnel-in-tunnel frames from being generated with GRO. The fix itself is
    correct for now as long as we do not add any devices that support
    NETIF_F_GSO_GRE_CSUM. When such a device is added it could have the
    potential to mess things up due to the fact that the outer transport header
    points to the outer UDP header and not the GRE header as would be expected.

    Fixes: fac8e0f579695 ("tunnels: Don't apply GRO to multiple layers of encapsulation.")
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     
  • Somehow my patch for commit cea8768f333e ("sctp: allow
    sctp_transmit_packet and others to use gfp") missed two important
    chunks, which are now added.

    Fixes: cea8768f333e ("sctp: allow sctp_transmit_packet and others to use gfp")
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-By: Neil Horman
    Signed-off-by: David S. Miller

    Marcelo Ricardo Leitner