15 Dec, 2015

24 commits

  • [ Upstream commit 6adc5fd6a142c6e2c80574c1db0c7c17dedaa42e ]

    Proxy entries could have null pointer to net-device.

    Signed-off-by: Konstantin Khlebnikov
    Fixes: 84920c1420e2 ("net: Allow ipv6 proxies and arp proxies be shown with iproute2")
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • [ Upstream commit 45f6fad84cc305103b28d73482b344d7f5b76f39 ]

    This patch addresses multiple problems :

    UDP/RAW sendmsg() need to get a stable struct ipv6_txoptions
    while socket is not locked : Other threads can change np->opt
    concurrently. Dmitry posted a syzkaller
    (http://github.com/google/syzkaller) program desmonstrating
    use-after-free.

    Starting with TCP/DCCP lockless listeners, tcp_v6_syn_recv_sock()
    and dccp_v6_request_recv_sock() also need to use RCU protection
    to dereference np->opt once (before calling ipv6_dup_options())

    This patch adds full RCU protection to np->opt

    Reported-by: Dmitry Vyukov
    Signed-off-by: Eric Dumazet
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit fbca9d2d35c6ef1b323fae75cc9545005ba25097 ]

    During own review but also reported by Dmitry's syzkaller [1] it has been
    noticed that we trigger a heap out-of-bounds access on eBPF array maps
    when updating elements. This happens with each map whose map->value_size
    (specified during map creation time) is not multiple of 8 bytes.

    In array_map_alloc(), elem_size is round_up(attr->value_size, 8) and
    used to align array map slots for faster access. However, in function
    array_map_update_elem(), we update the element as ...

    memcpy(array->value + array->elem_size * index, value, array->elem_size);

    ... where we access 'value' out-of-bounds, since it was allocated from
    map_update_elem() from syscall side as kmalloc(map->value_size, GFP_USER)
    and later on copied through copy_from_user(value, uvalue, map->value_size).
    Thus, up to 7 bytes, we can access out-of-bounds.

    Same could happen from within an eBPF program, where in worst case we
    access beyond an eBPF program's designated stack.

    Since 1be7f75d1668 ("bpf: enable non-root eBPF programs") didn't hit an
    official release yet, it only affects priviledged users.

    In case of array_map_lookup_elem(), the verifier prevents eBPF programs
    from accessing beyond map->value_size through check_map_access(). Also
    from syscall side map_lookup_elem() only copies map->value_size back to
    user, so nothing could leak.

    [1] http://github.com/google/syzkaller

    Fixes: 28fbcfa08d8e ("bpf: add array type of eBPF maps")
    Reported-by: Dmitry Vyukov
    Signed-off-by: Daniel Borkmann
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit 8c7188b23474cca017b3ef354c4a58456f68303a ]

    Sasha's found a NULL pointer dereference in the RDS connection code when
    sending a message to an apparently unbound socket. The problem is caused
    by the code checking if the socket is bound in rds_sendmsg(), which checks
    the rs_bound_addr field without taking a lock on the socket. This opens a
    race where rs_bound_addr is temporarily set but where the transport is not
    in rds_bind(), leading to a NULL pointer dereference when trying to
    dereference 'trans' in __rds_conn_create().

    Vegard wrote a reproducer for this issue, so kindly ask him to share if
    you're interested.

    I cannot reproduce the NULL pointer dereference using Vegard's reproducer
    with this patch, whereas I could without.

    Complete earlier incomplete fix to CVE-2015-6937:

    74e98eb08588 ("RDS: verify the underlying transport exists before creating a connection")

    Cc: David S. Miller
    Cc: stable@vger.kernel.org

    Reviewed-by: Vegard Nossum
    Reviewed-by: Sasha Levin
    Acked-by: Santosh Shilimkar
    Signed-off-by: Quentin Casasnovas
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Quentin Casasnovas
     
  • [ Upstream commit 264640fc2c5f4f913db5c73fa3eb1ead2c45e9d7 ]

    If a fragmented multicast packet is received on an ethernet device which
    has an active macvlan on top of it, each fragment is duplicated and
    received both on the underlying device and the macvlan. If some
    fragments for macvlan are processed before the whole packet for the
    underlying device is reassembled, the "overlapping fragments" test in
    ip6_frag_queue() discards the whole fragment queue.

    To resolve this, add device ifindex to the search key and require it to
    match reassembling multicast packets and packets to link-local
    addresses.

    Note: similar patch has been already submitted by Yoshifuji Hideaki in

    http://patchwork.ozlabs.org/patch/220979/

    but got lost and forgotten for some reason.

    Signed-off-by: Michal Kubecek
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Michal Kubeček
     
  • [ Upstream commit 3c25a860d17b7378822f35d8c9141db9507e3beb ]

    Commit fcb26ec5b18d ("broadcom: move all PHY_ID's to header")
    updated broadcom_tbl to use PHY_IDs, but incorrectly replaced 0x0143bca0
    with PHY_ID_BCM5482 (making a duplicate entry, and completely omitting
    the original). Fix that.

    Fixes: fcb26ec5b18d ("broadcom: move all PHY_ID's to header")
    Signed-off-by: Aaro Koskinen
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Aaro Koskinen
     
  • [ Upstream commit 4c6980462f32b4f282c5d8e5f7ea8070e2937725 ]

    Similar to ipv4, when destroying an mrt table the static mfc entries and
    the static devices are kept, which leads to devices that can never be
    destroyed (because of refcnt taken) and leaked memory. Make sure that
    everything is cleaned up on netns destruction.

    Fixes: 8229efdaef1e ("netns: ip6mr: enable namespace support in ipv6 multicast forwarding code")
    CC: Benjamin Thery
    Signed-off-by: Nikolay Aleksandrov
    Reviewed-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Aleksandrov
     
  • [ Upstream commit 0e615e9601a15efeeb8942cf7cd4dadba0c8c5a7 ]

    When destroying an mrt table the static mfc entries and the static
    devices are kept, which leads to devices that can never be destroyed
    (because of refcnt taken) and leaked memory, for example:
    unreferenced object 0xffff880034c144c0 (size 192):
    comm "mfc-broken", pid 4777, jiffies 4320349055 (age 46001.964s)
    hex dump (first 32 bytes):
    98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff .S.4.....S.4....
    ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00 ................
    backtrace:
    [] kmemleak_alloc+0x4e/0xb0
    [] kmem_cache_alloc+0x190/0x300
    [] ip_mroute_setsockopt+0x5cb/0x910
    [] do_ip_setsockopt.isra.11+0x105/0xff0
    [] ip_setsockopt+0x30/0xa0
    [] raw_setsockopt+0x33/0x90
    [] sock_common_setsockopt+0x14/0x20
    [] SyS_setsockopt+0x71/0xc0
    [] entry_SYSCALL_64_fastpath+0x16/0x7a
    [] 0xffffffffffffffff

    Make sure that everything is cleaned on netns destruction.

    Signed-off-by: Nikolay Aleksandrov
    Reviewed-by: Cong Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nikolay Aleksandrov
     
  • [ Upstream commit 6900317f5eff0a7070c5936e5383f589e0de7a09 ]

    David and HacKurx reported a following/similar size overflow triggered
    in a grsecurity kernel, thanks to PaX's gcc size overflow plugin:

    (Already fixed in later grsecurity versions by Brad and PaX Team.)

    [ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314
    cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr;
    [ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7
    [ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-66F35F19FE2A0D05, [...]
    [ 1002.296153] ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8
    [ 1002.296162] ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8
    [ 1002.296169] ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60
    [ 1002.296176] Call Trace:
    [ 1002.296190] [] dump_stack+0x45/0x57
    [ 1002.296200] [] report_size_overflow+0x38/0x60
    [ 1002.296209] [] scm_detach_fds+0x2ce/0x300
    [ 1002.296220] [] unix_stream_read_generic+0x609/0x930
    [ 1002.296228] [] unix_stream_recvmsg+0x4f/0x60
    [ 1002.296236] [] ? unix_set_peek_off+0x50/0x50
    [ 1002.296243] [] sock_recvmsg+0x47/0x60
    [ 1002.296248] [] ___sys_recvmsg+0xe2/0x1e0
    [ 1002.296257] [] __sys_recvmsg+0x46/0x80
    [ 1002.296263] [] SyS_recvmsg+0x2c/0x40
    [ 1002.296271] [] entry_SYSCALL_64_fastpath+0x12/0x85

    Further investigation showed that this can happen when an *odd* number of
    fds are being passed over AF_UNIX sockets.

    In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
    where i is the number of successfully passed fds, differ by 4 bytes due
    to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
    on 64 bit. The padding is used to align subsequent cmsg headers in the
    control buffer.

    When the control buffer passed in from the receiver side *lacks* these 4
    bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
    overflow in scm_detach_fds():

    int cmlen = CMSG_LEN(i * sizeof(int)); cmsg_level);
    if (!err)
    err = put_user(SCM_RIGHTS, &cm->cmsg_type);
    if (!err)
    err = put_user(cmlen, &cm->cmsg_len);
    if (!err) {
    cmlen = CMSG_SPACE(i * sizeof(int)); msg_control += cmlen;
    msg->msg_controllen -= cmlen; msg_controllen of 20 bytes, and the sender
    properly transferred 1 fd to the receiver, so that its CMSG_LEN results
    in 20 bytes and CMSG_SPACE in 24 bytes.

    In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
    issue in my tests as alignment seems always on 4 byte boundary. Same
    should be in case of native 32 bit, where we end up with 4 byte boundaries
    as well.

    In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
    a single fd would mean that on successful return, msg->msg_controllen is
    being set by the kernel to 24 bytes instead, thus more than the input
    buffer advertised. It could f.e. become an issue if such application later
    on zeroes or copies the control buffer based on the returned msg->msg_controllen
    elsewhere.

    Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).

    Going over the code, it seems like msg->msg_controllen is not being read
    after scm_detach_fds() in scm_recv() anymore by the kernel, good!

    Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
    and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
    and ___sys_recvmsg() places the updated length, that is, new msg_control -
    old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
    in the example).

    Long time ago, Wei Yongjun fixed something related in commit 1ac70e7ad24a
    ("[NET]: Fix function put_cmsg() which may cause usr application memory
    overflow").

    RFC3542, section 20.2. says:

    The fields shown as "XX" are possible padding, between the cmsghdr
    structure and the data, and between the data and the next cmsghdr
    structure, if required by the implementation. While sending an
    application may or may not include padding at the end of last
    ancillary data in msg_controllen and implementations must accept both
    as valid. On receiving a portable application must provide space for
    padding at the end of the last ancillary data as implementations may
    copy out the padding at the end of the control message buffer and
    include it in the received msg_controllen. When recvmsg() is called
    if msg_controllen is too small for all the ancillary data items
    including any trailing padding after the last item an implementation
    may set MSG_CTRUNC.

    Since we didn't place MSG_CTRUNC for already quite a long time, just do
    the same as in 1ac70e7ad24a to avoid an overflow.

    Btw, even man-page author got this wrong :/ See db939c9b26e9 ("cmsg.3: Fix
    error in SCM_RIGHTS code sample"). Some people must have copied this (?),
    thus it got triggered in the wild (reported several times during boot by
    David and HacKurx).

    No Fixes tag this time as pre 2002 (that is, pre history tree).

    Reported-by: David Sterba
    Reported-by: HacKurx
    Cc: PaX Team
    Cc: Emese Revfy
    Cc: Brad Spengler
    Cc: Wei Yongjun
    Cc: Eric Dumazet
    Reviewed-by: Hannes Frederic Sowa
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit 142a2e7ece8d8ac0e818eb2c91f99ca894730e2a ]

    Dmitry provided a syzkaller (http://github.com/google/syzkaller)
    generated program that triggers the WARNING at
    net/ipv4/tcp.c:1729 in tcp_recvmsg() :

    WARN_ON(tp->copied_seq != tp->rcv_nxt &&
    !(flags & (MSG_PEEK | MSG_TRUNC)));

    His program is specifically attempting a Cross SYN TCP exchange,
    that we support (for the pleasure of hackers ?), but it looks we
    lack proper tcp->copied_seq initialization.

    Thanks again Dmitry for your report and testings.

    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Tested-by: Dmitry Vyukov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 5d4c9bfbabdb1d497f21afd81501e5c54b0c85d9 ]

    tcp_send_rcvq() is used for re-injecting data into tcp receive queue.

    Problems :

    - No check against size is performed, allowed user to fool kernel in
    attempting very large memory allocations, eventually triggering
    OOM when memory is fragmented.

    - In case of fault during the copy we do not return correct errno.

    Lets use alloc_skb_with_frags() to cook optimal skbs.

    Fixes: 292e8d8c8538 ("tcp: Move rcvq sending to tcp_input.c")
    Fixes: c0e88ff0f256 ("tcp: Repair socket queues")
    Signed-off-by: Eric Dumazet
    Cc: Pavel Emelyanov
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 0e45f4da5981895e885dd72fe912a3f8e32bae73 ]

    Some middle-boxes black-hole the data after the Fast Open handshake
    (https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf).
    The exact reason is unknown. The work-around is to disable Fast Open
    temporarily after multiple recurring timeouts with few or no data
    delivered in the established state.

    Signed-off-by: Yuchung Cheng
    Signed-off-by: Eric Dumazet
    Reported-by: Christoph Paasch
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Yuchung Cheng
     
  • [ Upstream commit 1b8e6a01e19f001e9f93b39c32387961c91ed3cc ]

    When a passive TCP is created, we eventually call tcp_md5_do_add()
    with sk pointing to the child. It is not owner by the user yet (we
    will add this socket into listener accept queue a bit later anyway)

    But we do own the spinlock, so amend the lockdep annotation to avoid
    following splat :

    [ 8451.090932] net/ipv4/tcp_ipv4.c:923 suspicious rcu_dereference_protected() usage!
    [ 8451.090932]
    [ 8451.090932] other info that might help us debug this:
    [ 8451.090932]
    [ 8451.090934]
    [ 8451.090934] rcu_scheduler_active = 1, debug_locks = 1
    [ 8451.090936] 3 locks held by socket_sockopt_/214795:
    [ 8451.090936] #0: (rcu_read_lock){.+.+..}, at: [] __netif_receive_skb_core+0x151/0xe90
    [ 8451.090947] #1: (rcu_read_lock){.+.+..}, at: [] ip_local_deliver_finish+0x43/0x2b0
    [ 8451.090952] #2: (slock-AF_INET){+.-...}, at: [] sk_clone_lock+0x1c5/0x500
    [ 8451.090958]
    [ 8451.090958] stack backtrace:
    [ 8451.090960] CPU: 7 PID: 214795 Comm: socket_sockopt_

    [ 8451.091215] Call Trace:
    [ 8451.091216] [] dump_stack+0x55/0x76
    [ 8451.091229] [] lockdep_rcu_suspicious+0xeb/0x110
    [ 8451.091235] [] tcp_md5_do_add+0x1bf/0x1e0
    [ 8451.091239] [] tcp_v4_syn_recv_sock+0x1f1/0x4c0
    [ 8451.091242] [] ? tcp_v4_md5_hash_skb+0x167/0x190
    [ 8451.091246] [] tcp_check_req+0x3c8/0x500
    [ 8451.091249] [] ? tcp_v4_inbound_md5_hash+0x11e/0x190
    [ 8451.091253] [] tcp_v4_rcv+0x3c0/0x9f0
    [ 8451.091256] [] ? ip_local_deliver_finish+0x43/0x2b0
    [ 8451.091260] [] ip_local_deliver_finish+0xb6/0x2b0
    [ 8451.091263] [] ? ip_local_deliver_finish+0x43/0x2b0
    [ 8451.091267] [] ip_local_deliver+0x48/0x80
    [ 8451.091270] [] ip_rcv_finish+0x160/0x700
    [ 8451.091273] [] ip_rcv+0x29e/0x3d0
    [ 8451.091277] [] __netif_receive_skb_core+0xb47/0xe90

    Fixes: a8afca0329988 ("tcp: md5: protects md5sig_info with RCU")
    Signed-off-by: Eric Dumazet
    Reported-by: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     
  • [ Upstream commit 68242a5a1e2edce39b069385cbafb82304eac0f1 ]

    Thomas reports
    "
    4gsystems sells two total different LTE-surfsticks under the same name.
    ..
    The newer version of XS Stick W100 is from "omega"
    ..
    Under windows the driver switches to the same ID, and uses MI03\6 for
    network and MI01\6 for modem.
    ..
    echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
    echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id

    T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
    D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
    P: Vendor=1c9e ProdID=9b01 Rev=02.32
    S: Manufacturer=USB Modem
    S: Product=USB Modem
    S: SerialNumber=
    C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
    I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
    I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
    I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
    I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
    I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

    Now all important things are there:

    wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)

    There is also ttyUSB0, but it is not usable, at least not for at.

    The device works well with qmi and ModemManager-NetworkManager.
    "

    Reported-by: Thomas Schäfer
    Signed-off-by: Bjørn Mork
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Bjørn Mork
     
  • [ Upstream commit 41033f029e393a64e81966cbe34d66c6cf8a2e7e ]

    the OUTMCAST stat is double incremented, getting bumped once in the mcast code
    itself, and again in the common ip output path. Remove the mcast bump, as its
    not needed

    Validated by the reporter, with good results

    Signed-off-by: Neil Horman
    Reported-by: Claus Jensen
    CC: Claus Jensen
    CC: David Miller
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Neil Horman
     
  • [ Upstream commit b4fe85f9c9146f60457e9512fb6055e69e6a7a65 ]

    Drivers like vxlan use the recently introduced
    udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
    makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
    packet, updates the struct stats using the usual
    u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
    udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
    tstats, so drivers like vxlan, immediately after, call
    iptunnel_xmit_stats, which does the same thing - calls
    u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).

    While vxlan is probably fine (I don't know?), calling a similar function
    from, say, an unbound workqueue, on a fully preemptable kernel causes
    real issues:

    [ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6
    [ 188.435579] caller is debug_smp_processor_id+0x17/0x20
    [ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
    [ 188.435607] Call Trace:
    [ 188.435611] [] dump_stack+0x4f/0x7b
    [ 188.435615] [] check_preemption_disabled+0x19d/0x1c0
    [ 188.435619] [] debug_smp_processor_id+0x17/0x20

    The solution would be to protect the whole
    this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
    disabling preemption and then reenabling it.

    Signed-off-by: Jason A. Donenfeld
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason A. Donenfeld
     
  • [ Upstream commit ed5a377d87dc4c87fb3e1f7f698cba38cd893103 ]

    now sctp auth cannot work well when setting a hmacid manually, which
    is caused by that we didn't use the network order for hmacid, so fix
    it by adding the transformation in sctp_auth_ep_set_hmacs.

    even we set hmacid with the network order in userspace, it still
    can't work, because of this condition in sctp_auth_ep_set_hmacs():

    if (id > SCTP_AUTH_HMAC_ID_MAX)
    return -EOPNOTSUPP;

    so this wasn't working before and thus it won't break compatibility.

    Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
    Signed-off-by: Xin Long
    Signed-off-by: Marcelo Ricardo Leitner
    Acked-by: Neil Horman
    Acked-by: Vlad Yasevich
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    lucien
     
  • [ Upstream commit 5cfb4c8d05b4409c4044cb9c05b19705c1d9818b ]

    Since it's introduction in commit 69e3c75f4d54 ("net: TX_RING and
    packet mmap"), TX_RING could be used from SOCK_DGRAM and SOCK_RAW
    side. When used with SOCK_DGRAM only, the size_max > dev->mtu +
    reserve check should have reserve as 0, but currently, this is
    unconditionally set (in it's original form as dev->hard_header_len).

    I think this is not correct since tpacket_fill_skb() would then
    take dev->mtu and dev->hard_header_len into account for SOCK_DGRAM,
    the extra VLAN_HLEN could be possible in both cases. Presumably, the
    reserve code was copied from packet_snd(), but later on missed the
    check. Make it similar as we have it in packet_snd().

    Fixes: 69e3c75f4d54 ("net: TX_RING and packet mmap")
    Signed-off-by: Daniel Borkmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit c72219b75fde768efccf7666342282fab7f9e4e7 ]

    In case no struct sockaddr_ll has been passed to packet
    socket's sendmsg() when doing a TX_RING flush run, then
    skb->protocol is set to po->num instead, which is the protocol
    passed via socket(2)/bind(2).

    Applications only xmitting can go the path of allocating the
    socket as socket(PF_PACKET, , 0) and do a bind(2) on the
    TX_RING with sll_protocol of 0. That way, register_prot_hook()
    is neither called on creation nor on bind time, which saves
    cycles when there's no interest in capturing anyway.

    That leaves us however with po->num 0 instead and therefore
    the TX_RING flush run sets skb->protocol to 0 as well. Eric
    reported that this leads to problems when using tools like
    trafgen over bonding device. I.e. the bonding's hash function
    could invoke the kernel's flow dissector, which depends on
    skb->protocol being properly set. In the current situation, all
    the traffic is then directed to a single slave.

    Fix it up by inferring skb->protocol from the Ethernet header
    when not set and we have ARPHRD_ETHER device type. This is only
    done in case of SOCK_RAW and where we have a dev->hard_header_len
    length. In case of ARPHRD_ETHER devices, this is guaranteed to
    cover ETH_HLEN, and therefore being accessed on the skb after
    the skb_store_bits().

    Reported-by: Eric Dumazet
    Signed-off-by: Daniel Borkmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit 3c70c132488794e2489ab045559b0ce0afcf17de ]

    Packet sockets can be used by various net devices and are not
    really restricted to ARPHRD_ETHER device types. However, when
    currently checking for the extra 4 bytes that can be transmitted
    in VLAN case, our assumption is that we generally probe on
    ARPHRD_ETHER devices. Therefore, before looking into Ethernet
    header, check the device type first.

    This also fixes the issue where non-ARPHRD_ETHER devices could
    have no dev->hard_header_len in TX_RING SOCK_RAW case, and thus
    the check would test unfilled linear part of the skb (instead
    of non-linear).

    Fixes: 57f89bfa2140 ("network: Allow af_packet to transmit +4 bytes for VLAN packets.")
    Fixes: 52f1454f629f ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")
    Signed-off-by: Daniel Borkmann
    Acked-by: Willem de Bruijn
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit 8fd6c80d9dd938ca338c70698533a7e304752846 ]

    We concluded that the skb_probe_transport_header() should better be
    called unconditionally. Avoiding the call into the flow dissector has
    also not really much to do with the direct xmit mode.

    While it seems that only virtio_net code makes use of GSO from non
    RX/TX ring packet socket paths, we should probe for a transport header
    nevertheless before they hit devices.

    Reference: http://thread.gmane.org/gmane.linux.network/386173/
    Signed-off-by: Daniel Borkmann
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit efdfa2f7848f64517008136fb41f53c4a1faf93a ]

    In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
    header before doing xmit") and later on 40893fd0fd4e ("net: switch
    to use skb_probe_transport_header()") was probing for a transport
    header on the skb from a ring buffer slot, but at a time, where
    the skb has _not even_ been filled with data yet. So that call into
    the flow dissector is pretty useless. Lets do it after we've set
    up the skb frags.

    Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
    Reported-by: Eric Dumazet
    Signed-off-by: Daniel Borkmann
    Acked-by: Jason Wang
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Daniel Borkmann
     
  • [ Upstream commit d7475de58575c904818efa369c82e88c6648ce2e ]

    Use the local uapi headers to keep in sync with "recently" added #define's
    (e.g. SKF_AD_VLAN_TPID). Refactored CFLAGS, and bpf_asm doesn't need -I.

    Fixes: 3f356385e8a4 ("filter: bpf_asm: add minimal bpf asm tool")
    Signed-off-by: Kamal Mostafa
    Acked-by: Daniel Borkmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kamal Mostafa
     
  • [ Upstream commit 7d267278a9ece963d77eefec61630223fce08c6c ]

    Rainer Weikusat writes:
    An AF_UNIX datagram socket being the client in an n:1 association with
    some server socket is only allowed to send messages to the server if the
    receive queue of this socket contains at most sk_max_ack_backlog
    datagrams. This implies that prospective writers might be forced to go
    to sleep despite none of the message presently enqueued on the server
    receive queue were sent by them. In order to ensure that these will be
    woken up once space becomes again available, the present unix_dgram_poll
    routine does a second sock_poll_wait call with the peer_wait wait queue
    of the server socket as queue argument (unix_dgram_recvmsg does a wake
    up on this queue after a datagram was received). This is inherently
    problematic because the server socket is only guaranteed to remain alive
    for as long as the client still holds a reference to it. In case the
    connection is dissolved via connect or by the dead peer detection logic
    in unix_dgram_sendmsg, the server socket may be freed despite "the
    polling mechanism" (in particular, epoll) still has a pointer to the
    corresponding peer_wait queue. There's no way to forcibly deregister a
    wait queue with epoll.

    Based on an idea by Jason Baron, the patch below changes the code such
    that a wait_queue_t belonging to the client socket is enqueued on the
    peer_wait queue of the server whenever the peer receive queue full
    condition is detected by either a sendmsg or a poll. A wake up on the
    peer queue is then relayed to the ordinary wait queue of the client
    socket via wake function. The connection to the peer wait queue is again
    dissolved if either a wake up is about to be relayed or the client
    socket reconnects or a dead peer is detected or the client socket is
    itself closed. This enables removing the second sock_poll_wait from
    unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
    that no blocked writer sleeps forever.

    Signed-off-by: Rainer Weikusat
    Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets")
    Reviewed-by: Jason Baron
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Rainer Weikusat
     

10 Dec, 2015

16 commits

  • Greg Kroah-Hartman
     
  • The backport of 1f770c0a09da855a2b51af6d19de97fb955eca85 ("netlink:
    Fix autobind race condition that leads to zero port ID") missed a
    goto statement, which causes netlink to break subtly.

    This was discovered by Stefan Priebe .

    Fixes: 4e2776241766 ("netlink: Fix autobind race condition that...")
    Reported-by: Stefan Priebe
    Reported-by: Philipp Hahn
    Signed-off-by: Herbert Xu
    Acked-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Herbert Xu
     
  • commit 5967c17b118a2bd1dd1d554cc4eee16233e52bec upstream.

    We should never allow to enable/disable any facilities for the guest
    when other VCPUs were already created.

    kvm_arch_vcpu_(load|put) relies on SIMD not changing during runtime.
    If somebody would create and run VCPUs and then decides to enable
    SIMD, undefined behaviour could be possible (e.g. vector save area
    not being set up).

    Acked-by: Christian Borntraeger
    Acked-by: Cornelia Huck
    Signed-off-by: David Hildenbrand
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Greg Kroah-Hartman

    David Hildenbrand
     
  • commit 9f088dba3cc267ea11ec0da318cd0175575b5f9b upstream.

    The recently introduced lnet_peer_set_alive() function uses
    get_seconds() to read the current time into a shared variable,
    but all other uses of that variable compare it to jiffies values.

    This changes the current use to jiffies as well for consistency.

    Signed-off-by: Arnd Bergmann
    Fixes: af3fa7c71bf ("staging/lustre/lnet: peer aliveness status and NI status")
    Cc: Liang Zhen
    Cc: James Simmons
    Cc: Isaac Huang
    Signed-off-by: Oleg Drokin
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • commit a5964396190d0c40dd549c23848c282fffa5d1f2 upstream.

    Existing Intel xHCI controllers require a delay of 1 mS,
    after setting the CMD_RESET bit in command register, before
    accessing any HC registers. This allows the HC to complete
    the reset operation and be ready for HC register access.
    Without this delay, the subsequent HC register access,
    may result in a system hang, very rarely.

    Verified CherryView / Braswell platforms go through over
    5000 warm reboot cycles (which was not possible without
    this patch), without any xHCI reset hang.

    Signed-off-by: Rajmohan Mani
    Tested-by: Joe Lawrence
    Signed-off-by: Mathias Nyman
    Signed-off-by: Greg Kroah-Hartman

    Rajmohan Mani
     
  • commit ee0c1a65cf95230d5eb3d9de94fd2ead9a428c67 upstream.

    The correct lock order is atomic_write_lock => termios_rwsem, as
    established by tty_write() => n_tty_write().

    Fixes: c274f6ef1c666 ("tty: Hold termios_rwsem for tcflow(TCIxxx)")
    Reported-and-Tested-by: Dmitry Vyukov
    Signed-off-by: Peter Hurley
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • commit 6b2a3d628aa752f0ab825fc6d4d07b09e274d1c1 upstream.

    The data to audit/record is in the 'from' buffer (ie., the input
    read buffer).

    Fixes: 72586c6061ab ("n_tty: Fix auditing support for cannonical mode")
    Cc: Miloslav Trmač
    Signed-off-by: Peter Hurley
    Acked-by: Laura Abbott
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: Greg Kroah-Hartman

    Peter Hurley
     
  • commit a91e627e3f0ed820b11d86cdc04df38f65f33a70 upstream.

    One of the many faults of the QinHeng CH345 USB MIDI interface chip is
    that it does not handle received SysEx messages correctly -- every second
    event packet has a wrong code index number, which is the one from the last
    seen message, instead of 4. For example, the two messages "FE F0 01 02 03
    04 05 06 07 08 09 0A 0B 0C 0D 0E F7" result in the following event
    packets:

    correct: CH345:
    0F FE 00 00 0F FE 00 00
    04 F0 01 02 04 F0 01 02
    04 03 04 05 0F 03 04 05
    04 06 07 08 04 06 07 08
    04 09 0A 0B 0F 09 0A 0B
    04 0C 0D 0E 04 0C 0D 0E
    05 F7 00 00 05 F7 00 00

    A class-compliant driver must interpret an event packet with CIN 15 as
    having a single data byte, so the other two bytes would be ignored. The
    message received by the host would then be missing two bytes out of six;
    in this example, "F0 01 02 03 06 07 08 09 0C 0D 0E F7".

    These corrupted SysEx event packages contain only data bytes, while the
    CH345 uses event packets with a correct CIN value only for messages with
    a status byte, so it is possible to distinguish between these two cases by
    checking for the presence of this status byte.

    (Other bugs in the CH345's input handling, such as the corruption resulting
    from running status, cannot be worked around.)

    Signed-off-by: Clemens Ladisch
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Clemens Ladisch
     
  • commit 1ca8b201309d842642f221db7f02f71c0af5be2d upstream.

    The CH345 USB MIDI chip has two output ports. However, they are
    multiplexed through one pin, and the number of ports cannot be reduced
    even for hardware that implements only one connector, so for those
    devices, data sent to either port ends up on the same hardware output.
    This becomes a problem when both ports are used at the same time, as
    longer MIDI commands (such as SysEx messages) are likely to be
    interrupted by messages from the other port, and thus to get lost.

    It would not be possible for the driver to detect how many ports the
    device actually has, except that in practice, _all_ devices built with
    the CH345 have only one port. So we can just ignore the device's
    descriptors, and hardcode one output port.

    Signed-off-by: Clemens Ladisch
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Clemens Ladisch
     
  • commit 98d362becb6621bebdda7ed0eac7ad7ec6c37898 upstream.

    Signed-off-by: Clemens Ladisch
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Clemens Ladisch
     
  • commit 638148e20c7f8f6e95017fdc13bce8549a6925e0 upstream.

    Thomas reports
    "
    4gsystems sells two total different LTE-surfsticks under the same name.
    ..
    The newer version of XS Stick W100 is from "omega"
    ..
    Under windows the driver switches to the same ID, and uses MI03\6 for
    network and MI01\6 for modem.
    ..
    echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
    echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id

    T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
    D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
    P: Vendor=1c9e ProdID=9b01 Rev=02.32
    S: Manufacturer=USB Modem
    S: Product=USB Modem
    S: SerialNumber=
    C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
    I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
    I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
    I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
    I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
    I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

    Now all important things are there:

    wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)

    There is also ttyUSB0, but it is not usable, at least not for at.

    The device works well with qmi and ModemManager-NetworkManager.
    "

    Reported-by: Thomas Schäfer
    Signed-off-by: Bjørn Mork
    Signed-off-by: Greg Kroah-Hartman

    Bjørn Mork
     
  • commit e07af133c3e2716db25e3e1e1d9f10c2088e9c1a upstream.

    Also known as Verizon U620L.

    The device is modeswitched from 1410:9020 to 1410:9022 by selecting the
    4th USB configuration:

    $ sudo usb_modeswitch –v 0x1410 –p 0x9020 –u 4

    This configuration provides a ECM interface as well as TTYs ('Enterprise
    Mode' according to the U620 Linux integration guide).

    Signed-off-by: Aleksander Morgado
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Aleksander Morgado
     
  • commit 1bcb49e663f88bccee35b8688e6a3da2bea31fd4 upstream.

    The Honeywell HGI80 is a wireless interface to the evohome connected
    thermostat. It uses a TI 3410 USB-serial port.

    Signed-off-by: David Woodhouse
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    David Woodhouse
     
  • commit 705e63d2b29c8bbf091119084544d353bda70393 upstream.

    There is a bit of a mess in the order of arguments to the ulpi write
    callback. There is

    int ulpi_write(struct ulpi *ulpi, u8 addr, u8 val)

    in drivers/usb/common/ulpi.c;

    struct usb_phy_io_ops {
    ...
    int (*write)(struct usb_phy *x, u32 val, u32 reg);
    }

    in include/linux/usb/phy.h.

    The callback registered by the musb driver has to comply to the latter,
    but up to now had "offset" first which effectively made the function
    broken for correct users. So flip the order and while at it also
    switch to the parameter names of struct usb_phy_io_ops's write.

    Fixes: ffb865b1e460 ("usb: musb: add ulpi access operations")
    Signed-off-by: Uwe Kleine-König
    Signed-off-by: Felipe Balbi
    Signed-off-by: Greg Kroah-Hartman

    Uwe Kleine-König
     
  • commit 59536da34513c594af2a6fd35ba65ea45b6960a1 upstream.

    The DEVICE_HWI type was added under the faulty assumption that Huawei
    devices based on Qualcomm chipsets and firmware use the static USB
    interface numbering known from Gobi devices. But this model does
    not apply to Huawei devices like the HP branded lt4112 (Huawei me906e).
    Huawei firmwares will dynamically assign interface numbers. Functions
    are renumbered when the firmware is reconfigured.

    Fix by changing the DEVICE_HWI type to use a simplified version
    of Huawei's subclass + protocol scheme: Blacklisting known network
    interface combinations and assuming the rest are serial.

    Reported-and-tested-by: Muri Nicanor
    Tested-by: Martin Hauke
    Fixes: e7181d005e84 ("USB: qcserial: Add support for HP lt4112 LTE/HSPA+ Gobi 4G Modem")
    Signed-off-by: Bjørn Mork
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Bjørn Mork
     
  • commit 9d5b5ed796d7afd7e8d2ac4b4fb77c6a49463f4b upstream.

    It seems like this device has same vendor and product IDs as G2K
    devices, but it has different number of interfaces(4 vs 5) and also
    different interface layout which makes it currently unusable:

    usbcore: registered new interface driver qcserial
    usbserial: USB Serial support registered for Qualcomm USB modem
    usb 2-1.2: unknown number of interfaces: 5

    lsusb output:

    Bus 002 Device 003: ID 05c6:9215 Qualcomm, Inc. Acer Gobi 2000 Wireless
    Device Descriptor:
    bLength 18
    bDescriptorType 1
    bcdUSB 2.00
    bDeviceClass 0 (Defined at Interface level)
    bDeviceSubClass 0
    bDeviceProtocol 0
    bMaxPacketSize0 64
    idVendor 0x05c6 Qualcomm, Inc.
    idProduct 0x9215 Acer Gobi 2000 Wireless Modem
    bcdDevice 2.32
    iManufacturer 1 Quectel
    iProduct 2 Quectel LTE Module
    iSerial 0
    bNumConfigurations 1
    Configuration Descriptor:
    bLength 9
    bDescriptorType 2
    wTotalLength 209
    bNumInterfaces 5
    bConfigurationValue 1
    iConfiguration 0
    bmAttributes 0xa0
    (Bus Powered)
    Remote Wakeup
    MaxPower 500mA

    Signed-off-by: Petr Štetiar
    [johan: rename define and add comment ]
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Petr Štetiar