29 Feb, 2020

1 commit

  • commit 963485d436ccc2810177a7b08af22336ec2af67b upstream.

    rxrpc_rcu_destroy_call(), which is called as an RCU callback to clean up a
    put call, calls rxrpc_put_connection() which, deep in its bowels, takes a
    number of spinlocks in a non-BH-safe way, including rxrpc_conn_id_lock and
    local->client_conns_lock. RCU callbacks, however, are normally called from
    softirq context, which can cause lockdep to notice the locking
    inconsistency.

    To get lockdep to detect this, it's necessary to have the connection
    cleaned up on the put at the end of the last of its calls, though normally
    the clean up is deferred. This can be induced, however, by starting a call
    on an AF_RXRPC socket and then closing the socket without reading the
    reply.

    Fix this by having rxrpc_rcu_destroy_call() punt the destruction to a
    workqueue if in softirq-mode and defer the destruction to process context.

    Note that another way to fix this could be to add a bunch of bh-disable
    annotations to the spinlocks concerned - and there might be more than just
    those two - but that means spending more time with BHs disabled.

    Note also that some of these places were covered by bh-disable spinlocks
    belonging to the rxrpc_transport object, but these got removed without the
    _bh annotation being retained on the next lock in.

    Fixes: 999b69f89241 ("rxrpc: Kill the client connection bundle concept")
    Reported-by: syzbot+d82f3ac8d87e7ccbb2c9@syzkaller.appspotmail.com
    Reported-by: syzbot+3f1fd6b8cbf8702d134e@syzkaller.appspotmail.com
    Signed-off-by: David Howells
    cc: Hillf Danton
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     

11 Feb, 2020

5 commits

  • [ Upstream commit b39a934ec72fa2b5a74123891f25273a38378b90 ]

    The recent patch that substituted a flag on an rxrpc_call for the
    connection pointer being NULL as an indication that a call was disconnected
    puts the set_bit in the wrong place for service calls. This is only a
    problem if a call is implicitly terminated by a new call coming in on the
    same connection channel instead of a terminating ACK packet.

    In such a case, rxrpc_input_implicit_end_call() calls
    __rxrpc_disconnect_call(), which is now (incorrectly) setting the
    disconnection bit, meaning that when rxrpc_release_call() is later called,
    it doesn't call rxrpc_disconnect_call() and so the call isn't removed from
    the peer's error distribution list and the list gets corrupted.

    KASAN finds the issue as an access after release on a call, but the
    position at which it occurs is confusing as it appears to be related to a
    different call (the call site is where the latter call is being removed
    from the error distribution list and either the next or pprev pointer
    points to a previously released call).

    Fix this by moving the setting of the flag from __rxrpc_disconnect_call()
    to rxrpc_disconnect_call() in the same place that the connection pointer
    was being cleared.

    Fixes: 5273a191dca6 ("rxrpc: Fix NULL pointer deref due to call->conn being cleared on disconnect")
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    David Howells
     
  • [ Upstream commit 5273a191dca65a675dc0bcf3909e59c6933e2831 ]

    When a call is disconnected, the connection pointer from the call is
    cleared to make sure it isn't used again and to prevent further attempted
    transmission for the call. Unfortunately, there might be a daemon trying
    to use it at the same time to transmit a packet.

    Fix this by keeping call->conn set, but setting a flag on the call to
    indicate disconnection instead.

    Remove also the bits in the transmission functions where the conn pointer is
    checked and a ref taken under spinlock as this is now redundant.

    Fixes: 8d94aa381dab ("rxrpc: Calls shouldn't hold socket refs")
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • [ Upstream commit 04d36d748fac349b068ef621611f454010054c58 ]

    The introduction of a split between the reference count on rxrpc_local
    objects and the usage count didn't quite go far enough. A number of kernel
    work items need to make use of the socket to perform transmission. These
    also need to get an active count on the local object to prevent the socket
    from being closed.

    Fix this by getting the active count in those places.

    Also split out the raw active count get/put functions as these places tend
    to hold refs on the rxrpc_local object already, so getting and putting an
    extra object ref is just a waste of time.

    The problem can lead to symptoms like:

    BUG: kernel NULL pointer dereference, address: 0000000000000018
    ..
    CPU: 2 PID: 818 Comm: kworker/u9:0 Not tainted 5.5.0-fscache+ #51
    ...
    RIP: 0010:selinux_socket_sendmsg+0x5/0x13
    ...
    Call Trace:
    security_socket_sendmsg+0x2c/0x3e
    sock_sendmsg+0x1a/0x46
    rxrpc_send_keepalive+0x131/0x1ae
    rxrpc_peer_keepalive_worker+0x219/0x34b
    process_one_work+0x18e/0x271
    worker_thread+0x1a3/0x247
    kthread+0xe6/0xeb
    ret_from_fork+0x1f/0x30

    Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting")
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • [ Upstream commit f71dbf2fb28489a79bde0dca1c8adfb9cdb20a6b ]

    In rxrpc_input_data(), rxrpc_notify_socket() is called if the base sequence
    number of the packet is immediately following the hard-ack point at the end
    of the function. However, this isn't sufficient, since the recvmsg side
    may have been advancing the window and then overrun the position in which
    we're adding - at which point rx_hard_ack >= seq0 and no notification is
    generated.

    Fix this by always generating a notification at the end of the input
    function.

    Without this, a long call may stall, possibly indefinitely.

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     
  • [ Upstream commit fac20b9e738523fc884ee3ea5be360a321cd8bad ]

    Fix rxrpc_put_local() to not access local->debug_id after calling
    atomic_dec_return() as, unless that returned n==0, we no longer have the
    right to access the object.

    Fixes: 06d9532fa6b3 ("rxrpc: Fix read-after-free in rxrpc_queue_local()")
    Signed-off-by: David Howells
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     

01 Feb, 2020

1 commit

  • [ Upstream commit 122d74fac84204b9a98263636f6f9a3b2e665639 ]

    The subpacket scanning loop in rxrpc_receive_data() references the
    subpacket count in the private data part of the sk_buff in the loop
    termination condition. However, when the final subpacket is pasted into
    the ring buffer, the function is no longer has a ref on the sk_buff and
    should not be looking at sp->* any more. This point is actually marked in
    the code when skb is cleared (but sp is not - which is an error).

    Fix this by caching sp->nr_subpackets in a local variable and using that
    instead.

    Also clear 'sp' to catch accesses after that point.

    This can show up as an oops in rxrpc_get_skb() if sp->nr_subpackets gets
    trashed by the sk_buff getting freed and reused in the meantime.

    Fixes: e2de6c404898 ("rxrpc: Use info in skbuff instead of reparsing a jumbo packet")
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Howells
     

18 Jan, 2020

3 commits

  • [ Upstream commit 063c60d39180cec7c9317f5acfc3071f8fecd705 ]

    Fix rxrpc_new_incoming_call() to check that we have a suitable service key
    available for the combination of service ID and security class of a new
    incoming call - and to reject calls for which we don't.

    This causes an assertion like the following to appear:

    rxrpc: Assertion failed - 6(0x6) == 12(0xc) is false
    kernel BUG at net/rxrpc/call_object.c:456!

    Where call->state is RXRPC_CALL_SERVER_SECURING (6) rather than
    RXRPC_CALL_COMPLETE (12).

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Reported-by: Marc Dionne
    Signed-off-by: David Howells
    Signed-off-by: Sasha Levin

    David Howells
     
  • [ Upstream commit 13b7955a0252e15265386b229b814152f109b234 ]

    Standard kernel mutexes cannot be used in any way from interrupt or softirq
    context, so the user_mutex which manages access to a call cannot be a mutex
    since on a new call the mutex must start off locked and be unlocked within
    the softirq handler to prevent userspace interfering with a call we're
    setting up.

    Commit a0855d24fc22d49cdc25664fb224caee16998683 ("locking/mutex: Complain
    upon mutex API misuse in IRQ contexts") causes big warnings to be splashed
    in dmesg for each a new call that comes in from the server. Whilst it
    *seems* like it should be okay, since the accept path uses trylock, there
    are issues with PI boosting and marking the wrong task as the owner.

    Fix this by not taking the mutex in the softirq path at all. It's not
    obvious that there should be any need for it as the state is set before the
    first notification is generated for the new call.

    There's also no particular reason why the link-assessing ping should be
    triggered inside the mutex. It's not actually transmitted there anyway,
    but rather it has to be deferred to a workqueue.

    Further, I don't think that there's any particular reason that the socket
    notification needs to be done from within rx->incoming_lock, so the amount
    of time that lock is held can be shortened too and the ping prepared before
    the new call notification is sent.

    Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg")
    Signed-off-by: David Howells
    cc: Peter Zijlstra (Intel)
    cc: Ingo Molnar
    cc: Will Deacon
    cc: Davidlohr Bueso
    Signed-off-by: Sasha Levin

    David Howells
     
  • [ Upstream commit f33121cbe91973a08e68e4bde8c3f7e6e4e351c1 ]

    Move the unlock and the ping transmission for a new incoming call into
    rxrpc_new_incoming_call() rather than doing it in the caller. This makes
    it clearer to see what's going on.

    Suggested-by: Peter Zijlstra
    Signed-off-by: David Howells
    Acked-by: Peter Zijlstra (Intel)
    cc: Ingo Molnar
    cc: Will Deacon
    cc: Davidlohr Bueso
    Signed-off-by: Sasha Levin

    David Howells
     

01 Nov, 2019

1 commit

  • When rxrpc_recvmsg_data() sets the return value to 1 because it's drained
    all the data for the last packet, it checks the last-packet flag on the
    whole packet - but this is wrong, since the last-packet flag is only set on
    the final subpacket of the last jumbo packet. This means that a call that
    receives its last packet in a jumbo packet won't complete properly.

    Fix this by having rxrpc_locate_data() determine the last-packet state of
    the subpacket it's looking at and passing that back to the caller rather
    than having the caller look in the packet header. The caller then needs to
    cache this in the rxrpc_call struct as rxrpc_locate_data() isn't then
    called again for this packet.

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Fixes: e2de6c404898 ("rxrpc: Use info in skbuff instead of reparsing a jumbo packet")
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

17 Oct, 2019

1 commit

  • We need to extend the rcu_read_lock() section in rxrpc_error_report()
    and use rcu_dereference_sk_user_data() instead of plain access
    to sk->sk_user_data to make sure all rules are respected.

    The compiler wont reload sk->sk_user_data at will, and RCU rules
    prevent memory beeing freed too soon.

    Fixes: f0308fb07080 ("rxrpc: Fix possible NULL pointer access in ICMP handling")
    Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
    Signed-off-by: Eric Dumazet
    Cc: David Howells
    Signed-off-by: David S. Miller

    Eric Dumazet
     

12 Oct, 2019

1 commit

  • If an ICMP packet comes in on the UDP socket backing an AF_RXRPC socket as
    the UDP socket is being shut down, rxrpc_error_report() may get called to
    deal with it after sk_user_data on the UDP socket has been cleared, leading
    to a NULL pointer access when this local endpoint record gets accessed.

    Fix this by just returning immediately if sk_user_data was NULL.

    The oops looks like the following:

    #PF: supervisor read access in kernel mode
    #PF: error_code(0x0000) - not-present page
    ...
    RIP: 0010:rxrpc_error_report+0x1bd/0x6a9
    ...
    Call Trace:
    ? sock_queue_err_skb+0xbd/0xde
    ? __udp4_lib_err+0x313/0x34d
    __udp4_lib_err+0x313/0x34d
    icmp_unreach+0x1ee/0x207
    icmp_rcv+0x25b/0x28f
    ip_protocol_deliver_rcu+0x95/0x10e
    ip_local_deliver+0xe9/0x148
    __netif_receive_skb_one_core+0x52/0x6e
    process_backlog+0xdc/0x177
    net_rx_action+0xf9/0x270
    __do_softirq+0x1b6/0x39a
    ? smpboot_register_percpu_thread+0xce/0xce
    run_ksoftirqd+0x1d/0x42
    smpboot_thread_fn+0x19e/0x1b3
    kthread+0xf1/0xf6
    ? kthread_delayed_work_timer_fn+0x83/0x83
    ret_from_fork+0x24/0x30

    Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
    Reported-by: syzbot+611164843bd48cc2190c@syzkaller.appspotmail.com
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

07 Oct, 2019

6 commits

  • Fix the cleanup of the crypto state on a call after the call has been
    disconnected. As the call has been disconnected, its connection ref has
    been discarded and so we can't go through that to get to the security ops
    table.

    Fix this by caching the security ops pointer in the rxrpc_call struct and
    using that when freeing the call security state. Also use this in other
    places we're dealing with call-specific security.

    The symptoms look like:

    BUG: KASAN: use-after-free in rxrpc_release_call+0xb2d/0xb60
    net/rxrpc/call_object.c:481
    Read of size 8 at addr ffff888062ffeb50 by task syz-executor.5/4764

    Fixes: 1db88c534371 ("rxrpc: Fix -Wframe-larger-than= warnings from on-stack crypto")
    Reported-by: syzbot+eed305768ece6682bb7f@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     
  • The rxrpc_peer record needs to hold a reference on the rxrpc_local record
    it points as the peer is used as a base to access information in the
    rxrpc_local record.

    This can cause problems in __rxrpc_put_peer(), where we need the network
    namespace pointer, and in rxrpc_send_keepalive(), where we need to access
    the UDP socket, leading to symptoms like:

    BUG: KASAN: use-after-free in __rxrpc_put_peer net/rxrpc/peer_object.c:411
    [inline]
    BUG: KASAN: use-after-free in rxrpc_put_peer+0x685/0x6a0
    net/rxrpc/peer_object.c:435
    Read of size 8 at addr ffff888097ec0058 by task syz-executor823/24216

    Fix this by taking a ref on the local record for the peer record.

    Fixes: ace45bec6d77 ("rxrpc: Fix firewall route keepalive")
    Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing")
    Reported-by: syzbot+b9be979c55f2bea8ed30@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     
  • rxrpc_put_call() calls trace_rxrpc_call() after it has done the decrement
    of the refcount - which looks at the debug_id in the call record. But
    unless the refcount was reduced to zero, we no longer have the right to
    look in the record and, indeed, it may be deleted by some other thread.

    Fix this by getting the debug_id out before decrementing the refcount and
    then passing that into the tracepoint.

    Fixes: e34d4234b0b7 ("rxrpc: Trace rxrpc_call usage")
    Signed-off-by: David Howells

    David Howells
     
  • rxrpc_put_*conn() calls trace_rxrpc_conn() after they have done the
    decrement of the refcount - which looks at the debug_id in the connection
    record. But unless the refcount was reduced to zero, we no longer have the
    right to look in the record and, indeed, it may be deleted by some other
    thread.

    Fix this by getting the debug_id out before decrementing the refcount and
    then passing that into the tracepoint.

    Fixes: 363deeab6d0f ("rxrpc: Add connection tracepoint and client conn state tracepoint")
    Signed-off-by: David Howells

    David Howells
     
  • rxrpc_put_peer() calls trace_rxrpc_peer() after it has done the decrement
    of the refcount - which looks at the debug_id in the peer record. But
    unless the refcount was reduced to zero, we no longer have the right to
    look in the record and, indeed, it may be deleted by some other thread.

    Fix this by getting the debug_id out before decrementing the refcount and
    then passing that into the tracepoint.

    This can cause the following symptoms:

    BUG: KASAN: use-after-free in __rxrpc_put_peer net/rxrpc/peer_object.c:411
    [inline]
    BUG: KASAN: use-after-free in rxrpc_put_peer+0x685/0x6a0
    net/rxrpc/peer_object.c:435
    Read of size 8 at addr ffff888097ec0058 by task syz-executor823/24216

    Fixes: 1159d4b496f5 ("rxrpc: Add a tracepoint to track rxrpc_peer refcounting")
    Reported-by: syzbot+b9be979c55f2bea8ed30@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     
  • When sendmsg() finds a call to continue on with, if the call is in an
    inappropriate state, it doesn't release the ref it just got on that call
    before returning an error.

    This causes the following symptom to show up with kasan:

    BUG: KASAN: use-after-free in rxrpc_send_keepalive+0x8a2/0x940
    net/rxrpc/output.c:635
    Read of size 8 at addr ffff888064219698 by task kworker/0:3/11077

    where line 635 is:

    whdr.epoch = htonl(peer->local->rxnet->epoch);

    The local endpoint (which cannot be pinned by the call) has been released,
    but not the peer (which is pinned by the call).

    Fix this by releasing the call in the error path.

    Fixes: 37411cad633f ("rxrpc: Fix potential NULL-pointer exception")
    Reported-by: syzbot+d850c266e3df14da1d31@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     

15 Sep, 2019

1 commit


05 Sep, 2019

1 commit

  • There's a misplaced traceline in rxrpc_input_packet() which is looking at a
    packet that just got released rather than the replacement packet.

    Fix this by moving the traceline after the assignment that moves the new
    packet pointer to the actual packet pointer.

    Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
    Reported-by: Hillf Danton
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

03 Sep, 2019

1 commit


31 Aug, 2019

1 commit

  • When a local endpoint is ceases to be in use, such as when the kafs module
    is unloaded, the kernel will emit an assertion failure if there are any
    outstanding client connections:

    rxrpc: Assertion failed
    ------------[ cut here ]------------
    kernel BUG at net/rxrpc/local_object.c:433!

    and even beyond that, will evince other oopses if there are service
    connections still present.

    Fix this by:

    (1) Removing the triggering of connection reaping when an rxrpc socket is
    released. These don't actually clean up the connections anyway - and
    further, the local endpoint may still be in use through another
    socket.

    (2) Mark the local endpoint as dead when we start the process of tearing
    it down.

    (3) When destroying a local endpoint, strip all of its client connections
    from the idle list and discard the ref on each that the list was
    holding.

    (4) When destroying a local endpoint, call the service connection reaper
    directly (rather than through a workqueue) to immediately kill off all
    outstanding service connections.

    (5) Make the service connection reaper reap connections for which the
    local endpoint is marked dead.

    Only after destroying the connections can we close the socket lest we get
    an oops in a workqueue that's looking at a connection or a peer.

    Fixes: 3d18cbb7fd0c ("rxrpc: Fix conn expiry timers")
    Signed-off-by: David Howells
    Tested-by: Marc Dionne
    Signed-off-by: David S. Miller

    David Howells
     

27 Aug, 2019

7 commits

  • The in-place decryption routines in AF_RXRPC's rxkad security module
    currently call skb_cow_data() to make sure the data isn't shared and that
    the skb can be written over. This has a problem, however, as the softirq
    handler may be still holding a ref or the Rx ring may be holding multiple
    refs when skb_cow_data() is called in rxkad_verify_packet() - and so
    skb_shared() returns true and __pskb_pull_tail() dislikes that. If this
    occurs, something like the following report will be generated.

    kernel BUG at net/core/skbuff.c:1463!
    ...
    RIP: 0010:pskb_expand_head+0x253/0x2b0
    ...
    Call Trace:
    __pskb_pull_tail+0x49/0x460
    skb_cow_data+0x6f/0x300
    rxkad_verify_packet+0x18b/0xb10 [rxrpc]
    rxrpc_recvmsg_data.isra.11+0x4a8/0xa10 [rxrpc]
    rxrpc_kernel_recv_data+0x126/0x240 [rxrpc]
    afs_extract_data+0x51/0x2d0 [kafs]
    afs_deliver_fs_fetch_data+0x188/0x400 [kafs]
    afs_deliver_to_call+0xac/0x430 [kafs]
    afs_wait_for_call_to_complete+0x22f/0x3d0 [kafs]
    afs_make_call+0x282/0x3f0 [kafs]
    afs_fs_fetch_data+0x164/0x300 [kafs]
    afs_fetch_data+0x54/0x130 [kafs]
    afs_readpages+0x20d/0x340 [kafs]
    read_pages+0x66/0x180
    __do_page_cache_readahead+0x188/0x1a0
    ondemand_readahead+0x17d/0x2e0
    generic_file_read_iter+0x740/0xc10
    __vfs_read+0x145/0x1a0
    vfs_read+0x8c/0x140
    ksys_read+0x4a/0xb0
    do_syscall_64+0x43/0xf0
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

    Fix this by using skb_unshare() instead in the input path for DATA packets
    that have a security index != 0. Non-DATA packets don't need in-place
    encryption and neither do unencrypted DATA packets.

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Reported-by: Julian Wollrath
    Signed-off-by: David Howells

    David Howells
     
  • Use the previously-added transmit-phase skbuff private flag to simplify the
    socket buffer tracing a bit. Which phase the skbuff comes from can now be
    divined from the skb rather than having to be guessed from the call state.

    We can also reduce the number of rxrpc_skb_trace values by eliminating the
    difference between Tx and Rx in the symbols.

    Signed-off-by: David Howells

    David Howells
     
  • Add a flag in the private data on an skbuff to indicate that this is a
    transmission-phase buffer rather than a receive-phase buffer.

    Signed-off-by: David Howells

    David Howells
     
  • Abstract out rxtx ring cleanup into its own function from its two callers.
    This makes it easier to apply the same changes to both.

    Signed-off-by: David Howells

    David Howells
     
  • Pass the reference held on a DATA skb in the rxrpc input handler into the
    Rx ring rather than getting an additional ref for this and then dropping
    the original ref at the end.

    Signed-off-by: David Howells

    David Howells
     
  • Use the information now cached in the skbuff private data to avoid the need
    to reparse a jumbo packet. We can find all the subpackets by dead
    reckoning, so it's only necessary to note how many there are, whether the
    last one is flagged as LAST_PACKET and whether any have the REQUEST_ACK
    flag set.

    This is necessary as once recvmsg() can see the packet, it can start
    modifying it, such as doing in-place decryption.

    Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
    Signed-off-by: David Howells

    David Howells
     
  • Improve the information stored about jumbo packets so that we don't need to
    reparse them so much later.

    Signed-off-by: David Howells
    Reviewed-by: Jeffrey Altman

    David Howells
     

20 Aug, 2019

1 commit


14 Aug, 2019

2 commits

  • rxrpc_queue_local() attempts to queue the local endpoint it is given and
    then, if successful, prints a trace line. The trace line includes the
    current usage count - but we're not allowed to look at the local endpoint
    at this point as we passed our ref on it to the workqueue.

    Fix this by reading the usage count before queuing the work item.

    Also fix the reading of local->debug_id for trace lines, which must be done
    with the same consideration as reading the usage count.

    Fixes: 09d2bf595db4 ("rxrpc: Add a tracepoint to track rxrpc_local refcounting")
    Reported-by: syzbot+78e71c5bab4f76a6a719@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     
  • When a local endpoint (struct rxrpc_local) ceases to be in use by any
    AF_RXRPC sockets, it starts the process of being destroyed, but this
    doesn't cause it to be removed from the namespace endpoint list immediately
    as tearing it down isn't trivial and can't be done in softirq context, so
    it gets deferred.

    If a new socket comes along that wants to bind to the same endpoint, a new
    rxrpc_local object will be allocated and rxrpc_lookup_local() will use
    list_replace() to substitute the new one for the old.

    Then, when the dying object gets to rxrpc_local_destroyer(), it is removed
    unconditionally from whatever list it is on by calling list_del_init().

    However, list_replace() doesn't reset the pointers in the replaced
    list_head and so the list_del_init() will likely corrupt the local
    endpoints list.

    Fix this by using list_replace_init() instead.

    Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting")
    Reported-by: syzbot+193e29e9387ea5837f1d@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     

12 Aug, 2019

1 commit

  • Fix rxrpc_unuse_local() to handle a NULL local pointer as it can be called
    on an unbound socket on which rx->local is not yet set.

    The following reproduced (includes omitted):

    int main(void)
    {
    socket(AF_RXRPC, SOCK_DGRAM, AF_INET);
    return 0;
    }

    causes the following oops to occur:

    BUG: kernel NULL pointer dereference, address: 0000000000000010
    ...
    RIP: 0010:rxrpc_unuse_local+0x8/0x1b
    ...
    Call Trace:
    rxrpc_release+0x2b5/0x338
    __sock_release+0x37/0xa1
    sock_close+0x14/0x17
    __fput+0x115/0x1e9
    task_work_run+0x72/0x98
    do_exit+0x51b/0xa7a
    ? __context_tracking_exit+0x4e/0x10e
    do_group_exit+0xab/0xab
    __x64_sys_exit_group+0x14/0x17
    do_syscall_64+0x89/0x1d4
    entry_SYSCALL_64_after_hwframe+0x49/0xbe

    Reported-by: syzbot+20dee719a2e090427b5f@syzkaller.appspotmail.com
    Fixes: 730c5fd42c1e ("rxrpc: Fix local endpoint refcounting")
    Signed-off-by: David Howells
    cc: Jeffrey Altman
    Signed-off-by: David S. Miller

    David Howells
     

09 Aug, 2019

2 commits

  • Don't bother generating maxSkew in the ACK packet as it has been obsolete
    since AFS 3.1.

    Signed-off-by: David Howells
    Reviewed-by: Jeffrey Altman

    David Howells
     
  • The object lifetime management on the rxrpc_local struct is broken in that
    the rxrpc_local_processor() function is expected to clean up and remove an
    object - but it may get requeued by packets coming in on the backing UDP
    socket once it starts running.

    This may result in the assertion in rxrpc_local_rcu() firing because the
    memory has been scheduled for RCU destruction whilst still queued:

    rxrpc: Assertion failed
    ------------[ cut here ]------------
    kernel BUG at net/rxrpc/local_object.c:468!

    Note that if the processor comes around before the RCU free function, it
    will just do nothing because ->dead is true.

    Fix this by adding a separate refcount to count active users of the
    endpoint that causes the endpoint to be destroyed when it reaches 0.

    The original refcount can then be used to refcount objects through the work
    processor and cause the memory to be rcu freed when that reaches 0.

    Fixes: 4f95dd78a77e ("rxrpc: Rework local endpoint management")
    Reported-by: syzbot+1e0edc4b8b7494c28450@syzkaller.appspotmail.com
    Signed-off-by: David Howells

    David Howells
     

07 Aug, 2019

1 commit


31 Jul, 2019

1 commit

  • rxkad sometimes triggers a warning about oversized stack frames when
    building with clang for a 32-bit architecture:

    net/rxrpc/rxkad.c:243:12: error: stack frame size of 1088 bytes in function 'rxkad_secure_packet' [-Werror,-Wframe-larger-than=]
    net/rxrpc/rxkad.c:501:12: error: stack frame size of 1088 bytes in function 'rxkad_verify_packet' [-Werror,-Wframe-larger-than=]

    The problem is the combination of SYNC_SKCIPHER_REQUEST_ON_STACK() in
    rxkad_verify_packet()/rxkad_secure_packet() with the relatively large
    scatterlist in rxkad_verify_packet_1()/rxkad_secure_packet_encrypt().

    The warning does not show up when using gcc, which does not inline the
    functions as aggressively, but the problem is still the same.

    Allocate the cipher buffers from the slab instead, caching the allocated
    packet crypto request memory used for DATA packet crypto in the rxrpc_call
    struct.

    Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
    Reported-by: Arnd Bergmann
    Signed-off-by: David Howells
    Acked-by: Arnd Bergmann
    cc: Herbert Xu
    Signed-off-by: David S. Miller

    David Howells
     

30 Jul, 2019

2 commits

  • Fix the fact that a notification isn't sent to the recvmsg side to indicate
    a call failed when sendmsg() fails to transmit a DATA packet with the error
    ENETUNREACH, EHOSTUNREACH or ECONNREFUSED.

    Without this notification, the afs client just sits there waiting for the
    call to complete in some manner (which it's not now going to do), which
    also pins the rxrpc call in place.

    This can be seen if the client has a scope-level IPv6 address, but not a
    global-level IPv6 address, and we try and transmit an operation to a
    server's IPv6 address.

    Looking in /proc/net/rxrpc/calls shows completed calls just sat there with
    an abort code of RX_USER_ABORT and an error code of -ENETUNREACH.

    Fixes: c54e43d752c7 ("rxrpc: Fix missing start of call timeout")
    Signed-off-by: David Howells
    Reviewed-by: Marc Dionne
    Reviewed-by: Jeffrey Altman

    David Howells
     
  • There is a potential deadlock in rxrpc_peer_keepalive_dispatch() whereby
    rxrpc_put_peer() is called with the peer_hash_lock held, but if it reduces
    the peer's refcount to 0, rxrpc_put_peer() calls __rxrpc_put_peer() - which
    the tries to take the already held lock.

    Fix this by providing a version of rxrpc_put_peer() that can be called in
    situations where the lock is already held.

    The bug may produce the following lockdep report:

    ============================================
    WARNING: possible recursive locking detected
    5.2.0-next-20190718 #41 Not tainted
    --------------------------------------------
    kworker/0:3/21678 is trying to acquire lock:
    00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at: spin_lock_bh
    /./include/linux/spinlock.h:343 [inline]
    00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
    __rxrpc_put_peer /net/rxrpc/peer_object.c:415 [inline]
    00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
    rxrpc_put_peer+0x2d3/0x6a0 /net/rxrpc/peer_object.c:435

    but task is already holding lock:
    00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at: spin_lock_bh
    /./include/linux/spinlock.h:343 [inline]
    00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
    rxrpc_peer_keepalive_dispatch /net/rxrpc/peer_event.c:378 [inline]
    00000000aa5eecdf (&(&rxnet->peer_hash_lock)->rlock){+.-.}, at:
    rxrpc_peer_keepalive_worker+0x6b3/0xd02 /net/rxrpc/peer_event.c:430

    Fixes: 330bdcfadcee ("rxrpc: Fix the keepalive generator [ver #2]")
    Reported-by: syzbot+72af434e4b3417318f84@syzkaller.appspotmail.com
    Signed-off-by: David Howells
    Reviewed-by: Marc Dionne
    Reviewed-by: Jeffrey Altman

    David Howells