30 Dec, 2018

1 commit

  • Pull documentation update from Jonathan Corbet:
    "A fairly normal cycle for documentation stuff. We have a new document
    on perf security, more Italian translations, more improvements to the
    memory-management docs, improvements to the pathname lookup
    documentation, and the usual array of smaller fixes.

    As is often the case, there are a few reaches outside of
    Documentation/ to adjust kerneldoc comments"

    * tag 'docs-5.0' of git://git.lwn.net/linux: (38 commits)
    docs: improve pathname-lookup document structure
    configfs: fix wrong name of struct in documentation
    docs/mm-api: link slab_common.c to "The Slab Cache" section
    slab: make kmem_cache_create{_usercopy} description proper kernel-doc
    doc:process: add links where missing
    docs/core-api: make mm-api.rst more structured
    x86, boot: documentation whitespace fixup
    Documentation: devres: note checking needs when converting
    doc:it: add some process/* translations
    doc:it: fixes in process/1.Intro
    Documentation: convert path-lookup from markdown to resturctured text
    Documentation/admin-guide: update admin-guide index.rst
    Documentation/admin-guide: introduce perf-security.rst file
    scripts/kernel-doc: Fix struct and struct field attribute processing
    Documentation: dev-tools: Fix typos in index.rst
    Correct gen_init_cpio tool's documentation
    Document /proc/pid PID reuse behavior
    Documentation: update path-lookup.md for parallel lookups
    Documentation: Use "while" instead of "whilst"
    dmaengine: Add mailing list address to the documentation
    ...

    Linus Torvalds
     

21 Nov, 2018

1 commit

  • Whilst making an unrelated change to some Documentation, Linus sayeth:

    | Afaik, even in Britain, "whilst" is unusual and considered more
    | formal, and "while" is the common word.
    |
    | [...]
    |
    | Can we just admit that we work with computers, and we don't need to
    | use þe eald Englisc spelling of words that most of the world never
    | uses?

    dictionary.com refers to the word as "Chiefly British", which is
    probably an undesirable attribute for technical documentation.

    Replace all occurrences under Documentation/ with "while".

    Cc: David Howells
    Cc: Liam Girdwood
    Cc: Chris Wilson
    Cc: Michael Halcrow
    Cc: Jonathan Corbet
    Reported-by: Linus Torvalds
    Signed-off-by: Will Deacon
    Signed-off-by: Jonathan Corbet

    Will Deacon
     

16 Nov, 2018

1 commit

  • The life-checking function, which is used by kAFS to make sure that a call
    is still live in the event of a pending signal, only samples the received
    packet serial number counter; it doesn't actually provoke a change in the
    counter, rather relying on the server to happen to give us a packet in the
    time window.

    Fix this by adding a function to force a ping to be transmitted.

    kAFS then keeps track of whether there's been a stall, and if so, uses the
    new function to ping the server, resetting the timeout to allow the reply
    to come back.

    If there's a stall, a ping and the call is *still* stalled in the same
    place after another period, then the call will be aborted.

    Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
    Fixes: f4d15fb6f99a ("rxrpc: Provide functions for allowing cleaner handling of signals")
    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

04 Oct, 2018

2 commits

  • Allow the epoch value to be queried on a server connection. This is in the
    rxrpc header of every packet for use in routing and is derived from the
    client's state. It's also not supposed to change unless the client gets
    restarted.

    AFS can make use of this information to deduce whether a fileserver has
    been restarted because the fileserver makes client calls to the filesystem
    driver's cache manager to send notifications (ie. callback breaks) about
    conflicting changes from other clients. These convey the fileserver's own
    epoch value back to the filesystem.

    Signed-off-by: David Howells

    David Howells
     
  • Allow the timestamp on the sk_buff holding the first DATA packet of a reply
    to be queried. This can then be used as a base for the expiry time
    calculation on the callback promise duration indicated by an operation
    result.

    Signed-off-by: David Howells

    David Howells
     

18 Oct, 2017

3 commits

  • Make AF_RXRPC accept MSG_WAITALL as a flag to sendmsg() to tell it to
    ignore signals whilst loading up the message queue, provided progress is
    being made in emptying the queue at the other side.

    Progress is defined as the base of the transmit window having being
    advanced within 2 RTT periods. If the period is exceeded with no progress,
    sendmsg() will return anyway, indicating how much data has been copied, if
    any.

    Once the supplied buffer is entirely decanted, the sendmsg() will return.

    Signed-off-by: David Howells

    David Howells
     
  • Provide a couple of functions to allow cleaner handling of signals in a
    kernel service. They are:

    (1) rxrpc_kernel_get_rtt()

    This allows the kernel service to find out the RTT time for a call, so
    as to better judge how large a timeout to employ.

    Note, though, that whilst this returns a value in nanoseconds, the
    timeouts can only actually be in jiffies.

    (2) rxrpc_kernel_check_life()

    This returns a number that is updated when ACKs are received from the
    peer (notably including PING RESPONSE ACKs which we can elicit by
    sending PING ACKs to see if the call still exists on the server).

    The caller should compare the numbers of two calls to see if the call
    is still alive.

    These can be used to provide an extending timeout rather than returning
    immediately in the case that a signal occurs that would otherwise abort an
    RPC operation. The timeout would be extended if the server is still
    responsive and the call is still apparently alive on the server.

    For most operations this isn't that necessary - but for FS.StoreData it is:
    OpenAFS writes the data to storage as it comes in without making a backup,
    so if we immediately abort it when partially complete on a CTRL+C, say, we
    have no idea of the state of the file after the abort.

    Signed-off-by: David Howells

    David Howells
     
  • Provide support for a kernel service to make use of the service upgrade
    facility. This involves:

    (1) Pass an upgrade request flag to rxrpc_kernel_begin_call().

    (2) Make rxrpc_kernel_recv_data() return the call's current service ID so
    that the caller can detect service upgrade and see what the service
    was upgraded to.

    Signed-off-by: David Howells

    David Howells
     

29 Aug, 2017

2 commits

  • Allow a client call that failed on network error to be retried, provided
    that the Tx queue still holds DATA packet 1. This allows an operation to
    be submitted to another server or another address for the same server
    without having to repackage and re-encrypt the data so far processed.

    Two new functions are provided:

    (1) rxrpc_kernel_check_call() - This is used to find out the completion
    state of a call to guess whether it can be retried and whether it
    should be retried.

    (2) rxrpc_kernel_retry_call() - Disconnect the call from its current
    connection, reset the state and submit it as a new client call to a
    new address. The new address need not match the previous address.

    A call may be retried even if all the data hasn't been loaded into it yet;
    a partially constructed will be retained at the same point it was at when
    an error condition was detected. msg_data_left() can be used to find out
    how much data was packaged before the error occurred.

    Signed-off-by: David Howells

    David Howells
     
  • Add a callback to rxrpc_kernel_send_data() so that a kernel service can get
    a notification that the AF_RXRPC call has transitioned out the Tx phase and
    is now waiting for a reply or a final ACK.

    This is called from AF_RXRPC with the call state lock held so the
    notification is guaranteed to come before any reply is passed back.

    Further, modify the AFS filesystem to make use of this so that we don't have
    to change the afs_call state before sending the last bit of data.

    Signed-off-by: David Howells

    David Howells
     

08 Jun, 2017

2 commits

  • Provide a control message that can be specified on the first sendmsg() of a
    client call or the first sendmsg() of a service response to indicate the
    total length of the data to be transmitted for that call.

    Currently, because the length of the payload of an encrypted DATA packet is
    encrypted in front of the data, the packet cannot be encrypted until we
    know how much data it will hold.

    By specifying the length at the beginning of the transmit phase, each DATA
    packet length can be set before we start loading data from userspace (where
    several sendmsg() calls may contribute to a particular packet).

    An error will be returned if too little or too much data is presented in
    the Tx phase.

    Signed-off-by: David Howells

    David Howells
     
  • Provide a getsockopt() call that can query what cmsg types are supported by
    AF_RXRPC.

    David Howells
     

05 Jun, 2017

3 commits

  • Make it possible for a client to use AuriStor's service upgrade facility.

    The client does this by adding an RXRPC_UPGRADE_SERVICE control message to
    the first sendmsg() of a call. This takes no parameters.

    When recvmsg() starts returning data from the call, the service ID field in
    the returned msg_name will reflect the result of the upgrade attempt. If
    the upgrade was ignored, srx_service will match what was set in the
    sendmsg(); if the upgrade happened the srx_service will be altered to
    indicate the service the server upgraded to.

    Note that:

    (1) The choice of upgrade service is up to the server

    (2) Further client calls to the same server that would share a connection
    are blocked if an upgrade probe is in progress.

    (3) This should only be used to probe the service. Clients should then
    use the returned service ID in all subsequent communications with that
    server (and not set the upgrade). Note that the kernel will not
    retain this information should the connection expire from its cache.

    (4) If a server that supports upgrading is replaced by one that doesn't,
    whilst a connection is live, and if the replacement is running, say,
    OpenAFS 1.6.4 or older or an older IBM AFS, then the replacement
    server will not respond to packets sent to the upgraded connection.

    At this point, calls will time out and the server must be reprobed.

    Signed-off-by: David Howells

    David Howells
     
  • Implement AuriStor's service upgrade facility. There are three problems
    that this is meant to deal with:

    (1) Various of the standard AFS RPC calls have IPv4 addresses in their
    requests and/or replies - but there's no room for including IPv6
    addresses.

    (2) Definition of IPv6-specific RPC operations in the standard operation
    sets has not yet been achieved.

    (3) One could envision the creation a new service on the same port that as
    the original service. The new service could implement improved
    operations - and the client could try this first, falling back to the
    original service if it's not there.

    Unfortunately, certain servers ignore packets addressed to a service
    they don't implement and don't respond in any way - not even with an
    ABORT. This means that the client must then wait for the call timeout
    to occur.

    What service upgrade does is to see if the connection is marked as being
    'upgradeable' and if so, change the service ID in the server and thus the
    request and reply formats. Note that the upgrade isn't mandatory - a
    server that supports only the original call set will ignore the upgrade
    request.

    In the protocol, the procedure is then as follows:

    (1) To request an upgrade, the first DATA packet in a new connection must
    have the userStatus set to 1 (this is normally 0). The userStatus
    value is normally ignored by the server.

    (2) If the server doesn't support upgrading, the reply packets will
    contain the same service ID as for the first request packet.

    (3) If the server does support upgrading, all future reply packets on that
    connection will contain the new service ID and the new service ID will
    be applied to *all* further calls on that connection as well.

    (4) The RPC op used to probe the upgrade must take the same request data
    as the shadow call in the upgrade set (but may return a different
    reply). GetCapability RPC ops were added to all standard sets for
    just this purpose. Ops where the request formats differ cannot be
    used for probing.

    (5) The client must wait for completion of the probe before sending any
    further RPC ops to the same destination. It should then use the
    service ID that recvmsg() reported back in all future calls.

    (6) The shadow service must have call definitions for all the operation
    IDs defined by the original service.

    To support service upgrading, a server should:

    (1) Call bind() twice on its AF_RXRPC socket before calling listen().
    Each bind() should supply a different service ID, but the transport
    addresses must be the same. This allows the server to receive
    requests with either service ID.

    (2) Enable automatic upgrading by calling setsockopt(), specifying
    RXRPC_UPGRADEABLE_SERVICE and passing in a two-member array of
    unsigned shorts as the argument:

    unsigned short optval[2];

    This specifies a pair of service IDs. They must be different and must
    match the service IDs bound to the socket. Member 0 is the service ID
    to upgrade from and member 1 is the service ID to upgrade to.

    Signed-off-by: David Howells

    David Howells
     
  • Permit bind() to be called on an AF_RXRPC socket more than once (currently
    maximum twice) to bind multiple listening services to it. There are some
    restrictions:

    (1) All bind() calls involved must have a non-zero service ID.

    (2) The service IDs must all be different.

    (3) The rest of the address (notably the transport part) must be the same
    in all (a single UDP socket is shared).

    (4) This must be done before listen() or sendmsg() is called.

    This allows someone to connect to the service socket with different service
    IDs and lays the foundation for service upgrading.

    The service ID used by an incoming call can be extracted from the msg_name
    returned by recvmsg().

    Signed-off-by: David Howells

    David Howells
     

02 Sep, 2016

1 commit

  • Don't expose skbs to in-kernel users, such as the AFS filesystem, but
    instead provide a notification hook the indicates that a call needs
    attention and another that indicates that there's a new call to be
    collected.

    This makes the following possibilities more achievable:

    (1) Call refcounting can be made simpler if skbs don't hold refs to calls.

    (2) skbs referring to non-data events will be able to be freed much sooner
    rather than being queued for AFS to pick up as rxrpc_kernel_recv_data
    will be able to consult the call state.

    (3) We can shortcut the receive phase when a call is remotely aborted
    because we don't have to go through all the packets to get to the one
    cancelling the operation.

    (4) It makes it easier to do encryption/decryption directly between AFS's
    buffers and sk_buffs.

    (5) Encryption/decryption can more easily be done in the AFS's thread
    contexts - usually that of the userspace process that issued a syscall
    - rather than in one of rxrpc's background threads on a workqueue.

    (6) AFS will be able to wait synchronously on a call inside AF_RXRPC.

    To make this work, the following interface function has been added:

    int rxrpc_kernel_recv_data(
    struct socket *sock, struct rxrpc_call *call,
    void *buffer, size_t bufsize, size_t *_offset,
    bool want_more, u32 *_abort_code);

    This is the recvmsg equivalent. It allows the caller to find out about the
    state of a specific call and to transfer received data into a buffer
    piecemeal.

    afs_extract_data() and rxrpc_kernel_recv_data() now do all the extraction
    logic between them. They don't wait synchronously yet because the socket
    lock needs to be dealt with.

    Five interface functions have been removed:

    rxrpc_kernel_is_data_last()
    rxrpc_kernel_get_abort_code()
    rxrpc_kernel_get_error_number()
    rxrpc_kernel_free_skb()
    rxrpc_kernel_data_consumed()

    As a temporary hack, sk_buffs going to an in-kernel call are queued on the
    rxrpc_call struct (->knlrecv_queue) rather than being handed over to the
    in-kernel user. To process the queue internally, a temporary function,
    temp_deliver_data() has been added. This will be replaced with common code
    between the rxrpc_recvmsg() path and the kernel_rxrpc_recv_data() path in a
    future patch.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

30 Aug, 2016

2 commits

  • Pass struct socket * to more rxrpc kernel interface functions. They should
    be starting from this rather than the socket pointer in the rxrpc_call
    struct if they need to access the socket.

    I have left:

    rxrpc_kernel_is_data_last()
    rxrpc_kernel_get_abort_code()
    rxrpc_kernel_get_error_number()
    rxrpc_kernel_free_skb()
    rxrpc_kernel_data_consumed()

    unmodified as they're all about to be removed (and, in any case, don't
    touch the socket).

    Signed-off-by: David Howells

    David Howells
     
  • Provide a function so that kernel users, such as AFS, can ask for the peer
    address of a call:

    void rxrpc_kernel_get_peer(struct rxrpc_call *call,
    struct sockaddr_rxrpc *_srx);

    In the future the kernel service won't get sk_buffs to look inside.
    Further, this allows us to hide any canonicalisation inside AF_RXRPC for
    when IPv6 support is added.

    Also propagate this through to afs_find_server() and issue a warning if we
    can't handle the address family yet.

    Signed-off-by: David Howells

    David Howells
     

06 Aug, 2016

1 commit

  • Inside the kafs filesystem it is possible to occasionally have a call
    processed and terminated before we've had a chance to check whether we need
    to clean up the rx queue for that call because afs_send_simple_reply() ends
    the call when it is done, but this is done in a workqueue item that might
    happen to run to completion before afs_deliver_to_call() completes.

    Further, it is possible for rxrpc_kernel_send_data() to be called to send a
    reply before the last request-phase data skb is released. The rxrpc skb
    destructor is where the ACK processing is done and the call state is
    advanced upon release of the last skb. ACK generation is also deferred to
    a work item because it's possible that the skb destructor is not called in
    a context where kernel_sendmsg() can be invoked.

    To this end, the following changes are made:

    (1) kernel_rxrpc_data_consumed() is added. This should be called whenever
    an skb is emptied so as to crank the ACK and call states. This does
    not release the skb, however. kernel_rxrpc_free_skb() must now be
    called to achieve that. These together replace
    rxrpc_kernel_data_delivered().

    (2) kernel_rxrpc_data_consumed() is wrapped by afs_data_consumed().

    This makes afs_deliver_to_call() easier to work as the skb can simply
    be discarded unconditionally here without trying to work out what the
    return value of the ->deliver() function means.

    The ->deliver() functions can, via afs_data_complete(),
    afs_transfer_reply() and afs_extract_data() mark that an skb has been
    consumed (thereby cranking the state) without the need to
    conditionally free the skb to make sure the state is correct on an
    incoming call for when the call processor tries to send the reply.

    (3) rxrpc_recvmsg() now has to call kernel_rxrpc_data_consumed() when it
    has finished with a packet and MSG_PEEK isn't set.

    (4) rxrpc_packet_destructor() no longer calls rxrpc_hard_ACK_data().

    Because of this, we no longer need to clear the destructor and put the
    call before we free the skb in cases where we don't want the ACK/call
    state to be cranked.

    (5) The ->deliver() call-type callbacks are made to return -EAGAIN rather
    than 0 if they expect more data (afs_extract_data() returns -EAGAIN to
    the delivery function already), and the caller is now responsible for
    producing an abort if that was the last packet.

    (6) There are many bits of unmarshalling code where:

    ret = afs_extract_data(call, skb, last, ...);
    switch (ret) {
    case 0: break;
    case -EAGAIN: return 0;
    default: return ret;
    }

    is to be found. As -EAGAIN can now be passed back to the caller, we
    now just return if ret < 0:

    ret = afs_extract_data(call, skb, last, ...);
    if (ret < 0)
    return ret;

    (7) Checks for trailing data and empty final data packets has been
    consolidated as afs_data_complete(). So:

    if (skb->len > 0)
    return -EBADMSG;
    if (!last)
    return 0;

    becomes:

    ret = afs_data_complete(call, skb, last);
    if (ret < 0)
    return ret;

    (8) afs_transfer_reply() now checks the amount of data it has against the
    amount of data desired and the amount of data in the skb and returns
    an error to induce an abort if we don't get exactly what we want.

    Without these changes, the following oops can occasionally be observed,
    particularly if some printks are inserted into the delivery path:

    general protection fault: 0000 [#1] SMP
    Modules linked in: kafs(E) af_rxrpc(E) [last unloaded: af_rxrpc]
    CPU: 0 PID: 1305 Comm: kworker/u8:3 Tainted: G E 4.7.0-fsdevel+ #1303
    Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
    Workqueue: kafsd afs_async_workfn [kafs]
    task: ffff88040be041c0 ti: ffff88040c070000 task.ti: ffff88040c070000
    RIP: 0010:[] [] __lock_acquire+0xcf/0x15a1
    RSP: 0018:ffff88040c073bc0 EFLAGS: 00010002
    RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: ffff88040d29a710
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88040d29a710
    RBP: ffff88040c073c70 R08: 0000000000000001 R09: 0000000000000001
    R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
    R13: 0000000000000000 R14: ffff88040be041c0 R15: ffffffff814c928f
    FS: 0000000000000000(0000) GS:ffff88041fa00000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fa4595f4750 CR3: 0000000001c14000 CR4: 00000000001406f0
    Stack:
    0000000000000006 000000000be04930 0000000000000000 ffff880400000000
    ffff880400000000 ffffffff8108f847 ffff88040be041c0 ffffffff81050446
    ffff8803fc08a920 ffff8803fc08a958 ffff88040be041c0 ffff88040c073c38
    Call Trace:
    [] ? mark_held_locks+0x5e/0x74
    [] ? __local_bh_enable_ip+0x9b/0xa1
    [] ? trace_hardirqs_on_caller+0x16d/0x189
    [] lock_acquire+0x122/0x1b6
    [] ? lock_acquire+0x122/0x1b6
    [] ? skb_dequeue+0x18/0x61
    [] _raw_spin_lock_irqsave+0x35/0x49
    [] ? skb_dequeue+0x18/0x61
    [] skb_dequeue+0x18/0x61
    [] afs_deliver_to_call+0x344/0x39d [kafs]
    [] afs_process_async_call+0x4c/0xd5 [kafs]
    [] afs_async_workfn+0xe/0x10 [kafs]
    [] process_one_work+0x29d/0x57c
    [] worker_thread+0x24a/0x385
    [] ? rescuer_thread+0x2d0/0x2d0
    [] kthread+0xf3/0xfb
    [] ret_from_fork+0x1f/0x40
    [] ? kthread_create_on_node+0x1cf/0x1cf

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

27 Feb, 2014

2 commits

  • Expose RxRPC parameters via sysctls to control the Rx window size, the Rx MTU
    maximum size and the number of packets that can be glued into a jumbo packet.

    More info added to Documentation/networking/rxrpc.txt.

    Signed-off-by: David Howells

    David Howells
     
  • Add sysctls for configuring RxRPC protocol handling, specifically controls on
    delays before ack generation, the delay before resending a packet, the maximum
    lifetime of a call and the expiration times of calls, connections and
    transports that haven't been recently used.

    More info added in Documentation/networking/rxrpc.txt.

    Signed-off-by: David Howells

    David Howells
     

31 Oct, 2013

1 commit


06 Jan, 2009

1 commit


20 Oct, 2007

1 commit

  • Most of these fixes were already submitted for old kernel versions, and were
    approved, but for some reason they never made it into the releases.

    Because this is a consolidation of a couple old missed patches, it touches both
    Kconfigs and documentation texts.

    Signed-off-by: Matt LaPlante
    Acked-by: Randy Dunlap
    Signed-off-by: Adrian Bunk

    Matt LaPlante
     

17 Oct, 2007

1 commit

  • Make request_key() and co fundamentally asynchronous to make it easier for
    NFS to make use of them. There are now accessor functions that do
    asynchronous constructions, a wait function to wait for construction to
    complete, and a completion function for the key type to indicate completion
    of construction.

    Note that the construction queue is now gone. Instead, keys under
    construction are linked in to the appropriate keyring in advance, and that
    anyone encountering one must wait for it to be complete before they can use
    it. This is done automatically for userspace.

    The following auxiliary changes are also made:

    (1) Key type implementation stuff is split from linux/key.h into
    linux/key-type.h.

    (2) AF_RXRPC provides a way to allocate null rxrpc-type keys so that AFS does
    not need to call key_instantiate_and_link() directly.

    (3) Adjust the debugging macros so that they're -Wformat checked even if
    they are disabled, and make it so they can be enabled simply by defining
    __KDEBUG to be consistent with other code of mine.

    (3) Documentation.

    [alan@lxorguk.ukuu.org.uk: keys: missing word in documentation]
    Signed-off-by: David Howells
    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

27 Apr, 2007

2 commits

  • Add an interface to the AF_RXRPC module so that the AFS filesystem module can
    more easily make use of the services available. AFS still opens a socket but
    then uses the action functions in lieu of sendmsg() and registers an intercept
    functions to grab messages before they're queued on the socket Rx queue.

    This permits AFS (or whatever) to:

    (1) Avoid the overhead of using the recvmsg() call.

    (2) Use different keys directly on individual client calls on one socket
    rather than having to open a whole slew of sockets, one for each key it
    might want to use.

    (3) Avoid calling request_key() at the point of issue of a call or opening of
    a socket. This is done instead by AFS at the point of open(), unlink() or
    other VFS operation and the key handed through.

    (4) Request the use of something other than GFP_KERNEL to allocate memory.

    Furthermore:

    (*) The socket buffer markings used by RxRPC are made available for AFS so
    that it can interpret the cooked RxRPC messages itself.

    (*) rxgen (un)marshalling abort codes are made available.

    The following documentation for the kernel interface is added to
    Documentation/networking/rxrpc.txt:

    =========================
    AF_RXRPC KERNEL INTERFACE
    =========================

    The AF_RXRPC module also provides an interface for use by in-kernel utilities
    such as the AFS filesystem. This permits such a utility to:

    (1) Use different keys directly on individual client calls on one socket
    rather than having to open a whole slew of sockets, one for each key it
    might want to use.

    (2) Avoid having RxRPC call request_key() at the point of issue of a call or
    opening of a socket. Instead the utility is responsible for requesting a
    key at the appropriate point. AFS, for instance, would do this during VFS
    operations such as open() or unlink(). The key is then handed through
    when the call is initiated.

    (3) Request the use of something other than GFP_KERNEL to allocate memory.

    (4) Avoid the overhead of using the recvmsg() call. RxRPC messages can be
    intercepted before they get put into the socket Rx queue and the socket
    buffers manipulated directly.

    To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket,
    bind an addess as appropriate and listen if it's to be a server socket, but
    then it passes this to the kernel interface functions.

    The kernel interface functions are as follows:

    (*) Begin a new client call.

    struct rxrpc_call *
    rxrpc_kernel_begin_call(struct socket *sock,
    struct sockaddr_rxrpc *srx,
    struct key *key,
    unsigned long user_call_ID,
    gfp_t gfp);

    This allocates the infrastructure to make a new RxRPC call and assigns
    call and connection numbers. The call will be made on the UDP port that
    the socket is bound to. The call will go to the destination address of a
    connected client socket unless an alternative is supplied (srx is
    non-NULL).

    If a key is supplied then this will be used to secure the call instead of
    the key bound to the socket with the RXRPC_SECURITY_KEY sockopt. Calls
    secured in this way will still share connections if at all possible.

    The user_call_ID is equivalent to that supplied to sendmsg() in the
    control data buffer. It is entirely feasible to use this to point to a
    kernel data structure.

    If this function is successful, an opaque reference to the RxRPC call is
    returned. The caller now holds a reference on this and it must be
    properly ended.

    (*) End a client call.

    void rxrpc_kernel_end_call(struct rxrpc_call *call);

    This is used to end a previously begun call. The user_call_ID is expunged
    from AF_RXRPC's knowledge and will not be seen again in association with
    the specified call.

    (*) Send data through a call.

    int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg,
    size_t len);

    This is used to supply either the request part of a client call or the
    reply part of a server call. msg.msg_iovlen and msg.msg_iov specify the
    data buffers to be used. msg_iov may not be NULL and must point
    exclusively to in-kernel virtual addresses. msg.msg_flags may be given
    MSG_MORE if there will be subsequent data sends for this call.

    The msg must not specify a destination address, control data or any flags
    other than MSG_MORE. len is the total amount of data to transmit.

    (*) Abort a call.

    void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code);

    This is used to abort a call if it's still in an abortable state. The
    abort code specified will be placed in the ABORT message sent.

    (*) Intercept received RxRPC messages.

    typedef void (*rxrpc_interceptor_t)(struct sock *sk,
    unsigned long user_call_ID,
    struct sk_buff *skb);

    void
    rxrpc_kernel_intercept_rx_messages(struct socket *sock,
    rxrpc_interceptor_t interceptor);

    This installs an interceptor function on the specified AF_RXRPC socket.
    All messages that would otherwise wind up in the socket's Rx queue are
    then diverted to this function. Note that care must be taken to process
    the messages in the right order to maintain DATA message sequentiality.

    The interceptor function itself is provided with the address of the socket
    and handling the incoming message, the ID assigned by the kernel utility
    to the call and the socket buffer containing the message.

    The skb->mark field indicates the type of message:

    MARK MEANING
    =============================== =======================================
    RXRPC_SKB_MARK_DATA Data message
    RXRPC_SKB_MARK_FINAL_ACK Final ACK received for an incoming call
    RXRPC_SKB_MARK_BUSY Client call rejected as server busy
    RXRPC_SKB_MARK_REMOTE_ABORT Call aborted by peer
    RXRPC_SKB_MARK_NET_ERROR Network error detected
    RXRPC_SKB_MARK_LOCAL_ERROR Local error encountered
    RXRPC_SKB_MARK_NEW_CALL New incoming call awaiting acceptance

    The remote abort message can be probed with rxrpc_kernel_get_abort_code().
    The two error messages can be probed with rxrpc_kernel_get_error_number().
    A new call can be accepted with rxrpc_kernel_accept_call().

    Data messages can have their contents extracted with the usual bunch of
    socket buffer manipulation functions. A data message can be determined to
    be the last one in a sequence with rxrpc_kernel_is_data_last(). When a
    data message has been used up, rxrpc_kernel_data_delivered() should be
    called on it..

    Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose
    of. It is possible to get extra refs on all types of message for later
    freeing, but this may pin the state of a call until the message is finally
    freed.

    (*) Accept an incoming call.

    struct rxrpc_call *
    rxrpc_kernel_accept_call(struct socket *sock,
    unsigned long user_call_ID);

    This is used to accept an incoming call and to assign it a call ID. This
    function is similar to rxrpc_kernel_begin_call() and calls accepted must
    be ended in the same way.

    If this function is successful, an opaque reference to the RxRPC call is
    returned. The caller now holds a reference on this and it must be
    properly ended.

    (*) Reject an incoming call.

    int rxrpc_kernel_reject_call(struct socket *sock);

    This is used to reject the first incoming call on the socket's queue with
    a BUSY message. -ENODATA is returned if there were no incoming calls.
    Other errors may be returned if the call had been aborted (-ECONNABORTED)
    or had timed out (-ETIME).

    (*) Record the delivery of a data message and free it.

    void rxrpc_kernel_data_delivered(struct sk_buff *skb);

    This is used to record a data message as having been delivered and to
    update the ACK state for the call. The socket buffer will be freed.

    (*) Free a message.

    void rxrpc_kernel_free_skb(struct sk_buff *skb);

    This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC
    socket.

    (*) Determine if a data message is the last one on a call.

    bool rxrpc_kernel_is_data_last(struct sk_buff *skb);

    This is used to determine if a socket buffer holds the last data message
    to be received for a call (true will be returned if it does, false
    if not).

    The data message will be part of the reply on a client call and the
    request on an incoming call. In the latter case there will be more
    messages, but in the former case there will not.

    (*) Get the abort code from an abort message.

    u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb);

    This is used to extract the abort code from a remote abort message.

    (*) Get the error number from a local or network error message.

    int rxrpc_kernel_get_error_number(struct sk_buff *skb);

    This is used to extract the error number from a message indicating either
    a local error occurred or a network error occurred.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     
  • Provide AF_RXRPC sockets that can be used to talk to AFS servers, or serve
    answers to AFS clients. KerberosIV security is fully supported. The patches
    and some example test programs can be found in:

    http://people.redhat.com/~dhowells/rxrpc/

    This will eventually replace the old implementation of kernel-only RxRPC
    currently resident in net/rxrpc/.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells