17 Feb, 2020

1 commit

  • This if guards whether user-space wants a copy of the offload-jited
    bytecode and whether this bytecode exists. By erroneously doing a bitwise
    AND instead of a logical AND on user- and kernel-space buffer-size can lead
    to no data being copied to user-space especially when user-space size is a
    power of two and bigger then the kernel-space buffer.

    Fixes: fcfb126defda ("bpf: add new jited info fields in bpf_dev_offload and bpf_prog_info")
    Signed-off-by: Johannes Krude
    Signed-off-by: Daniel Borkmann
    Acked-by: Jakub Kicinski
    Link: https://lore.kernel.org/bpf/20200212193227.GA3769@phlox.h.transitiv.net

    Johannes Krude
     

09 Dec, 2019

1 commit

  • ns_get_path() and ns_get_path_cb() only ever return either NULL or an
    ERR_PTR. It is far more idiomatic to simply return an integer, and it
    makes all of the callers of ns_get_path() more straightforward to read.

    Fixes: e149ed2b805f ("take the targets of /proc/*/ns/* symlinks to separate fs")
    Signed-off-by: Aleksa Sarai
    Signed-off-by: Al Viro

    Aleksa Sarai
     

07 Nov, 2019

1 commit


13 Feb, 2019

1 commit

  • Currently bpf_offload_dev does not have any priv pointer, forcing
    the drivers to work backwards from the netdev in program metadata.
    This is not great given programs are conceptually associated with
    the offload device, and it means one or two unnecessary deferences.
    Add a priv pointer to bpf_offload_dev.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

24 Jan, 2019

1 commit

  • Let offload JITs know when instructions are replaced and optimized
    out, so they can update their state appropriately. The optimizations
    are best effort, if JIT returns an error from any callback verifier
    will stop notifying it as state may now be out of sync, but the
    verifier continues making progress.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Alexei Starovoitov

    Jakub Kicinski
     

17 Nov, 2018

1 commit

  • Pointer offload is being null checked however the following statement
    dereferences the potentially null pointer offload when assigning
    offload->dev_state. Fix this by only assigning it if offload is not
    null.

    Detected by CoverityScan, CID#1475437 ("Dereference after null check")

    Fixes: 00db12c3d141 ("bpf: call verifier_prep from its callback in struct bpf_offload_dev")
    Signed-off-by: Colin Ian King
    Acked-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Colin Ian King
     

11 Nov, 2018

8 commits

  • The kernel functions to prepare verifier and translate for offloaded
    program retrieve "offload" from "prog", and "netdev" from "offload".
    Then both "prog" and "netdev" are passed to the callbacks.

    Simplify this by letting the drivers retrieve the net device themselves
    from the offload object attached to prog - if they need it at all. There
    is currently no need to pass the netdev as an argument to those
    functions.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • Function bpf_prog_offload_verifier_prep(), called from the kernel BPF
    verifier to run a driver-specific callback for preparing for the
    verification step for offloaded programs, takes a pointer to a struct
    bpf_verifier_env object. However, no driver callback needs the whole
    structure at this time: the two drivers supporting this, nfp and
    netdevsim, only need a pointer to the struct bpf_prog instance held by
    env.

    Update the callback accordingly, on kernel side and in these two
    drivers.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • As part of the transition from ndo_bpf() to callbacks attached to struct
    bpf_offload_dev for some of the eBPF offload operations, move the
    functions related to program destruction to the struct and remove the
    subcommand that was used to call them through the NDO.

    Remove function __bpf_offload_ndo(), which is no longer used.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • As part of the transition from ndo_bpf() to callbacks attached to struct
    bpf_offload_dev for some of the eBPF offload operations, move the
    functions related to code translation to the struct and remove the
    subcommand that was used to call them through the NDO.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • In a way similar to the change previously brought to the verify_insn
    hook and to the finalize callback, switch to the newly added ops in
    struct bpf_prog_offload for calling the functions used to prepare driver
    verifiers.

    Since the dev_ops pointer in struct bpf_prog_offload is no longer used
    by any callback, we can now remove it from struct bpf_prog_offload.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • In a way similar to the change previously brought to the verify_insn
    hook, switch to the newly added ops in struct bpf_prog_offload for
    calling the functions used to perform final verification steps for
    offloaded programs.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • We intend to remove the dev_ops in struct bpf_prog_offload, and to only
    keep the ops in struct bpf_offload_dev instead, which is accessible from
    more locations for passing function pointers.

    But dev_ops is used for calling the verify_insn hook. Switch to the
    newly added ops in struct bpf_prog_offload instead.

    To avoid table lookups for each eBPF instruction to verify, we remember
    the offdev attached to a netdev and modify bpf_offload_find_netdev() to
    avoid performing more than once a lookup for a given offload object.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     
  • For passing device functions for offloaded eBPF programs, there used to
    be no place where to store the pointer without making the non-offloaded
    programs pay a memory price.

    As a consequence, three functions were called with ndo_bpf() through
    specific commands. Now that we have struct bpf_offload_dev, and since
    none of those operations rely on RTNL, we can turn these three commands
    into hooks inside the struct bpf_prog_offload_ops, and pass them as part
    of bpf_offload_dev_create().

    This commit effectively passes a pointer to the struct to
    bpf_offload_dev_create(). We temporarily have two struct
    bpf_prog_offload_ops instances, one under offdev->ops and one under
    offload->dev_ops. The next patches will make the transition towards the
    former, so that offload->dev_ops can be removed, and callbacks relying
    on ndo_bpf() added to offdev->ops as well.

    While at it, rename "nfp_bpf_analyzer_ops" as "nfp_bpf_dev_ops" (and
    similarly for netdevsim).

    Suggested-by: Jakub Kicinski
    Signed-off-by: Quentin Monnet
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Alexei Starovoitov

    Quentin Monnet
     

08 Oct, 2018

1 commit

  • In preparation for BPF-to-BPF calls in offloaded programs, add a new
    function attribute to the struct bpf_prog_offload_ops so that drivers
    supporting eBPF offload can hook at the end of program verification, and
    potentially extract information collected by the verifier.

    Implement a minimal callback (returning 0) in the drivers providing the
    structs, namely netdevsim and nfp.

    This will be useful in the nfp driver, in later commits, to extract the
    number of subprograms as well as the stack depth for those subprograms.

    Signed-off-by: Quentin Monnet
    Reviewed-by: Jiong Wang
    Reviewed-by: Jakub Kicinski
    Signed-off-by: Daniel Borkmann

    Quentin Monnet
     

18 Jul, 2018

4 commits

  • Allow programs and maps to be re-used across different netdevs,
    as long as they belong to the same struct bpf_offload_dev.
    Update the bpf_offload_prog_map_match() helper for the verifier
    and export a new helper for the drivers to use when checking
    programs at attachment time.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • Create a higher-level entity to represent a device/ASIC to allow
    programs and maps to be shared between device ports. The extra
    work is required to make sure we don't destroy BPF objects as
    soon as the netdev for which they were loaded gets destroyed,
    as other ports may still be using them. When netdev goes away
    all of its BPF objects will be moved to other netdevs of the
    device, and only destroyed when last netdev is unregistered.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • Currently we have two lists of offloaded objects - programs and maps.
    Netdevice unregister notifier scans those lists to orphan objects
    associated with device being unregistered. This puts unnecessary
    (even if negligible) burden on all netdev unregister calls in BPF-
    -enabled kernel. The lists of objects may potentially get long
    making the linear scan even more problematic. There haven't been
    complaints about this mechanisms so far, but it is suboptimal.

    Instead of relying on notifiers, make the few BPF-capable drivers
    register explicitly for BPF offloads. The programs and maps will
    now be collected per-device not on a global list, and only scanned
    for removal when driver unregisters from BPF offloads.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • A set of new API functions exported for the drivers will soon use
    'bpf_offload_dev_' as a prefix. Rename the bpf_offload_dev_match()
    which is internal to the core (used by the verifier) to avoid any
    confusion.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

05 May, 2018

1 commit

  • BPF_MAP_TYPE_PERF_EVENT_ARRAY is special as far as offload goes.
    The map only holds glue to perf ring, not actual data. Allow
    non-offloaded perf event arrays to be used in offloaded programs.
    Offload driver can extract the events from HW and put them in
    the map for user space to retrieve.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Reviewed-by: Jiong Wang
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

19 Jan, 2018

2 commits


18 Jan, 2018

1 commit

  • For host JIT, there are "jited_len"/"bpf_func" fields in struct bpf_prog
    used by all host JIT targets to get jited image and it's length. While for
    offload, targets are likely to have different offload mechanisms that these
    info are kept in device private data fields.

    Therefore, BPF_OBJ_GET_INFO_BY_FD syscall needs an unified way to get JIT
    length and contents info for offload targets.

    One way is to introduce new callback to parse device private data then fill
    those fields in bpf_prog_info. This might be a little heavy, the other way
    is to add generic fields which will be initialized by all offload targets.

    This patch follow the second approach to introduce two new fields in
    struct bpf_dev_offload and teach bpf_prog_get_info_by_fd about them to fill
    correct jited_prog_len and jited_prog_insns in bpf_prog_info.

    Reviewed-by: Jakub Kicinski
    Signed-off-by: Jiong Wang
    Acked-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Jiong Wang
     

17 Jan, 2018

1 commit

  • Daniel suggests it would be more logical for bpf_offload_dev_match()
    to return false is either the program or the map are not offloaded,
    rather than treating the both not offloaded case as a "matching
    CPU/host device".

    This makes no functional difference today, since verifier only calls
    bpf_offload_dev_match() when one of the objects is offloaded.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

15 Jan, 2018

3 commits

  • BPF map offload follow similar path to program offload. At creation
    time users may specify ifindex of the device on which they want to
    create the map. Map will be validated by the kernel's
    .map_alloc_check callback and device driver will be called for the
    actual allocation. Map will have an empty set of operations
    associated with it (save for alloc and free callbacks). The real
    device callbacks are kept in map->offload->dev_ops because they
    have slightly different signatures. Map operations are called in
    process context so the driver may communicate with HW freely,
    msleep(), wait() etc.

    Map alloc and free callbacks are muxed via existing .ndo_bpf, and
    are always called with rtnl lock held. Maps and programs are
    guaranteed to be destroyed before .ndo_uninit (i.e. before
    unregister_netdev() returns). Map callbacks are invoked with
    bpf_devs_lock *read* locked, drivers must take care of exclusive
    locking if necessary.

    All offload-specific branches are marked with unlikely() (through
    bpf_map_is_dev_bound()), given that branch penalty will be
    negligible compared to IO anyway, and we don't want to penalize
    SW path unnecessarily.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • Add a helper to check if netdev could be found and whether it
    has .ndo_bpf callback. There is no need to check the callback
    every time it's invoked, ndos can't reasonably be swapped for
    a set without .ndp_bpf while program is loaded.

    bpf_dev_offload_check() will also be used by map offload.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • With map offload coming, we need to call program offload structure
    something less ambiguous. Pure rename, no functional changes.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

31 Dec, 2017

5 commits

  • Report to the user ifindex and namespace information of offloaded
    programs. If device has disappeared return -ENODEV. Specify the
    namespace using dev/inode combination.

    CC: Eric W. Biederman
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • Bound programs are quite useless after their device disappears.
    They are simply waiting for reference count to go to zero,
    don't list them in BPF_PROG_GET_NEXT_ID by freeing their ID
    early.

    Note that orphaned offload programs will return -ENODEV on
    BPF_OBJ_GET_INFO_BY_FD so user will never see ID 0.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • All bpf offload operations should now be under bpf_devs_lock,
    it's safe to free and clear the entire offload structure,
    not only the netdev pointer.

    __bpf_prog_offload_destroy() will no longer be called multiple
    times.

    Suggested-by: Alexei Starovoitov
    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • To allow verifier instruction callbacks without any extra locking
    NETDEV_UNREGISTER notification would wait on a waitqueue for verifier
    to finish. This design decision was made when rtnl lock was providing
    all the locking. Use the read/write lock instead and remove the
    workqueue.

    Verifier will now call into the offload code, so dev_ops are moved
    to offload structure. Since verifier calls are all under
    bpf_prog_is_dev_bound() we no longer need static inline implementations
    to please builds with CONFIG_NET=n.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • We don't need the RTNL lock for all operations on offload state.
    We only need to hold it around ndo calls. The device offload
    initialization doesn't require it. The soon-to-come querying
    of the offload info will only need it partially. We will also
    be able to remove the waitqueue in following patches.

    Use struct rw_semaphore because map offload will require sleeping
    with the semaphore held for read.

    Suggested-by: Kirill Tkhai
    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

28 Nov, 2017

1 commit


21 Nov, 2017

5 commits

  • This reverts commit bd601b6ada11 ("bpf: report offload info to user
    space"). The ifindex by itself is not sufficient, we should provide
    information on which network namespace this ifindex belongs to.
    After considering some options we concluded that it's best to just
    remove this API for now, and rework it in -next.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • We are currently destroying the device offload state when device
    moves to another net namespace. This doesn't break with current
    NFP code, because offload state is not used on program removal,
    but it's not correct behaviour.

    Ignore the device unregister notifications on namespace move.

    Signed-off-by: Jakub Kicinski
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • bpf_target_prog seems long and clunky, rename it to prog_ifindex.
    We don't want to call this field just ifindex, because maps
    may need a similar field in the future and bpf_attr members for
    programs and maps are unnamed.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • We are currently only allowing attachment of device-bound
    cls_bpf and XDP programs. Make this restriction explicit in
    the BPF offload code. This way we can potentially reuse the
    ifindex field in the future.

    Since XDP and cls_bpf programs can only be loaded by admin,
    we can drop the explicit capability check from offload code.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     
  • Offload state may get destroyed either because the device for which
    it was constructed is going away, or because the refcount of bpf
    program itself has reached 0. In both of those cases we will call
    __bpf_prog_offload_destroy() to unlink the offload from the device.
    We may in fact call it twice, which works just fine, but we should
    make clear this is intended and caution others trying to extend the
    function.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Quentin Monnet
    Acked-by: Alexei Starovoitov
    Acked-by: Daniel Borkmann
    Signed-off-by: Daniel Borkmann

    Jakub Kicinski
     

05 Nov, 2017

2 commits

  • Extend struct bpf_prog_info to contain information about program
    being bound to a device. Since the netdev may get destroyed while
    program still exists we need a flag to indicate the program is
    loaded for a device, even if the device is gone.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Simon Horman
    Reviewed-by: Quentin Monnet
    Signed-off-by: David S. Miller

    Jakub Kicinski
     
  • The fact that we don't know which device the program is going
    to be used on is quite limiting in current eBPF infrastructure.
    We have to reverse or limit the changes which kernel makes to
    the loaded bytecode if we want it to be offloaded to a networking
    device. We also have to invent new APIs for debugging and
    troubleshooting support.

    Make it possible to load programs for a specific netdev. This
    helps us to bring the debug information closer to the core
    eBPF infrastructure (e.g. we will be able to reuse the verifer
    log in device JIT). It allows device JITs to perform translation
    on the original bytecode.

    __bpf_prog_get() when called to get a reference for an attachment
    point will now refuse to give it if program has a device assigned.
    Following patches will add a version of that function which passes
    the expected netdev in. @type argument in __bpf_prog_get() is
    renamed to attach_type to make it clearer that it's only set on
    attachment.

    All calls to ndo_bpf are protected by rtnl, only verifier callbacks
    are not. We need a wait queue to make sure netdev doesn't get
    destroyed while verifier is still running and calling its driver.

    Signed-off-by: Jakub Kicinski
    Reviewed-by: Simon Horman
    Reviewed-by: Quentin Monnet
    Signed-off-by: David S. Miller

    Jakub Kicinski