31 Aug, 2022

1 commit

  • [ Upstream commit ba953a9d89a00c078b85f4b190bc1dde66fe16b5 ]

    When namespace support was added to xfrm/afkey, it caused the
    previously single-threaded call to xfrm_probe_algs to become
    multi-threaded. This is buggy and needs to be fixed with a mutex.

    Reported-by: Abhishek Shah
    Fixes: 283bc9f35bbb ("xfrm: Namespacify xfrm state/policy locks")
    Signed-off-by: Herbert Xu
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Herbert Xu
     

15 Jun, 2022

1 commit

  • [ Upstream commit 9c90c9b3e50e16d03c7f87d63e9db373974781e0 ]

    This reverts commit 4dc2a5a8f6754492180741facf2a8787f2c415d7.

    A non-zero return value from pfkey_broadcast() does not necessarily mean
    an error occurred as this function returns -ESRCH when no registered
    listener received the message. In particular, a call with
    BROADCAST_PROMISC_ONLY flag and null one_sk argument can never return
    zero so that this commit in fact prevents processing any PF_KEY message.
    One visible effect is that racoon daemon fails to find encryption
    algorithms like aes and refuses to start.

    Excluding -ESRCH return value would fix this but it's not obvious that
    we really want to bail out here and most other callers of
    pfkey_broadcast() also ignore the return value. Also, as pointed out by
    Steffen Klassert, PF_KEY is kind of deprecated and newer userspace code
    should use netlink instead so that we should only disturb the code for
    really important fixes.

    v2: add a comment explaining why is the return value ignored

    Signed-off-by: Michal Kubecek
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Michal Kubecek
     

06 Jun, 2022

1 commit

  • [ Upstream commit 015c44d7bff3f44d569716117becd570c179ca32 ]

    Since the recent introduction supporting the SM3 and SM4 hash algos for IPsec, the kernel
    produces invalid pfkey acquire messages, when these encryption modules are disabled. This
    happens because the availability of the algos wasn't checked in all necessary functions.
    This patch adds these checks.

    Signed-off-by: Thomas Bartschies
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Thomas Bartschies
     

25 May, 2022

1 commit

  • [ Upstream commit 4dc2a5a8f6754492180741facf2a8787f2c415d7 ]

    If skb_clone() returns null pointer, pfkey_broadcast() will
    return error.
    Therefore, it should be better to check the return value of
    pfkey_broadcast() and return error if fails.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Jiasheng Jiang
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Jiasheng Jiang
     

08 Apr, 2022

1 commit

  • [ Upstream commit 9a564bccb78a76740ea9d75a259942df8143d02c ]

    Add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
    to initialize the buffer of supp_skb to fix a kernel-info-leak issue.
    1) Function pfkey_register calls compose_sadb_supported to request
    a sk_buff. 2) compose_sadb_supported calls alloc_sbk to allocate
    a sk_buff, but it doesn't zero it. 3) If auth_len is greater 0, then
    compose_sadb_supported treats the memory as a struct sadb_supported and
    begins to initialize. But it just initializes the field sadb_supported_len
    and field sadb_supported_exttype without field sadb_supported_reserved.

    Reported-by: TCS Robot
    Signed-off-by: Haimin Zhang
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Haimin Zhang
     

19 Mar, 2022

1 commit

  • [ Upstream commit c1aca3080e382886e2e58e809787441984a2f89b ]

    This patch enables distinguishing SAs and SPs based on if_id during
    the xfrm_migrate flow. This ensures support for xfrm interfaces
    throughout the SA/SP lifecycle.

    When there are multiple existing SPs with the same direction,
    the same xfrm_selector and different endpoint addresses,
    xfrm_migrate might fail with ENODATA.

    Specifically, the code path for performing xfrm_migrate is:
    Stage 1: find policy to migrate with
    xfrm_migrate_policy_find(sel, dir, type, net)
    Stage 2: find and update state(s) with
    xfrm_migrate_state_find(mp, net)
    Stage 3: update endpoint address(es) of template(s) with
    xfrm_policy_migrate(pol, m, num_migrate)

    Currently "Stage 1" always returns the first xfrm_policy that
    matches, and "Stage 3" looks for the xfrm_tmpl that matches the
    old endpoint address. Thus if there are multiple xfrm_policy
    with same selector, direction, type and net, "Stage 1" might
    rertun a wrong xfrm_policy and "Stage 3" will fail with ENODATA
    because it cannot find a xfrm_tmpl with the matching endpoint
    address.

    The fix is to allow userspace to pass an if_id and add if_id
    to the matching rule in Stage 1 and Stage 2 since if_id is a
    unique ID for xfrm_policy and xfrm_state. For compatibility,
    if_id will only be checked if the attribute is set.

    Tested with additions to Android's kernel unit test suite:
    https://android-review.googlesource.com/c/kernel/tests/+/1668886

    Signed-off-by: Yan Yan
    Signed-off-by: Steffen Klassert
    Signed-off-by: Sasha Levin

    Yan Yan
     

26 May, 2021

1 commit


04 Jan, 2021

1 commit

  • xfrm_probe_algs() probes kernel crypto modules and changes the
    availability of struct xfrm_algo_desc. But there is a small window
    where ealg->available and aalg->available get changed between
    count_ah_combs()/count_esp_combs() and dump_ah_combs()/dump_esp_combs(),
    in this case we may allocate a smaller skb but later put a larger
    amount of data and trigger the panic in skb_put().

    Fix this by relaxing the checks when counting the size, that is,
    skipping the test of ->available. We may waste some memory for a few
    of sizeof(struct sadb_comb), but it is still much better than a panic.

    Reported-by: syzbot+b2bf2652983d23734c5c@syzkaller.appspotmail.com
    Cc: Steffen Klassert
    Cc: Herbert Xu
    Signed-off-by: Cong Wang
    Signed-off-by: Steffen Klassert

    Cong Wang
     

02 Aug, 2020

1 commit


22 Jul, 2020

1 commit

  • In pfkey_dump() dplen and splen can both be specified to access the
    xfrm_address_t structure out of bounds in__xfrm_state_filter_match()
    when it calls addr_match() with the indexes. Return EINVAL if either
    are out of range.

    Signed-off-by: Mark Salyzyn
    Cc: netdev@vger.kernel.org
    Cc: linux-kernel@vger.kernel.org
    Cc: kernel-team@android.com
    Cc: Steffen Klassert
    Cc: Herbert Xu
    Cc: "David S. Miller"
    Cc: Jakub Kicinski
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Steffen Klassert

    Mark Salyzyn
     

20 Jul, 2020

1 commit


24 Jun, 2020

1 commit

  • In commit ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list"),
    it would take 'priority' to make a policy unique, and allow duplicated
    policies with different 'priority' to be added, which is not expected
    by userland, as Tobias reported in strongswan.

    To fix this duplicated policies issue, and also fix the issue in
    commit ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list"),
    when doing add/del/get/update on user interfaces, this patch is to change
    to look up a policy with both mark and mask by doing:

    mark.v == pol->mark.v && mark.m == pol->mark.m

    and leave the check:

    (mark & pol->mark.m) == pol->mark.v

    for tx/rx path only.

    As the userland expects an exact mark and mask match to manage policies.

    v1->v2:
    - make xfrm_policy_mark_match inline and fix the changelog as
    Tobias suggested.

    Fixes: 295fae568885 ("xfrm: Allow user space manipulation of SPD mark")
    Fixes: ed17b8d377ea ("xfrm: fix a warning in xfrm_policy_insert_list")
    Reported-by: Tobias Brunner
    Tested-by: Tobias Brunner
    Signed-off-by: Xin Long
    Signed-off-by: Steffen Klassert

    Xin Long
     

09 Jul, 2019

1 commit


06 Jul, 2019

1 commit

  • Steffen Klassert says:

    ====================
    pull request (net): ipsec 2019-07-05

    1) Fix xfrm selector prefix length validation for
    inter address family tunneling.
    From Anirudh Gupta.

    2) Fix a memleak in pfkey.
    From Jeremy Sowden.

    3) Fix SA selector validation to allow empty selectors again.
    From Nicolas Dichtel.

    4) Select crypto ciphers for xfrm_algo, this fixes some
    randconfig builds. From Arnd Bergmann.

    5) Remove a duplicated assignment in xfrm_bydst_resize.
    From Cong Wang.

    6) Fix a hlist corruption on hash rebuild.
    From Florian Westphal.

    7) Fix a memory leak when creating xfrm interfaces.
    From Nicolas Dichtel.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

10 Jun, 2019

1 commit

  • fix below warnings reported by coccicheck

    net/key/af_key.c:932:2-5: WARNING: Use BUG_ON instead of if condition
    followed by BUG.
    net/key/af_key.c:948:2-5: WARNING: Use BUG_ON instead of if condition
    followed by BUG.

    Signed-off-by: Hariprasad Kelam
    Signed-off-by: David S. Miller

    Hariprasad Kelam
     

31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

28 May, 2019

1 commit

  • In both functions, if pfkey_xfrm_policy2msg failed we leaked the newly
    allocated sk_buff. Free it on error.

    Fixes: 55569ce256ce ("Fix conversion between IPSEC_MODE_xxx and XFRM_MODE_xxx.")
    Reported-by: syzbot+4f0529365f7f2208d9f0@syzkaller.appspotmail.com
    Signed-off-by: Jeremy Sowden
    Signed-off-by: Steffen Klassert

    Jeremy Sowden
     

21 May, 2019

1 commit


26 Mar, 2019

1 commit

  • In commit 6a53b7593233 ("xfrm: check id proto in validate_tmpl()")
    I introduced a check for xfrm protocol, but according to Herbert
    IPSEC_PROTO_ANY should only be used as a wildcard for lookup, so
    it should be removed from validate_tmpl().

    And, IPSEC_PROTO_ANY is expected to only match 3 IPSec-specific
    protocols, this is why xfrm_state_flush() could still miss
    IPPROTO_ROUTING, which leads that those entries are left in
    net->xfrm.state_all before exit net. Fix this by replacing
    IPSEC_PROTO_ANY with zero.

    This patch also extracts the check from validate_tmpl() to
    xfrm_id_proto_valid() and uses it in parse_ipsecrequest().
    With this, no other protocols should be added into xfrm.

    Fixes: 6a53b7593233 ("xfrm: check id proto in validate_tmpl()")
    Reported-by: syzbot+0bf0519d6e0de15914fe@syzkaller.appspotmail.com
    Cc: Steffen Klassert
    Cc: Herbert Xu
    Signed-off-by: Cong Wang
    Acked-by: Herbert Xu
    Signed-off-by: Steffen Klassert

    Cong Wang
     

12 Feb, 2019

1 commit

  • Attempting to avoid cloning the skb when broadcasting by inflating
    the refcount with sock_hold/sock_put while under RCU lock is dangerous
    and violates RCU principles. It leads to subtle race conditions when
    attempting to free the SKB, as we may reference sockets that have
    already been freed by the stack.

    Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c4b
    [006b6b6b6b6b6c4b] address between user and kernel address ranges
    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    task: fffffff78f65b380 task.stack: ffffff8049a88000
    pc : sock_rfree+0x38/0x6c
    lr : skb_release_head_state+0x6c/0xcc
    Process repro (pid: 7117, stack limit = 0xffffff8049a88000)
    Call trace:
    sock_rfree+0x38/0x6c
    skb_release_head_state+0x6c/0xcc
    skb_release_all+0x1c/0x38
    __kfree_skb+0x1c/0x30
    kfree_skb+0xd0/0xf4
    pfkey_broadcast+0x14c/0x18c
    pfkey_sendmsg+0x1d8/0x408
    sock_sendmsg+0x44/0x60
    ___sys_sendmsg+0x1d0/0x2a8
    __sys_sendmsg+0x64/0xb4
    SyS_sendmsg+0x34/0x4c
    el0_svc_naked+0x34/0x38
    Kernel panic - not syncing: Fatal exception

    Suggested-by: Eric Dumazet
    Signed-off-by: Sean Tranchetti
    Signed-off-by: Steffen Klassert

    Sean Tranchetti
     

05 Feb, 2019

1 commit

  • xfrm_state_put() moves struct xfrm_state to the GC list
    and schedules the GC work to clean it up. On net exit call
    path, xfrm_state_flush() is called to clean up and
    xfrm_flush_gc() is called to wait for the GC work to complete
    before exit.

    However, this doesn't work because one of the ->destructor(),
    ipcomp_destroy(), schedules the same GC work again inside
    the GC work. It is hard to wait for such a nested async
    callback. This is also why syzbot still reports the following
    warning:

    WARNING: CPU: 1 PID: 33 at net/ipv6/xfrm6_tunnel.c:351 xfrm6_tunnel_net_exit+0x2cb/0x500 net/ipv6/xfrm6_tunnel.c:351
    ...
    ops_exit_list.isra.0+0xb0/0x160 net/core/net_namespace.c:153
    cleanup_net+0x51d/0xb10 net/core/net_namespace.c:551
    process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
    worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
    kthread+0x357/0x430 kernel/kthread.c:246
    ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352

    In fact, it is perfectly fine to bypass GC and destroy xfrm_state
    synchronously on net exit call path, because it is in process context
    and doesn't need a work struct to do any blocking work.

    This patch introduces xfrm_state_put_sync() which simply bypasses
    GC, and lets its callers to decide whether to use this synchronous
    version. On net exit path, xfrm_state_fini() and
    xfrm6_tunnel_net_exit() use it. And, as ipcomp_destroy() itself is
    blocking, it can use xfrm_state_put_sync() directly too.

    Also rename xfrm_state_gc_destroy() to ___xfrm_state_destroy() to
    reflect this change.

    Fixes: b48c05ab5d32 ("xfrm: Fix warning in xfrm6_tunnel_net_exit.")
    Reported-and-tested-by: syzbot+e9aebef558e3ed673934@syzkaller.appspotmail.com
    Cc: Steffen Klassert
    Signed-off-by: Cong Wang
    Signed-off-by: Steffen Klassert

    Cong Wang
     

16 Nov, 2018

1 commit


28 Jul, 2018

1 commit

  • Steffen Klassert says:

    ====================
    pull request (net-next): ipsec-next 2018-07-27

    1) Extend the output_mark to also support the input direction
    and masking the mark values before applying to the skb.

    2) Add a new lookup key for the upcomming xfrm interfaces.

    3) Extend the xfrm lookups to match xfrm interface IDs.

    4) Add virtual xfrm interfaces. The purpose of these interfaces
    is to overcome the design limitations that the existing
    VTI devices have.

    The main limitations that we see with the current VTI are the
    following:

    VTI interfaces are L3 tunnels with configurable endpoints.
    For xfrm, the tunnel endpoint are already determined by the SA.
    So the VTI tunnel endpoints must be either the same as on the
    SA or wildcards. In case VTI tunnel endpoints are same as on
    the SA, we get a one to one correlation between the SA and
    the tunnel. So each SA needs its own tunnel interface.

    On the other hand, we can have only one VTI tunnel with
    wildcard src/dst tunnel endpoints in the system because the
    lookup is based on the tunnel endpoints. The existing tunnel
    lookup won't work with multiple tunnels with wildcard
    tunnel endpoints. Some usecases require more than on
    VTI tunnel of this type, for example if somebody has multiple
    namespaces and every namespace requires such a VTI.

    VTI needs separate interfaces for IPv4 and IPv6 tunnels.
    So when routing to a VTI, we have to know to which address
    family this traffic class is going to be encapsulated.
    This is a lmitation because it makes routing more complex
    and it is not always possible to know what happens behind the
    VTI, e.g. when the VTI is move to some namespace.

    VTI works just with tunnel mode SAs. We need generic interfaces
    that ensures transfomation, regardless of the xfrm mode and
    the encapsulated address family.

    VTI is configured with a combination GRE keys and xfrm marks.
    With this we have to deal with some extra cases in the generic
    tunnel lookup because the GRE keys on the VTI are actually
    not GRE keys, the GRE keys were just reused for something else.
    All extensions to the VTI interfaces would require to add
    even more complexity to the generic tunnel lookup.

    So to overcome this, we developed xfrm interfaces with the
    following design goal:

    It should be possible to tunnel IPv4 and IPv6 through the same
    interface.

    No limitation on xfrm mode (tunnel, transport and beet).

    Should be a generic virtual interface that ensures IPsec
    transformation, no need to know what happens behind the
    interface.

    Interfaces should be configured with a new key that must match a
    new policy/SA lookup key.

    The lookup logic should stay in the xfrm codebase, no need to
    change or extend generic routing and tunnel lookups.

    Should be possible to use IPsec hardware offloads of the underlying
    interface.

    5) Remove xfrm pcpu policy cache. This was added after the flowcache
    removal, but it turned out to make things even worse.
    From Florian Westphal.

    6) Allow to update the set mark on SA updates.
    From Nathan Harold.

    7) Convert some timestamps to time64_t.
    From Arnd Bergmann.

    8) Don't check the offload_handle in xfrm code,
    it is an opaque data cookie for the driver.
    From Shannon Nelson.

    9) Remove xfrmi interface ID from flowi. After this pach
    no generic code is touched anymore to do xfrm interface
    lookups. From Benedict Wong.

    10) Allow to update the xfrm interface ID on SA updates.
    From Nathan Harold.

    11) Don't pass zero to ERR_PTR() in xfrm_resolve_and_create_bundle.
    From YueHaibing.

    12) Return more detailed errors on xfrm interface creation.
    From Benedict Wong.

    13) Use PTR_ERR_OR_ZERO instead of IS_ERR + PTR_ERR.
    From the kbuild test robot.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

23 Jun, 2018

1 commit

  • This patch adds the xfrm interface id as a lookup key
    for xfrm states and policies. With this we can assign
    states and policies to virtual xfrm interfaces.

    Signed-off-by: Steffen Klassert
    Acked-by: Shannon Nelson
    Acked-by: Benedict Wong
    Tested-by: Benedict Wong
    Tested-by: Antony Antony
    Reviewed-by: Eyal Birger

    Steffen Klassert
     

05 Jun, 2018

1 commit

  • Pull aio updates from Al Viro:
    "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

    The only thing I'm holding back for a day or so is Adam's aio ioprio -
    his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
    but let it sit in -next for decency sake..."

    * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    aio: sanitize the limit checking in io_submit(2)
    aio: fold do_io_submit() into callers
    aio: shift copyin of iocb into io_submit_one()
    aio_read_events_ring(): make a bit more readable
    aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
    aio: take list removal to (some) callers of aio_complete()
    aio: add missing break for the IOCB_CMD_FDSYNC case
    random: convert to ->poll_mask
    timerfd: convert to ->poll_mask
    eventfd: switch to ->poll_mask
    pipe: convert to ->poll_mask
    crypto: af_alg: convert to ->poll_mask
    net/rxrpc: convert to ->poll_mask
    net/iucv: convert to ->poll_mask
    net/phonet: convert to ->poll_mask
    net/nfc: convert to ->poll_mask
    net/caif: convert to ->poll_mask
    net/bluetooth: convert to ->poll_mask
    net/sctp: convert to ->poll_mask
    net/tipc: convert to ->poll_mask
    ...

    Linus Torvalds
     

26 May, 2018

1 commit


16 May, 2018

1 commit

  • Variants of proc_create{,_data} that directly take a struct seq_operations
    and deal with network namespaces in ->open and ->release. All callers of
    proc_create + seq_open_net converted over, and seq_{open,release}_net are
    removed entirely.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

09 Apr, 2018

1 commit

  • Key extensions (struct sadb_key) include a user-specified number of key
    bits. The kernel uses that number to determine how much key data to copy
    out of the message in pfkey_msg2xfrm_state().

    The length of the sadb_key message must be verified to be long enough,
    even in the case of SADB_X_AALG_NULL. Furthermore, the sadb_key_len value
    must be long enough to include both the key data and the struct sadb_key
    itself.

    Introduce a helper function verify_key_len(), and call it from
    parse_exthdrs() where other exthdr types are similarly checked for
    correctness.

    Signed-off-by: Kevin Easton
    Reported-by: syzbot+5022a34ca5a3d49b84223653fab632dfb7b4cf37@syzkaller.appspotmail.com
    Signed-off-by: Steffen Klassert

    Kevin Easton
     

10 Jan, 2018

1 commit


30 Dec, 2017

2 commits

  • If a message sent to a PF_KEY socket ended with an incomplete extension
    header (fewer than 4 bytes remaining), then parse_exthdrs() read past
    the end of the message, into uninitialized memory. Fix it by returning
    -EINVAL in this case.

    Reproducer:

    #include
    #include
    #include

    int main()
    {
    int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
    char buf[17] = { 0 };
    struct sadb_msg *msg = (void *)buf;

    msg->sadb_msg_version = PF_KEY_V2;
    msg->sadb_msg_type = SADB_DELETE;
    msg->sadb_msg_len = 2;

    write(sock, buf, 17);
    }

    Cc: stable@vger.kernel.org
    Signed-off-by: Eric Biggers
    Signed-off-by: Steffen Klassert

    Eric Biggers
     
  • If a message sent to a PF_KEY socket ended with one of the extensions
    that takes a 'struct sadb_address' but there were not enough bytes
    remaining in the message for the ->sa_family member of the 'struct
    sockaddr' which is supposed to follow, then verify_address_len() read
    past the end of the message, into uninitialized memory. Fix it by
    returning -EINVAL in this case.

    This bug was found using syzkaller with KMSAN.

    Reproducer:

    #include
    #include
    #include

    int main()
    {
    int sock = socket(PF_KEY, SOCK_RAW, PF_KEY_V2);
    char buf[24] = { 0 };
    struct sadb_msg *msg = (void *)buf;
    struct sadb_address *addr = (void *)(msg + 1);

    msg->sadb_msg_version = PF_KEY_V2;
    msg->sadb_msg_type = SADB_DELETE;
    msg->sadb_msg_len = 3;
    addr->sadb_address_len = 1;
    addr->sadb_address_exttype = SADB_EXT_ADDRESS_SRC;

    write(sock, buf, 24);
    }

    Reported-by: Alexander Potapenko
    Cc: stable@vger.kernel.org
    Signed-off-by: Eric Biggers
    Signed-off-by: Steffen Klassert

    Eric Biggers
     

14 Nov, 2017

1 commit


16 Aug, 2017

1 commit


15 Aug, 2017

1 commit

  • pfkey_broadcast() might be called from non process contexts,
    we can not use GFP_KERNEL in these cases [1].

    This patch partially reverts commit ba51b6be38c1 ("net: Fix RCU splat in
    af_key"), only keeping the GFP_ATOMIC forcing under rcu_read_lock()
    section.

    [1] : syzkaller reported :

    in_atomic(): 1, irqs_disabled(): 0, pid: 2932, name: syzkaller183439
    3 locks held by syzkaller183439/2932:
    #0: (&net->xfrm.xfrm_cfg_mutex){+.+.+.}, at: [] pfkey_sendmsg+0x4c8/0x9f0 net/key/af_key.c:3649
    #1: (&pfk->dump_lock){+.+.+.}, at: [] pfkey_do_dump+0x76/0x3f0 net/key/af_key.c:293
    #2: (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [] spin_lock_bh include/linux/spinlock.h:304 [inline]
    #2: (&(&net->xfrm.xfrm_policy_lock)->rlock){+...+.}, at: [] xfrm_policy_walk+0x192/0xa30 net/xfrm/xfrm_policy.c:1028
    CPU: 0 PID: 2932 Comm: syzkaller183439 Not tainted 4.13.0-rc4+ #24
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
    Call Trace:
    __dump_stack lib/dump_stack.c:16 [inline]
    dump_stack+0x194/0x257 lib/dump_stack.c:52
    ___might_sleep+0x2b2/0x470 kernel/sched/core.c:5994
    __might_sleep+0x95/0x190 kernel/sched/core.c:5947
    slab_pre_alloc_hook mm/slab.h:416 [inline]
    slab_alloc mm/slab.c:3383 [inline]
    kmem_cache_alloc+0x24b/0x6e0 mm/slab.c:3559
    skb_clone+0x1a0/0x400 net/core/skbuff.c:1037
    pfkey_broadcast_one+0x4b2/0x6f0 net/key/af_key.c:207
    pfkey_broadcast+0x4ba/0x770 net/key/af_key.c:281
    dump_sp+0x3d6/0x500 net/key/af_key.c:2685
    xfrm_policy_walk+0x2f1/0xa30 net/xfrm/xfrm_policy.c:1042
    pfkey_dump_sp+0x42/0x50 net/key/af_key.c:2695
    pfkey_do_dump+0xaa/0x3f0 net/key/af_key.c:299
    pfkey_spddump+0x1a0/0x210 net/key/af_key.c:2722
    pfkey_process+0x606/0x710 net/key/af_key.c:2814
    pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3650
    sock_sendmsg_nosec net/socket.c:633 [inline]
    sock_sendmsg+0xca/0x110 net/socket.c:643
    ___sys_sendmsg+0x755/0x890 net/socket.c:2035
    __sys_sendmsg+0xe5/0x210 net/socket.c:2069
    SYSC_sendmsg net/socket.c:2080 [inline]
    SyS_sendmsg+0x2d/0x50 net/socket.c:2076
    entry_SYSCALL_64_fastpath+0x1f/0xbe
    RIP: 0033:0x445d79
    RSP: 002b:00007f32447c1dc8 EFLAGS: 00000202 ORIG_RAX: 000000000000002e
    RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000445d79
    RDX: 0000000000000000 RSI: 000000002023dfc8 RDI: 0000000000000008
    RBP: 0000000000000086 R08: 00007f32447c2700 R09: 00007f32447c2700
    R10: 00007f32447c2700 R11: 0000000000000202 R12: 0000000000000000
    R13: 00007ffe33edec4f R14: 00007f32447c29c0 R15: 0000000000000000

    Fixes: ba51b6be38c1 ("net: Fix RCU splat in af_key")
    Signed-off-by: Eric Dumazet
    Reported-by: Dmitry Vyukov
    Cc: David Ahern
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Eric Dumazet
     

19 Jul, 2017

1 commit

  • After rcu conversions performance degradation in forward tests isn't that
    noticeable anymore.

    See next patch for some numbers.

    A followup patcg could then also remove genid from the policies
    as we do not cache bundles anymore.

    Signed-off-by: Florian Westphal
    Signed-off-by: David S. Miller

    Florian Westphal
     

05 Jul, 2017

1 commit

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

01 Jul, 2017

3 commits

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    This patch uses refcount_inc_not_zero() instead of
    atomic_inc_not_zero_hint() due to absense of a _hint()
    version of refcount API. If the hint() version must
    be used, we might need to revisit API.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena