09 Aug, 2018

1 commit

  • commit 91874ecf32e41b5d86a4cb9d60e0bee50d828058 upstream.

    It's legal to have 64 groups for netlink_sock.

    As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe
    only to first 32 groups.

    The check for correctness of .bind() userspace supplied parameter
    is done by applying mask made from ngroups shift. Which broke Android
    as they have 64 groups and the shift for mask resulted in an overflow.

    Fixes: 61f4b23769f0 ("netlink: Don't shift with UB on nlk->ngroups")
    Cc: "David S. Miller"
    Cc: Herbert Xu
    Cc: Steffen Klassert
    Cc: netdev@vger.kernel.org
    Cc: stable@vger.kernel.org
    Reported-and-Tested-by: Nathan Chancellor
    Signed-off-by: Dmitry Safonov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Safonov
     

06 Aug, 2018

1 commit

  • [ Upstream commit bc5b6c0b62b932626a135f516a41838c510c6eba ]

    'protocol' is a user-controlled value, so sanitize it after the bounds
    check to avoid using it for speculative out-of-bounds access to arrays
    indexed by it.

    This addresses the following accesses detected with the help of smatch:

    * net/netlink/af_netlink.c:654 __netlink_create() warn: potential
    spectre issue 'nlk_cb_mutex_keys' [w]

    * net/netlink/af_netlink.c:654 __netlink_create() warn: potential
    spectre issue 'nlk_cb_mutex_key_strings' [w]

    * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre
    issue 'nl_table' [w] (local cap)

    Cc: Josh Poimboeuf
    Signed-off-by: Jeremy Cline
    Reviewed-by: Josh Poimboeuf
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jeremy Cline
     

03 Aug, 2018

2 commits

  • [ Upstream commit 61f4b23769f0cc72ae62c9a81cf08f0397d40da8 ]

    On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in
    hang during boot.
    Check for 0 ngroups and use (unsigned long long) as a type to shift.

    Fixes: 7acf9d4237c4 ("netlink: Do not subscribe to non-existent groups").
    Reported-by: kernel test robot
    Signed-off-by: Dmitry Safonov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Safonov
     
  • [ Upstream commit 7acf9d4237c46894e0fa0492dd96314a41742e84 ]

    Make ABI more strict about subscribing to group > ngroups.
    Code doesn't check for that and it looks bogus.
    (one can subscribe to non-existing group)
    Still, it's possible to bind() to all possible groups with (-1)

    Cc: "David S. Miller"
    Cc: Herbert Xu
    Cc: Steffen Klassert
    Cc: netdev@vger.kernel.org
    Signed-off-by: Dmitry Safonov
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Safonov
     

16 May, 2018

1 commit

  • commit 6091f09c2f79730d895149bcfe3d66140288cd0e upstream.

    syzbot reported :

    BUG: KMSAN: uninit-value in ffs arch/x86/include/asm/bitops.h:432 [inline]
    BUG: KMSAN: uninit-value in netlink_sendmsg+0xb26/0x1310 net/netlink/af_netlink.c:1851

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Eric Dumazet
     

12 Apr, 2018

1 commit

  • [ Upstream commit 7880287981b60a6808f39f297bb66936e8bdf57a ]

    KMSAN reports use of uninitialized memory in the case when |alen| is
    smaller than sizeof(struct sockaddr_nl), and therefore |nladdr| isn't
    fully copied from the userspace.

    Signed-off-by: Alexander Potapenko
    Fixes: 1da177e4c3f41524 ("Linux-2.6.12-rc2")
    Reviewed-by: Eric Dumazet
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexander Potapenko
     

01 Apr, 2018

1 commit

  • [ Upstream commit 02a2385f37a7c6594c9d89b64c4a1451276f08eb ]

    nlmsg_multicast() consumes always the skb, thus the original skb must be
    freed only when this function is called with a clone.

    Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
    Reported-by: Ben Hutchings
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Dichtel
     

09 Mar, 2018

2 commits

  • [ Upstream commit b87b6194be631c94785fe93398651e804ed43e28 ]

    Before, if cb->start() failed, the module reference would never be put,
    because cb->cb_running is intentionally false at this point. Users are
    generally annoyed by this because they can no longer unload modules that
    leak references. Also, it may be possible to tediously wrap a reference
    counter back to zero, especially since module.c still uses atomic_inc
    instead of refcount_inc.

    This patch expands the error path to simply call module_put if
    cb->start() fails.

    Fixes: 41c87425a1ac ("netlink: do not set cb_running if dump's start() errs")
    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason A. Donenfeld
     
  • [ Upstream commit cb9f7a9a5c96a773bbc9c70660dc600cfff82f82 ]

    Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
    case when commit 134e63756d5f was pushed.
    However, there was no reason to stop the loop if a netns does not have
    listeners.
    Returns -ESRCH only if there was no listeners in all netns.

    To avoid having the same problem in the future, I didn't take the
    assumption that nlmsg_multicast() returns only 0 or -ESRCH.

    Fixes: 134e63756d5f ("genetlink: make netns aware")
    CC: Johannes Berg
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Dichtel
     

31 Jan, 2018

2 commits

  • [ Upstream commit cd443f1e91ca600a092e780e8250cd6a2954b763 ]

    Move up the extack reset/initialization in netlink_rcv_skb, so that
    those 'goto ack' will not skip it. Otherwise, later on netlink_ack
    may use the uninitialized extack and cause kernel crash.

    Fixes: cbbdf8433a5f ("netlink: extack needs to be reset each time through loop")
    Reported-by: syzbot+03bee3680a37466775e7@syzkaller.appspotmail.com
    Signed-off-by: Xin Long
    Acked-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Xin Long
     
  • [ Upstream commit cbbdf8433a5f117b1a2119ea30fc651b61ef7570 ]

    syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value.
    The problem is that netlink_rcv_skb loops over the skb repeatedly invoking
    the callback and without resetting the extack leaving potentially stale
    data. Initializing each time through avoids the WARN_ON.

    Fixes: 2d4bc93368f5a ("netlink: extended ACK reporting")
    Reported-by: syzbot+315fa6766d0f7c359327@syzkaller.appspotmail.com
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    David Ahern
     

03 Jan, 2018

1 commit

  • [ Upstream commit 93c647643b48f0131f02e45da3bd367d80443291 ]

    Currently, a nlmon link inside a child namespace can observe systemwide
    netlink activity. Filter the traffic so that nlmon can only sniff
    netlink messages from its own netns.

    Test case:

    vpnns -- bash -c "ip link add nlmon0 type nlmon; \
    ip link set nlmon0 up; \
    tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
    sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
    spi 0x1 mode transport \
    auth sha1 0x6162633132330000000000000000000000000000 \
    enc aes 0x00000000000000000000000000000000
    grep --binary abc123 /tmp/nlmon.pcap

    Signed-off-by: Kevin Cernekee
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Kevin Cernekee
     

24 Nov, 2017

1 commit

  • [ Upstream commit 0642840b8bb008528dbdf929cec9f65ac4231ad0 ]

    The way people generally use netlink_dump is that they fill in the skb
    as much as possible, breaking when nla_put returns an error. Then, they
    get called again and start filling out the next skb, and again, and so
    forth. The mechanism at work here is the ability for the iterative
    dumping function to detect when the skb is filled up and not fill it
    past the brim, waiting for a fresh skb for the rest of the data.

    However, if the attributes are small and nicely packed, it is possible
    that a dump callback function successfully fills in attributes until the
    skb is of size 4080 (libmnl's default page-sized receive buffer size).
    The dump function completes, satisfied, and then, if it happens to be
    that this is actually the last skb, and no further ones are to be sent,
    then netlink_dump will add on the NLMSG_DONE part:

    nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);

    It is very important that netlink_dump does this, of course. However, in
    this example, that call to nlmsg_put_answer will fail, because the
    previous filling by the dump function did not leave it enough room. And
    how could it possibly have done so? All of the nla_put variety of
    functions simply check to see if the skb has enough tailroom,
    independent of the context it is in.

    In order to keep the important assumptions of all netlink dump users, it
    is therefore important to give them an skb that has this end part of the
    tail already reserved, so that the call to nlmsg_put_answer does not
    fail. Otherwise, library authors are forced to find some bizarre sized
    receive buffer that has a large modulo relative to the common sizes of
    messages received, which is ugly and buggy.

    This patch thus saves the NLMSG_DONE for an additional message, for the
    case that things are dangerously close to the brim. This requires
    keeping track of the errno from ->dump() across calls.

    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Jason A. Donenfeld
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

18 Oct, 2017

1 commit

  • It seems that it's possible to toggle NETLINK_F_EXT_ACK
    through setsockopt() while another thread/CPU is building
    a message inside netlink_ack(), which could then trigger
    the WARN_ON()s I added since if it goes from being turned
    off to being turned on between allocating and filling the
    message, the skb could end up being too small.

    Avoid this whole situation by storing the value of this
    flag in a separate variable and using that throughout the
    function instead.

    Fixes: 2d4bc93368f5 ("netlink: extended ACK reporting")
    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

10 Oct, 2017

1 commit

  • It turns out that multiple places can call netlink_dump(), which means
    it's still possible to dereference partially initialized values in
    dump() that were the result of a faulty returned start().

    This fixes the issue by calling start() _before_ setting cb_running to
    true, so that there's no chance at all of hitting the dump() function
    through any indirect paths.

    It also moves the call to start() to be when the mutex is held. This has
    the nice side effect of serializing invocations to start(), which is
    likely desirable anyway. It also prevents any possible other races that
    might come out of this logic.

    In testing this with several different pieces of tricky code to trigger
    these issues, this commit fixes all avenues that I'm aware of.

    Signed-off-by: Jason A. Donenfeld
    Cc: Johannes Berg
    Reviewed-by: Johannes Berg
    Signed-off-by: David S. Miller

    Jason A. Donenfeld
     

30 Sep, 2017

1 commit

  • Drivers that use the start method for netlink dumping rely on dumpit not
    being called if start fails. For example, ila_xlat.c allocates memory
    and assigns it to cb->args[0] in its start() function. It might fail to
    do that and return -ENOMEM instead. However, even when returning an
    error, dumpit will be called, which, in the example above, quickly
    dereferences the memory in cb->args[0], which will OOPS the kernel. This
    is but one example of how this goes wrong.

    Since start() has always been a function with an int return type, it
    therefore makes sense to use it properly, rather than ignoring it. This
    patch thus returns early and does not call dumpit() when start() fails.

    Signed-off-by: Jason A. Donenfeld
    Cc: Johannes Berg
    Reviewed-by: Johannes Berg
    Signed-off-by: David S. Miller

    Jason A. Donenfeld
     

07 Sep, 2017

2 commits

  • Now there is no lock protecting nlk ngroups/groups' accessing in
    netlink bind and getname. It's safe from nlk groups' setting in
    netlink_release, but not from netlink_realloc_groups called by
    netlink_setsockopt.

    netlink_lock_table is needed in both netlink bind and getname when
    accessing nlk groups.

    Acked-by: Florian Westphal
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     
  • ChunYu found a netlink use-after-free issue by syzkaller:

    [28448.842981] BUG: KASAN: use-after-free in __nla_put+0x37/0x40 at addr ffff8807185e2378
    [28448.969918] Call Trace:
    [...]
    [28449.117207] __nla_put+0x37/0x40
    [28449.132027] nla_put+0xf5/0x130
    [28449.146261] sk_diag_fill.isra.4.constprop.5+0x5a0/0x750 [netlink_diag]
    [28449.176608] __netlink_diag_dump+0x25a/0x700 [netlink_diag]
    [28449.202215] netlink_diag_dump+0x176/0x240 [netlink_diag]
    [28449.226834] netlink_dump+0x488/0xbb0
    [28449.298014] __netlink_dump_start+0x4e8/0x760
    [28449.317924] netlink_diag_handler_dump+0x261/0x340 [netlink_diag]
    [28449.413414] sock_diag_rcv_msg+0x207/0x390
    [28449.432409] netlink_rcv_skb+0x149/0x380
    [28449.467647] sock_diag_rcv+0x2d/0x40
    [28449.484362] netlink_unicast+0x562/0x7b0
    [28449.564790] netlink_sendmsg+0xaa8/0xe60
    [28449.661510] sock_sendmsg+0xcf/0x110
    [28449.865631] __sys_sendmsg+0xf3/0x240
    [28450.000964] SyS_sendmsg+0x32/0x50
    [28450.016969] do_syscall_64+0x25c/0x6c0
    [28450.154439] entry_SYSCALL64_slow_path+0x25/0x25

    It was caused by no protection between nlk groups' free in netlink_release
    and nlk groups' accessing in sk_diag_dump_groups. The similar issue also
    exists in netlink_seq_show().

    This patch is to defer nlk groups' free in deferred_put_nlk_sk.

    Reported-by: ChunYu Wang
    Acked-by: Florian Westphal
    Signed-off-by: Xin Long
    Signed-off-by: David S. Miller

    Xin Long
     

01 Jul, 2017

3 commits

  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    This patch uses refcount_inc_not_zero() instead of
    atomic_inc_not_zero_hint() due to absense of a _hint()
    version of refcount API. If the hint() version must
    be used, we might need to revisit API.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     
  • refcount_t type and corresponding API should be
    used instead of atomic_t when the variable is used as
    a reference counter. This allows to avoid accidental
    refcounter overflows that might lead to use-after-free
    situations.

    Signed-off-by: Elena Reshetova
    Signed-off-by: Hans Liljestrand
    Signed-off-by: Kees Cook
    Signed-off-by: David Windsor
    Signed-off-by: David S. Miller

    Reshetova, Elena
     

16 Jun, 2017

2 commits

  • It seems like a historic accident that these return unsigned char *,
    and in many places that means casts are required, more often than not.

    Make these functions (skb_put, __skb_put and pskb_put) return void *
    and remove all the casts across the tree, adding a (u8 *) cast only
    where the unsigned char pointer was used directly, all done with the
    following spatch:

    @@
    expression SKB, LEN;
    typedef u8;
    identifier fn = { skb_put, __skb_put };
    @@
    - *(fn(SKB, LEN))
    + *(u8 *)fn(SKB, LEN)

    @@
    expression E, SKB, LEN;
    identifier fn = { skb_put, __skb_put };
    type T;
    @@
    - E = ((T *)(fn(SKB, LEN)))
    + E = fn(SKB, LEN)

    which actually doesn't cover pskb_put since there are only three
    users overall.

    A handful of stragglers were converted manually, notably a macro in
    drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
    instances in net/bluetooth/hci_sock.c. In the former file, I also
    had to fix one whitespace problem spatch introduced.

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • A common pattern with skb_put() is to just want to memcpy()
    some data into the new space, introduce skb_put_data() for
    this.

    An spatch similar to the one for skb_put_zero() converts many
    of the places using it:

    @@
    identifier p, p2;
    expression len, skb, data;
    type t, t2;
    @@
    (
    -p = skb_put(skb, len);
    +p = skb_put_data(skb, data, len);
    |
    -p = (t)skb_put(skb, len);
    +p = skb_put_data(skb, data, len);
    )
    (
    p2 = (t2)p;
    -memcpy(p2, data, len);
    |
    -memcpy(p, data, len);
    )

    @@
    type t, t2;
    identifier p, p2;
    expression skb, data;
    @@
    t *p;
    ...
    (
    -p = skb_put(skb, sizeof(t));
    +p = skb_put_data(skb, data, sizeof(t));
    |
    -p = (t *)skb_put(skb, sizeof(t));
    +p = skb_put_data(skb, data, sizeof(t));
    )
    (
    p2 = (t2)p;
    -memcpy(p2, data, sizeof(*p));
    |
    -memcpy(p, data, sizeof(*p));
    )

    @@
    expression skb, len, data;
    @@
    -memcpy(skb_put(skb, len), data, len);
    +skb_put_data(skb, data, len);

    (again, manually post-processed to retain some comments)

    Reviewed-by: Stephen Hemminger
    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     

01 Jun, 2017

1 commit

  • The NETLINK_F_LISTEN_ALL_NSID otion enables to listen all netns that have a
    nsid assigned into the netns where the netlink socket is opened.
    The nsid is sent as metadata to userland, but the existence of this nsid is
    checked only for netns that are different from the socket netns. Thus, if
    no nsid is assigned to the socket netns, NETNSA_NSID_NOT_ASSIGNED is
    reported to the userland. This value is confusing and useless.
    After this patch, only valid nsid are sent to userland.

    Reported-by: Flavio Leitner
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

14 Apr, 2017

5 commits

  • This is an add-on to the previous patch that passes the extended ACK
    structure where it's already available by existing genl_info or extack
    function arguments.

    This was done with this spatch (with some manual adjustment of
    indentation):

    @@
    expression A, B, C, D, E;
    identifier fn, info;
    @@
    fn(..., struct genl_info *info, ...) {
    ...
    -nlmsg_parse(A, B, C, D, E, NULL)
    +nlmsg_parse(A, B, C, D, E, info->extack)
    ...
    }

    @@
    expression A, B, C, D, E;
    identifier fn, info;
    @@
    fn(..., struct genl_info *info, ...) {
    extack)
    ...>
    }

    @@
    expression A, B, C, D, E;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {

    }

    @@
    expression A, B, C, D, E;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {

    }

    @@
    expression A, B, C, D, E;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {
    ...
    -nlmsg_parse(A, B, C, D, E, NULL)
    +nlmsg_parse(A, B, C, D, E, extack)
    ...
    }

    @@
    expression A, B, C, D;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {

    }

    @@
    expression A, B, C, D;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {

    }

    @@
    expression A, B, C, D;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {

    }

    @@
    expression A, B, C;
    identifier fn, extack;
    @@
    fn(..., struct netlink_ext_ack *extack, ...) {

    }

    Signed-off-by: Johannes Berg
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Pass the new extended ACK reporting struct to all of the generic
    netlink parsing functions. For now, pass NULL in almost all callers
    (except for some in the core.)

    Signed-off-by: Johannes Berg
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Now that we have extended error reporting and a new message format for
    netlink ACK messages, also extend this to be able to return arbitrary
    cookie data on success.

    This will allow, for example, nl80211 to not send an extra message for
    cookies identifying newly created objects, but return those directly
    in the ACK message.

    The cookie data size is currently limited to 20 bytes (since Jamal
    talked about using SHA1 for identifiers.)

    Thanks to Jamal Hadi Salim for bringing up this idea during the
    discussions.

    Signed-off-by: Johannes Berg
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Pass the extended ACK reporting struct down from generic netlink to
    the families, using the existing struct genl_info for simplicity.

    Also add support to set the extended ACK information from generic
    netlink users.

    Signed-off-by: Johannes Berg
    Reviewed-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Johannes Berg
     
  • Add the base infrastructure and UAPI for netlink extended ACK
    reporting. All "manual" calls to netlink_ack() pass NULL for now and
    thus don't get extended ACK reporting.

    Big thanks goes to Pablo Neira Ayuso for not only bringing up the
    whole topic at netconf (again) but also coming up with the nlattr
    passing trick and various other ideas.

    Signed-off-by: Johannes Berg
    Reviewed-by: David Ahern
    Signed-off-by: David S. Miller

    Johannes Berg
     

05 Apr, 2017

1 commit

  • cb_running is reported in /proc/self/net/netlink and it is reported by
    the ss tool, when it gets information from the proc files.

    sock_diag is a new interface which is used instead of proc files, so it
    looks reasonable that this interface has to report no less information
    about sockets than proc files.

    We use these flags to dump and restore netlink sockets.

    Signed-off-by: Andrei Vagin
    Signed-off-by: David S. Miller

    Andrey Vagin
     

23 Mar, 2017

1 commit

  • Commit 2ae0f17df1cd ("genetlink: use idr to track families") replaced

    if (++n < fams_to_skip)
    continue;
    into:

    if (n++ < fams_to_skip)
    continue;

    This subtle change cause that on retry ctrl_dumpfamily() call we omit
    one family that failed to do ctrl_fill_info() on previous call, because
    cb->args[0] = n number counts also family that failed to do
    ctrl_fill_info().

    Patch fixes the problem and avoid confusion in the future just decrease
    n counter when ctrl_fill_info() fail.

    User visible problem caused by this bug is failure to get access to
    some genetlink family i.e. nl80211. However problem is reproducible
    only if number of registered genetlink families is big enough to
    cause second call of ctrl_dumpfamily().

    Cc: Xose Vazquez Perez
    Cc: Larry Finger
    Cc: Johannes Berg
    Fixes: 2ae0f17df1cd ("genetlink: use idr to track families")
    Signed-off-by: Stanislaw Gruszka
    Acked-by: Johannes Berg
    Signed-off-by: David S. Miller

    Stanislaw Gruszka
     

22 Mar, 2017

1 commit

  • On Tue, Mar 14, 2017 at 10:44:10AM +0100, Dmitry Vyukov wrote:
    >
    > Yes, please.
    > Disregarding some reports is not a good way long term.

    Please try this patch.

    ---8cb_mutex are annotated by lockdep
    as a single class. This causes a false lcokdep cycle involving
    genl and crypto_user.

    This patch fixes it by dividing cb_mutex into individual classes
    based on the netlink protocol. As genl and crypto_user do not
    use the same netlink protocol this breaks the false dependency
    loop.

    Reported-by: Dmitry Vyukov
    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

28 Jan, 2017

1 commit

  • Slava Shwartsman reported a warning in skb_try_coalesce(), when we
    detect skb->truesize is completely wrong.

    In his case, issue came from IPv6 reassembly coping with malicious
    datagrams, that forced various pskb_may_pull() to reallocate a bigger
    skb->head than the one allocated by NIC driver before entering GRO
    layer.

    Current code does not change skb->truesize, leaving this burden to
    callers if they care enough.

    Blindly changing skb->truesize in pskb_expand_head() is not
    easy, as some producers might track skb->truesize, for example
    in xmit path for back pressure feedback (sk->sk_wmem_alloc)

    We can detect the cases where it should be safe to change
    skb->truesize :

    1) skb is not attached to a socket.
    2) If it is attached to a socket, destructor is sock_edemux()

    My audit gave only two callers doing their own skb->truesize
    manipulation.

    I had to remove skb parameter in sock_edemux macro when
    CONFIG_INET is not set to avoid a compile error.

    Signed-off-by: Eric Dumazet
    Reported-by: Slava Shwartsman
    Signed-off-by: David S. Miller

    Eric Dumazet
     

17 Jan, 2017

1 commit

  • In commit d35c99ff77ecb ("netlink: do not enter direct reclaim from
    netlink_dump()") we made sure to not trigger expensive memory reclaim.

    Problem is that a bit later, netlink_trim() might be called and
    trigger memory reclaim.

    netlink_trim() should be best effort, and really as fast as possible.
    Under memory pressure, it is fine to not trim this skb.

    Signed-off-by: Eric Dumazet
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

25 Dec, 2016

1 commit


11 Dec, 2016

1 commit


07 Dec, 2016

1 commit


06 Dec, 2016

1 commit

  • It is wrong to schedule a work from sk_destruct using the socket
    as the memory reserve because the socket will be freed immediately
    after the return from sk_destruct.

    Instead we should do the deferral prior to sk_free.

    This patch does just that.

    Fixes: 707693c8a498 ("netlink: Call cb->done from a worker thread")
    Signed-off-by: Herbert Xu
    Tested-by: Andrey Konovalov
    Signed-off-by: David S. Miller

    Herbert Xu
     

04 Dec, 2016

1 commit

  • Couple conflicts resolved here:

    1) In the MACB driver, a bug fix to properly initialize the
    RX tail pointer properly overlapped with some changes
    to support variable sized rings.

    2) In XGBE we had a "CONFIG_PM" --> "CONFIG_PM_SLEEP" fix
    overlapping with a reorganization of the driver to support
    ACPI, OF, as well as PCI variants of the chip.

    3) In 'net' we had several probe error path bug fixes to the
    stmmac driver, meanwhile a lot of this code was cleaned up
    and reorganized in 'net-next'.

    4) The cls_flower classifier obtained a helper function in
    'net-next' called __fl_delete() and this overlapped with
    Daniel Borkamann's bug fix to use RCU for object destruction
    in 'net'. It also overlapped with Jiri's change to guard
    the rhashtable_remove_fast() call with a check against
    tc_skip_sw().

    5) In mlx4, a revert bug fix in 'net' overlapped with some
    unrelated changes in 'net-next'.

    6) In geneve, a stale header pointer after pskb_expand_head()
    bug fix in 'net' overlapped with a large reorganization of
    the same code in 'net-next'. Since the 'net-next' code no
    longer had the bug in question, there was nothing to do
    other than to simply take the 'net-next' hunks.

    Signed-off-by: David S. Miller

    David S. Miller