03 Apr, 2019

1 commit

  • [ Upstream commit ceabee6c59943bdd5e1da1a6a20dc7ee5f8113a2 ]

    In genl_register_family(), when idr_alloc() fails,
    we forget to free the memory we possibly allocate for
    family->attrbuf.

    Reported-by: Hulk Robot
    Fixes: 2ae0f17df1cd ("genetlink: use idr to track families")
    Signed-off-by: YueHaibing
    Reviewed-by: Kirill Tkhai
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    YueHaibing
     

06 Aug, 2018

1 commit


05 Aug, 2018

1 commit

  • It's legal to have 64 groups for netlink_sock.

    As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe
    only to first 32 groups.

    The check for correctness of .bind() userspace supplied parameter
    is done by applying mask made from ngroups shift. Which broke Android
    as they have 64 groups and the shift for mask resulted in an overflow.

    Fixes: 61f4b23769f0 ("netlink: Don't shift with UB on nlk->ngroups")
    Cc: "David S. Miller"
    Cc: Herbert Xu
    Cc: Steffen Klassert
    Cc: netdev@vger.kernel.org
    Cc: stable@vger.kernel.org
    Reported-and-Tested-by: Nathan Chancellor
    Signed-off-by: Dmitry Safonov
    Signed-off-by: David S. Miller

    Dmitry Safonov
     

03 Aug, 2018

1 commit


02 Aug, 2018

1 commit

  • 'protocol' is a user-controlled value, so sanitize it after the bounds
    check to avoid using it for speculative out-of-bounds access to arrays
    indexed by it.

    This addresses the following accesses detected with the help of smatch:

    * net/netlink/af_netlink.c:654 __netlink_create() warn: potential
    spectre issue 'nlk_cb_mutex_keys' [w]

    * net/netlink/af_netlink.c:654 __netlink_create() warn: potential
    spectre issue 'nlk_cb_mutex_key_strings' [w]

    * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre
    issue 'nl_table' [w] (local cap)

    Cc: Josh Poimboeuf
    Signed-off-by: Jeremy Cline
    Reviewed-by: Josh Poimboeuf
    Signed-off-by: David S. Miller

    Jeremy Cline
     

31 Jul, 2018

1 commit

  • On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in
    hang during boot.
    Check for 0 ngroups and use (unsigned long long) as a type to shift.

    Fixes: 7acf9d4237c4 ("netlink: Do not subscribe to non-existent groups").
    Reported-by: kernel test robot
    Signed-off-by: Dmitry Safonov
    Signed-off-by: David S. Miller

    Dmitry Safonov
     

30 Jul, 2018

1 commit

  • Make ABI more strict about subscribing to group > ngroups.
    Code doesn't check for that and it looks bogus.
    (one can subscribe to non-existing group)
    Still, it's possible to bind() to all possible groups with (-1)

    Cc: "David S. Miller"
    Cc: Herbert Xu
    Cc: Steffen Klassert
    Cc: netdev@vger.kernel.org
    Signed-off-by: Dmitry Safonov
    Signed-off-by: David S. Miller

    Dmitry Safonov
     

25 Jul, 2018

1 commit


29 Jun, 2018

1 commit

  • The poll() changes were not well thought out, and completely
    unexplained. They also caused a huge performance regression, because
    "->poll()" was no longer a trivial file operation that just called down
    to the underlying file operations, but instead did at least two indirect
    calls.

    Indirect calls are sadly slow now with the Spectre mitigation, but the
    performance problem could at least be largely mitigated by changing the
    "->get_poll_head()" operation to just have a per-file-descriptor pointer
    to the poll head instead. That gets rid of one of the new indirections.

    But that doesn't fix the new complexity that is completely unwarranted
    for the regular case. The (undocumented) reason for the poll() changes
    was some alleged AIO poll race fixing, but we don't make the common case
    slower and more complex for some uncommon special case, so this all
    really needs way more explanations and most likely a fundamental
    redesign.

    [ This revert is a revert of about 30 different commits, not reverted
    individually because that would just be unnecessarily messy - Linus ]

    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Jun, 2018

1 commit

  • The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
    patch replaces cases of:

    kmalloc(a * b, gfp)

    with:
    kmalloc_array(a * b, gfp)

    as well as handling cases of:

    kmalloc(a * b * c, gfp)

    with:

    kmalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kmalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kmalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The tools/ directory was manually excluded, since it has its own
    implementation of kmalloc().

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kmalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kmalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kmalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kmalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kmalloc
    + kmalloc_array
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kmalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kmalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kmalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kmalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kmalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kmalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kmalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kmalloc(sizeof(THING) * C2, ...)
    |
    kmalloc(sizeof(TYPE) * C2, ...)
    |
    kmalloc(C1 * C2 * C3, ...)
    |
    kmalloc(C1 * C2, ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kmalloc
    + kmalloc_array
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

05 Jun, 2018

1 commit

  • Pull aio updates from Al Viro:
    "Majority of AIO stuff this cycle. aio-fsync and aio-poll, mostly.

    The only thing I'm holding back for a day or so is Adam's aio ioprio -
    his last-minute fixup is trivial (missing stub in !CONFIG_BLOCK case),
    but let it sit in -next for decency sake..."

    * 'work.aio-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    aio: sanitize the limit checking in io_submit(2)
    aio: fold do_io_submit() into callers
    aio: shift copyin of iocb into io_submit_one()
    aio_read_events_ring(): make a bit more readable
    aio: all callers of aio_{read,write,fsync,poll} treat 0 and -EIOCBQUEUED the same way
    aio: take list removal to (some) callers of aio_complete()
    aio: add missing break for the IOCB_CMD_FDSYNC case
    random: convert to ->poll_mask
    timerfd: convert to ->poll_mask
    eventfd: switch to ->poll_mask
    pipe: convert to ->poll_mask
    crypto: af_alg: convert to ->poll_mask
    net/rxrpc: convert to ->poll_mask
    net/iucv: convert to ->poll_mask
    net/phonet: convert to ->poll_mask
    net/nfc: convert to ->poll_mask
    net/caif: convert to ->poll_mask
    net/bluetooth: convert to ->poll_mask
    net/sctp: convert to ->poll_mask
    net/tipc: convert to ->poll_mask
    ...

    Linus Torvalds
     

26 May, 2018

1 commit


16 May, 2018

1 commit

  • Variants of proc_create{,_data} that directly take a struct seq_operations
    and deal with network namespaces in ->open and ->release. All callers of
    proc_create + seq_open_net converted over, and seq_{open,release}_net are
    removed entirely.

    Signed-off-by: Christoph Hellwig

    Christoph Hellwig
     

05 May, 2018

1 commit


08 Apr, 2018

1 commit

  • syzbot reported :

    BUG: KMSAN: uninit-value in ffs arch/x86/include/asm/bitops.h:432 [inline]
    BUG: KMSAN: uninit-value in netlink_sendmsg+0xb26/0x1310 net/netlink/af_netlink.c:1851

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Eric Dumazet
    Reported-by: syzbot
    Signed-off-by: David S. Miller

    Eric Dumazet
     

02 Apr, 2018

1 commit


28 Mar, 2018

1 commit


26 Mar, 2018

1 commit


23 Mar, 2018

1 commit

  • Fun set of conflict resolutions here...

    For the mac80211 stuff, these were fortunately just parallel
    adds. Trivially resolved.

    In drivers/net/phy/phy.c we had a bug fix in 'net' that moved the
    function phy_disable_interrupts() earlier in the file, whilst in
    'net-next' the phy_error() call from this function was removed.

    In net/ipv4/xfrm4_policy.c, David Ahern's changes to remove the
    'rt_table_id' member of rtable collided with a bug fix in 'net' that
    added a new struct member "rt_mtu_locked" which needs to be copied
    over here.

    The mlxsw driver conflict consisted of net-next separating
    the span code and definitions into separate files, whilst
    a 'net' bug fix made some changes to that moved code.

    The mlx5 infiniband conflict resolution was quite non-trivial,
    the RDMA tree's merge commit was used as a guide here, and
    here are their notes:

    ====================

    Due to bug fixes found by the syzkaller bot and taken into the for-rc
    branch after development for the 4.17 merge window had already started
    being taken into the for-next branch, there were fairly non-trivial
    merge issues that would need to be resolved between the for-rc branch
    and the for-next branch. This merge resolves those conflicts and
    provides a unified base upon which ongoing development for 4.17 can
    be based.

    Conflicts:
    drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524
    (IB/mlx5: Fix cleanup order on unload) added to for-rc and
    commit b5ca15ad7e61 (IB/mlx5: Add proper representors support)
    add as part of the devel cycle both needed to modify the
    init/de-init functions used by mlx5. To support the new
    representors, the new functions added by the cleanup patch
    needed to be made non-static, and the init/de-init list
    added by the representors patch needed to be modified to
    match the init/de-init list changes made by the cleanup
    patch.
    Updates:
    drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
    prototypes added by representors patch to reflect new function
    names as changed by cleanup patch
    drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
    stage list to match new order from cleanup patch
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     

17 Mar, 2018

1 commit

  • nlmsg_multicast() consumes always the skb, thus the original skb must be
    freed only when this function is called with a clone.

    Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
    Reported-by: Ben Hutchings
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

24 Feb, 2018

1 commit


23 Feb, 2018

1 commit

  • Before, if cb->start() failed, the module reference would never be put,
    because cb->cb_running is intentionally false at this point. Users are
    generally annoyed by this because they can no longer unload modules that
    leak references. Also, it may be possible to tediously wrap a reference
    counter back to zero, especially since module.c still uses atomic_inc
    instead of refcount_inc.

    This patch expands the error path to simply call module_put if
    cb->start() fails.

    Fixes: 41c87425a1ac ("netlink: do not set cb_running if dump's start() errs")
    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller

    Jason A. Donenfeld
     

13 Feb, 2018

4 commits

  • These pernet_operations init just allocated net memory,
    and they obviously can be executed in parallel in any
    others.

    v3: New

    Signed-off-by: Kirill Tkhai
    Acked-by: Andrei Vagin
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • This pernet_operations create and destroy net::genl_sock.
    Foreign pernet_operations don't touch it.

    Signed-off-by: Kirill Tkhai
    Acked-by: Andrei Vagin
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • The methods of netlink_net_ops create and destroy "netlink"
    file, which are not interesting for foreigh pernet_operations.
    So, netlink_net_ops may safely be made async.

    Signed-off-by: Kirill Tkhai
    Acked-by: Andrei Vagin
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • Changes since v1:
    Added changes in these files:
    drivers/infiniband/hw/usnic/usnic_transport.c
    drivers/staging/lustre/lnet/lnet/lib-socket.c
    drivers/target/iscsi/iscsi_target_login.c
    drivers/vhost/net.c
    fs/dlm/lowcomms.c
    fs/ocfs2/cluster/tcp.c
    security/tomoyo/network.c

    Before:
    All these functions either return a negative error indicator,
    or store length of sockaddr into "int *socklen" parameter
    and return zero on success.

    "int *socklen" parameter is awkward. For example, if caller does not
    care, it still needs to provide on-stack storage for the value
    it does not need.

    None of the many FOO_getname() functions of various protocols
    ever used old value of *socklen. They always just overwrite it.

    This change drops this parameter, and makes all these functions, on success,
    return length of sockaddr. It's always >= 0 and can be differentiated
    from an error.

    Tests in callers are changed from "if (err)" to "if (err < 0)", where needed.

    rpc_sockname() lost "int buflen" parameter, since its only use was
    to be passed to kernel_getsockname() as &buflen and subsequently
    not used in any way.

    Userspace API is not changed.

    text data bss dec hex filename
    30108430 2633624 873672 33615726 200ef6e vmlinux.before.o
    30108109 2633612 873672 33615393 200ee21 vmlinux.o

    Signed-off-by: Denys Vlasenko
    CC: David S. Miller
    CC: linux-kernel@vger.kernel.org
    CC: netdev@vger.kernel.org
    CC: linux-bluetooth@vger.kernel.org
    CC: linux-decnet-user@lists.sourceforge.net
    CC: linux-wireless@vger.kernel.org
    CC: linux-rdma@vger.kernel.org
    CC: linux-sctp@vger.kernel.org
    CC: linux-nfs@vger.kernel.org
    CC: linux-x25@vger.kernel.org
    Signed-off-by: David S. Miller

    Denys Vlasenko
     

09 Feb, 2018

1 commit

  • Nowadays, nlmsg_multicast() returns only 0 or -ESRCH but this was not the
    case when commit 134e63756d5f was pushed.
    However, there was no reason to stop the loop if a netns does not have
    listeners.
    Returns -ESRCH only if there was no listeners in all netns.

    To avoid having the same problem in the future, I didn't take the
    assumption that nlmsg_multicast() returns only 0 or -ESRCH.

    Fixes: 134e63756d5f ("genetlink: make netns aware")
    CC: Johannes Berg
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     

20 Jan, 2018

1 commit


19 Jan, 2018

1 commit

  • Move up the extack reset/initialization in netlink_rcv_skb, so that
    those 'goto ack' will not skip it. Otherwise, later on netlink_ack
    may use the uninitialized extack and cause kernel crash.

    Fixes: cbbdf8433a5f ("netlink: extack needs to be reset each time through loop")
    Reported-by: syzbot+03bee3680a37466775e7@syzkaller.appspotmail.com
    Signed-off-by: Xin Long
    Acked-by: David Ahern
    Signed-off-by: David S. Miller

    Xin Long
     

17 Jan, 2018

2 commits

  • Overlapping changes all over.

    The mini-qdisc bits were a little bit tricky, however.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • /proc has been ignoring struct file_operations::owner field for 10 years.
    Specifically, it started with commit 786d7e1612f0b0adb6046f19b906609e4fe8b1ba
    ("Fix rmmod/read/write races in /proc entries"). Notice the chunk where
    inode->i_fop is initialized with proxy struct file_operations for
    regular files:

    - if (de->proc_fops)
    - inode->i_fop = de->proc_fops;
    + if (de->proc_fops) {
    + if (S_ISREG(inode->i_mode))
    + inode->i_fop = &proc_reg_file_ops;
    + else
    + inode->i_fop = de->proc_fops;
    + }

    VFS stopped pinning module at this point.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Alexey Dobriyan
     

16 Jan, 2018

1 commit

  • syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value.
    The problem is that netlink_rcv_skb loops over the skb repeatedly invoking
    the callback and without resetting the extack leaving potentially stale
    data. Initializing each time through avoids the WARN_ON.

    Fixes: 2d4bc93368f5a ("netlink: extended ACK reporting")
    Reported-by: syzbot+315fa6766d0f7c359327@syzkaller.appspotmail.com
    Signed-off-by: David Ahern
    Signed-off-by: David S. Miller

    David Ahern
     

17 Dec, 2017

1 commit


12 Dec, 2017

1 commit

  • Currently, a nlmon link inside a child namespace can observe systemwide
    netlink activity. Filter the traffic so that nlmon can only sniff
    netlink messages from its own netns.

    Test case:

    vpnns -- bash -c "ip link add nlmon0 type nlmon; \
    ip link set nlmon0 up; \
    tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
    sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
    spi 0x1 mode transport \
    auth sha1 0x6162633132330000000000000000000000000000 \
    enc aes 0x00000000000000000000000000000000
    grep --binary abc123 /tmp/nlmon.pcap

    Signed-off-by: Kevin Cernekee
    Signed-off-by: David S. Miller

    Kevin Cernekee
     

11 Dec, 2017

3 commits

  • Both netlink_add_tap() and netlink_remove_tap() are
    called in process context, no need to bother spinlock.

    Note, in fact, currently we always hold RTNL when calling
    these two functions, so we don't need any other lock at
    all, but keeping this lock doesn't harm anything.

    Cc: Daniel Borkmann
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • nlmon device is not supposed to capture netlink events from
    other netns, so instead of filtering events, we can simply
    make netlink tap itself per netns.

    Cc: Daniel Borkmann
    Cc: Kevin Cernekee
    Signed-off-by: Cong Wang
    Signed-off-by: David S. Miller

    Cong Wang
     
  • Most callers of rhashtable_walk_start don't care about a resize event
    which is indicated by a return value of -EAGAIN. So calls to
    rhashtable_walk_start are wrapped wih code to ignore -EAGAIN. Something
    like this is common:

    ret = rhashtable_walk_start(rhiter);
    if (ret && ret != -EAGAIN)
    goto out;

    Since zero and -EAGAIN are the only possible return values from the
    function this check is pointless. The condition never evaluates to true.

    This patch changes rhashtable_walk_start to return void. This simplifies
    code for the callers that ignore -EAGAIN. For the few cases where the
    caller cares about the resize event, particularly where the table can be
    walked in mulitple parts for netlink or seq file dump, the function
    rhashtable_walk_start_check has been added that returns -EAGAIN on a
    resize event.

    Signed-off-by: Tom Herbert
    Acked-by: Herbert Xu
    Signed-off-by: David S. Miller

    Tom Herbert
     

14 Nov, 2017

1 commit


13 Nov, 2017

1 commit

  • The way people generally use netlink_dump is that they fill in the skb
    as much as possible, breaking when nla_put returns an error. Then, they
    get called again and start filling out the next skb, and again, and so
    forth. The mechanism at work here is the ability for the iterative
    dumping function to detect when the skb is filled up and not fill it
    past the brim, waiting for a fresh skb for the rest of the data.

    However, if the attributes are small and nicely packed, it is possible
    that a dump callback function successfully fills in attributes until the
    skb is of size 4080 (libmnl's default page-sized receive buffer size).
    The dump function completes, satisfied, and then, if it happens to be
    that this is actually the last skb, and no further ones are to be sent,
    then netlink_dump will add on the NLMSG_DONE part:

    nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI);

    It is very important that netlink_dump does this, of course. However, in
    this example, that call to nlmsg_put_answer will fail, because the
    previous filling by the dump function did not leave it enough room. And
    how could it possibly have done so? All of the nla_put variety of
    functions simply check to see if the skb has enough tailroom,
    independent of the context it is in.

    In order to keep the important assumptions of all netlink dump users, it
    is therefore important to give them an skb that has this end part of the
    tail already reserved, so that the call to nlmsg_put_answer does not
    fail. Otherwise, library authors are forced to find some bizarre sized
    receive buffer that has a large modulo relative to the common sizes of
    messages received, which is ugly and buggy.

    This patch thus saves the NLMSG_DONE for an additional message, for the
    case that things are dangerously close to the brim. This requires
    keeping track of the errno from ->dump() across calls.

    Signed-off-by: Jason A. Donenfeld
    Signed-off-by: David S. Miller

    Jason A. Donenfeld
     

04 Nov, 2017

1 commit