23 May, 2019

1 commit

  • This adds the ability for Netlink to report a socket's UID along with the
    other UNIX diagnostic information that is already available. This will
    allow diagnostic tools greater insight into which users control which
    socket.

    To test this, do the following as a non-root user:

    unshare -U -r bash
    nc -l -U user.socket.$$ &

    .. and verify from within that same session that Netlink UNIX socket
    diagnostics report the socket's UID as 0. Also verify that Netlink UNIX
    socket diagnostics report the socket's UID as the user's UID from an
    unprivileged process in a different session. Verify the same from
    a root process.

    Signed-off-by: Felipe Gasper
    Signed-off-by: David S. Miller

    Felipe Gasper
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have MODULE_LICENCE("GPL*") inside which was used in the initial
    scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

21 Feb, 2019

1 commit

  • Several u->addr and u->path users are not holding any locks in
    common with unix_bind(). unix_state_lock() is useless for those
    purposes.

    u->addr is assign-once and *(u->addr) is fully set up by the time
    we set u->addr (all under unix_table_lock). u->path is also
    set in the same critical area, also before setting u->addr, and
    any unix_sock with ->path filled will have non-NULL ->addr.

    So setting ->addr with smp_store_release() is all we need for those
    "lockless" users - just have them fetch ->addr with smp_load_acquire()
    and don't even bother looking at ->path if they see NULL ->addr.

    Users of ->addr and ->path fall into several classes now:
    1) ones that do smp_load_acquire(u->addr) and access *(u->addr)
    and u->path only if smp_load_acquire() has returned non-NULL.
    2) places holding unix_table_lock. These are guaranteed that
    *(u->addr) is seen fully initialized. If unix_sock is in one of the
    "bound" chains, so's ->path.
    3) unix_sock_destructor() using ->addr is safe. All places
    that set u->addr are guaranteed to have seen all stores *(u->addr)
    while holding a reference to u and unix_sock_destructor() is called
    when (atomic) refcount hits zero.
    4) unix_release_sock() using ->path is safe. unix_bind()
    is serialized wrt unix_release() (normally - by struct file
    refcount), and for the instances that had ->path set by unix_bind()
    unix_release_sock() comes from unix_release(), so they are fine.
    Instances that had it set in unix_stream_connect() either end up
    attached to a socket (in unix_accept()), in which case the call
    chain to unix_release_sock() and serialization are the same as in
    the previous case, or they never get accept'ed and unix_release_sock()
    is called when the listener is shut down and its queue gets purged.
    In that case the listener's queue lock provides the barriers needed -
    unix_stream_connect() shoves our unix_sock into listener's queue
    under that lock right after having set ->path and eventual
    unix_release_sock() caller picks them from that queue under the
    same lock right before calling unix_release_sock().
    5) unix_find_other() use of ->path is pointless, but safe -
    it happens with successful lookup by (abstract) name, so ->path.dentry
    is guaranteed to be NULL there.

    earlier-variant-reviewed-by: "Paul E. McKenney"
    Signed-off-by: Al Viro
    Signed-off-by: David S. Miller

    Al Viro
     

26 Oct, 2017

1 commit

  • socket_diag shows information only about sockets from a namespace where
    a diag socket lives.

    But if we request information about one unix socket, the kernel don't
    check that its netns is matched with a diag socket namespace, so any
    user can get information about any unix socket in a system. This looks
    like a bug.

    v2: add a Fixes tag

    Fixes: 51d7cccf0723 ("net: make sock diag per-namespace")
    Signed-off-by: Andrei Vagin
    Signed-off-by: David S. Miller

    Andrei Vagin
     

20 Feb, 2016

1 commit

  • The value passed by unix_diag_get_exact to unix_lookup_by_ino has type
    __u32, but unix_lookup_by_ino's argument ino has type int, which is not
    a problem yet.
    However, when ino is compared with sock_i_ino return value of type
    unsigned long, ino is sign extended to signed long, and this results
    to incorrect comparison on 64-bit architectures for inode numbers
    greater than INT_MAX.

    This bug was found by strace test suite.

    Fixes: 5d3cae8bc39d ("unix_diag: Dumping exact socket core")
    Signed-off-by: Dmitry V. Levin
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Dmitry V. Levin
     

16 Apr, 2015

1 commit


18 Jan, 2015

1 commit

  • Contrary to common expectations for an "int" return, these functions
    return only a positive value -- if used correctly they cannot even
    return 0 because the message header will necessarily be in the skb.

    This makes the very common pattern of

    if (genlmsg_end(...) < 0) { ... }

    be a whole bunch of dead code. Many places also simply do

    return nlmsg_end(...);

    and the caller is expected to deal with it.

    This also commonly (at least for me) causes errors, because it is very
    common to write

    if (my_function(...))
    /* error condition */

    and if my_function() does "return nlmsg_end()" this is of course wrong.

    Additionally, there's not a single place in the kernel that actually
    needs the message length returned, and if anyone needs it later then
    it'll be very easy to just use skb->len there.

    Remove this, and make the functions void. This removes a bunch of dead
    code as described above. The patch adds lines because I did

    - return nlmsg_end(...);
    + nlmsg_end(...);
    + return 0;

    I could have preserved all the function's return values by returning
    skb->len, but instead I've audited all the places calling the affected
    functions and found that none cared. A few places actually compared
    the return value with < 0 with no change in behaviour, so I opted for the more
    efficient version.

    One instance of the error I've made numerous times now is also present
    in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
    check for
    Signed-off-by: David S. Miller

    Johannes Berg
     

03 Oct, 2013

1 commit

  • When filling the netlink message we miss to wipe the pad field,
    therefore leak one byte of heap memory to userland. Fix this by
    setting pad to 0.

    Signed-off-by: Mathias Krause
    Signed-off-by: David S. Miller

    Mathias Krause
     

28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

24 Oct, 2012

1 commit


11 Sep, 2012

1 commit

  • It is a frequent mistake to confuse the netlink port identifier with a
    process identifier. Try to reduce this confusion by renaming fields
    that hold port identifiers portid instead of pid.

    I have carefully avoided changing the structures exported to
    userspace to avoid changing the userspace API.

    I have successfully built an allyesconfig kernel with this change.

    Signed-off-by: "Eric W. Biederman"
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

17 Jul, 2012

1 commit

  • Before this patch sock_diag works for init_net only and dumps
    information about sockets from all namespaces.

    This patch expands sock_diag for all name-spaces.
    It creates a netlink kernel socket for each netns and filters
    data during dumping.

    v2: filter accoding with netns in all places
    remove an unused variable.

    Cc: "David S. Miller"
    Cc: Alexey Kuznetsov
    Cc: James Morris
    Cc: Hideaki YOSHIFUJI
    Cc: Patrick McHardy
    Cc: Pavel Emelyanov
    CC: Eric Dumazet
    Cc: linux-kernel@vger.kernel.org
    Cc: netdev@vger.kernel.org
    Signed-off-by: Andrew Vagin
    Acked-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Andrey Vagin
     

28 Jun, 2012

1 commit


27 Jun, 2012

1 commit


09 Jun, 2012

1 commit

  • /proc/net/unix has quadratic behavior, and can hold unix_table_lock for
    a while if high number of unix sockets are alive. (90 ms for 200k
    sockets...)

    We already have a hash table, so its quite easy to use it.

    Problem is unbound sockets are still hashed in a single hash slot
    (unix_socket_table[UNIX_HASH_TABLE])

    This patch also spreads unbound sockets to 256 hash slots, to speedup
    both /proc/net/unix and unix_diag.

    Time to read /proc/net/unix with 200k unix sockets :
    (time dd if=/proc/net/unix of=/dev/null bs=4k)

    before : 520 secs
    after : 2 secs

    Signed-off-by: Eric Dumazet
    Cc: Steven Whitehouse
    Cc: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Eric Dumazet
     

26 Apr, 2012

1 commit


22 Mar, 2012

1 commit

  • Pull vfs pile 1 from Al Viro:
    "This is _not_ all; in particular, Miklos' and Jan's stuff is not there
    yet."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (64 commits)
    ext4: initialization of ext4_li_mtx needs to be done earlier
    debugfs-related mode_t whack-a-mole
    hfsplus: add an ioctl to bless files
    hfsplus: change finder_info to u32
    hfsplus: initialise userflags
    qnx4: new helper - try_extent()
    qnx4: get rid of qnx4_bread/qnx4_getblk
    take removal of PF_FORKNOEXEC to flush_old_exec()
    trim includes in inode.c
    um: uml_dup_mmap() relies on ->mmap_sem being held, but activate_mm() doesn't hold it
    um: embed ->stub_pages[] into mmu_context
    gadgetfs: list_for_each_safe() misuse
    ocfs2: fix leaks on failure exits in module_init
    ecryptfs: make register_filesystem() the last potential failure exit
    ntfs: forgets to unregister sysctls on register_filesystem() failure
    logfs: missing cleanup on register_filesystem() failure
    jfs: mising cleanup on register_filesystem() failure
    make configfs_pin_fs() return root dentry on success
    configfs: configfs_create_dir() has parent dentry in dentry->d_parent
    configfs: sanitize configfs_create()
    ...

    Linus Torvalds
     

21 Mar, 2012

1 commit


27 Feb, 2012

1 commit


31 Dec, 2011

2 commits


27 Dec, 2011

2 commits


21 Dec, 2011

1 commit

  • Otherwise getting

    | net/unix/diag.c:312:16: error: expected declaration specifiers or ‘...’ before string constant
    | net/unix/diag.c:313:1: error: expected declaration specifiers or ‘...’ before string constant

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: David S. Miller

    Cyrill Gorcunov
     

17 Dec, 2011

8 commits

  • Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • When establishing a unix connection on stream sockets the
    server end receives an skb with socket in its receive queue.

    Report who is waiting for these ends to be accepted for
    listening sockets via NLA.

    There's a lokcing issue with this -- the unix sk state lock is
    required to access the peer, and it is taken under the listening
    sk's queue lock. Strictly speaking the queue lock should be taken
    inside the state lock, but since in this case these two sockets
    are different it shouldn't lead to deadlock.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Report the peer socket inode ID as NLA. With this it's finally
    possible to find out the other end of an interesting unix connection.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Actually, the socket path if it's not anonymous doesn't give
    a clue to which file the socket is bound to. Even if the path
    is absolute, it can be unlinked and then new socket can be
    bound to it.

    With this NLA it's possible to check which file a particular
    socket is really bound to.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Report the sun_path when requested as NLA. With leading '\0' if
    present but without the leading AF_UNIX bits.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • The socket inode is used as a key for lookup. This is effectively
    the only really unique ID of a unix socket, but using this for
    search currently has one problem -- it is O(number of sockets) :(

    Does it worth fixing this lookup or inventing some other ID for
    unix sockets?

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Walk the unix sockets table and fill the core response structure,
    which includes type, state and inode.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Includes basic module_init/_exit functionality, dump/get_exact stubs
    and declares the basic API structures for request and response.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov