08 Apr, 2013

1 commit

  • For stream sockets the code misses to update the msg_namelen member
    to 0 and therefore makes net/socket.c leak the local, uninitialized
    sockaddr_storage variable to userland -- 128 bytes of kernel stack
    memory. The msg_namelen update is also missing for datagram sockets
    in case the socket is shutting down during receive.

    Fix both issues by setting msg_namelen to 0 early. It will be
    updated later if we're going to fill the msg_name member.

    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Mathias Krause
    Signed-off-by: David S. Miller

    Mathias Krause
     

19 Nov, 2012

1 commit

  • Allow an unpriviled user who has created a user namespace, and then
    created a network namespace to effectively use the new network
    namespace, by reducing capable(CAP_NET_ADMIN) and
    capable(CAP_NET_RAW) calls to be ns_capable(net->user_ns,
    CAP_NET_ADMIN), or capable(net->user_ns, CAP_NET_RAW) calls.

    Allow creation of af_key sockets.
    Allow creation of llc sockets.
    Allow creation of af_packet sockets.

    Allow sending xfrm netlink control messages.

    Allow binding to netlink multicast groups.
    Allow sending to netlink multicast groups.
    Allow adding and dropping netlink multicast groups.
    Allow sending to all netlink multicast groups and port ids.

    Allow reading the netfilter SO_IP_SET socket option.
    Allow sending netfilter netlink messages.
    Allow setting and getting ip_vs netfilter socket options.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

16 Aug, 2012

1 commit

  • The LLC code wrongly returns 0, i.e. "success", when the socket is
    zapped. Together with the uninitialized uaddrlen pointer argument from
    sys_getsockname this leads to an arbitrary memory leak of up to 128
    bytes kernel stack via the getsockname() syscall.

    Return an error instead when the socket is zapped to prevent the info
    leak. Also remove the unnecessary memset(0). We don't directly write to
    the memory pointed by uaddr but memcpy() a local structure at the end of
    the function that is properly initialized.

    Signed-off-by: Mathias Krause
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Mathias Krause
     

15 Aug, 2012

1 commit


11 Jul, 2012

1 commit


17 May, 2012

1 commit


16 May, 2012

2 commits

  • We are going to delete the Token ring support. This removes any
    special processing in the core networking for token ring, (aside
    from net/tr.c itself), leaving the drivers and remaining tokenring
    support present but inert.

    The mass removal of the drivers and net/tr.c will be in a separate
    commit, so that the history of these files that we still care
    about won't have the giant deletion tied into their history.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • Standardize the net core ratelimited logging functions.

    Coalesce formats, align arguments.
    Change a printk then vprintk sequence to use printf extension %pV.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     

16 Apr, 2012

1 commit


25 Jan, 2012

1 commit


20 Dec, 2011

1 commit


09 Dec, 2010

1 commit

  • Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :

    > Hmm..
    >
    > If somebody can explain why RTNL is held in arp_ioctl() (and therefore
    > in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
    > that your patch can be applied.
    >
    > Right now it is not good, because RTNL wont be necessarly held when you
    > are going to call arp_invalidate() ?

    While doing this analysis, I found a refcount bug in llc, I'll send a
    patch for net-2.6

    Meanwhile, here is the patch for net-next-2.6

    Your patch then can be applied after mine.

    Thanks

    [PATCH] net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()

    dev_getbyhwaddr() was called under RTNL.

    Rename it to dev_getbyhwaddr_rcu() and change all its caller to now use
    RCU locking instead of RTNL.

    Change arp_ioctl() to use RCU instead of RTNL locking.

    Note: this fix a dev refcount bug in llc

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Sep, 2010

1 commit


21 Apr, 2010

1 commit

  • Define a new function to return the waitqueue of a "struct sock".

    static inline wait_queue_head_t *sk_sleep(struct sock *sk)
    {
    return sk->sk_sleep;
    }

    Change all read occurrences of sk_sleep by a call to this function.

    Needed for a future RCU conversion. sk_sleep wont be a field directly
    available.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

27 Dec, 2009

4 commits

  • The SAP ref counter gets decremented twice when deleting a socket,
    although for all but the first socket of a SAP the SAP ref counter was
    incremented only once.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     
  • For the reclamation phase we use the SLAB_DESTROY_BY_RCU mechanism,
    which require some extra checks in the lookup code:

    a) If the current socket was released, reallocated & inserted in
    another list it will short circuit the iteration for the current list,
    thus we need to restart the lookup.

    b) If the current socket was released, reallocated & inserted in the
    same list we just need to recheck it matches the look-up criteria and
    if not we can skip to the next element.

    In this case there is no need to restart the lookup, since sockets are
    inserted at the start of the list and the worst that will happen is
    that we will iterate throught some of the list elements more then
    once.

    Note that the /proc and multicast delivery was not yet converted to
    RCU, it still uses spinlocks for protection.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     
  • Using bind(MAC address) with LLC sockets has O(n) complexity, where n
    is the number of interfaces. To overcome this, we add support for
    SO_BINDTODEVICE which drops the complexity to O(1).

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     
  • Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

26 Nov, 2009

1 commit

  • Generated with the following semantic patch

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 == n2
    + net_eq(n1, n2)

    @@
    struct net *n1;
    struct net *n2;
    @@
    - n1 != n2
    + !net_eq(n1, n2)

    applied over {include,net,drivers/net}.

    Signed-off-by: Octavian Purdila
    Signed-off-by: David S. Miller

    Octavian Purdila
     

06 Nov, 2009

1 commit

  • The generic __sock_create function has a kern argument which allows the
    security system to make decisions based on if a socket is being created by
    the kernel or by userspace. This patch passes that flag to the
    net_proto_family specific create function, so it can do the same thing.

    Signed-off-by: Eric Paris
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Eric Paris
     

07 Oct, 2009

1 commit


01 Oct, 2009

1 commit

  • This provides safety against negative optlen at the type
    level instead of depending upon (sometimes non-trivial)
    checks against this sprinkled all over the the place, in
    each and every implementation.

    Based upon work done by Arjan van de Ven and feedback
    from Linus Torvalds.

    Signed-off-by: David S. Miller

    David S. Miller
     

24 Aug, 2009

1 commit


18 May, 2009

1 commit


23 Feb, 2009

1 commit


22 Nov, 2008

1 commit


17 Jun, 2008

1 commit


03 Apr, 2008

1 commit


28 Mar, 2008

1 commit

  • LLC currently allows users to inject raw frames, including IP packets
    encapsulated in SNAP. While Linux doesn't handle IP over SNAP, other
    systems do. Restrict LLC sockets to root similar to packet sockets.

    [ Modified Patrick's patch to use CAP_NEW_RAW --DaveM ]

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

06 Mar, 2008

1 commit


20 Oct, 2007

1 commit

  • The task_struct->pid member is going to be deprecated, so start
    using the helpers (task_pid_nr/task_pid_vnr/task_pid_nr_ns) in
    the kernel.

    The first thing to start with is the pid, printed to dmesg - in
    this case we may safely use task_pid_nr(). Besides, printks produce
    more (much more) than a half of all the explicit pid usage.

    [akpm@linux-foundation.org: git-drm went and changed lots of stuff]
    Signed-off-by: Pavel Emelyanov
    Cc: Dave Airlie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

11 Oct, 2007

2 commits

  • This patch makes most of the generic device layer network
    namespace safe. This patch makes dev_base_head a
    network namespace variable, and then it picks up
    a few associated variables. The functions:
    dev_getbyhwaddr
    dev_getfirsthwbytype
    dev_get_by_flags
    dev_get_by_name
    __dev_get_by_name
    dev_get_by_index
    __dev_get_by_index
    dev_ioctl
    dev_ethtool
    dev_load
    wireless_process_ioctl

    were modified to take a network namespace argument, and
    deal with it.

    vlan_ioctl_set and brioctl_set were modified so their
    hooks will receive a network namespace argument.

    So basically anthing in the core of the network stack that was
    affected to by the change of dev_base was modified to handle
    multiple network namespaces. The rest of the network stack was
    simply modified to explicitly use &init_net the initial network
    namespace. This can be fixed when those components of the network
    stack are modified to handle multiple network namespaces.

    For now the ifindex generator is left global.

    Fundametally ifindex numbers are per namespace, or else
    we will have corner case problems with migration when
    we get that far.

    At the same time there are assumptions in the network stack
    that the ifindex of a network device won't change. Making
    the ifindex number global seems a good compromise until
    the network stack can cope with ifindex changes when
    you change namespaces, and the like.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch passes in the namespace a new socket should be created in
    and has the socket code do the appropriate reference counting. By
    virtue of this all socket create methods are touched. In addition
    the socket create methods are modified so that they will fail if
    you attempt to create a socket in a non-default network namespace.

    Failing if we attempt to create a socket outside of the default
    network namespace ensures that as we incrementally make the network stack
    network namespace aware we will not export functionality that someone
    has not audited and made certain is network namespace safe.
    Allowing us to partially enable network namespaces before all of the
    exotic protocols are supported.

    Any protocol layers I have missed will fail to compile because I now
    pass an extra parameter into the socket creation code.

    [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

09 May, 2007

1 commit


11 Feb, 2007

1 commit


03 Dec, 2006

1 commit


05 Aug, 2006

1 commit

  • The datagram interface of LLC is broken in a couple of ways.
    These were discovered when trying to use it to build an out-of-kernel
    version of STP.

    First it didn't pass the source address of the received packet
    in recvfrom(). It needs to copy the source address of received LLC packets
    into the socket control block. At the same time fix a security issue
    because there was uninitialized data leakage. Every recvfrom call
    was just copying out old data.

    Second, LLC should not merge multiple packets in one receive call
    on datagram sockets. LLC should preserve packet boundaries on
    SOCK_DGRAM.

    This fix goes against the old historical comments about UNIX98 semantics
    but without this fix SOCK_DGRAM is broken and useless. So either ANK's
    interpretation was incorect or UNIX98 standard was wrong.

    Signed-off-by: Stephen Hemminger
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

01 Jul, 2006

1 commit


18 Jun, 2006

1 commit

  • LLC receive is broken for SOCK_DGRAM.
    If an application does recv() on a datagram socket and there
    is no data present, don't return "not connected". Instead, just
    do normal datagram semantics.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger