04 Jun, 2008

1 commit

  • Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and
    nla_nest_cancel() void functions.

    Return -EMSGSIZE instead of -1 if the provided message buffer is not
    big enough.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     

28 Apr, 2008

1 commit

  • Previously I added sessionid output to all audit messages where it was
    available but we still didn't know the sessionid of the sender of
    netlink messages. This patch adds that information to netlink messages
    so we can audit who sent netlink messages.

    Signed-off-by: Eric Paris
    Signed-off-by: Al Viro

    Eric Paris
     

19 Apr, 2008

2 commits

  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    security: fix up documentation for security_module_enable
    Security: Introduce security= boot parameter
    Audit: Final renamings and cleanup
    SELinux: use new audit hooks, remove redundant exports
    Audit: internally use the new LSM audit hooks
    LSM/Audit: Introduce generic Audit LSM hooks
    SELinux: remove redundant exports
    Netlink: Use generic LSM hook
    Audit: use new LSM hooks instead of SELinux exports
    SELinux: setup new inode/ipc getsecid hooks
    LSM: Introduce inode_getsecid and ipc_getsecid hooks

    Linus Torvalds
     
  • Don't use SELinux exported selinux_get_task_sid symbol.
    Use the generic LSM equivalent instead.

    Signed-off-by: Casey Schaufler
    Signed-off-by: Ahmed S. Darwish
    Acked-by: James Morris
    Acked-by: David S. Miller
    Reviewed-by: Paul Moore

    Ahmed S. Darwish
     

26 Mar, 2008

3 commits


22 Mar, 2008

1 commit

  • Make socket filters work for netlink unicast and notifications.
    This is useful for applications like Zebra that get overrun with
    messages that are then ignored.

    Note: netlink messages are in host byte order, but packet filter
    state machine operations are done as network byte order.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

01 Mar, 2008

2 commits


13 Feb, 2008

1 commit

  • The genl_unregister_family() calls the genl_unregister_mc_groups(),
    which takes and releases the genl_lock and then locks and releases
    this lock itself.

    Relax this behavior, all the more so the genl_unregister_mc_groups()
    is called from genl_unregister_family() only.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

02 Feb, 2008

1 commit


01 Feb, 2008

1 commit

  • The comment about "race free view of the set of network
    namespaces" was a bit hasty. Look (there even can be only
    one CPU, as discovered by Alexey Dobriyan and Denis Lunev):

    put_net()
    if (atomic_dec_and_test(&net->refcnt))
    /* true */
    __put_net(net);
    queue_work(...);

    /*
    * note: the net now has refcnt 0, but still in
    * the global list of net namespaces
    */

    == re-schedule ==

    register_pernet_subsys(&some_ops);
    register_pernet_operations(&some_ops);
    (*some_ops)->init(net);
    /*
    * we call netlink_kernel_create() here
    * in some places
    */
    netlink_kernel_create();
    sk_alloc();
    get_net(net); /* refcnt = 1 */
    /*
    * now we drop the net refcount not to
    * block the net namespace exit in the
    * future (or this can be done on the
    * error path)
    */
    put_net(sk->sk_net);
    if (atomic_dec_and_test(&...))
    /*
    * true. BOOOM! The net is
    * scheduled for release twice
    */

    When thinking on this problem, I decided, that getting and
    putting the net in init callback is wrong. If some init
    callback needs to have a refcount-less reference on the struct
    net, _it_ has to be careful himself, rather than relying on
    the infrastructure to handle this correctly.

    In case of netlink_kernel_create(), the problem is that the
    sk_alloc() gets the given namespace, but passing the info
    that we don't want to get it inside this call is too heavy.

    Instead, I propose to crate the socket inside an init_net
    namespace and then re-attach it to the desired one right
    after the socket is created.

    After doing this, we also have to be careful on error paths
    not to drop the reference on the namespace, we didn't get
    the one on.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Denis Lunev
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

29 Jan, 2008

9 commits

  • Used to append data to a message without a header or padding.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • During network namespace stop process kernel side netlink sockets
    belonging to a namespace should be closed. They should not prevent
    namespace to stop, so they do not increment namespace usage
    counter. Though this counter will be put during last sock_put.

    The raplacement of the correct netns for init_ns solves the problem
    only partial as socket to be stoped until proper stop is a valid
    netlink kernel socket and can be looked up by the user processes. This
    is not a problem until it resides in initial namespace (no processes
    inside this net), but this is not true for init_net.

    So, hold the referrence for a socket, remove it from lookup tables and
    only after that change namespace and perform a last put.

    Signed-off-by: Denis V. Lunev
    Tested-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • Create a specific helper for netlink kernel socket disposal. This just
    let the code look better and provides a ground for proper disposal
    inside a namespace.

    Signed-off-by: Denis V. Lunev
    Tested-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • Netlink protocol table is global for all namespaces. Some netlink
    protocols have been virtualized, i.e. they have per/namespace netlink
    socket. This difference can easily lead to double free if more than 1
    namespace is started. Count the number of kernel netlink sockets to
    track that this table is not used any more.

    Signed-off-by: Denis V. Lunev
    Tested-by: Alexey Dobriyan
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • net/netlink/af_netlink.c:
    netlink_realloc_groups | -46
    netlink_insert | -49
    netlink_autobind | -94
    netlink_clear_multicast_users | -48
    netlink_bind | -55
    netlink_setsockopt | -54
    netlink_release | -86
    netlink_kernel_create | -47
    netlink_change_ngroups | -56
    9 functions changed, 535 bytes removed, diff: -535

    net/netlink/af_netlink.c:
    netlink_table_ungrab | +53
    1 function changed, 53 bytes added, diff: +53

    net/netlink/af_netlink.o:
    10 functions changed, 53 bytes added, 535 bytes removed, diff: -482

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     
  • Add __acquires() and __releases() annotations to suppress some sparse
    warnings.

    example of warnings :

    net/ipv4/udp.c:1555:14: warning: context imbalance in 'udp_seq_start' - wrong
    count at exit
    net/ipv4/udp.c:1571:13: warning: context imbalance in 'udp_seq_stop' -
    unexpected unlock

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • nl_pid_hash_alloc() is renamed to nl_pid_hash_zalloc().
    It is now returning zeroed memory to its callers.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Fix large number of checkpatch errors.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Signed-off-by: Denis V. Lunev
    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Denis V. Lunev
     

13 Nov, 2007

1 commit


07 Nov, 2007

1 commit

  • Commit ed6dcf4a in the history.git tree broke netlink_unicast timeouts
    by moving the schedule_timeout() call to a new function that doesn't
    propagate the remaining timeout back to the caller. This means on each
    retry we start with the full timeout again.

    ipc/mqueue.c seems to actually want to wait indefinitely so this
    behaviour is retained.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

01 Nov, 2007

1 commit

  • Finally, the zero_it argument can be completely removed from
    the callers and from the function prototype.

    Besides, fix the checkpatch.pl warnings about using the
    assignments inside if-s.

    This patch is rather big, and it is a part of the previous one.
    I splitted it wishing to make the patches more readable. Hope
    this particular split helped.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

27 Oct, 2007

1 commit

  • It is not safe to to place struct pernet_operations in a special section.
    We need struct pernet_operations to last until we call unregister_pernet_subsys.
    Which doesn't happen until module unload.

    So marking struct pernet_operations is a disaster for modules in two ways.
    - We discard it before we call the exit method it points to.
    - Because I keep struct pernet_operations on a linked list discarding
    it for compiled in code removes elements in the middle of a linked
    list and does horrible things for linked insert.

    So this looks safe assuming __exit_refok is not discarded
    for modules.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

24 Oct, 2007

1 commit

  • Revert to original netlink behavior. Do not reply with ACK if the
    netlink dump has bees successfully started.

    libnl has been broken by the cd40b7d3983c708aabe3d3008ec64ffce56d33b0
    The following command reproduce the problem:
    /nl-route-get 192.168.1.1

    Signed-off-by: Denis V. Lunev
    Acked-by: Thomas Graf
    Signed-off-by: David S. Miller

    Denis V. Lunev
     

16 Oct, 2007

1 commit


11 Oct, 2007

12 commits

  • This patch make processing netlink user -> kernel messages synchronious.
    This change was inspired by the talk with Alexey Kuznetsov about current
    netlink messages processing. He says that he was badly wrong when introduced
    asynchronious user -> kernel communication.

    The call netlink_unicast is the only path to send message to the kernel
    netlink socket. But, unfortunately, it is also used to send data to the
    user.

    Before this change the user message has been attached to the socket queue
    and sk->sk_data_ready was called. The process has been blocked until all
    pending messages were processed. The bad thing is that this processing
    may occur in the arbitrary process context.

    This patch changes nlk->data_ready callback to get 1 skb and force packet
    processing right in the netlink_unicast.

    Kernel -> user path in netlink_unicast remains untouched.

    EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
    drop, but the process remains in the cycle until the message will be fully
    processed. So, there is no need to use this kludges now.

    Signed-off-by: Denis V. Lunev
    Acked-by: Alexey Kuznetsov
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • There are currently two ways to determine whether the netlink socket is a
    kernel one or a user one. This patch creates a single inline call for
    this purpose and unifies all the calls in the af_netlink.c

    No similar calls are found outside af_netlink.c.

    Signed-off-by: Denis V. Lunev
    Acked-by: Alexey Kuznetsov
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • netlink_sendskb does not use third argument. Clean it and save a couple of
    bytes.

    Signed-off-by: Denis V. Lunev
    Acked-by: Alexey Kuznetsov
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • The code in netfilter/nfnetlink.c and in ./net/netlink/genetlink.c looks
    like outdated copy/paste from rtnetlink.c. Push them into sync with the
    original.

    Changes from v1:
    - deleted comment in nfnetlink_rcv_msg by request of Patrick McHardy

    Signed-off-by: Denis V. Lunev
    Acked-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Denis V. Lunev
     
  • This concerns the ipv4 and ipv6 code mostly, but also the netlink
    and unix sockets.

    The netlink code is an example of how to use the __seq_open_private()
    call - it saves the net namespace on this private.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • With the net namespaces many code leaved the __init section,
    thus making the kernel occupy more memory than it did before.
    Since we have a config option that prohibits the namespace
    creation, the functions that initialize/finalize some netns
    stuff are simply not needed and can be freed after the boot.

    Currently, this is almost not noticeable, since few calls
    are no longer in __init, but when the namespaces will be
    merged it will be possible to free more code. I propose to
    use the __net_init, __net_exit and __net_initdata "attributes"
    for functions/variables that are not used if the CONFIG_NET_NS
    is not set to save more space in memory.

    The exiting functions cannot just reside in the __exit section,
    as noticed by David, since the init section will have
    references on it and the compilation will fail due to modpost
    checks. These references can exist, since the init namespace
    never dies and the exit callbacks are never called. So I
    introduce the __exit_refok attribute just like it is already
    done with the __init_refok.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • with the macro max provided by , so changed its name
    to a more proper one: limit

    Signed-off-by: Denis Cheng
    Signed-off-by: David S. Miller

    Denis Cheng
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: David S. Miller

    Denis Cheng
     
  • I was looking at Patrick's fix to inet_diag and it occured
    to me that we're using a pointer argument to return values
    unnecessarily in netlink_run_queue. Changing it to return
    the value will allow the compiler to generate better code
    since the value won't have to be memory-backed.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The problem: proc_net files remember which network namespace the are
    against but do not remember hold a reference count (as that would pin
    the network namespace). So we currently have a small window where
    the reference count on a network namespace may be incremented when opening
    a /proc file when it has already gone to zero.

    To fix this introduce maybe_get_net and get_proc_net.

    maybe_get_net increments the network namespace reference count only if it is
    greater then zero, ensuring we don't increment a reference count after it
    has gone to zero.

    get_proc_net handles all of the magic to go from a proc inode to the network
    namespace instance and call maybe_get_net on it.

    PROC_NET the old accessor is removed so that we don't get confused and use
    the wrong helper function.

    Then I fix up the callers to use get_proc_net and handle the case case
    where get_proc_net returns NULL. In that case I return -ENXIO because
    effectively the network namespace has already gone away so the files
    we are trying to access don't exist anymore.

    Signed-off-by: Eric W. Biederman
    Acked-by: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This change allows the generic attribute interface to be used within
    the netfilter subsystem where this flag was initially introduced.

    The byte-order flag is yet unused, it's intended use is to
    allow automatic byte order convertions for all atomic types.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • Each netlink socket will live in exactly one network namespace,
    this includes the controlling kernel sockets.

    This patch updates all of the existing netlink protocols
    to only support the initial network namespace. Request
    by clients in other namespaces will get -ECONREFUSED.
    As they would if the kernel did not have the support for
    that netlink protocol compiled in.

    As each netlink protocol is updated to be multiple network
    namespace safe it can register multiple kernel sockets
    to acquire a presence in the rest of the network namespaces.

    The implementation in af_netlink is a simple filter implementation
    at hash table insertion and hash table look up time.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman