26 Mar, 2008

1 commit


01 Mar, 2008

1 commit


29 Feb, 2008

1 commit

  • As Davem mentioned in his recently patch
    (d9595a7b9c777d45a74774f1428c263a0a47f4c0)
    that the procfs visibility should occur after
    the ->proc_fops are setup.

    And also, Alexey provide proc_create() to make
    sure that ->proc_fops is setup before gluing PDE
    to main tree.

    We use proc_create().

    Signed-off-by: Wang Chen
    Signed-off-by: David S. Miller

    Wang Chen
     

29 Jan, 2008

3 commits

  • CHECK net/appletalk/aarp.c
    net/appletalk/aarp.c:951:14: warning: context imbalance in 'aarp_seq_start' - wrong count at exit
    net/appletalk/aarp.c:977:13: warning: context imbalance in 'aarp_seq_stop' - unexpected unlock
    CHECK net/appletalk/atalk_proc.c
    net/appletalk/atalk_proc.c:34:11: warning: context imbalance in 'atalk_seq_interface_start' - wrong count at exit
    net/appletalk/atalk_proc.c:54:13: warning: context imbalance in 'atalk_seq_interface_stop' - unexpected unlock
    net/appletalk/atalk_proc.c:93:11: warning: context imbalance in 'atalk_seq_route_start' - wrong count at exit
    net/appletalk/atalk_proc.c:113:13: warning: context imbalance in 'atalk_seq_route_stop' - unexpected unlock
    net/appletalk/atalk_proc.c:161:11: warning: context imbalance in 'atalk_seq_socket_start' - wrong count at exit
    net/appletalk/atalk_proc.c:178:13: warning: context imbalance in 'atalk_seq_socket_stop' - unexpected unlock

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This patch includes many places, that only required
    replacing the ctl_table-s with appropriate ctl_paths
    and call register_sysctl_paths().

    Nothing special was done with them.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     
  • Many-many code in the kernel initialized the timer->function
    and timer->data together with calling init_timer(timer). There
    is already a helper for this. Use it for networking code.

    The patch is HUGE, but makes the code 130 lines shorter
    (98 insertions(+), 228 deletions(-)).

    Signed-off-by: Pavel Emelyanov
    Acked-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

01 Nov, 2007

1 commit

  • Finally, the zero_it argument can be completely removed from
    the callers and from the function prototype.

    Besides, fix the checkpatch.pl warnings about using the
    assignments inside if-s.

    This patch is rather big, and it is a part of the previous one.
    I splitted it wishing to make the patches more readable. Hope
    this particular split helped.

    Signed-off-by: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Pavel Emelyanov
     

11 Oct, 2007

9 commits

  • Fix a bunch of sparse warnings. Mostly about 0 used as
    NULL pointer, and shadowed variable declarations.
    One notable case was that hash size should have been unsigned.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • Since hardware header operations are part of the protocol class
    not the device instance, make them into a separate object and
    save memory.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This is nicer than the MAC_FMT stuff.

    Signed-off-by: Joe Perches
    Signed-off-by: David S. Miller

    Joe Perches
     
  • This patch makes most of the generic device layer network
    namespace safe. This patch makes dev_base_head a
    network namespace variable, and then it picks up
    a few associated variables. The functions:
    dev_getbyhwaddr
    dev_getfirsthwbytype
    dev_get_by_flags
    dev_get_by_name
    __dev_get_by_name
    dev_get_by_index
    __dev_get_by_index
    dev_ioctl
    dev_ethtool
    dev_load
    wireless_process_ioctl

    were modified to take a network namespace argument, and
    deal with it.

    vlan_ioctl_set and brioctl_set were modified so their
    hooks will receive a network namespace argument.

    So basically anthing in the core of the network stack that was
    affected to by the change of dev_base was modified to handle
    multiple network namespaces. The rest of the network stack was
    simply modified to explicitly use &init_net the initial network
    namespace. This can be fixed when those components of the network
    stack are modified to handle multiple network namespaces.

    For now the ifindex generator is left global.

    Fundametally ifindex numbers are per namespace, or else
    we will have corner case problems with migration when
    we get that far.

    At the same time there are assumptions in the network stack
    that the ifindex of a network device won't change. Making
    the ifindex number global seems a good compromise until
    the network stack can cope with ifindex changes when
    you change namespaces, and the like.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Every user of the network device notifiers is either a protocol
    stack or a pseudo device. If a protocol stack that does not have
    support for multiple network namespaces receives an event for a
    device that is not in the initial network namespace it quite possibly
    can get confused and do the wrong thing.

    To avoid problems until all of the protocol stacks are converted
    this patch modifies all netdev event handlers to ignore events on
    devices that are not in the initial network namespace.

    As the rest of the code is made network namespace aware these
    checks can be removed.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch modifies every packet receive function
    registered with dev_add_pack() to drop packets if they
    are not from the initial network namespace.

    This should ensure that the various network stacks do
    not receive packets in a anything but the initial network
    namespace until the code has been converted and is ready
    for them.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch passes in the namespace a new socket should be created in
    and has the socket code do the appropriate reference counting. By
    virtue of this all socket create methods are touched. In addition
    the socket create methods are modified so that they will fail if
    you attempt to create a socket in a non-default network namespace.

    Failing if we attempt to create a socket outside of the default
    network namespace ensures that as we incrementally make the network stack
    network namespace aware we will not export functionality that someone
    has not audited and made certain is network namespace safe.
    Allowing us to partially enable network namespaces before all of the
    exotic protocols are supported.

    Any protocol layers I have missed will fail to compile because I now
    pass an extra parameter into the socket creation code.

    [ Integrated AF_IUCV build fixes from Andrew Morton... -DaveM ]

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch makes /proc/net per network namespace. It modifies the global
    variables proc_net and proc_net_stat to be per network namespace.
    The proc_net file helpers are modified to take a network namespace argument,
    and all of their callers are fixed to pass &init_net for that argument.
    This ensures that all of the /proc/net files are only visible and
    usable in the initial network namespace until the code behind them
    has been updated to be handle multiple network namespaces.

    Making /proc/net per namespace is necessary as at least some files
    in /proc/net depend upon the set of network devices which is per
    network namespace, and even more files in /proc/net have contents
    that are relevant to a single network namespace.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This slightly improves code safety and clarity.

    Later network namespace patches touch this code so this is a
    preliminary cleanup.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     

11 Jul, 2007

1 commit


09 May, 2007

1 commit


28 Apr, 2007

1 commit

  • This reverts eefa3906283a2b60a6d02a2cda593a7d7d7946c5

    The simplification made in that change works with the assumption that
    the 'offset' parameter to these functions is always positive or zero,
    which is not true. It can be and often is negative in order to access
    SKB header values in front of skb->data.

    Signed-off-by: David S. Miller

    David S. Miller
     

26 Apr, 2007

7 commits

  • I noticed recently that, in skb_checksum(), "offset" and "start" are
    essentially the same thing and have the same value throughout the
    function, despite being computed differently. Using a single variable
    allows some cleanups and makes the skb_checksum() function smaller,
    more readable, and presumably marginally faster.

    We appear to have many other "sk_buff walker" functions built on the
    exact same model, so the cleanup applies to them, too. Here is a list
    of the functions I found to be affected:

    net/appletalk/ddp.c:atalk_sum_skb()
    net/core/datagram.c:skb_copy_datagram_iovec()
    net/core/datagram.c:skb_copy_and_csum_datagram()
    net/core/skbuff.c:skb_copy_bits()
    net/core/skbuff.c:skb_store_bits()
    net/core/skbuff.c:skb_checksum()
    net/core/skbuff.c:skb_copy_and_csum_bit()
    net/core/user_dma.c:dma_skb_copy_datagram_iovec()
    net/xfrm/xfrm_algo.c:skb_icv_walk()
    net/xfrm/xfrm_algo.c:skb_to_sgvec()

    OTOH, I admit I'm a bit surprised, the cleanup is rather obvious so I'm
    really wondering if I am missing something. Can anyone please comment
    on this?

    Signed-off-by: Jean Delvare
    Signed-off-by: David S. Miller

    Jean Delvare
     
  • Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
    skb->mac to skb->mac_header, to match the names of the associated helpers
    (skb[_[re]set]_{transport,network,mac}_header).

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the common, open coded 'skb->h.raw = skb->data' operation, so that we can
    later turn skb->h.raw into a offset, reducing the size of struct sk_buff in
    64bit land while possibly keeping it as a pointer on 32bit.

    This one touches just the most simple cases:

    skb->h.raw = skb->data;
    skb->h.raw = {skb_push|[__]skb_pull}()

    The next ones will handle the slightly more "complex" cases.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • But only in the cases where its a newly allocated skb, i.e. one where skb->tail
    is equal to skb->data, or just after skb_reserve, where this requirement is
    maintained.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the common, open coded 'skb->nh.raw = skb->data' operation, so that we can
    later turn skb->nh.raw into a offset, reducing the size of struct sk_buff in
    64bit land while possibly keeping it as a pointer on 32bit.

    This one touches just the most simple case, next will handle the slightly more
    "complex" cases.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • For the places where we need a pointer to the mac header, it is still legal to
    touch skb->mac.raw directly if just adding to, subtracting from or setting it
    to another layer header.

    This one also converts some more cases to skb_reset_mac_header() that my
    regex missed as it had no spaces before nor after '=', ugh.

    Signed-off-by: Arnaldo Carvalho de Melo
    Signed-off-by: David S. Miller

    Arnaldo Carvalho de Melo
     
  • Now network timestamps use ktime_t infrastructure, we can add a new
    ioctl() SIOCGSTAMPNS command to get timestamps in 'struct timespec'.
    User programs can thus access to nanosecond resolution.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     

05 Apr, 2007

1 commit

  • When we receive an AppleTalk frame shorter than what its header says,
    we still attempt to verify its checksum, and trip on the BUG_ON() at
    the end of function atalk_sum_skb() because of the length mismatch.

    This has security implications because this can be triggered by simply
    sending a specially crafted ethernet frame to a target victim,
    effectively crashing that host. Thus this qualifies, I think, as a
    remote DoS. Here is the frame I used to trigger the crash, in npg
    format:

    {
    # Ethernet header -----

    XX XX XX XX XX XX # Destination MAC
    00 00 00 00 00 00 # Source MAC
    00 1D # Length

    # LLC header -----

    AA AA 03
    08 00 07 80 9B # Appletalk

    # Appletalk header -----

    00 1B # Packet length (invalid)
    00 01 # Fake checksum
    00 00 00 00 # Destination and source networks
    00 00 00 00 # Destination and source nodes and ports

    # Payload -----

    0C 0D 0E 0F 10 11 12 13
    14
    }

    The destination MAC address must be set to those of the victim.

    The severity is mitigated by two requirements:
    * The target host must have the appletalk kernel module loaded. I
    suspect this isn't so frequent.
    * AppleTalk frames are non-IP, thus I guess they can only travel on
    local networks. I am no network expert though, maybe it is possible
    to somehow encapsulate AppleTalk packets over IP.

    The bug has been reported back in June 2004:
    http://bugzilla.kernel.org/show_bug.cgi?id=2979
    But it wasn't investigated, and was closed in July 2006 as both
    reporters had vanished meanwhile.

    This code was new in kernel 2.6.0-test5:
    http://git.kernel.org/?p=linux/kernel/git/tglx/history.git;a=commitdiff;h=7ab442d7e0a76402c12553ee256f756097cae2d2
    And not modified since then, so we can assume that vanilla kernels
    2.6.0-test5 and later, and distribution kernels based thereon, are
    affected.

    Note that I still do not know for sure what triggered the bug in the
    real-world cases. The frame could have been corrupted by the kernel if
    we have a bug hiding somewhere. But more likely, we are receiving the
    faulty frame from the network.

    Signed-off-by: Jean Delvare
    Signed-off-by: David S. Miller

    Jean Delvare
     

15 Feb, 2007

2 commits

  • The semantic effect of insert_at_head is that it would allow new registered
    sysctl entries to override existing sysctl entries of the same name. Which is
    pain for caching and the proc interface never implemented.

    I have done an audit and discovered that none of the current users of
    register_sysctl care as (excpet for directories) they do not register
    duplicate sysctl entries.

    So this patch simply removes the support for overriding existing entries in
    the sys_sysctl interface since no one uses it or cares and it makes future
    enhancments harder.

    Signed-off-by: Eric W. Biederman
    Acked-by: Ralf Baechle
    Acked-by: Martin Schwidefsky
    Cc: Russell King
    Cc: David Howells
    Cc: "Luck, Tony"
    Cc: Ralf Baechle
    Cc: Paul Mackerras
    Cc: Martin Schwidefsky
    Cc: Andi Kleen
    Cc: Jens Axboe
    Cc: Corey Minyard
    Cc: Neil Brown
    Cc: "John W. Linville"
    Cc: James Bottomley
    Cc: Jan Kara
    Cc: Trond Myklebust
    Cc: Mark Fasheh
    Cc: David Chinner
    Cc: "David S. Miller"
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Signed-off-by: Eric W. Biederman
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

13 Feb, 2007

1 commit

  • Many struct file_operations in the kernel can be "const". Marking them const
    moves these to the .rodata section, which avoids false sharing with potential
    dirty data. In addition it'll catch accidental writes at compile time to
    these shared resources.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     

11 Feb, 2007

1 commit


04 Dec, 2006

1 commit


31 Oct, 2006

1 commit


29 Sep, 2006

1 commit


22 Jul, 2006

1 commit


01 Jul, 2006

1 commit


29 Mar, 2006

1 commit

  • Fix kernel oopses whenever somebody issues compatible ioctl on AppleTalk,
    Econet, IPX or IRDA socket. For AppleTalk/Econet/IRDA it restores state
    in which these sockets were before compat_ioctl was introduced to the socket
    ops, for IPX it implements support for 4 ioctls which were not implemented
    before - as these ioctls use structures which match between 32bit and 64bit
    userspace, no special code is needed, just call 64bit ioctl handler.

    Signed-off-by: Petr Vandrovec
    Signed-off-by: David S. Miller

    Petr Vandrovec
     

12 Jan, 2006

1 commit


04 Jan, 2006

2 commits

  • Currently all network protocols need to call dev_ioctl as the default
    fallback in their ioctl implementations. This patch adds a fallback
    to dev_ioctl to sock_ioctl if the protocol returned -ENOIOCTLCMD.
    This way all the procotol ioctl handlers can be simplified and we don't
    need to export dev_ioctl.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: David S. Miller

    Christoph Hellwig
     
  • I noticed that some of 'struct proto_ops' used in the kernel may share
    a cache line used by locks or other heavily modified data. (default
    linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
    least)

    This patch makes sure a 'struct proto_ops' can be declared as const,
    so that all cpus can share all parts of it without false sharing.

    This is not mandatory : a driver can still use a read/write structure
    if it needs to (and eventually a __read_mostly)

    I made a global stubstitute to change all existing occurences to make
    them const.

    This should reduce the possibility of false sharing on SMP, and
    speedup some socket system calls.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet