16 Oct, 2007

4 commits


13 Oct, 2007

1 commit


11 Oct, 2007

8 commits

  • The netfilter sysctls in the bridging code don't set strategy routines:

    sysctl table check failed: /net/bridge/bridge-nf-call-arptables .3.10.1 Missing strategy
    sysctl table check failed: /net/bridge/bridge-nf-call-iptables .3.10.2 Missing strategy
    sysctl table check failed: /net/bridge/bridge-nf-call-ip6tables .3.10.3 Missing strategy
    sysctl table check failed: /net/bridge/bridge-nf-filter-vlan-tagged .3.10.4 Missing strategy
    sysctl table check failed: /net/bridge/bridge-nf-filter-pppoe-tagged .3.10.5 Missing strategy

    These binary sysctls can't work. The binary sysctl numbers of
    other netfilter sysctls with this problem are being removed. These
    need to go as well.

    Signed-off-by: Joseph Fannin
    Acked-by: "Eric W. Biederman"
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Joseph Fannin
     
  • For the operations
    get-tx-csum
    get-sg
    get-tso
    get-ufo
    the default ethtool_op_xxx behavior is fine for all drivers, so we
    permit op==NULL to imply the default behavior.

    This provides a more uniform behavior across all drivers, eliminating
    ethtool(8) "ioctl not supported" errors on older drivers that had
    not been updated for the latest sub-ioctls.

    The ethtool_op_xxx() functions are left exported, in case anyone
    wishes to call them directly from a driver-private implementation --
    a not-uncommon case. Should an ethtool_op_xxx() helper remain unused
    for a while, except by net/core/ethtool.c, we can un-export it at a
    later date.

    [ Resolved conflicts with set/get value ethtool patch... -DaveM ]

    Signed-off-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Jeff Garzik
     
  • It's been a useless no-op for long enough in 2.6 so I figured it's time to
    remove it. The number of people that could object because they're
    maintaining unified 2.4 and 2.6 drivers is probably rather small.

    [ Handled drivers added by netdev tree and some missed IRDA cases... -DaveM ]

    Signed-off-by: Ralf Baechle
    Signed-off-by: Jeff Garzik
    Signed-off-by: David S. Miller

    Ralf Baechle
     
  • This patch makes most of the generic device layer network
    namespace safe. This patch makes dev_base_head a
    network namespace variable, and then it picks up
    a few associated variables. The functions:
    dev_getbyhwaddr
    dev_getfirsthwbytype
    dev_get_by_flags
    dev_get_by_name
    __dev_get_by_name
    dev_get_by_index
    __dev_get_by_index
    dev_ioctl
    dev_ethtool
    dev_load
    wireless_process_ioctl

    were modified to take a network namespace argument, and
    deal with it.

    vlan_ioctl_set and brioctl_set were modified so their
    hooks will receive a network namespace argument.

    So basically anthing in the core of the network stack that was
    affected to by the change of dev_base was modified to handle
    multiple network namespaces. The rest of the network stack was
    simply modified to explicitly use &init_net the initial network
    namespace. This can be fixed when those components of the network
    stack are modified to handle multiple network namespaces.

    For now the ifindex generator is left global.

    Fundametally ifindex numbers are per namespace, or else
    we will have corner case problems with migration when
    we get that far.

    At the same time there are assumptions in the network stack
    that the ifindex of a network device won't change. Making
    the ifindex number global seems a good compromise until
    the network stack can cope with ifindex changes when
    you change namespaces, and the like.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Each netlink socket will live in exactly one network namespace,
    this includes the controlling kernel sockets.

    This patch updates all of the existing netlink protocols
    to only support the initial network namespace. Request
    by clients in other namespaces will get -ECONREFUSED.
    As they would if the kernel did not have the support for
    that netlink protocol compiled in.

    As each netlink protocol is updated to be multiple network
    namespace safe it can register multiple kernel sockets
    to acquire a presence in the rest of the network namespaces.

    The implementation in af_netlink is a simple filter implementation
    at hash table insertion and hash table look up time.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Every user of the network device notifiers is either a protocol
    stack or a pseudo device. If a protocol stack that does not have
    support for multiple network namespaces receives an event for a
    device that is not in the initial network namespace it quite possibly
    can get confused and do the wrong thing.

    To avoid problems until all of the protocol stacks are converted
    this patch modifies all netdev event handlers to ignore events on
    devices that are not in the initial network namespace.

    As the rest of the code is made network namespace aware these
    checks can be removed.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • This patch modifies every packet receive function
    registered with dev_add_pack() to drop packets if they
    are not from the initial network namespace.

    This should ensure that the various network stacks do
    not receive packets in a anything but the initial network
    namespace until the code has been converted and is ready
    for them.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • Hopefully captured all single statement cases under net/. I'm
    not too sure if there is some policy about #includes that are
    "guaranteed" (ie., in the current tree) to be available through
    some other #included header, so I just added linux/kernel.h to
    each changed file that didn't #include it previously.

    Signed-off-by: Ilpo Järvinen
    Signed-off-by: David S. Miller

    Ilpo Järvinen
     

17 Sep, 2007

2 commits

  • This patch adds an optimised version of skb_cow that avoids the copy if
    the header can be modified even if the rest of the payload is cloned.

    This can be used in encapsulating paths where we only need to modify the
    header. As it is, this can be used in PPPOE and bridging.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • The clone argument is only used by one caller and that caller can clone
    the packet itself. This patch moves the clone call into the caller and
    kills the clone argument.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

11 Sep, 2007

1 commit

  • So I've had a deadlock reported to me. I've found that the sequence of
    events goes like this:

    1) process A (modprobe) runs to remove ip_tables.ko

    2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
    increasing the ip_tables socket_ops use count

    3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
    in the kernel, which in turn executes the ip_tables module cleanup routine,
    which calls nf_unregister_sockopt

    4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
    calling process into uninterruptible sleep, expecting the process using the
    socket option code to wake it up when it exits the kernel

    4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
    ipt_find_table_lock, which in this case calls request_module to load
    ip_tables_nat.ko

    5) request_module forks a copy of modprobe (process C) to load the module and
    blocks until modprobe exits.

    6) Process C. forked by request_module process the dependencies of
    ip_tables_nat.ko, of which ip_tables.ko is one.

    7) Process C attempts to lock the request module and all its dependencies, it
    blocks when it attempts to lock ip_tables.ko (which was previously locked in
    step 3)

    Theres not really any great permanent solution to this that I can see, but I've
    developed a two part solution that corrects the problem

    Part 1) Modifies the nf_sockopt registration code so that, instead of using a
    use counter internal to the nf_sockopt_ops structure, we instead use a pointer
    to the registering modules owner to do module reference counting when nf_sockopt
    calls a modules set/get routine. This prevents the deadlock by preventing set 4
    from happening.

    Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
    remove operations (the same way rmmod does), and add an option to explicity
    request blocking operation. So if you select blocking operation in modprobe you
    can still cause the above deadlock, but only if you explicity try (and since
    root can do any old stupid thing it would like.... :) ).

    Signed-off-by: Neil Horman
    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Neil Horman
     

31 Aug, 2007

2 commits

  • Bridge code calls ethtool to get speed. The conversion to using
    only ethtool_ops broke the case of devices without ethtool_ops.
    This is a new regression in 2.6.23.

    Rearranged the switch to a logical order, and use gcc initializer.

    Ps: speed should have been part of the network device structure from
    the start rather than burying it in ethtool.

    Signed-off-by: Stephen Hemminger
    Acked-by: Matthew Wilcox
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • This patch fixes some packet leakage in bridge. The bridging code was
    allowing forward table entries to be generated even if a device was
    being blocked. The fix is to not add forwarding database entries
    unless the port is active.

    The bug arose as part of the conversion to processing STP frames
    through normal receive path (in 2.6.17).

    Signed-off-by: Stephen Hemminger
    Acked-by: John W. Linville
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

27 Aug, 2007

1 commit

  • I tried to preserve bridging code as it was before, but logic is quite
    strange - I think we should free skb on error, since it is already
    unshared and thus will just leak.

    Herbert Xu states:

    > + if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL)
    > + goto out;

    If this happens it'll be a double-free on skb since we'll
    return NF_DROP which makes the caller free it too.

    We could return NF_STOLEN to prevent that but I'm not sure
    whether that's correct netfilter semantics. Patrick, could
    you please make a call on this?

    Patrick McHardy states:

    NF_STOLEN should work fine here.

    Signed-off-by: Evgeniy Polyakov
    Signed-off-by: David S. Miller

    Evgeniy Polyakov
     

20 Aug, 2007

1 commit


15 Aug, 2007

2 commits


14 Aug, 2007

1 commit

  • http://bugzilla.kernel.org/show_bug.cgi?id=8797 shows that the
    bonding driver may produce bogus combinations of the checksum
    flags and SG/TSO.

    For example, if you bond devices with NETIF_F_HW_CSUM and
    NETIF_F_IP_CSUM you'll end up with a bonding device that
    has neither flag set. If both have TSO then this produces
    an illegal combination.

    The bridge device on the other hand has the correct code to
    deal with this.

    In fact, the same code can be used for both. So this patch
    moves that logic into net/core/dev.c and uses it for both
    bonding and bridging.

    In the process I've made small adjustments such as only
    setting GSO_ROBUST if at least one constituent device
    supports it.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

01 Aug, 2007

1 commit

  • During the transition to the ethtool_ops way of doing things, we supported
    calling the device's ->do_ioctl method to allow unconverted drivers to
    continue working. Those days are long behind us, all in-tree drivers
    use the ethtool_ops way, and so we no longer need to support this.

    The bonding driver is the biggest beneficiary of this; it no longer
    needs to call ioctl() as a fallback if ethtool_ops aren't supported.

    Also put a proper copyright statement on ethtool.c.

    Signed-off-by: Matthew Wilcox
    Signed-off-by: David S. Miller

    Matthew Wilcox
     

27 Jul, 2007

1 commit


25 Jul, 2007

2 commits


20 Jul, 2007

1 commit

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     

18 Jul, 2007

1 commit

  • Rather than using a tri-state integer for the wait flag in
    call_usermodehelper_exec, define a proper enum, and use that. I've
    preserved the integer values so that any callers I've missed should
    still work OK.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: James Bottomley
    Cc: Randy Dunlap
    Cc: Christoph Hellwig
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Johannes Berg
    Cc: Ralf Baechle
    Cc: Bjorn Helgaas
    Cc: Joel Becker
    Cc: Tony Luck
    Cc: Kay Sievers
    Cc: Srivatsa Vaddagiri
    Cc: Oleg Nesterov
    Cc: David Howells

    Jeremy Fitzhardinge
     

15 Jul, 2007

1 commit


13 Jul, 2007

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (61 commits)
    sysfs: add parameter "struct bin_attribute *" in .read/.write methods for sysfs binary attributes
    sysfs: make directory dentries and inodes reclaimable
    sysfs: implement sysfs_get_dentry()
    sysfs: move sysfs_drop_dentry() to dir.c and make it static
    sysfs: restructure add/remove paths and fix inode update
    sysfs: use sysfs_mutex to protect the sysfs_dirent tree
    sysfs: consolidate sysfs spinlocks
    sysfs: make kobj point to sysfs_dirent instead of dentry
    sysfs: implement sysfs_find_dirent() and sysfs_get_dirent()
    sysfs: implement SYSFS_FLAG_REMOVED flag
    sysfs: rename sysfs_dirent->s_type to s_flags and make room for flags
    sysfs: make sysfs_drop_dentry() access inodes using ilookup()
    sysfs: Fix oops in sysfs_drop_dentry on x86_64
    sysfs: use singly-linked list for sysfs_dirent tree
    sysfs: slim down sysfs_dirent->s_active
    sysfs: move s_active functions to fs/sysfs/dir.c
    sysfs: fix root sysfs_dirent -> root dentry association
    sysfs: use iget_locked() instead of new_inode()
    sysfs: reorganize sysfs_new_indoe() and sysfs_create()
    sysfs: fix parent refcounting during rename and move
    ...

    Linus Torvalds
     

12 Jul, 2007

2 commits

  • Well, first of all, I don't want to change so many files either.

    What I do:
    Adding a new parameter "struct bin_attribute *" in the
    .read/.write methods for the sysfs binary attributes.

    In fact, only the four lines change in fs/sysfs/bin.c and
    include/linux/sysfs.h do the real work.
    But I have to update all the files that use binary attributes
    to make them compatible with the new .read and .write methods.
    I'm not sure if I missed any. :(

    Why I do this:
    For a sysfs attribute, we can get a pointer pointing to the
    struct attribute in the .show/.store method,
    while we can't do this for the binary attributes.
    I don't know why this is different, but this does make it not
    so handy to use the binary attributes as the regular ones.
    So I think this patch is reasonable. :)

    Who benefits from it:
    The patch that exposes ACPI tables in sysfs
    requires such an improvement.
    All the table binary attributes share the same .read method.
    Parameter "struct bin_attribute *" is used to get
    the table signature and instance number which are used to
    distinguish different ACPI table binary attributes.

    Without this parameter, we need to offer different .read methods
    for different ACPI table binary attributes.
    This is impossible as there are various ACPI tables on different
    platforms, and we don't know what they are until they are loaded.

    Signed-off-by: Zhang Rui
    Signed-off-by: Greg Kroah-Hartman

    Zhang Rui
     
  • sysfs is now completely out of driver/module lifetime game. After
    deletion, a sysfs node doesn't access anything outside sysfs proper,
    so there's no reason to hold onto the attribute owners. Note that
    often the wrong modules were accounted for as owners leading to
    accessing removed modules.

    This patch kills now unnecessary attribute->owner. Note that with
    this change, userland holding a sysfs node does not prevent the
    backing module from being unloaded.

    For more info regarding lifetime rule cleanup, please read the
    following message.

    http://article.gmane.org/gmane.linux.kernel/510293

    (tweaked by Greg to not delete the field just yet, to make it easier to
    merge things properly.)

    Signed-off-by: Tejun Heo
    Cc: Cornelia Huck
    Cc: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     

11 Jul, 2007

1 commit

  • The existing model for checksum offload does not correctly handle
    devices that can offload IPV4 and IPV6 only. The NETIF_F_HW_CSUM flag
    implies device can do any arbitrary protocol.

    This patch:
    * adds NETIF_F_IPV6_CSUM for those devices
    * fixes bnx2 and tg3 devices that need it
    * add NETIF_F_IPV6_CSUM to ipv6 output (incl GSO)
    * fixes assumptions about NETIF_F_ALL_CSUM in nat
    * adjusts bridge union of checksumming computation

    Signed-off-by: David S. Miller

    Stephen Hemminger
     

31 May, 2007

2 commits


09 May, 2007

1 commit


04 May, 2007

1 commit

  • Cleanup of dev_base list use, with the aim to simplify making device
    list per-namespace. In almost every occasion, use of dev_base variable
    and dev->next pointer could be easily replaced by for_each_netdev
    loop. A few most complicated places were converted to using
    first_netdev()/next_netdev().

    Signed-off-by: Pavel Emelianov
    Acked-by: Kirill Korotaev
    Signed-off-by: David S. Miller

    Pavel Emelianov
     

03 May, 2007

1 commit


26 Apr, 2007

1 commit