17 Sep, 2007

1 commit


15 Sep, 2007

1 commit

  • 1) Comments suggest that setting optlen to zero will unbind
    the socket from whatever device it might be attached to. This
    hasn't been the case since at least 2.2.x because the first thing
    this function does is return -EINVAL if 'optlen' is less than
    sizeof(int).

    This check also means that passing in a two byte string doesn't
    work so well. It's almost as if this code was testing with "eth?"
    patterned strings and nothing else :-)

    Fix this by breaking the logic of this facility out into a
    seperate function which validates optlen more appropriately.

    The optlen==0 and small string cases now work properly.

    2) We should reset the cached route of the socket after we have made
    the device binding changes, not before.

    Reported by Ben Greear.

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Sep, 2007

1 commit

  • When msg_iovlen is zero we shouldn't try to dereference
    msg_iov. Right now the only thing that tries to do so
    is skb_copy_and_csum_datagram_iovec. Since the total
    length should also be zero if msg_iovlen is zero, it's
    sufficient to check the total length there and simply
    return if it's zero.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

31 Aug, 2007

1 commit


29 Aug, 2007

1 commit

  • Initially pkt_dev can be NULL this causes netif_subqueue_stopped to
    oops. The patch below should cure it. But maybe the pktgen TX logic
    should be reworked to better support the new multiqueue support.

    Signed-off-by: Robert Olsson
    Signed-off-by: David S. Miller

    Robert Olsson
     

27 Aug, 2007

2 commits


15 Aug, 2007

1 commit


14 Aug, 2007

1 commit

  • http://bugzilla.kernel.org/show_bug.cgi?id=8797 shows that the
    bonding driver may produce bogus combinations of the checksum
    flags and SG/TSO.

    For example, if you bond devices with NETIF_F_HW_CSUM and
    NETIF_F_IP_CSUM you'll end up with a bonding device that
    has neither flag set. If both have TSO then this produces
    an illegal combination.

    The bridge device on the other hand has the correct code to
    deal with this.

    In fact, the same code can be used for both. So this patch
    moves that logic into net/core/dev.c and uses it for both
    bonding and bridging.

    In the process I've made small adjustments such as only
    setting GSO_ROBUST if at least one constituent device
    supports it.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

08 Aug, 2007

1 commit


01 Aug, 2007

3 commits

  • replay label is unused otherwise.

    Signed-off-by: Thomas Graf
    Signed-off-by: David S. Miller

    Thomas Graf
     
  • All drivers implement ethtool get_perm_addr the same way -- by calling
    the generic function. So we can inline the generic function into the
    caller and avoid going through the drivers.

    Signed-off-by: Matthew Wilcox
    Signed-off-by: David S. Miller

    Matthew Wilcox
     
  • During the transition to the ethtool_ops way of doing things, we supported
    calling the device's ->do_ioctl method to allow unconverted drivers to
    continue working. Those days are long behind us, all in-tree drivers
    use the ethtool_ops way, and so we no longer need to support this.

    The bonding driver is the biggest beneficiary of this; it no longer
    needs to call ioctl() as a fallback if ethtool_ops aren't supported.

    Also put a proper copyright statement on ethtool.c.

    Signed-off-by: Matthew Wilcox
    Signed-off-by: David S. Miller

    Matthew Wilcox
     

31 Jul, 2007

6 commits

  • Non-static inline code usually doesn't makes sense.

    In this case making is static and non-inline is the correct solution.

    Signed-off-by: Adrian Bunk
    Signed-off-by: David S. Miller

    Adrian Bunk
     
  • This patch adds code to allow errors to be passed up from event
    handlers of NETDEV_REGISTER and NETDEV_CHANGENAME. It also adds
    the notifier_from_errno/notifier_to_errnor helpers to pass the
    errno value up to the notifier caller.

    If an error is detected when a device is registered, it causes
    that operation to fail. A NETDEV_UNREGISTER will be sent to
    all event handlers.

    Similarly if NETDEV_CHANGENAME fails the original name is restored
    and a new NETDEV_CHANGENAME event is sent.

    As such all event handlers must be idempotent with respect to
    these events.

    When an event handler is registered NETDEV_REGISTER events are
    sent for all devices currently registered. Should any of them
    fail, we will send NETDEV_GOING_DOWN/NETDEV_DOWN/NETDEV_UNREGISTER
    events to that handler for the devices which have already been
    registered with it. The handler registration itself will fail.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • When we added name-based hashing the dev_base_lock was designated as the
    lock to take when changing the name hash list. Unfortunately, because
    it was a preexisting lock that just happened to be taken in the right
    spots we neglected to take it in dev_change_name.

    The race can affect calles of __dev_get_by_name that do so without taking
    the RTNL. They may end up walking down the wrong hash chain and end up
    missing the device that they're looking for.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch makes register_netdevice call dev->uninit if the regsitration
    fails after dev->init has completed successfully. Very few drivers use
    the init/uninit calls but at least one (drivers/net/wan/sealevel.c) may
    leak without this change.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • Fix kernel-doc omissions in net/:

    Warning(linux-2.6.23-rc1//net/core/dev.c:2728): No description found for parameter 'addr'
    Warning(linux-2.6.23-rc1//net/core/dev.c:2752): No description found for parameter 'addr'
    Warning(linux-2.6.23-rc1//net/core/dev.c:3839): No description found for parameter 'net_dma'
    Warning(linux-2.6.23-rc1//net/core/dev.c:3877): No description found for parameter 'state'

    Signed-off-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Randy Dunlap
     

22 Jul, 2007

1 commit


21 Jul, 2007

1 commit


20 Jul, 2007

3 commits

  • Slab destructors were no longer supported after Christoph's
    c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
    BUGs for both slab and slub, and slob never supported them
    either.

    This rips out support for the dtor pointer from kmem_cache_create()
    completely and fixes up every single callsite in the kernel (there were
    about 224, not including the slab allocator definitions themselves,
    or the documentation references).

    Signed-off-by: Paul Mundt

    Paul Mundt
     
  • * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits)
    [TG3]: Fix msi issue with kexec/kdump.
    [NET] XFRM: Fix whitespace errors.
    [NET] TIPC: Fix whitespace errors.
    [NET] SUNRPC: Fix whitespace errors.
    [NET] SCTP: Fix whitespace errors.
    [NET] RXRPC: Fix whitespace errors.
    [NET] ROSE: Fix whitespace errors.
    [NET] RFKILL: Fix whitespace errors.
    [NET] PACKET: Fix whitespace errors.
    [NET] NETROM: Fix whitespace errors.
    [NET] NETFILTER: Fix whitespace errors.
    [NET] IPV4: Fix whitespace errors.
    [NET] DCCP: Fix whitespace errors.
    [NET] CORE: Fix whitespace errors.
    [NET] BLUETOOTH: Fix whitespace errors.
    [NET] AX25: Fix whitespace errors.
    [PATCH] mac80211: remove rtnl locking in ieee80211_sta.c
    [PATCH] mac80211: fix GCC warning on 64bit platforms
    [GENETLINK]: Dynamic multicast groups.
    [NETLIKN]: Allow removing multicast groups.
    ...

    Linus Torvalds
     
  • the two init sites resulted in inconsistend names for the lock class.

    Signed-off-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

19 Jul, 2007

1 commit


18 Jul, 2007

5 commits

  • Signed-off-by: Denis Cheng
    Signed-off-by: David S. Miller

    Denis Cheng
     
  • this two functions could share the dev->_xmit_lock acquired context.

    Signed-off-by: Denis Cheng
    Signed-off-by: David S. Miller

    Denis Cheng
     
  • Because this function is only called by unregister_netdevice,
    this moving could make this non-global function static,
    and also remove its declaration in netdevice.h;

    Any further, function __dev_addr_discard is also just called by
    dev_mc_discard and dev_unicast_discard, keeping this two functions
    both in one c file could make __dev_addr_discard also static
    and remove its declaration in netdevice.h;

    Futhermore, the sequential call to dev_unicast_discard and then
    dev_mc_discard in unregister_netdevice have a similar mechanism that:
    (netif_tx_lock_bh / __dev_addr_discard / netif_tx_unlock_bh),
    they should merged into one to eliminate duplicates in acquiring and
    releasing the dev->_xmit_lock, this would be done in my following patch.

    Signed-off-by: Denis Cheng
    Signed-off-by: David S. Miller

    Denis Cheng
     
  • -Fixes ABBA deadlock noted by Patrick McHardy :

    > There is at least one ABBA deadlock, est_timer() does:
    > read_lock(&est_lock)
    > spin_lock(e->stats_lock) (which is dev->queue_lock)
    >
    > and qdisc_destroy calls htb_destroy under dev->queue_lock, which
    > calls htb_destroy_class, then gen_kill_estimator and this
    > write_locks est_lock.

    To fix the ABBA deadlock the rate estimators are now kept on an rcu list.

    -The est_lock changes the use from protecting the list to protecting
    the update to the 'bstat' pointer in order to avoid NULL dereferencing.

    -The 'interval' member of the gen_estimator structure removed as it is
    not needed.

    Signed-off-by: Ranko Zivojnovic
    Signed-off-by: David S. Miller

    Ranko Zivojnovic
     
  • Currently, the freezer treats all tasks as freezable, except for the kernel
    threads that explicitly set the PF_NOFREEZE flag for themselves. This
    approach is problematic, since it requires every kernel thread to either
    set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
    care for the freezing of tasks at all.

    It seems better to only require the kernel threads that want to or need to
    be frozen to use some freezer-related code and to remove any
    freezer-related code from the other (nonfreezable) kernel threads, which is
    done in this patch.

    The patch causes all kernel threads to be nonfreezable by default (ie. to
    have PF_NOFREEZE set by default) and introduces the set_freezable()
    function that should be called by the freezable kernel threads in order to
    unset PF_NOFREEZE. It also makes all of the currently freezable kernel
    threads call set_freezable(), so it shouldn't cause any (intentional)
    change of behaviour to appear. Additionally, it updates documentation to
    describe the freezing of tasks more accurately.

    [akpm@linux-foundation.org: build fixes]
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Nigel Cunningham
    Cc: Pavel Machek
    Cc: Oleg Nesterov
    Cc: Gautham R Shenoy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

17 Jul, 2007

2 commits

  • This reverts commit 29578624e354f56143d92510fff33a8b2aaa2c03.

    Ingo Molnar reports complete breakage with his e1000 card (no
    networking, card reports transmit timeouts), and bisected it down to
    this commit. Let's figure out what went wrong, but not keep breaking
    machines until we do.

    Cc: Ingo Molnar
    Cc: Olaf Kirch
    Cc: David Miller
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Part two in the O_CLOEXEC saga: adding support for file descriptors received
    through Unix domain sockets.

    The patch is once again pretty minimal, it introduces a new flag for recvmsg
    and passes it just like the existing MSG_CMSG_COMPAT flag. I think this bit
    is not used otherwise but the networking people will know better.

    This new flag is not recognized by recvfrom and recv. These functions cannot
    be used for that purpose and the asymmetry this introduces is not worse than
    the already existing MSG_CMSG_COMPAT situations.

    The patch must be applied on the patch which introduced O_CLOEXEC. It has to
    remove static from the new get_unused_fd_flags function but since scm.c cannot
    live in a module the function still hasn't to be exported.

    Here's a test program to make sure the code works. It's so much longer than
    the actual patch...

    #include
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #ifndef O_CLOEXEC
    # define O_CLOEXEC 02000000
    #endif
    #ifndef MSG_CMSG_CLOEXEC
    # define MSG_CMSG_CLOEXEC 0x40000000
    #endif

    int
    main (int argc, char *argv[])
    {
    if (argc > 1)
    {
    int fd = atol (argv[1]);
    printf ("child: fd = %d\n", fd);
    if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
    {
    puts ("file descriptor valid in child");
    return 1;
    }
    return 0;

    }

    struct sockaddr_un sun;
    strcpy (sun.sun_path, "./testsocket");
    sun.sun_family = AF_UNIX;

    char databuf[] = "hello";
    struct iovec iov[1];
    iov[0].iov_base = databuf;
    iov[0].iov_len = sizeof (databuf);

    union
    {
    struct cmsghdr hdr;
    char bytes[CMSG_SPACE (sizeof (int))];
    } buf;
    struct msghdr msg = { .msg_iov = iov, .msg_iovlen = 1,
    .msg_control = buf.bytes,
    .msg_controllen = sizeof (buf) };
    struct cmsghdr *cmsg = CMSG_FIRSTHDR (&msg);

    cmsg->cmsg_level = SOL_SOCKET;
    cmsg->cmsg_type = SCM_RIGHTS;
    cmsg->cmsg_len = CMSG_LEN (sizeof (int));

    msg.msg_controllen = cmsg->cmsg_len;

    pid_t child = fork ();
    if (child == -1)
    error (1, errno, "fork");
    if (child == 0)
    {
    int sock = socket (PF_UNIX, SOCK_STREAM, 0);
    if (sock < 0)
    error (1, errno, "socket");

    if (bind (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
    error (1, errno, "bind");
    if (listen (sock, SOMAXCONN) < 0)
    error (1, errno, "listen");

    int conn = accept (sock, NULL, NULL);
    if (conn == -1)
    error (1, errno, "accept");

    *(int *) CMSG_DATA (cmsg) = sock;
    if (sendmsg (conn, &msg, MSG_NOSIGNAL) < 0)
    error (1, errno, "sendmsg");

    return 0;
    }

    /* For a test suite this should be more robust like a
    barrier in shared memory. */
    sleep (1);

    int sock = socket (PF_UNIX, SOCK_STREAM, 0);
    if (sock < 0)
    error (1, errno, "socket");

    if (connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0)
    error (1, errno, "connect");
    unlink (sun.sun_path);

    *(int *) CMSG_DATA (cmsg) = -1;

    if (recvmsg (sock, &msg, MSG_CMSG_CLOEXEC) < 0)
    error (1, errno, "recvmsg");

    int fd = *(int *) CMSG_DATA (cmsg);
    if (fd == -1)
    error (1, 0, "no descriptor received");

    char fdname[20];
    snprintf (fdname, sizeof (fdname), "%d", fd);
    execl ("/proc/self/exe", argv[0], fdname, NULL);
    puts ("execl failed");
    return 1;
    }

    [akpm@linux-foundation.org: Fix fastcall inconsistency noted by Michael Buesch]
    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Ulrich Drepper
    Cc: Ingo Molnar
    Cc: Michael Buesch
    Cc: Michael Kerrisk
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

15 Jul, 2007

4 commits

  • Add ethtool utility function to set or clear IPV6_CSUM feature flag.
    Modify tg3.c and bnx2.c to use this function when doing ethtool -K
    to change tx checksum.

    Signed-off-by: Michael Chan
    Signed-off-by: David S. Miller

    Michael Chan
     
  • Add macvlan driver, which allows to create virtual ethernet devices
    based on MAC address.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • The method drivers currently use to synchronize multicast lists is not
    very pretty:

    - walk the multicast list
    - search each entry on a copy of the previous list
    - if new add to lower device
    - walk the copy of the previous list
    - search each entry on the current list
    - if removed delete from lower device
    - copy entire list

    This patch adds a new field to struct dev_addr_list to store the
    synchronization state and adds two helper functions for synchronization
    and cleanup.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • Currently the set_multicast_list (and set_rx_mode) callbacks are
    responsible for configuring the device according to the IFF_PROMISC,
    IFF_MULTICAST and IFF_ALLMULTI flags and the mc_list (and uc_list in
    case of set_rx_mode).

    These callbacks can be invoked from BH context without the rtnl_mutex
    by dev_mc_add/dev_mc_delete, which makes reading the device flags and
    promiscous/allmulti count racy. For real hardware drivers that just
    commit all changes to the hardware this is not a real problem since
    the stack guarantees to call them for every change, so at least the
    final call will not race and commit the correct configuration to the
    hardware.

    For software devices that want to synchronize promiscous and multicast
    state to an underlying device however this can cause corruption of the
    underlying device's flags or promisc/allmulti counts.

    When the software device is concurrently put in promiscous or allmulti
    mode while set_multicast_list is invoked from bottem half context, the
    device might synchronize the change to the underlying device without
    holding the rtnl_mutex, which races with concurrent changes to the
    underlying device.

    Add a dev->change_rx_flags hook that is invoked when any of the flags
    that affect rx filtering change (under the rtnl_mutex), which allows
    drivers to perform synchronization immediately and only synchronize
    the address lists in set_multicast_list/set_rx_mode.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     

14 Jul, 2007

1 commit

  • * 'ioat-md-accel-for-linus' of git://lost.foo-projects.org/~dwillia2/git/iop: (28 commits)
    ioatdma: add the unisys "i/oat" pci vendor/device id
    ARM: Add drivers/dma to arch/arm/Kconfig
    iop3xx: surface the iop3xx DMA and AAU units to the iop-adma driver
    iop13xx: surface the iop13xx adma units to the iop-adma driver
    dmaengine: driver for the iop32x, iop33x, and iop13xx raid engines
    md: remove raid5 compute_block and compute_parity5
    md: handle_stripe5 - request io processing in raid5_run_ops
    md: handle_stripe5 - add request/completion logic for async expand ops
    md: handle_stripe5 - add request/completion logic for async read ops
    md: handle_stripe5 - add request/completion logic for async check ops
    md: handle_stripe5 - add request/completion logic for async compute ops
    md: handle_stripe5 - add request/completion logic for async write ops
    md: common infrastructure for running operations with raid5_run_ops
    md: raid5_run_ops - run stripe operations outside sh->lock
    raid5: replace custom debug PRINTKs with standard pr_debug
    raid5: refactor handle_stripe5 and handle_stripe6 (v3)
    async_tx: add the async_tx api
    xor: make 'xor_blocks' a library routine for use with async_tx
    dmaengine: make clients responsible for managing channels
    dmaengine: refactor dmaengine around dma_async_tx_descriptor
    ...

    Linus Torvalds
     

13 Jul, 2007

1 commit

  • The current implementation assumes that a channel will only be used by one
    client at a time. In order to enable channel sharing the dmaengine core is
    changed to a model where clients subscribe to channel-available-events.
    Instead of tracking how many channels a client wants and how many it has
    received the core just broadcasts the available channels and lets the
    clients optionally take a reference. The core learns about the clients'
    needs at dma_event_callback time.

    In support of multiple operation types, clients can specify a capability
    mask to only be notified of channels that satisfy a certain set of
    capabilities.

    Changelog:
    * removed DMA_TX_ARRAY_INIT, no longer needed
    * dma_client_chan_free -> dma_chan_release: switch to global reference
    counting only at device unregistration time, before it was also happening
    at client unregistration time
    * clients now return dma_state_client to dmaengine (ack, dup, nak)
    * checkpatch.pl fixes
    * fixup merge with git-ioat

    Cc: Chris Leech
    Signed-off-by: Shannon Nelson
    Signed-off-by: Dan Williams
    Acked-by: David S. Miller

    Dan Williams
     

12 Jul, 2007

2 commits

  • Drivers need to validate the initial addresses in their netlink attribute
    validation function or manually reject them if they can't support this.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy
     
  • All drivers need to unregister their devices in the module unload function.
    While doing so they must hold the rtnl and atomically unregister the
    rtnl_link ops as well. This makes the rtnl_link_unregister function that
    takes the rtnl itself completely useless.

    Provide default newlink/dellink functions, make __rtnl_link_unregister and
    rtnl_link_unregister unregister all devices with matching rtnl_link_ops and
    change the existing users to take advantage of that.

    Signed-off-by: Patrick McHardy
    Signed-off-by: David S. Miller

    Patrick McHardy