22 Jun, 2011

14 commits

  • Now that we allow multiple IEEE App entries we need a way
    to remove specific entries. To do this add the ieee_dcb_delapp()
    routine.

    Additionaly drivers may need to remove the APP entry from
    their firmware tables. Add dcb ops routine to handle this.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • This adds a setapp routine for IEEE802.1Qaz encoded APP data types.
    The IEEE 802.1Qaz spec encodes the priority bits differently and
    allows for multiple APP data entries of the same selector and
    protocol. Trying to force these to use the same set routines was
    becoming tedious. Furthermore, userspace could probably enforce
    the correct semantics, but expecting drivers to do this seems
    error prone in the firmware case.

    For these reasons add ieee_dcb_setapp() that understands the
    IEEE 802.1Qaz encoded form.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • Now that dcbnl is being used in many cases by more
    than a single agent it is beneficial to be notified
    when some entity either driver or user space has
    changed the DCB attributes.

    Today applications either end up polling the interface
    or relying on a user space database to maintain the DCB
    state and post events. Polling is a poor solution for
    obvious reasons. And relying on a user space database
    has its own downside. Namely it has created strange
    boot dependencies requiring the database be populated
    before any applications dependent on DCB attributes
    starts or the application goes into a polling loop.
    Populating the database requires negotiating link
    setting with the peer and can take anywhere from less
    than a second up to a few seconds depending on the switch
    implementation.

    Perhaps more importantly if another application or an
    embedded agent sets a DCB link attribute the database
    has no way of knowing other than polling the kernel.
    This prevents applications from responding quickly to
    changes in link events which at least in the FCoE case
    and probably any other protocols expecting a lossless
    link may result in IO errors.

    By adding a multicast group for DCB we have clean way
    to disseminate kernel DCB link attributes up to user
    space. Avoiding the need for user space to maintain
    a coherant database and disperse events that potentially
    do not reflect the current link state.

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • Adding the capabilities bitmask to the get_ieee response allows
    user space to determine the current DCBX mode. Either CEE or IEEE
    this is useful with devices that support switching between modes
    where knowing the current state is relevant.

    Derived from work by Mark Rustad

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • And change iSCSI RQ doorbell size from 16B to 64B to match new firmware.

    Signed-off-by: Michael Chan
    Signed-off-by: Eddie Wai
    Signed-off-by: David S. Miller

    Michael Chan
     
  • This patch adds 2 tracepoints to get a status of a socket receive queue
    and related parameter.

    One tracepoint is added to sock_queue_rcv_skb. It records rcvbuf size
    and its usage. The other tracepoint is added to __sk_mem_schedule and
    it records limitations of memory for sockets and current usage.

    By using these tracepoints we're able to know detailed reason why kernel
    drop the packet.

    Signed-off-by: Satoru Moriya
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Satoru Moriya
     
  • This patch adds a tracepoint to __udp_queue_rcv_skb to get the
    return value of ip_queue_rcv_skb. It indicates why kernel drops
    a packet at this point.

    ip_queue_rcv_skb returns following values in the packet drop case:

    rcvbuf is full : -ENOMEM
    sk_filter returns error : -EINVAL, -EACCESS, -ENOMEM, etc.
    __sk_mem_schedule returns error: -ENOBUF

    Signed-off-by: Satoru Moriya
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Satoru Moriya
     
  • It was suggested by "make versioncheck" that the follwing includes of
    linux/version.h are redundant:

    /home/jj/src/linux-2.6/net/caif/caif_dev.c: 14 linux/version.h not needed.
    /home/jj/src/linux-2.6/net/caif/chnl_net.c: 10 linux/version.h not needed.
    /home/jj/src/linux-2.6/net/ipv4/gre.c: 19 linux/version.h not needed.
    /home/jj/src/linux-2.6/net/netfilter/ipset/ip_set_core.c: 20 linux/version.h not needed.
    /home/jj/src/linux-2.6/net/netfilter/xt_set.c: 16 linux/version.h not needed.

    and it seems that it is right.

    Beyond manually inspecting the source files I also did a few build
    tests with various configs to confirm that including the header in
    those files is indeed not needed.

    Here's a patch to remove the pointless includes.

    Signed-off-by: Jesper Juhl
    Acked-by: Jozsef Kadlecsik
    Signed-off-by: David S. Miller

    Jesper Juhl
     
  • This patch enables software (and phy device) transmit time stamping.
    Compile tested only.

    Signed-off-by: Richard Cochran
    Signed-off-by: David S. Miller

    Richard Cochran
     
  • Because the socket buffer is freed in the completion interrupt, it is not
    safe to access it after submitting it to the hardware.

    Signed-off-by: Richard Cochran
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Richard Cochran
     
  • Convert xen driver to 64 bit statistics interface.
    Use stats_sync to ensure that 64 bit update is read atomically on 32 bit platform.
    Put hot statistics into per-cpu table.

    Signed-off-by: Stephen Hemminger
    Acked-by: Ian Campbell
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Need to add stat_sync wrapper around 64 bit statistic values.
    Fix wraparound bug in lockup detector where it is unsafely comparing
    64 bit value that is not atomic. Since only care about detecting activity
    just looking at current low order bits will work.

    Remove unused entries in old vxge_sw_stats structure.
    Change the error counters to unsigned long since they won't grow so large
    as to have to be 64 bits.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Convert input functional block device to use 64 bit stats.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: Eric Dumazet
    Acked-by: Jamal Hadi Salim
    Signed-off-by: David S. Miller

    stephen hemminger
     
  • Unnecessary casts of void * clutter the code.

    These are the remainder casts after several specific
    patches to remove netdev_priv and dev_priv.

    Done via coccinelle script (and a little editing):

    $ cat cast_void_pointer.cocci
    @@
    type T;
    T *pt;
    void *pv;
    @@

    - pt = (T *)pv;
    + pt = pv;

    Signed-off-by: Joe Perches
    Acked-by: Sjur Brændeland
    Acked-By: Chris Snook
    Acked-by: Jon Mason
    Acked-by: Geert Uytterhoeven
    Acked-by: David Dillow
    Signed-off-by: David S. Miller

    Joe Perches
     

21 Jun, 2011

26 commits

  • Currently single PCI pool used across all CPUs and that
    doesn't scales up as number of CPU increases, so this
    patch adds per CPU PCI pool to setup udl and that aligns
    well from FCoE stack as that already has per CPU exch locking.

    Adds per CPU PCI alloc setup and free in
    ixgbe_fcoe_ddp_pools_alloc and ixgbe_fcoe_ddp_pools_free,
    use CPU specific pool during DDP setup.

    Re-arranged ixgbe_fcoe struct to have fewer holes
    along with adding pools ptr using pahole.

    Signed-off-by: Vasu Dev
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    Vasu Dev
     
  • This patch adds support for Dell CEM (Comprehensive Embedded Management)).
    This consists of informing the management firmware of the driver version
    during probe on 82599 and X540 HW.

    Signed-off-by: Emil Tantilov
    Tested-by: Evan Swanson
    Signed-off-by: Jeff Kirsher

    Emil Tantilov
     
  • The ixgbe_dcb_txq_to_tc() routine was used to map TX rings to
    a DCB traffic class. Now that a tx_ring has a DCB traffic class
    associated with it this routine is no longer needed.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • Now flow directors perfect filters features can coexist with DCB.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • This bit mask is wrong DCBX_HOST is always set. It was missed up
    until now because lldpad reprograms the device on a link
    event. However this is still wrong and it is best not to be
    mis-configured for some time immediately following ixgbe_up().

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • Setup RSS redirection table to be compatible with multiple packet
    buffers. Currently, this works on 82599 devices because the RSS
    redirection index is masked by the number of queues per packet
    buffer.

    This sets the cap on the RSS table to maxq.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • The tx_idx and rx_idx values are swapped on 82598 devices
    with DCB enabled.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • The number of TX and RX queues allocated depends on the device
    type, the current features set, online CPUs, and various
    compile flags.

    To enable DCB with multiple queues and allow it to coexist with
    all the features currently implemented it has to setup a valid
    queue count. This is done at init time using the FDIR and RSS
    max queue counts and allowing each TC to allocate a queue per
    CPU.

    DCB will now use available queues up to (8 x TCs) this is somewhat
    arbitrary cap but allows DCB to use up to 64 queues. Its easy to
    increase this later if that is needed.

    This is prep work to enable Flow Director with DCB. After this
    DCB can easily coexist with existing features and no longer
    needs its own DCB feature ring.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • ixgbe devices support different numbers of packet buffers either
    8 or 4. Here we only allocate the minimal number of packet
    buffers required to implement the net_devices number of traffic
    classes.

    Fewer traffic classes allows for larger packet buffers in
    hardware. Also more Tx/Rx queues can be given to each
    traffic class.

    This patch is mostly about propagating the number of traffic
    classes through the init path. Specifically this adds the 4TC
    cases to the MRQC and MTQC setup routines. Also ixgbe_setup_tc()
    was sanitized to handle other traffic class value.

    Finally changing the number of packet buffers in the hardware
    requires the device to reinit. So this moves the reinit work
    from DCB into the main ixgbe_setup_tc() routine to consolidate
    the reset code. Now dcbnl_xxx ops call ixgbe_setup_tc() to
    configure packet buffers if needed.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • The MRQC and MTQC registers are configured in the main
    setup path but are also reconfigured in the DCB setup
    path. The DCB path fixes the DCB configuration by configuring
    the SECTXMINIFG gap which is required for DCB pause
    to operate correctly.

    This patch reduces the duplicate code and does all setup
    in ixgbe_setup_mtqc() and ixgbe_setup_mrqc().

    Additionally, this removes the IXGBE_QDE. This write never
    set the WRITE bit in the register so the write was not
    actually doing anything. Also this was to clear the register
    but, it is never set and defaults to zero. If this is
    needed for SRIOV it should be added correctly in a follow
    up patch. But it's never been working so removing it here
    should be OK.

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • Consolidate packet buffer allocation currently being
    done in the DCB path and main path. This allows the
    feature set and packet buffer requirements to be done
    once.

    This is prep work to allow DCB to coexist with other
    features namely, flow director.

    CC: Alexander Duyck
    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • Replace duplicated code in if/else branches with single
    check and ixgbe_init_interrupt_scheme().

    Signed-off-by: John Fastabend
    Tested-by: Ross Brattain
    Signed-off-by: Jeff Kirsher

    John Fastabend
     
  • Use standard format for net_device_ops (without &)

    Signed-off-by: Stephen Hemminger
    Acked-by: Greg Rose
    Signed-off-by: Jeff Kirsher

    Stephen Hemminger
     
  • In two places storage for mbx_ops is misidentified as type
    ixgbe_mac_operations.

    Reported-by: Andi Kleen
    Signed-off-by: Greg Rose
    Signed-off-by: Jeff Kirsher

    Greg Rose
     
  • Private rx_csum flags are now duplicate of netdev->features & NETIF_F_RXCSUM.
    Removing this needs deeper surgery.

    Things noticed:
    - HW VLAN acceleration probably can be toggled, but it's left as is
    - the resets on RX csum offload change can probably be avoided
    - there is A LOT of copy-and-pasted code here

    Signed-off-by: Michał Mirosław
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Michał Mirosław
     
  • Private rx_csum flags are now duplicate of netdev->features & NETIF_F_RXCSUM.
    Removing this needs deeper surgery.

    Things noticed:
    - RX csum disabled by default
    - HW VLAN acceleration probably can be toggled, but it's left as is
    - the resets on RX csum offload change can probably be avoided
    - there is A LOT of copy-and-pasted code here

    Signed-off-by: Michał Mirosław
    Tested-by: Aaron Brown
    Signed-off-by: Jeff Kirsher

    Michał Mirosław
     
  • Conflicts:
    drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
    drivers/net/wireless/rtlwifi/pci.c
    net/netfilter/ipvs/ip_vs_core.c

    David S. Miller
     
  • Linus Torvalds
     
  • Commit 13e12d14e2dc ("vfs: reorganize 'struct inode' layout a bit")
    moved things around a bit changed i_state to be unsigned int instead of
    unsigned long. That was to help structure layout for the 64-bit case,
    and shrink 'struct inode' a bit (admittedly that only happened when
    spinlock debugging was on and i_flags didn't pack with i_lock).

    However, Meelis Roos reports that this results in unaligned exceptions
    on sprc, and it turns out that the bit-locking primitives that we use
    for the I_NEW bit want to use the bitops. Which want 'unsigned long',
    not 'unsigned int'.

    We really should fix the bit locking code to not have that kind of
    requirement, but that's a much bigger change. So for now, revert that
    field back to 'unsigned long' (but keep the other re-ordering changes
    from the commit that caused this).

    Andi points out that we have played games with this in 'struct page', so
    it's solvable with other hacks too, but since right now the struct inode
    size advantage only happens with some rare config options, it's not
    worth fighting.

    It _would_ be worth fixing the bitlocking code, though. Especially
    since there is no type safety in the bitlocking code (this never caused
    any warnings, and worked fine on x86-64, because the bitlocks take a
    'void *' and x86-64 doesn't care that deeply about alignment). So it's
    currently a very easy problem to trigger by mistake and never notice.

    Reported-by: Meelis Roos
    Cc: Andi Kleen
    Cc: David Miller
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
    drm/radeon/kms/r6xx+: voltage fixes
    drm/nouveau: drop leftover debugging
    drm/radeon: avoid warnings from r600/eg irq handlers on powered off card.
    drm/radeon/kms: add missing param for dce3.2 DP transmitter setup
    drm/radeon/kms/atom: fix duallink on some early DCE3.2 cards
    drm/nouveau: fix assumption that semaphore dmaobj is valid in x-chan sync
    drm/nv50/disp: fix gamma with page flipping overlay turned on
    drm/nouveau/pm: Prevent overflow in nouveau_perf_init()
    drm/nouveau: fix big-endian switch

    Linus Torvalds
     
  • * 'msm-fix' of git://codeaurora.org/quic/kernel/davidb/linux-msm:
    msm: timer: Fix DGT rate on 8960 and 8660
    msm: timer: compensate for timer shift in msm_read_timer_count
    msm: timer: Fix SMP build error

    Linus Torvalds
     
  • * 'for-2.6.40' of git://linux-nfs.org/~bfields/linux:
    nfsd4: fix break_lease flags on nfsd open
    nfsd: link returns nfserr_delay when breaking lease
    nfsd: v4 support requires CRYPTO
    nfsd: fix dependency of nfsd on auth_rpcgss

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
    pxa168_eth: fix race in transmit path.
    ipv4, ping: Remove duplicate icmp.h include
    netxen: fix race in skb->len access
    sgi-xp: fix a use after free
    hp100: fix an skb->len race
    netpoll: copy dev name of slaves to struct netpoll
    ipv4: fix multicast losses
    r8169: fix static initializers.
    inet_diag: fix inet_diag_bc_audit()
    gigaset: call module_put before restart of if_open()
    farsync: add module_put to error path in fst_open()
    net: rfs: enable RFS before first data packet is received
    fs_enet: fix freescale FCC ethernet dp buffer alignment
    netdev: bfin_mac: fix memory leak when freeing dma descriptors
    vlan: don't call ndo_vlan_rx_register on hardware that doesn't have vlan support
    caif: Bugfix - XOFF removed channel from caif-mux
    tun: teach the tun/tap driver to support netpoll
    dp83640: drop PHY status frames in the driver.
    dp83640: fix phy status frame event parsing
    phylib: Allow BCM63XX PHY to be selected only on BCM63XX.
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper
    fix comment in generic_permission()
    kill obsolete comment for follow_down()
    proc_sys_permission() is OK in RCU mode
    reiserfs_permission() doesn't need to bail out in RCU mode
    proc_fd_permission() is doesn't need to bail out in RCU mode
    nilfs2_permission() doesn't need to bail out in RCU mode
    logfs doesn't need ->permission() at all
    coda_ioctl_permission() is safe in RCU mode
    cifs_permission() doesn't need to bail out in RCU mode
    bad_inode_permission() is safe from RCU mode
    ubifs: dereferencing an ERR_PTR in ubifs_mount()

    Linus Torvalds
     
  • 0xff01 is not an actual voltage value, but a flag
    for the driver. If the power state as that value,
    skip setting the voltage.

    Signed-off-by: Alex Deucher
    Signed-off-by: Dave Airlie

    Alex Deucher
     
  • The DGT runs at 27 MHz divided by 4 on 8660 and 8960.

    Signed-off-by: Stephen Boyd
    Signed-off-by: David Brown

    Stephen Boyd