15 Jan, 2011

3 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
    GRETH: resolve SMP issues and other problems
    GRETH: handle frame error interrupts
    GRETH: avoid writing bad speed/duplex when setting transfer mode
    GRETH: fixed skb buffer memory leak on frame errors
    GRETH: GBit transmit descriptor handling optimization
    GRETH: fix opening/closing
    GRETH: added raw AMBA vendor/device number to match against.
    cassini: Fix build bustage on x86.
    e1000e: consistent use of Rx/Tx vs. RX/TX/rx/tx in comments/logs
    e1000e: update Copyright for 2011
    e1000: Avoid unhandled IRQ
    r8169: keep firmware in memory.
    netdev: tilepro: Use is_unicast_ether_addr helper
    etherdevice.h: Add is_unicast_ether_addr function
    ks8695net: Use default implementation of ethtool_ops::get_link
    ks8695net: Disable non-working ethtool operations
    USB CDC NCM: Don't deref NULL in cdc_ncm_rx_fixup() and don't use uninitialized variable.
    vxge: Remember to release firmware after upgrading firmware
    netdev: bfin_mac: Remove is_multicast_ether_addr use in netdev_for_each_mc_addr
    ipsec: update MAX_AH_AUTH_LEN to support sha512
    ...

    Linus Torvalds
     
  • * 'for-2.6.38' of git://linux-nfs.org/~bfields/linux: (62 commits)
    nfsd4: fix callback restarting
    nfsd: break lease on unlink, link, and rename
    nfsd4: break lease on nfsd setattr
    nfsd: don't support msnfs export option
    nfsd4: initialize cb_per_client
    nfsd4: allow restarting callbacks
    nfsd4: simplify nfsd4_cb_prepare
    nfsd4: give out delegations more quickly in 4.1 case
    nfsd4: add helper function to run callbacks
    nfsd4: make sure sequence flags are set after destroy_session
    nfsd4: re-probe callback on connection loss
    nfsd4: set sequence flag when backchannel is down
    nfsd4: keep finer-grained callback status
    rpc: allow xprt_class->setup to return a preexisting xprt
    rpc: keep backchannel xprt as long as server connection
    rpc: move sk_bc_xprt to svc_xprt
    nfsd4: allow backchannel recovery
    nfsd4: support BIND_CONN_TO_SESSION
    nfsd4: modify session list under cl_lock
    Documentation: fl_mylease no longer exists
    ...

    Fix up conflicts in fs/nfsd/vfs.c with the vfs-scale work. The
    vfs-scale work touched some msnfs cases, and this merge removes support
    for that entirely, so the conflict was trivial to resolve.

    Linus Torvalds
     
  • rxrpc_workqueue isn't depended upon while reclaiming memory. Convert
    to alloc_workqueue() without WQ_MEM_RECLAIM.

    Signed-off-by: Tejun Heo
    Signed-off-by: David Howells
    Cc: linux-afs@lists.infradead.org
    Signed-off-by: Linus Torvalds

    Tejun Heo
     

14 Jan, 2011

7 commits

  • After recent changes, (percpu stats on vlan/tunnels...), we dont need
    anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters.

    Only remaining users are ixgbe, sch_teql, gianfar & macvlan :

    1) ixgbe can be converted to use existing tx_ring counters.

    2) macvlan incremented txq->tx_dropped, it can use the
    dev->stats.tx_dropped counter.

    3) sch_teql : almost revert ab35cd4b8f42 (Use net_device internal stats)
    Now we have ndo_get_stats64(), use it, even for "unsigned long"
    fields (No need to bring back a struct net_device_stats)

    4) gianfar adds a stats structure per tx queue to hold
    tx_bytes/tx_packets

    This removes a lockdep warning (and possible lockup) in rndis gadget,
    calling dev_get_stats() from hard IRQ context.

    Ref: http://www.spinics.net/lists/netdev/msg149202.html

    Reported-by: Neil Jones
    Signed-off-by: Eric Dumazet
    CC: Jarek Poplawski
    CC: Alexander Duyck
    CC: Jeff Kirsher
    CC: Sandeep Gopalpet
    CC: Michal Nazarewicz
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • David S. Miller
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (41 commits)
    fs: add documentation on fallocate hole punching
    Gfs2: fail if we try to use hole punch
    Btrfs: fail if we try to use hole punch
    Ext4: fail if we try to use hole punch
    Ocfs2: handle hole punching via fallocate properly
    XFS: handle hole punching via fallocate properly
    fs: add hole punching to fallocate
    vfs: pass struct file to do_truncate on O_TRUNC opens (try #2)
    fix signedness mess in rw_verify_area() on 64bit architectures
    fs: fix kernel-doc for dcache::prepend_path
    fs: fix kernel-doc for dcache::d_validate
    sanitize ecryptfs ->mount()
    switch afs
    move internal-only parts of ncpfs headers to fs/ncpfs
    switch ncpfs
    switch 9p
    pass default dentry_operations to mount_pseudo()
    switch hostfs
    switch affs
    switch configfs
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (46 commits)
    hwrng: via_rng - Fix memory scribbling on some CPUs
    crypto: padlock - Move padlock.h into include/crypto
    hwrng: via_rng - Fix asm constraints
    crypto: n2 - use __devexit not __exit in n2_unregister_algs
    crypto: mark crypto workqueues CPU_INTENSIVE
    crypto: mv_cesa - dont return PTR_ERR() of wrong pointer
    crypto: ripemd - Set module author and update email address
    crypto: omap-sham - backlog handling fix
    crypto: gf128mul - Remove experimental tag
    crypto: af_alg - fix af_alg memory_allocated data type
    crypto: aesni-intel - Fixed build with binutils 2.16
    crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets
    net: Add missing lockdep class names for af_alg
    include: Install linux/if_alg.h for user-space crypto API
    crypto: omap-aes - checkpatch --file warning fixes
    crypto: omap-aes - initialize aes module once per request
    crypto: omap-aes - unnecessary code removed
    crypto: omap-aes - error handling implementation improved
    crypto: omap-aes - redundant locking is removed
    crypto: omap-aes - DMA initialization fixes for OMAP off mode
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    rbd: fix cleanup when trying to mount inexistent image
    net/ceph: make ceph_msgr_wq non-reentrant
    ceph: fsc->*_wq's aren't used in memory reclaim path
    ceph: Always free allocated memory in osdmap_decode()
    ceph: Makefile: Remove unnessary code
    ceph: associate requests with opening sessions
    ceph: drop redundant r_mds field
    ceph: implement DIRLAYOUTHASH feature to get dir layout from MDS
    ceph: add dir_layout to inode

    Linus Torvalds
     
  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    Documentation/trace/events.txt: Remove obsolete sched_signal_send.
    writeback: fix global_dirty_limits comment runtime -> real-time
    ppc: fix comment typo singal -> signal
    drivers: fix comment typo diable -> disable.
    m68k: fix comment typo diable -> disable.
    wireless: comment typo fix diable -> disable.
    media: comment typo fix diable -> disable.
    remove doc for obsolete dynamic-printk kernel-parameter
    remove extraneous 'is' from Documentation/iostats.txt
    Fix spelling milisec -> ms in snd_ps3 module parameter description
    Fix spelling mistakes in comments
    Revert conflicting V4L changes
    i7core_edac: fix typos in comments
    mm/rmap.c: fix comment
    sound, ca0106: Fix assignment to 'channel'.
    hrtimer: fix a typo in comment
    init/Kconfig: fix typo
    anon_inodes: fix wrong function name in comment
    fix comment typos concerning "consistent"
    poll: fix a typo in comment
    ...

    Fix up trivial conflicts in:
    - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
    - fs/ext4/ext4.h

    Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

    Linus Torvalds
     
  • This patch fixes a loop in ctnetlink_get_conntrack() that can be
    triggered if you use the same socket to receive events and to
    perform a GET operation. Under heavy load, netlink_unicast()
    may return -EAGAIN, this error code is reserved in nfnetlink for
    the module load-on-demand. Instead, we return -ENOBUFS which is
    the appropriate error code that has to be propagated to
    user-space.

    Reported-by: Holger Eitzenberger
    Signed-off-by: Pablo Neira Ayuso

    Pablo Neira Ayuso
     

13 Jan, 2011

8 commits

  • Fix new kernel-doc warning (copy-paste typo):

    Warning(net/ethernet/eth.c:366): No description found for parameter 'rxqs'

    Signed-off-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Randy Dunlap
     
  • David S. Miller
     
  • Linux IPv6 forwards unicast packets, which are link layer multicasts...
    The hole was present since day one. I was 100% this check is there, but it is not.

    The problem shows itself, f.e. when Microsoft Network Load Balancer runs on a network.
    This software resolves IPv6 unicast addresses to multicast MAC addresses.

    Signed-off-by: Alexey Kuznetsov
    Signed-off-by: David S. Miller

    Alexey Kuznetsov
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • ceph messenger code does a rather complex dancing around multithread
    workqueue to make sure the same work item isn't executed concurrently
    on different CPUs. This restriction can be provided by workqueue with
    WQ_NON_REENTRANT.

    Make ceph_msgr_wq non-reentrant workqueue with the default concurrency
    level and remove the QUEUED/BUSY logic.

    * This removes backoff handling in con_work() but it couldn't reliably
    block execution of con_work() to begin with - queue_con() can be
    called after the work started but before BUSY is set. It seems that
    it was an optimization for a rather cold path and can be safely
    removed.

    * The number of concurrent work items is bound by the number of
    connections and connetions are independent from each other. With
    the default concurrency level, different connections will be
    executed independently.

    Signed-off-by: Tejun Heo
    Cc: Sage Weil
    Cc: ceph-devel@vger.kernel.org
    Signed-off-by: Sage Weil

    Tejun Heo
     
  • Always free memory allocated to 'pi' in
    net/ceph/osdmap.c::osdmap_decode().

    Signed-off-by: Jesper Juhl
    Signed-off-by: Sage Weil

    Jesper Juhl
     
  • Add a ceph_dir_layout to the inode, and calculate dentry hash values based
    on the parent directory's specified dir_hash function. This is needed
    because the old default Linux dcache hash function is extremely week and
    leads to a poor distribution of files among dir fragments.

    Signed-off-by: Sage Weil

    Sage Weil
     
  • The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but
    failed to update the #ifdef stanzas guarding the defragmentation related
    fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c.

    This patch adds the required #ifdefs so that IPv6 tproxy can truly be used
    without connection tracking.

    Original report:
    http://marc.info/?l=linux-netdev&m=129010118516341&w=2

    Reported-by: Randy Dunlap
    Acked-by: Randy Dunlap
    Signed-off-by: KOVACS Krisztian
    Signed-off-by: Pablo Neira Ayuso

    KOVACS Krisztian
     

12 Jan, 2011

12 commits

  • Commit fe10ae53384e48c51996941b7720ee16995cbcb7 adds a memset() to clear
    the structure being sent back to userspace, but accidentally used the
    wrong size.

    Reported-by: Brad Spengler
    Signed-off-by: Kees Cook
    Cc: stable@kernel.org
    Signed-off-by: David S. Miller

    Kees Cook
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (67 commits)
    cxgb4vf: recover from failure in cxgb4vf_open()
    netfilter: ebtables: make broute table work again
    netfilter: fix race in conntrack between dump_table and destroy
    ah: reload pointers to skb data after calling skb_cow_data()
    ah: update maximum truncated ICV length
    xfrm: check trunc_len in XFRMA_ALG_AUTH_TRUNC
    ehea: Increase the skb array usage
    net/fec: remove config FEC2 as it's used nowhere
    pcnet_cs: add new_id
    tcp: disallow bind() to reuse addr/port
    net/r8169: Update the function of parsing firmware
    net: ppp: use {get,put}_unaligned_be{16,32}
    CAIF: Fix IPv6 support in receive path for GPRS/3G
    arp: allow to invalidate specific ARP entries
    net_sched: factorize qdisc stats handling
    mlx4: Call alloc_etherdev to allocate RX and TX queues
    net: Add alloc_netdev_mqs function
    caif: don't set connection request param size before copying data
    cxgb4vf: fix mailbox data/control coherency domain race
    qlcnic: change module parameter permissions
    ...

    Linus Torvalds
     
  • David S. Miller
     
  • * 'nfs-for-2.6.38' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (89 commits)
    NFS fix the setting of exchange id flag
    NFS: Don't use vm_map_ram() in readdir
    NFSv4: Ensure continued open and lockowner name uniqueness
    NFS: Move cl_delegations to the nfs_server struct
    NFS: Introduce nfs_detach_delegations()
    NFS: Move cl_state_owners and related fields to the nfs_server struct
    NFS: Allow walking nfs_client.cl_superblocks list outside client.c
    pnfs: layout roc code
    pnfs: update nfs4_callback_recallany to handle layouts
    pnfs: add CB_LAYOUTRECALL handling
    pnfs: CB_LAYOUTRECALL xdr code
    pnfs: change lo refcounting to atomic_t
    pnfs: check that partial LAYOUTGET return is ignored
    pnfs: add layout to client list before sending rpc
    pnfs: serialize LAYOUTGET(openstateid)
    pnfs: layoutget rpc code cleanup
    pnfs: change how lsegs are removed from layout list
    pnfs: change layout state seqlock to a spinlock
    pnfs: add prefix to struct pnfs_layout_hdr fields
    pnfs: add prefix to struct pnfs_layout_segment fields
    ...

    Linus Torvalds
     
  • The netlink interface to dump the connection tracking table has a race
    when entries are deleted at the same time. A customer reported a crash
    and the backtrace showed thatctnetlink_dump_table was running while a
    conntrack entry was being destroyed.
    (see https://bugzilla.vyatta.com/show_bug.cgi?id=6402).

    According to RCU documentation, when using hlist_nulls the reader
    must handle the case of seeing a deleted entry and not proceed
    further down the linked list. The old code would continue
    which caused the scan to walk into the free list.

    This patch uses locking (rather than RCU) for this operation which
    is guaranteed safe, and no longer requires getting reference while
    doing dump operation.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: Pablo Neira Ayuso

    Stephen Hemminger
     
  • skb_cow_data() may allocate a new data buffer, so pointers on
    skb should be set after this function.

    Bug was introduced by commit dff3bb06 ("ah4: convert to ahash")
    and 8631e9bd ("ah6: convert to ahash").

    Signed-off-by: Wang Xuefu
    Acked-by: Krzysztof Witek
    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Dang Hongwu
     
  • Maximum trunc length is defined by MAX_AH_AUTH_LEN (in bytes)
    and need to be checked when this value is set (in bits) by
    the user. In ah4.c and ah6.c a BUG_ON() checks this condiftion.

    Signed-off-by: Nicolas Dichtel
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • inet_csk_bind_conflict() logic currently disallows a bind() if
    it finds a friend socket (a socket bound on same address/port)
    satisfying a set of conditions :

    1) Current (to be bound) socket doesnt have sk_reuse set
    OR
    2) other socket doesnt have sk_reuse set
    OR
    3) other socket is in LISTEN state

    We should add the CLOSE state in the 3) condition, in order to avoid two
    REUSEADDR sockets in CLOSE state with same local address/port, since
    this can deny further operations.

    Note : a prior patch tried to address the problem in a different (and
    buggy) way. (commit fda48a0d7a8412ced tcp: bind() fix when many ports
    are bound).

    Reported-by: Gaspar Chilingarov
    Reported-by: Daniel Baluta
    Tested-by: Daniel Baluta
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • This allows us to reuse the xprt associated with a server connection if
    one has already been set up.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Multiple backchannels can share the same tcp connection; from rfc 5661 section
    2.10.3.1:

    A connection's association with a session is not exclusive. A
    connection associated with the channel(s) of one session may be
    simultaneously associated with the channel(s) of other sessions
    including sessions associated with other client IDs.

    However, multiple backchannels share a connection, they must all share
    the same xid stream (hence the same rpc_xprt); the only way we have to
    match replies with calls at the rpc layer is using the xid.

    So, keep the rpc_xprt around as long as the connection lasts, in case
    we're asked to use the connection as a backchannel again.

    Requests to create new backchannel clients over a given server
    connection should results in creating new clients that reuse the
    existing rpc_xprt.

    But to start, just reject attempts to associate multiple rpc_xprt's with
    the same underlying bc_xprt.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • This seems obviously transport-level information even if it's currently
    used only by the server socket code.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • I made a slight mess of Documentation/filesystems/Locking; resolve
    conflicts with upstream before fixing it up.

    J. Bruce Fields
     

11 Jan, 2011

10 commits

  • Use proper data types for storing the count of the binary blob and
    length of a string. Without this patch length calculation of string will
    always result in -1 because of comparision between signed and unsigned
    integer.

    Signed-off-by: M. Mohan Kumar
    Signed-off-by: Venkateswararao Jujjuri
    Signed-off-by: Eric Van Hensbergen

    M. Mohan Kumar
     
  • Checks version field of IP in the receive path for GPRS/3G data
    and appropriately sets the value of skb->protocol.

    Signed-off-by: Sjur Braendeland
    Signed-off-by: David S. Miller

    Kumar Sanghvi
     
  • IPv4 over firewire needs to be able to remove ARP entries
    from the ARP cache that belong to nodes that are removed, because
    IPv4 over firewire uses ARP packets for private information
    about nodes.

    This information becomes invalid as soon as node drops
    off the bus and when it reconnects, its only possible
    to start talking to it after it responded to an ARP packet.
    But ARP cache prevents such packets from being sent.

    Signed-off-by: Maxim Levitsky
    Signed-off-by: David S. Miller

    Maxim Levitsky
     
  • HTB takes into account skb is segmented in stats updates.
    Generalize this to all schedulers.

    They should use qdisc_bstats_update() helper instead of manipulating
    bstats.bytes and bstats.packets

    Add bstats_update() helper too for classes that use
    gnet_stats_basic_packed fields.

    Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no
    stab is setup on qdisc.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Added alloc_netdev_mqs function which allows the number of transmit and
    receive queues to be specified independenty. alloc_netdev_mq was
    changed to a macro to call the new function. Also added
    alloc_etherdev_mqs with same purpose.

    Signed-off-by: Tom Herbert
    Signed-off-by: David S. Miller

    Tom Herbert
     
  • The size field should not be set until after the data is successfully
    copied in.

    Signed-off-by: Dan Rosenberg
    Signed-off-by: David S. Miller

    Dan Rosenberg
     
  • Dan Rosenberg pointed out that there were some signed comparison bugs
    in the phonet protocol.

    http://marc.info/?l=full-disclosure&m=129424528425330&w=2

    The problem is that we check for array overflows but "protocol" is
    signed and we don't check for array underflows. If you have already
    have CAP_SYS_ADMIN then you could use the bugs to get root, or someone
    could cause an oops by mistake.

    Signed-off-by: Dan Carpenter
    Acked-by: Rémi Denis-Courmont
    Signed-off-by: David S. Miller

    Dan Carpenter
     
  • Conflicts:
    fs/nfs/nfs2xdr.c
    fs/nfs/nfs3xdr.c
    fs/nfs/nfs4xdr.c

    Trond Myklebust
     
  • vm_map_ram() is not available on NOMMU platforms, and causes trouble
    on incoherrent architectures such as ARM when we access the page data
    through both the direct and the virtual mapping.

    The alternative is to use the direct mapping to access page data
    for the case when we are not crossing a page boundary, but to copy
    the data into a linear scratch buffer when we are accessing data
    that spans page boundaries.

    Signed-off-by: Trond Myklebust
    Tested-by: Marc Kleine-Budde
    Cc: stable@kernel.org [2.6.37]

    Trond Myklebust
     
  • Using "iptables -L" with a lot of rules have a too big BH latency.
    Jesper mentioned ~6 ms and worried of frame drops.

    Switch to a per_cpu seqlock scheme, so that taking a snapshot of
    counters doesnt need to block BH (for this cpu, but also other cpus).

    This adds two increments on seqlock sequence per ipt_do_table() call,
    its a reasonable cost for allowing "iptables -L" not block BH
    processing.

    Reported-by: Jesper Dangaard Brouer
    Signed-off-by: Eric Dumazet
    CC: Patrick McHardy
    Acked-by: Stephen Hemminger
    Acked-by: Jesper Dangaard Brouer
    Signed-off-by: Pablo Neira Ayuso

    Eric Dumazet