29 Oct, 2010

1 commit


27 Oct, 2010

10 commits


26 Oct, 2010

6 commits

  • Instead of always assigning an increasing inode number in new_inode
    move the call to assign it into those callers that actually need it.
    For now callers that need it is estimated conservatively, that is
    the call is added to all filesystems that do not assign an i_ino
    by themselves. For a few more filesystems we can avoid assigning
    any inode number given that they aren't user visible, and for others
    it could be done lazily when an inode number is actually needed,
    but that's left for later patches.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Dave Chinner
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Since an IB transport port may use either IB or Ethernet as its link layer,
    add the file /sys/class/infiniband//ports//link_layer to
    show the link layer for the port.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • This patch allows IBoE traffic to be encapsulated in 802.1Q tagged
    VLAN frames. The VLAN tag is encoded in the GID and derived from it
    by a simple computation.

    The netdev notifier callback is modified to catch VLAN device
    addition/removal and the port's GID table is updated to reflect the
    change, so that for each netdevice there is an entry in the GID table.
    When the port's GID table is exhausted, GID entries will not be added.
    Only children of the main interfaces can add to the GID table; if a
    VLAN interface is added on another VLAN interface (e.g. "vconfig add
    eth2.6 8"), then that interfaces will not add an entry to the GID
    table.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the
    GID derived from a link local address in the following way:

    GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN.

    The 3 bits user priority field of the packets are identical to the 3
    bits of the SL.

    In case of rdma_cm apps, the TOS field is used to generate the SL
    field by doing a shift right of 5 bits effectively taking to 3 MS bits
    of the TOS field.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • Add support for IBoE to mlx4_ib. The bulk of the code is handling the
    new address vector fields; mlx4 needs the MAC address of a remote node
    to include it in a WQE (for datagrams) or in the QP context (for
    connected QPs). Address resolution is done by assuming all unicast
    GIDs are either link-local IPv6 addresses.

    Multicast group attach/detach needs to update the NIC's multicast
    filters; but since attaching a QP to a multicast group can be done
    before the QP is bound to a port, for IBoE we need to keep track of
    all multicast groups that a QP is attached too before it transitions
    from INIT to RTR (since it does not have a port in the INIT state).

    Signed-off-by: Eli Cohen

    [ Many things cleaned up and otherwise monkeyed with; hope I didn't
    introduce too many bugs. - Roland ]

    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     

25 Oct, 2010

5 commits

  • srp_send_tsk_mgmt() was missing the proper DMA sync calls before posting
    the buffer to the device.

    Signed-off-by: David Dillow
    Signed-off-by: Roland Dreier

    David Dillow
     
  • Use the list_first_entry() macro in ib_srp instead of open-coding the equivalent,
    which makes the source code slightly more descriptive. The list_first_entry()
    macro itself was introduced in kernel 2.6.22.

    Signed-off-by: Bart Van Assche
    Signed-off-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • As proposed by the SRP (draft) standard, ib_srp reserves one ring
    element for SRP_TSK_MGMT requests. This patch makes sure that the SCSI
    mid-layer never tries to queue more than (SRP request limit) - 1 SCSI
    commands to ib_srp. This improves performance for targets whose request
    limit is less than or equal to SRP_NORMAL_REQ_SQ_SIZE by reducing the
    number of BUSY responses reported by ib_srp to the SCSI mid-layer.

    Signed-off-by: Bart Van Assche
    Signed-off-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • Signed-off-by: David Dillow
    Signed-off-by: Roland Dreier

    David Dillow
     
  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    Update broken web addresses in arch directory.
    Update broken web addresses in the kernel.
    Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget
    Revert "Fix typo: configuation => configuration" partially
    ida: document IDA_BITMAP_LONGS calculation
    ext2: fix a typo on comment in ext2/inode.c
    drivers/scsi: Remove unnecessary casts of private_data
    drivers/s390: Remove unnecessary casts of private_data
    net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data
    drivers/infiniband: Remove unnecessary casts of private_data
    drivers/gpu/drm: Remove unnecessary casts of private_data
    kernel/pm_qos_params.c: Remove unnecessary casts of private_data
    fs/ecryptfs: Remove unnecessary casts of private_data
    fs/seq_file.c: Remove unnecessary casts of private_data
    arm: uengine.c: remove C99 comments
    arm: scoop.c: remove C99 comments
    Fix typo configue => configure in comments
    Fix typo: configuation => configuration
    Fix typo interrest[ing|ed] => interest[ing|ed]
    Fix various typos of valid in comments
    ...

    Fix up trivial conflicts in:
    drivers/char/ipmi/ipmi_si_intf.c
    drivers/usb/gadget/rndis.c
    net/irda/irnet/irnet_ppp.c

    Linus Torvalds
     

24 Oct, 2010

5 commits

  • The Node Description cannot be changed via MADs (it is read-only).
    Until now, it was changed in the driver via sysfs, and the new Node
    Description was simply inserted by the driver into MAD responses
    (replacing the description returned by FW).

    System startup scripts use the sysfs interface to change the node
    description at driver startup to show the hostname, etc. However, this
    has a race condition: the SM could discover the original FW node
    description rather than the system-specific description if it queried the
    port before the startup scripts finish running.

    For mlx4, we fix this with a new FW command (SET_NODE) that allows
    passing the new node description to FW. When this command is invoked,
    FW sends a trap 144 to the SM. When it gets this trap, the SM can
    query the node to obtain the new node description -- thus eliminating
    the effects of the race.

    This patch simply calls SET_NODE command when a new node description
    is entered via sysfs (thus causing trap 144 to be issued by the FW).
    We ignore all failures of the SET_NODE command (including those caused
    by using a device FW that predates the SET_NODE command), since in
    that case things work just as before.

    Signed-off-by: Jack Morgenstein
    Signed-off-by: Roland Dreier

    Jack Morgenstein
     
  • Signed-off-by: matt mooney
    Signed-off-by: Roland Dreier

    matt mooney
     
  • For iWARP connections, the connect request is carried in a TCP payload
    on an already established TCP connection. So if the ucma's backlog is
    full, the connection request is transmitted and acked at the TCP level
    by the time the connect request gets dropped in the ucma. The end
    result is the connection gets rejected by the iWARP provider.
    Further, a 32 node 256NP OpenMPI job will generate > 128 connect
    requests on some ranks.

    This patch increases the default max backlog to 1024, and adds a
    sysctl variable so the backlog can be adjusted at run time.

    Signed-off-by: Steve Wise
    Signed-off-by: Roland Dreier

    Steve Wise
     
  • Use the net device's dev_id field to encode the port number of the pci
    device. This can be used to to associate a net device with the pci
    device's port. The encoding is: dev_id = port - 1.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits)
    bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL.
    vlan: Calling vlan_hwaccel_do_receive() is always valid.
    tproxy: use the interface primary IP address as a default value for --on-ip
    tproxy: added IPv6 support to the socket match
    cxgb3: function namespace cleanup
    tproxy: added IPv6 support to the TPROXY target
    tproxy: added IPv6 socket lookup function to nf_tproxy_core
    be2net: Changes to use only priority codes allowed by f/w
    tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
    tproxy: added tproxy sockopt interface in the IPV6 layer
    tproxy: added udp6_lib_lookup function
    tproxy: added const specifiers to udp lookup functions
    tproxy: split off ipv6 defragmentation to a separate module
    l2tp: small cleanup
    nf_nat: restrict ICMP translation for embedded header
    can: mcp251x: fix generation of error frames
    can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
    can-raw: add msg_flags to distinguish local traffic
    9p: client code cleanup
    rds: make local functions/variables static
    ...

    Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and
    drivers/net/wireless/ath/ath9k/debug.c as per David

    Linus Torvalds
     

23 Oct, 2010

7 commits

  • This patch adds support for SRP_CRED_REQ to avoid a lockup by targets
    that use that mechanism to return credits to the initiator. This
    prevents a lockup observed in the field where we would never add the
    credits from the SRP_CRED_REQ to our current count, and would therefore
    never send another command to the target.

    Minimal support for SRP_AER_REQ is also added, as these messages can
    also be used to convey additional credits to the initiator.

    Based upon extensive debugging and code by Bart Van Assche and a bug
    report by Chris Worley.

    Signed-off-by: David Dillow
    Signed-off-by: Roland Dreier

    David Dillow
     
  • The transmit ring in ib_srp (srp_target.tx_ring) is currently only used
    for allocating requests sent by the initiator to the target. This patch
    prepares using that ring for allocation of both requests and responses.
    Also, this patch differentiates the uses of SRP_SQ_SIZE, increases the
    size of the IB send completion queue by one element and reserves one
    transmit ring slot for SRP_TSK_MGMT requests.

    Signed-off-by: Bart Van Assche
    Signed-off-by: David Dillow
    Signed-off-by: Roland Dreier

    Bart Van Assche
     
  • See table 35 in IBA - the header order for RDMA_WRITE_ONLY_WITH_IMMEDIATE
    and SEND_LAST_WITH_IMMEDIATE is different: the RDMA_WRITE_ONLY has
    a RETH header before the immediate data, so we need a different code path
    to extract the immediate data.

    I tested this with a userspace app that does RDMA_WRITE with immediate
    on a QLE7140.

    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Ralph Campbell
    Signed-off-by: Roland Dreier

    Jason Gunthorpe
     
  • The flushing of work requests for user QPs is implemented entirely in
    the user mode library. The only kernel interaction is to mark the
    user QP object indicating it is in error when the QP exits RTS. When
    the user QP operations are called by the application (eg: post_send,
    post_recv), the QP in error bit is checked and if set, the library
    flushes the QP. If, however, the application is not doing IO, but
    rather just polling the CQ, it will never get flushed work requests.
    This breaks some classes of applications.

    This patch adds logic to mark user CQs in error when a QP that is bound
    to the CQ is marked in error. The library poll code can then notice
    the CQ is in error and flush all the in error QPs bound to that CQ.

    Design:

    - add 1 extra CQE entry to the CQ memory that will be used to indicate
    in error status.
    - return the desired CQ memory size that should be mapped by the library
    - bump the ABI since the create_cq uverbs response changes.
    - detect older libraries and reduce the mmap size accordingly.
    (The ABI bump doesn't break old libraries, since they didn't check
    the ABI field anyway)

    Signed-off-by: Steve Wise
    Signed-off-by: Roland Dreier

    Steve Wise
     
  • Remove the local service t4_pktgl_to_skb() and use cxgb4_pktgl_to_skb()
    exported by cxgb4.

    Signed-off-by: Steve Wise
    Signed-off-by: Roland Dreier

    Steve Wise
     
  • Signed-off-by: Steve Wise
    Signed-off-by: Roland Dreier

    Steve Wise
     
  • * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
    vfs: make no_llseek the default
    vfs: don't use BKL in default_llseek
    llseek: automatically add .llseek fop
    libfs: use generic_file_llseek for simple_attr
    mac80211: disallow seeks in minstrel debug code
    lirc: make chardev nonseekable
    viotape: use noop_llseek
    raw: use explicit llseek file operations
    ibmasmfs: use generic_file_llseek
    spufs: use llseek in all file operations
    arm/omap: use generic_file_llseek in iommu_debug
    lkdtm: use generic_file_llseek in debugfs
    net/wireless: use generic_file_llseek in debugfs
    drm: use noop_llseek

    Linus Torvalds
     

18 Oct, 2010

1 commit

  • The patch below updates broken web addresses in the kernel

    Signed-off-by: Justin P. Mattock
    Cc: Maciej W. Rozycki
    Cc: Geert Uytterhoeven
    Cc: Finn Thain
    Cc: Randy Dunlap
    Cc: Matt Turner
    Cc: Dimitry Torokhov
    Cc: Mike Frysinger
    Acked-by: Ben Pfaff
    Acked-by: Hans J. Koch
    Reviewed-by: Finn Thain
    Signed-off-by: Jiri Kosina

    Justin P. Mattock
     

17 Oct, 2010

1 commit

  • Fix kconfig dependency warning to satisfy dependencies:

    warning: (MLX4_EN && NETDEVICES && NETDEV_10000 && PCI && INET || MLX4_INFINIBAND && INFINIBAND) selects MLX4_CORE which has unmet direct dependencies (NETDEVICES && NETDEV_10000 && PCI)

    Signed-off-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Randy Dunlap
     

15 Oct, 2010

2 commits

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     
  • Add support for packing IBoE packet headers.

    Signed-off-by: Eli Cohen

    [ Clean up and fix ib_ud_header_init() a bit. - Roland ]

    Signed-off-by: Roland Dreier

    Eli Cohen
     

14 Oct, 2010

2 commits

  • Add support for IBoE device binding and IP --> GID resolution. Path
    resolving and multicast joining are implemented within cma.c by
    filling in the responses and running callbacks in the CMA work queue.

    IP --> GID resolution always yields IPv6 link local addresses; remote
    GIDs are derived from the destination MAC address of the remote port.
    Multicast GIDs are always mapped to multicast MACs as is done in IPv6.
    (IPv4 multicast is enabled by translating IPv4 multicast addresses to
    IPv6 multicast as described in
    .)

    Some helper functions are added to ib_addr.h.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • Since IBoE is using Ethernet as its link layer, there is no central
    management entity so there is need for QP0. QP1 is still needed since
    it handles communications between CM agents. This patch will skip QP0
    and create only QP1 for IBoE ports.

    Signed-off-by: Eli Cohen
    Signed-off-by: Roland Dreier

    Eli Cohen