03 Jan, 2013

1 commit

  • Pull Ceph fixes from Sage Weil:
    "Two of Alex's patches deal with a race when reseting server
    connections for open RBD images, one demotes some non-fatal BUGs to
    WARNs, and my patch fixes a protocol feature bit failure path."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
    libceph: fix protocol feature mismatch failure path
    libceph: WARN, don't BUG on unexpected connection states
    libceph: always reset osds when kicking
    libceph: move linger requests sooner in kick_requests()

    Linus Torvalds
     

28 Dec, 2012

4 commits

  • We should not set con->state to CLOSED here; that happens in
    ceph_fault() in the caller, where it first asserts that the state
    is not yet CLOSED. Avoids a BUG when the features don't match.

    Since the fail_protocol() has become a trivial wrapper, replace
    calls to it with direct calls to reset_connection().

    Signed-off-by: Sage Weil
    Reviewed-by: Alex Elder

    Sage Weil
     
  • A number of assertions in the ceph messenger are implemented with
    BUG_ON(), killing the system if connection's state doesn't match
    what's expected. At this point our state model is (evidently) not
    well understood enough for these assertions to trigger a BUG().
    Convert all BUG_ON(con->state...) calls to be WARN_ON(con->state...)
    so we learn about these issues without killing the machine.

    We now recognize that a connection fault can occur due to a socket
    closure at any time, regardless of the state of the connection. So
    there is really nothing we can assert about the state of the
    connection at that point so eliminate that assertion.

    Reported-by: Ugis
    Tested-by: Ugis
    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • When ceph_osdc_handle_map() is called to process a new osd map,
    kick_requests() is called to ensure all affected requests are
    updated if necessary to reflect changes in the osd map. This
    happens in two cases: whenever an incremental map update is
    processed; and when a full map update (or the last one if there is
    more than one) gets processed.

    In the former case, the kick_requests() call is followed immediately
    by a call to reset_changed_osds() to ensure any connections to osds
    affected by the map change are reset. But for full map updates
    this isn't done.

    Both cases should be doing this osd reset.

    Rather than duplicating the reset_changed_osds() call, move it into
    the end of kick_requests().

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • The kick_requests() function is called by ceph_osdc_handle_map()
    when an osd map change has been indicated. Its purpose is to
    re-queue any request whose target osd is different from what it
    was when it was originally sent.

    It is structured as two loops, one for incomplete but registered
    requests, and a second for handling completed linger requests.
    As a special case, in the first loop if a request marked to linger
    has not yet completed, it is moved from the request list to the
    linger list. This is as a quick and dirty way to have the second
    loop handle sending the request along with all the other linger
    requests.

    Because of the way it's done now, however, this quick and dirty
    solution can result in these incomplete linger requests never
    getting re-sent as desired. The problem lies in the fact that
    the second loop only arranges for a linger request to be sent
    if it appears its target osd has changed. This is the proper
    handling for *completed* linger requests (it avoids issuing
    the same linger request twice to the same osd).

    But although the linger requests added to the list in the first loop
    may have been sent, they have not yet completed, so they need to be
    re-sent regardless of whether their target osd has changed.

    The first required fix is we need to avoid calling __map_request()
    on any incomplete linger request. Otherwise the subsequent
    __map_request() call in the second loop will find the target osd
    has not changed and will therefore not re-send the request.

    Second, we need to be sure that a sent but incomplete linger request
    gets re-sent. If the target osd is the same with the new osd map as
    it was when the request was originally sent, this won't happen.
    This can be fixed through careful handling when we move these
    requests from the request list to the linger list, by unregistering
    the request *before* it is registered as a linger request. This
    works because a side-effect of unregistering the request is to make
    the request's r_osd pointer be NULL, and *that* will ensure the
    second loop actually re-sends the linger request.

    Processing of such a request is done at that point, so continue with
    the next one once it's been moved.

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     

27 Dec, 2012

6 commits

  • ip6gre_xmit2() incorrectly sets transport header to inner payload
    instead of GRE header. It seems copy-and-pasted from ipip.c.
    Set transport header to gre header.
    (In ipip case the transport header is the inner ip header, so that's
    correct.)

    Found by inspection. In practice the incorrect transport header
    doesn't matter because the skb usually is sent to another net_device
    or socket, so the transport header isn't referenced.

    Signed-off-by: Isaku Yamahata
    Signed-off-by: David S. Miller

    Isaku Yamahata
     
  • ipgre_tunnel_xmit() incorrectly sets transport header to inner payload
    instead of GRE header. It seems copy-and-pasted from ipip.c.
    So set transport header to gre header.
    (In ipip case the transport header is the inner ip header, so that's
    correct.)

    Found by inspection. In practice the incorrect transport header
    doesn't matter because the skb usually is sent to another net_device
    or socket, so the transport header isn't referenced.

    Signed-off-by: Isaku Yamahata
    Signed-off-by: David S. Miller

    Isaku Yamahata
     
  • Add an else to only print the incompatible protocol message
    when version hasn't been established.

    Signed-off-by: Mike Marciniszyn
    Signed-off-by: David S. Miller

    Marciniszyn, Mike
     
  • 0b088e00 ("RDS: Use page_remainder_alloc() for recv bufs")
    added uses of sg_dma_len() and sg_dma_address(). This makes
    RDS DOA with the qib driver.

    IB ulps should use ib_sg_dma_len() and ib_sg_dma_address
    respectively since some HCAs overload ib_sg_dma* operations.

    Signed-off-by: Mike Marciniszyn
    Signed-off-by: David S. Miller

    Marciniszyn, Mike
     
  • In commit 96e0bf4b5193d (tcp: Discard segments that ack data not yet
    sent) John Dykstra enforced a check against ack sequences.

    In commit 354e4aa391ed5 (tcp: RFC 5961 5.2 Blind Data Injection Attack
    Mitigation) I added more safety tests.

    But we missed fact that these tests are not performed if ACK bit is
    not set.

    RFC 793 3.9 mandates TCP should drop a frame without ACK flag set.

    " fifth check the ACK field,
    if the ACK bit is off drop the segment and return"

    Not doing so permits an attacker to only guess an acceptable sequence
    number, evading stronger checks.

    Many thanks to Zhiyun Qian for bringing this issue to our attention.

    See :
    http://web.eecs.umich.edu/~zhiyunq/pub/ccs12_TCP_sequence_number_inference.pdf

    Reported-by: Zhiyun Qian
    Signed-off-by: Eric Dumazet
    Cc: Nandita Dukkipati
    Cc: Neal Cardwell
    Cc: John Dykstra
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • batadv_iv_ogm_emit_send_time() attempts to calculates a random integer
    in the range of 'orig_interval +- BATADV_JITTER' by the below lines.

    msecs = atomic_read(&bat_priv->orig_interval) - BATADV_JITTER;
    msecs += (random32() % 2 * BATADV_JITTER);

    But it actually gets 'orig_interval' or 'orig_interval - BATADV_JITTER'
    because '%' and '*' have same precedence and associativity is
    left-to-right.

    This adds the parentheses at the appropriate position so that it matches
    original intension.

    Signed-off-by: Akinobu Mita
    Acked-by: Antonio Quartulli
    Cc: Marek Lindner
    Cc: Simon Wunderlich
    Cc: Antonio Quartulli
    Cc: b.a.t.m.a.n@lists.open-mesh.org
    Cc: "David S. Miller"
    Cc: netdev@vger.kernel.org
    Signed-off-by: David S. Miller

    Akinobu Mita
     

25 Dec, 2012

1 commit

  • Sedat reported the following commit caused a regression:

    commit 9650388b5c56578fdccc79c57a8c82fb92b8e7f1
    Author: Eric Dumazet
    Date: Fri Dec 21 07:32:10 2012 +0000

    ipv4: arp: fix a lockdep splat in arp_solicit

    This is due to the 6th parameter of arp_send() needs to be NULL
    for the broadcast case, the above commit changed it to an all-zero
    array by mistake.

    Reported-by: Sedat Dilek
    Tested-by: Sedat Dilek
    Cc: Sedat Dilek
    Cc: Eric Dumazet
    Cc: David S. Miller
    Cc: Julian Anastasov
    Signed-off-by: Cong Wang
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Cong Wang
     

22 Dec, 2012

7 commits

  • Fixed integer overflow in function htb_dequeue

    Signed-off-by: Stefan Hasko
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Stefan Hasko
     
  • CONFIG_HOTPLUG is always enabled now, so remove the unused code that was
    trying to be compiled out when this option was disabled, in the
    networking core.

    Cc: Bill Pemberton
    Signed-off-by: Greg Kroah-Hartman
    Signed-off-by: David S. Miller

    Greg KH
     
  • When netdev_set_master faild in br_add_if, we should
    call br_netpoll_disable to do some cleanup jobs,such
    as free the memory of struct netpoll which allocated
    in br_netpoll_enable.

    Signed-off-by: Gao feng
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Gao feng
     
  • Yan Burman reported following lockdep warning :

    =============================================
    [ INFO: possible recursive locking detected ]
    3.7.0+ #24 Not tainted
    ---------------------------------------------
    swapper/1/0 is trying to acquire lock:
    (&n->lock){++--..}, at: [] __neigh_event_send
    +0x2e/0x2f0

    but task is already holding lock:
    (&n->lock){++--..}, at: [] arp_solicit+0x1d4/0x280

    other info that might help us debug this:
    Possible unsafe locking scenario:

    CPU0
    ----
    lock(&n->lock);
    lock(&n->lock);

    *** DEADLOCK ***

    May be due to missing lock nesting notation

    4 locks held by swapper/1/0:
    #0: (((&n->timer))){+.-...}, at: []
    call_timer_fn+0x0/0x1c0
    #1: (&n->lock){++--..}, at: [] arp_solicit
    +0x1d4/0x280
    #2: (rcu_read_lock_bh){.+....}, at: []
    dev_queue_xmit+0x0/0x5d0
    #3: (rcu_read_lock_bh){.+....}, at: []
    ip_finish_output+0x13e/0x640

    stack backtrace:
    Pid: 0, comm: swapper/1 Not tainted 3.7.0+ #24
    Call Trace:
    [] validate_chain+0xdcc/0x11f0
    [] ? __lock_acquire+0x440/0xc30
    [] ? kmem_cache_free+0xe5/0x1c0
    [] __lock_acquire+0x440/0xc30
    [] ? inet_getpeer+0x40/0x600
    [] ? __lock_acquire+0x440/0xc30
    [] ? __neigh_event_send+0x2e/0x2f0
    [] lock_acquire+0x95/0x140
    [] ? __neigh_event_send+0x2e/0x2f0
    [] ? __lock_acquire+0x440/0xc30
    [] _raw_write_lock_bh+0x3b/0x50
    [] ? __neigh_event_send+0x2e/0x2f0
    [] __neigh_event_send+0x2e/0x2f0
    [] neigh_resolve_output+0x16b/0x270
    [] ip_finish_output+0x34d/0x640
    [] ? ip_finish_output+0x13e/0x640
    [] ? vxlan_xmit+0x556/0xbec [vxlan]
    [] ip_output+0x80/0xf0
    [] ip_local_out+0x28/0x80
    [] vxlan_xmit+0x66a/0xbec [vxlan]
    [] ? vxlan_xmit+0x556/0xbec [vxlan]
    [] ? skb_gso_segment+0x2b0/0x2b0
    [] ? _raw_spin_unlock_irqrestore+0x65/0x80
    [] ? dev_queue_xmit_nit+0x207/0x270
    [] dev_hard_start_xmit+0x298/0x5d0
    [] dev_queue_xmit+0x2f3/0x5d0
    [] ? dev_hard_start_xmit+0x5d0/0x5d0
    [] arp_xmit+0x58/0x60
    [] arp_send+0x3b/0x40
    [] arp_solicit+0x204/0x280
    [] ? neigh_add+0x310/0x310
    [] neigh_probe+0x45/0x70
    [] neigh_timer_handler+0x1a0/0x2a0
    [] call_timer_fn+0x7f/0x1c0
    [] ? detach_if_pending+0x120/0x120
    [] run_timer_softirq+0x238/0x2b0
    [] ? neigh_add+0x310/0x310
    [] __do_softirq+0x101/0x280
    [] call_softirq+0x1c/0x30
    [] do_softirq+0x85/0xc0
    [] irq_exit+0x9e/0xc0
    [] smp_apic_timer_interrupt+0x68/0xa0
    [] apic_timer_interrupt+0x6f/0x80
    [] ? mwait_idle+0xa4/0x1c0
    [] ? mwait_idle+0x9b/0x1c0
    [] cpu_idle+0x89/0xe0
    [] start_secondary+0x1b2/0x1b6

    Bug is from arp_solicit(), releasing the neigh lock after arp_send()
    In case of vxlan, we eventually need to write lock a neigh lock later.

    Its a false positive, but we can get rid of it without lockdep
    annotations.

    We can instead use neigh_ha_snapshot() helper.

    Reported-by: Yan Burman
    Signed-off-by: Eric Dumazet
    Acked-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Using a seqlock for devnet_rename_seq is not a good idea,
    as device_rename() can sleep.

    As we hold RTNL, we dont need a protection for writers,
    and only need a seqcount so that readers can catch a change done
    by a writer.

    Bug added in commit c91f6df2db4972d3 (sockopt: Change getsockopt() of
    SO_BINDTODEVICE to return an interface name)

    Reported-by: Dave Jones
    Signed-off-by: Eric Dumazet
    Cc: Brian Haley
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Once skb_realloc_headroom() is called, tiph might point to freed memory.

    Cache tiph->ttl value before the reallocation, to avoid unexpected
    behavior.

    Signed-off-by: Eric Dumazet
    Cc: Isaku Yamahata
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • ipgre_tunnel_xmit() parses network header as IP unconditionally.
    But transmitting packets are not always IP packet. For example such packet
    can be sent by packet socket with sockaddr_ll.sll_protocol set.
    So make the function check if skb->protocol is IP.

    Signed-off-by: Isaku Yamahata
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Isaku Yamahata
     

21 Dec, 2012

8 commits

  • Pull nfsd update from Bruce Fields:
    "Included this time:

    - more nfsd containerization work from Stanislav Kinsbursky: we're
    not quite there yet, but should be by 3.9.

    - NFSv4.1 progress: implementation of basic backchannel security
    negotiation and the mandatory BACKCHANNEL_CTL operation. See

    http://wiki.linux-nfs.org/wiki/index.php/Server_4.0_and_4.1_issues

    for remaining TODO's

    - Fixes for some bugs that could be triggered by unusual compounds.
    Our xdr code wasn't designed with v4 compounds in mind, and it
    shows. A more thorough rewrite is still a todo.

    - If you've ever seen "RPC: multiple fragments per record not
    supported" logged while using some sort of odd userland NFS client,
    that should now be fixed.

    - Further work from Jeff Layton on our mechanism for storing
    information about NFSv4 clients across reboots.

    - Further work from Bryan Schumaker on his fault-injection mechanism
    (which allows us to discard selective NFSv4 state, to excercise
    rarely-taken recovery code paths in the client.)

    - The usual mix of miscellaneous bugs and cleanup.

    Thanks to everyone who tested or contributed this cycle."

    * 'for-3.8' of git://linux-nfs.org/~bfields/linux: (111 commits)
    nfsd4: don't leave freed stateid hashed
    nfsd4: free_stateid can use the current stateid
    nfsd4: cleanup: replace rq_resused count by rq_next_page pointer
    nfsd: warn on odd reply state in nfsd_vfs_read
    nfsd4: fix oops on unusual readlike compound
    nfsd4: disable zero-copy on non-final read ops
    svcrpc: fix some printks
    NFSD: Correct the size calculation in fault_inject_write
    NFSD: Pass correct buffer size to rpc_ntop
    nfsd: pass proper net to nfsd_destroy() from NFSd kthreads
    nfsd: simplify service shutdown
    nfsd: replace boolean nfsd_up flag by users counter
    nfsd: simplify NFSv4 state init and shutdown
    nfsd: introduce helpers for generic resources init and shutdown
    nfsd: make NFSd service structure allocated per net
    nfsd: make NFSd service boot time per-net
    nfsd: per-net NFSd up flag introduced
    nfsd: move per-net startup code to separated function
    nfsd: pass net to __write_ports() and down
    nfsd: pass net to nfsd_set_nrthreads()
    ...

    Linus Torvalds
     
  • Pull Ceph update from Sage Weil:
    "There are a few different groups of commits here. The largest is
    Alex's ongoing work to enable the coming RBD features (cloning,
    striping). There is some cleanup in libceph that goes along with it.

    Cyril and David have fixed some problems with NFS reexport (leaking
    dentries and page locks), and there is a batch of patches from Yan
    fixing problems with the fs client when running against a clustered
    MDS. There are a few bug fixes mixed in for good measure, many of
    which will be going to the stable trees once they're upstream.

    My apologies for the late pull. There is still a gremlin in the rbd
    map/unmap code and I was hoping to include the fix for that as well,
    but we haven't been able to confirm the fix is correct yet; I'll send
    that in a separate pull once it's nailed down."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (68 commits)
    rbd: get rid of rbd_{get,put}_dev()
    libceph: register request before unregister linger
    libceph: don't use rb_init_node() in ceph_osdc_alloc_request()
    libceph: init event->node in ceph_osdc_create_event()
    libceph: init osd->o_node in create_osd()
    libceph: report connection fault with warning
    libceph: socket can close in any connection state
    rbd: don't use ENOTSUPP
    rbd: remove linger unconditionally
    rbd: get rid of RBD_MAX_SEG_NAME_LEN
    libceph: avoid using freed osd in __kick_osd_requests()
    ceph: don't reference req after put
    rbd: do not allow remove of mounted-on image
    libceph: Unlock unprocessed pages in start_read() error path
    ceph: call handle_cap_grant() for cap import message
    ceph: Fix __ceph_do_pending_vmtruncate
    ceph: Don't add dirty inode to dirty list if caps is in migration
    ceph: Fix infinite loop in __wake_requests
    ceph: Don't update i_max_size when handling non-auth cap
    bdi_register: add __printf verification, fix arg mismatch
    ...

    Linus Torvalds
     
  • In kick_requests(), we need to register the request before we
    unregister the linger request. Otherwise the unregister will
    reset the request's osd pointer to NULL.

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • The red-black node in the ceph osd request structure is initialized
    in ceph_osdc_alloc_request() using rbd_init_node(). We do need to
    initialize this, because in __unregister_request() we call
    RB_EMPTY_NODE(), which expects the node it's checking to have
    been initialized. But rb_init_node() is apparently overkill, and
    may in fact be on its way out. So use RB_CLEAR_NODE() instead.

    For a little more background, see this commit:
    4c199a93 rbtree: empty nodes have no color"

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • The red-black node node in the ceph osd event structure is not
    initialized in create_osdc_create_event(). Because this node can
    be the subject of a RB_EMPTY_NODE() call later on, we should ensure
    the node is initialized properly for that.

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • The red-black node node in the ceph osd structure is not initialized
    in create_osd(). Because this node can be the subject of a
    RB_EMPTY_NODE() call later on, we should ensure the node is
    initialized properly for that. Add a call to RB_CLEAR_NODE()
    initialize it.

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • When a connection's socket disconnects, or if there's a protocol
    error of some kind on the connection, a fault is signaled and
    the connection is reset (closed and reopened, basically). We
    currently get an error message on the log whenever this occurs.

    A ceph connection will attempt to reestablish a socket connection
    repeatedly if a fault occurs. This means that these error messages
    will get repeatedly added to the log, which is undesirable.

    Change the error message to be a warning, so they don't get
    logged by default.

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder
     
  • Pull virtio update from Rusty Russell:
    "Some nice cleanups, and even a patch my wife did as a "live" demo for
    Latinoware 2012.

    There's a slightly non-trivial merge in virtio-net, as we cleaned up
    the virtio add_buf interface while DaveM accepted the mq virtio-net
    patches."

    * tag 'virtio-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (27 commits)
    virtio_console: Add support for remoteproc serial
    virtio_console: Merge struct buffer_token into struct port_buffer
    virtio: add drv_to_virtio to make code clearly
    virtio: use dev_to_virtio wrapper in virtio
    virtio-mmio: Fix irq parsing in command line parameter
    virtio_console: Free buffers from out-queue upon close
    virtio: Convert dev_printk(KERN_ to dev_(
    virtio_console: Use kmalloc instead of kzalloc
    virtio_console: Free buffer if splice fails
    virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: console: make it clear that virtqueue_add_buf() no longer returns > 0
    virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
    virtio: console: don't rely on virtqueue_add_buf() returning capacity.
    virtio_net: don't rely on virtqueue_add_buf() returning capacity.
    virtio-net: remove unused skb_vnet_hdr->num_sg field
    virtio-net: correct capacity math on ring full
    virtio: move queue_index and num_free fields into core struct virtqueue.
    ...

    Linus Torvalds
     

20 Dec, 2012

4 commits

  • Pull networking fixes from David Miller:

    1) Really fix tuntap SKB use after free bug, from Eric Dumazet.

    2) Adjust SKB data pointer to point past the transport header before
    calling icmpv6_notify() so that the headers are in the state which
    that function expects. From Duan Jiong.

    3) Fix ambiguities in the new tuntap multi-queue APIs. From Jason
    Wang.

    4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.

    5) Don't destroy mutex after freeing up device private in mac802154,
    fix also from Konstantin Khlebnikov.

    6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.

    7) SCTP HMAC kconfig rework, from Neil Horman.

    8) Fix SCTP jprobes function signature, otherwise things explode, from
    Daniel Borkmann.

    9) Fix typo in ipv6-offload Makefile variable reference, from Simon
    Arlott.

    10) Don't fail USBNET open just because remote wakeup isn't supported,
    from Oliver Neukum.

    11) be2net driver bug fixes from Sathya Perla.

    12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
    Woodhouse.

    13) Fix MTU changing regression in 8139cp driver, from John Greene.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
    solos-pci: ensure all TX packets are aligned to 4 bytes
    solos-pci: add firmware upgrade support for new models
    solos-pci: remove superfluous debug output
    solos-pci: add GPIO support for newer versions on Geos board
    8139cp: Prevent dev_close/cp_interrupt race on MTU change
    net: qmi_wwan: add ZTE MF880
    drivers/net: Use of_match_ptr() macro in smsc911x.c
    drivers/net: Use of_match_ptr() macro in smc91x.c
    ipv6: addrconf.c: remove unnecessary "if"
    bridge: Correctly encode addresses when dumping mdb entries
    bridge: Do not unregister all PF_BRIDGE rtnl operations
    use generic usbnet_manage_power()
    usbnet: generic manage_power()
    usbnet: handle PM failure gracefully
    ksz884x: fix receive polling race condition
    qlcnic: update driver version
    qlcnic: fix unused variable warnings
    net: fec: forbid FEC_PTP on SoCs that do not support
    be2net: fix wrong frag_idx reported by RX CQ
    be2net: fix be_close() to ensure all events are ack'ed
    ...

    Linus Torvalds
     
  • the value of err is always negative if it goes to errout, so we don't need to
    check the value of err.

    Signed-off-by: Cong Ding
    Signed-off-by: David S. Miller

    Cong Ding
     
  • When dumping mdb table, set the addresses the kernel returns
    based on the address protocol type.

    Signed-off-by: Vlad Yasevich
    Acked-by: Cong Wang
    Signed-off-by: David S. Miller

    Vlad Yasevich
     
  • Bridge fdb and link rtnl operations are registered in
    core/rtnetlink. Bridge mdb operations are registred
    in bridge/mdb. When removing bridge module, do not
    unregister ALL PF_BRIDGE ops since that would remove
    the ops from rtnetlink as well. Do remove mdb ops when
    bridge is destroyed.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: David S. Miller

    Vlad Yasevich
     

19 Dec, 2012

2 commits

  • Pull (again) user namespace infrastructure changes from Eric Biederman:
    "Those bugs, those darn embarrasing bugs just want don't want to get
    fixed.

    Linus I just updated my mirror of your kernel.org tree and it appears
    you successfully pulled everything except the last 4 commits that fix
    those embarrasing bugs.

    When you get a chance can you please repull my branch"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    userns: Fix typo in description of the limitation of userns_install
    userns: Add a more complete capability subset test to commit_creds
    userns: Require CAP_SYS_ADMIN for most uses of setns.
    Fix cap_capable to only allow owners in the parent user namespace to have caps.

    Linus Torvalds
     
  • Pull NFS client updates from Trond Myklebust:
    "Features include:

    - Full audit of BUG_ON asserts in the NFS, SUNRPC and lockd client
    code. Remove altogether where possible, and replace with
    WARN_ON_ONCE and appropriate error returns where not.
    - NFSv4.1 client adds session dynamic slot table management. There
    is matching server side code that has been submitted to Bruce for
    consideration.

    Together, this code allows the server to dynamically manage the
    amount of memory it allocates to the duplicate request cache for
    each client. It will constantly resize those caches to reserve
    more memory for clients that are hot while shrinking caches for
    those that are quiescent.

    In addition, there are assorted bugfixes for the generic NFS write
    code, fixes to deal with the drop_nlink() warnings, and yet another
    fix for NFSv4 getacl."

    * tag 'nfs-for-3.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (106 commits)
    SUNRPC: continue run over clients list on PipeFS event instead of break
    NFS: Don't use SetPageError in the NFS writeback code
    SUNRPC: variable 'svsk' is unused in function bc_send_request
    SUNRPC: Handle ECONNREFUSED in xs_local_setup_socket
    NFSv4.1: Deal effectively with interrupted RPC calls.
    NFSv4.1: Move the RPC timestamp out of the slot.
    NFSv4.1: Try to deal with NFS4ERR_SEQ_MISORDERED.
    NFS: nfs_lookup_revalidate should not trust an inode with i_nlink == 0
    NFS: Fix calls to drop_nlink()
    NFS: Ensure that we always drop inodes that have been marked as stale
    nfs: Remove unused list nfs4_clientid_list
    nfs: Remove duplicate function declaration in internal.h
    NFS: avoid NULL dereference in nfs_destroy_server
    SUNRPC handle EKEYEXPIRED in call_refreshresult
    SUNRPC set gss gc_expiry to full lifetime
    nfs: fix page dirtying in NFS DIO read codepath
    nfs: don't zero out the rest of the page if we hit the EOF on a DIO READ
    NFSv4.1: Be conservative about the client highest slotid
    NFSv4.1: Handle NFS4ERR_BADSLOT errors correctly
    nfs: don't extend writes to cover entire page if pagecache is invalid
    ...

    Linus Torvalds
     

18 Dec, 2012

7 commits

  • As reported by Chen Gang , we should ensure there
    is enough space when formatting the sysfs buffers.

    Signed-off-by: Chas Williams
    Signed-off-by: David S. Miller

    chas williams - CONTRACTOR
     
  • Otherwise an out of bounds read could happen.

    Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • Signed-off-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Hannes Frederic Sowa
     
  • It may be a matter of personal taste, but I find this makes the code
    clearer.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • Pull user namespace changes from Eric Biederman:
    "While small this set of changes is very significant with respect to
    containers in general and user namespaces in particular. The user
    space interface is now complete.

    This set of changes adds support for unprivileged users to create user
    namespaces and as a user namespace root to create other namespaces.
    The tyranny of supporting suid root preventing unprivileged users from
    using cool new kernel features is broken.

    This set of changes completes the work on setns, adding support for
    the pid, user, mount namespaces.

    This set of changes includes a bunch of basic pid namespace
    cleanups/simplifications. Of particular significance is the rework of
    the pid namespace cleanup so it no longer requires sending out
    tendrils into all kinds of unexpected cleanup paths for operation. At
    least one case of broken error handling is fixed by this cleanup.

    The files under /proc//ns/ have been converted from regular files
    to magic symlinks which prevents incorrect caching by the VFS,
    ensuring the files always refer to the namespace the process is
    currently using and ensuring that the ptrace_mayaccess permission
    checks are always applied.

    The files under /proc//ns/ have been given stable inode numbers
    so it is now possible to see if different processes share the same
    namespaces.

    Through the David Miller's net tree are changes to relax many of the
    permission checks in the networking stack to allowing the user
    namespace root to usefully use the networking stack. Similar changes
    for the mount namespace and the pid namespace are coming through my
    tree.

    Two small changes to add user namespace support were commited here adn
    in David Miller's -net tree so that I could complete the work on the
    /proc//ns/ files in this tree.

    Work remains to make it safe to build user namespaces and 9p, afs,
    ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the
    Kconfig guard remains in place preventing that user namespaces from
    being built when any of those filesystems are enabled.

    Future design work remains to allow root users outside of the initial
    user namespace to mount more than just /proc and /sys."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits)
    proc: Usable inode numbers for the namespace file descriptors.
    proc: Fix the namespace inode permission checks.
    proc: Generalize proc inode allocation
    userns: Allow unprivilged mounts of proc and sysfs
    userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file
    procfs: Print task uids and gids in the userns that opened the proc file
    userns: Implement unshare of the user namespace
    userns: Implent proc namespace operations
    userns: Kill task_user_ns
    userns: Make create_new_namespaces take a user_ns parameter
    userns: Allow unprivileged use of setns.
    userns: Allow unprivileged users to create new namespaces
    userns: Allow setting a userns mapping to your current uid.
    userns: Allow chown and setgid preservation
    userns: Allow unprivileged users to create user namespaces.
    userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped
    userns: fix return value on mntns_install() failure
    vfs: Allow unprivileged manipulation of the mount namespace.
    vfs: Only support slave subtrees across different user namespaces
    vfs: Add a user namespace reference from struct mnt_namespace
    ...

    Linus Torvalds
     
  • Reported-by: kbuild test robot
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • A connection's socket can close for any reason, independent of the
    state of the connection (and without irrespective of the connection
    mutex). As a result, the connectino can be in pretty much any state
    at the time its socket is closed.

    Handle those other cases at the top of con_work(). Pull this whole
    block of code into a separate function to reduce the clutter.

    Signed-off-by: Alex Elder
    Reviewed-by: Sage Weil

    Alex Elder