08 Aug, 2017

1 commit

  • With commit 0ffdaf5b41cf ("net/sock: add WARN_ON(parent->sk)
    in sock_graft()"), a calltrace happened as follows:

    [ 457.018340] WARNING: CPU: 0 PID: 15623 at ./include/net/sock.h:1703 inet_accept+0x135/0x140
    ...
    [ 457.018381] RIP: 0010:inet_accept+0x135/0x140
    [ 457.018381] RSP: 0018:ffffc90001727d18 EFLAGS: 00010286
    [ 457.018383] RAX: 0000000000000001 RBX: ffff880012413000 RCX: 0000000000000001
    [ 457.018384] RDX: 000000000000018a RSI: 00000000fffffe01 RDI: ffffffff8156fae8
    [ 457.018384] RBP: ffffc90001727d38 R08: 0000000000000000 R09: 0000000000004305
    [ 457.018385] R10: 0000000000000001 R11: 0000000000004304 R12: ffff880035ae7a00
    [ 457.018386] R13: ffff88001282af10 R14: ffff880034e4e200 R15: 0000000000000000
    [ 457.018387] FS: 0000000000000000(0000) GS:ffff88003fc00000(0000) knlGS:0000000000000000
    [ 457.018388] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 457.018389] CR2: 00007fdec22f9000 CR3: 0000000002b5a000 CR4: 00000000000006f0
    [ 457.018395] Call Trace:
    [ 457.018402] tcp_accept_from_sock.part.8+0x12d/0x449 [dlm]
    [ 457.018405] ? vprintk_emit+0x248/0x2d0
    [ 457.018409] tcp_accept_from_sock+0x3f/0x50 [dlm]
    [ 457.018413] process_recv_sockets+0x3b/0x50 [dlm]
    [ 457.018415] process_one_work+0x138/0x370
    [ 457.018417] worker_thread+0x4d/0x3b0
    [ 457.018419] kthread+0x109/0x140
    [ 457.018421] ? rescuer_thread+0x320/0x320
    [ 457.018422] ? kthread_park+0x60/0x60
    [ 457.018424] ret_from_fork+0x25/0x30

    Since newsocket created by sock_create_kern sets it's
    sock by the path:

    sock_create_kern -> __sock_creat
    ->pf->create => inet_create
    -> sock_init_data

    Then WARN_ON is triggered by "con->sock->ops->accept =>
    inet_accept -> sock_graft", it also means newsock->sk
    is leaked since sock_graft will replace it with a new
    sk.

    To resolve the issue, we need to use sock_create_lite
    instead of sock_create_kern, like commit 0933a578cd55
    ("rds: tcp: use sock_create_lite() to create the accept
    socket") did.

    Reported-by: Zhilong Liu
    Signed-off-by: Guoqing Jiang
    Signed-off-by: David Teigland

    Guoqing Jiang
     

10 Mar, 2017

1 commit

  • Lockdep issues a circular dependency warning when AFS issues an operation
    through AF_RXRPC from a context in which the VFS/VM holds the mmap_sem.

    The theory lockdep comes up with is as follows:

    (1) If the pagefault handler decides it needs to read pages from AFS, it
    calls AFS with mmap_sem held and AFS begins an AF_RXRPC call, but
    creating a call requires the socket lock:

    mmap_sem must be taken before sk_lock-AF_RXRPC

    (2) afs_open_socket() opens an AF_RXRPC socket and binds it. rxrpc_bind()
    binds the underlying UDP socket whilst holding its socket lock.
    inet_bind() takes its own socket lock:

    sk_lock-AF_RXRPC must be taken before sk_lock-AF_INET

    (3) Reading from a TCP socket into a userspace buffer might cause a fault
    and thus cause the kernel to take the mmap_sem, but the TCP socket is
    locked whilst doing this:

    sk_lock-AF_INET must be taken before mmap_sem

    However, lockdep's theory is wrong in this instance because it deals only
    with lock classes and not individual locks. The AF_INET lock in (2) isn't
    really equivalent to the AF_INET lock in (3) as the former deals with a
    socket entirely internal to the kernel that never sees userspace. This is
    a limitation in the design of lockdep.

    Fix the general case by:

    (1) Double up all the locking keys used in sockets so that one set are
    used if the socket is created by userspace and the other set is used
    if the socket is created by the kernel.

    (2) Store the kern parameter passed to sk_alloc() in a variable in the
    sock struct (sk_kern_sock). This informs sock_lock_init(),
    sock_init_data() and sk_clone_lock() as to the lock keys to be used.

    Note that the child created by sk_clone_lock() inherits the parent's
    kern setting.

    (3) Add a 'kern' parameter to ->accept() that is analogous to the one
    passed in to ->create() that distinguishes whether kernel_accept() or
    sys_accept4() was the caller and can be passed to sk_alloc().

    Note that a lot of accept functions merely dequeue an already
    allocated socket. I haven't touched these as the new socket already
    exists before we get the parameter.

    Note also that there are a couple of places where I've made the accepted
    socket unconditionally kernel-based:

    irda_accept()
    rds_rcp_accept_one()
    tcp_accept_from_sock()

    because they follow a sock_create_kern() and accept off of that.

    Whilst creating this, I noticed that lustre and ocfs don't create sockets
    through sock_create_kern() and thus they aren't marked as for-kernel,
    though they appear to be internal. I wonder if these should do that so
    that they use the new set of lock keys.

    Signed-off-by: David Howells
    Signed-off-by: David S. Miller

    David Howells
     

24 Oct, 2016

1 commit


20 Oct, 2016

2 commits

  • Before this patch, functions save_callbacks and restore_callbacks
    called function lock_sock and release_sock to prevent other processes
    from messing with the struct sock while the callbacks were saved and
    restored. However, function add_sock calls write_lock_bh prior to
    calling it save_callbacks, which disables preempts. So the call to
    lock_sock would try to schedule when we can't schedule.

    Signed-off-by: Bob Peterson
    Signed-off-by: David Teigland

    Bob Peterson
     
  • When DLM calls accept() on a socket, the comm code copies the sk
    after we've saved its callbacks. Afterward, it calls add_sock which
    saves the callbacks a second time. Since the error reporting function
    lowcomms_error_report calls the previous callback too, this results
    in a recursive call to itself. This patch adds a new parameter to
    function add_sock to tell whether to save the callbacks. Function
    tcp_accept_from_sock (and its sctp counterpart) then calls it with
    false to avoid the recursion.

    Signed-off-by: Bob Peterson
    Signed-off-by: David Teigland

    Bob Peterson
     

10 Oct, 2016

1 commit

  • After backporting commit ee44b4bc054a ("dlm: use sctp 1-to-1 API")
    series to a kernel with an older workqueue which didn't use RCU yet, it
    was noticed that we are freeing the workqueues in dlm_lowcomms_stop()
    too early as free_conn() will try to access that memory for canceling
    the queued works if any.

    This issue was introduced by commit 0d737a8cfd83 as before it such
    attempt to cancel the queued works wasn't performed, so the issue was
    not present.

    This patch fixes it by simply inverting the free order.

    Cc: stable@vger.kernel.org
    Fixes: 0d737a8cfd83 ("dlm: fix race while closing connections")
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     

24 Jun, 2016

1 commit

  • Replace calls to kmalloc followed by a memcpy with a direct call to
    kmemdup.

    The Coccinelle semantic patch used to make this change is as follows:
    @@
    expression from,to,size,flag;
    statement S;
    @@

    - to = \(kmalloc\|kzalloc\)(size,flag);
    + to = kmemdup(from,size,flag);
    if (to==NULL || ...) S
    - memcpy(to, from, size);

    Signed-off-by: Amitoj Kaur Chawla
    Signed-off-by: David Teigland

    Amitoj Kaur Chawla
     

05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

23 Feb, 2016

2 commits

  • This patch fixes the problems with patch b3a5bbfd7.

    1. It removes a return statement from lowcomms_error_report
    because it needs to call the original error report in all paths
    through the function.
    2. All socket callbacks are saved and restored, not just the
    sk_error_report, and that's done so with proper locking like
    sunrpc does.

    Signed-off-by: Bob Peterson
    Signed-off-by: David Teigland

    Bob Peterson
     
  • This patch replaces the call to nodeid_to_addr with a call to
    kernel_getpeername. This avoids taking a spinlock because it may
    potentially be called from a softirq context.

    Signed-off-by: Bob Peterson
    Signed-off-by: David Teigland

    Bob Peterson
     

02 Dec, 2015

1 commit

  • This patch is a cleanup to make following patch easier to
    review.

    Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
    from (struct socket)->flags to a (struct socket_wq)->flags
    to benefit from RCU protection in sock_wake_async()

    To ease backports, we rename both constants.

    Two new helpers, sk_set_bit(int nr, struct sock *sk)
    and sk_clear_bit(int net, struct sock *sk) are added so that
    following patch can change their implementation.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Aug, 2015

1 commit


18 Aug, 2015

7 commits

  • Signed-off-by: Fengguang Wu
    Signed-off-by: David Teigland

    kbuild test robot
     
  • There are cases on which lowcomms_connect_sock() is called directly,
    which caused the CF_WRITE_PENDING flag to not bet set upon reconnect,
    specially on send_to_sock() error handling. On this last, the flag was
    already cleared and no further attempt on transmitting would be done.

    As dlm tends to connect when it needs to transmit something, it makes
    sense to always mark this flag right after the connect.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • BUG_ON() is a severe action for this case, specially now that DLM with
    SCTP will use 1 socket per association. Instead, we can just close the
    socket on this error condition and return from the function.

    Also move the check to an earlier stage as it won't change and thus we
    can abort as soon as possible.

    Although this issue was reported when still using SCTP with 1-to-many
    API, this cleanup wouldn't be that simple back then because we couldn't
    close the socket and making sure such event would cease would be hard.
    And actually, previous code was closing the association, yet SCTP layer
    is still raising the new data event. Probably a bug to be fixed in SCTP.

    Reported-by:
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
    needed but this causes it to use sctp_do_peeloff() to mimic an
    kernel_accept() and this causes a symbol dependency on sctp module.

    By switching it to 1-to-1 API we can avoid this dependency and also
    reduce quite a lot of SCTP-specific code in lowcomms.c.

    The caveat is that now DLM won't always use the same src port. It will
    choose a random one, just like TCP code. This allows the peers to
    attempt simultaneous connections, which now are handled just like for
    TCP.

    Even more sharing between TCP and SCTP code on DLM is possible, but it
    is intentionally left for a later commit.

    Note that for using nodes with this commit, you have to have at least
    the early fixes on this patchset otherwise it will trigger some issues
    on old nodes.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • If we don't clear that bit, lowcomms_connect_sock() will not schedule
    another attempt, and no further attempt will be done.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • When a connection have issues DLM may need to close it. Therefore we
    should also cancel pending workqueues for such connection at that time,
    and not just when dlm is not willing to use this connection anymore.

    Also, if we don't clear CF_CONNECT_PENDING flag, the error handling
    routines won't be able to re-connect as lowcomms_connect_sock() will
    check for it.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • When using SCTP and accepting a new connection, DLM currently validates
    if the peer trying to connect to it is one of the cluster nodes, but it
    doesn't check if it already has a connection to it or not.

    If it already had a connection, it will be overwritten, and the new one
    will be used for writes, possibly causing the node to leave the cluster
    due to communication breakage.

    Still, one could DoS the node by attempting N connections and keeping
    them open.

    As said, but being explicit, both situations are only triggerable from
    other cluster nodes, but are doable with only user-level perms.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     

11 May, 2015

1 commit


12 Jun, 2014

1 commit

  • The connection struct with nodeid 0 is the listening socket,
    not a connection to another node. The sctp resend function
    was not checking that the nodeid was valid (non-zero), so it
    would mistakenly get and resend on the listening connection
    when nodeid was zero.

    Signed-off-by: Lidong Zhong
    Signed-off-by: David Teigland

    Lidong Zhong
     

12 Apr, 2014

1 commit

  • Several spots in the kernel perform a sequence like:

    skb_queue_tail(&sk->s_receive_queue, skb);
    sk->sk_data_ready(sk, skb->len);

    But at the moment we place the SKB onto the socket receive queue it
    can be consumed and freed up. So this skb->len access is potentially
    to freed up memory.

    Furthermore, the skb->len can be modified by the consumer so it is
    possible that the value isn't accurate.

    And finally, no actual implementation of this callback actually uses
    the length argument. And since nobody actually cared about it's
    value, lots of call sites pass arbitrary values in such as '0' and
    even '1'.

    So just remove the length argument from the callback, that way there
    is no confusion whatsoever and all of these use-after-free cases get
    fixed as a side effect.

    Based upon a patch by Eric Dumazet and his suggestion to audit this
    issue tree-wide.

    Signed-off-by: David S. Miller

    David S. Miller
     

26 Jan, 2014

1 commit

  • Pull networking updates from David Miller:

    1) BPF debugger and asm tool by Daniel Borkmann.

    2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

    3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

    4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match. From Ben Hutchings.

    5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

    6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

    7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

    8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

    9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

    10) Fix ipv6 router reachability probing, from Jiri Benc.

    11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

    12) Support tunneling in GRO layer, from Jerry Chu.

    13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

    14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI. From Atzm Watanabe.

    15) New "Heavy Hitter" qdisc, from Terry Lam.

    16) Significantly improve the IPSEC support in pktgen, from Fan Du.

    17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom
    Herbert.

    18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

    19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

    20) Key TCP metrics blobs also by source address, not just destination
    address. From Christoph Paasch.

    21) Support 10G in generic phylib. From Andy Fleming.

    22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided. From Tom Herbert.

    The wireless and netfilter folks have been busy little bees too.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
    net/cxgb4: Fix referencing freed adapter
    ipv6: reallocate addrconf router for ipv6 address when lo device up
    fib_frontend: fix possible NULL pointer dereference
    rtnetlink: remove IFLA_BOND_SLAVE definition
    rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
    qlcnic: update version to 5.3.55
    qlcnic: Enhance logic to calculate msix vectors.
    qlcnic: Refactor interrupt coalescing code for all adapters.
    qlcnic: Update poll controller code path
    qlcnic: Interrupt code cleanup
    qlcnic: Enhance Tx timeout debugging.
    qlcnic: Use bool for rx_mac_learn.
    bonding: fix u64 division
    rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
    sfc: Use the correct maximum TX DMA ring size for SFC9100
    Add Shradha Shah as the sfc driver maintainer.
    net/vxlan: Share RX skb de-marking and checksum checks with ovs
    tulip: cleanup by using ARRAY_SIZE()
    ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
    net/cxgb4: Don't retrieve stats during recovery
    ...

    Linus Torvalds
     

22 Jan, 2014

1 commit


16 Dec, 2013

1 commit

  • The recovery time for a failed node was taking a long
    time because the failed node could not perform the full
    shutdown process. Removing the linger time speeds this
    up. The dlm does not care what happens to messages to
    or from the failed node.

    Signed-off-by: Dongmao Zhang
    Signed-off-by: David Teigland

    Dongmao Zhang
     

19 Jun, 2013

1 commit


15 Jun, 2013

6 commits

  • For TCP we disable Nagle and I cannot think of why it would be needed
    for SCTP. When disabled it seems to improve dlm_lock operations like it
    does for TCP.

    Signed-off-by: Mike Christie
    Signed-off-by: David Teigland

    Mike Christie
     
  • Currently if a SCTP send fails, we lose the data we were trying
    to send because the writequeue_entry is released when we do the send.
    When this happens other nodes will then hang waiting for a reply.

    This adds support for SCTP to retry the send operation.

    I also removed the retry limit for SCTP use, because we want
    to make sure we try every path during init time and for longer
    failures we want to continually retry in case paths come back up
    while trying other paths. We will do this until userspace tells us
    to stop.

    Signed-off-by: Mike Christie
    Signed-off-by: David Teigland

    Mike Christie
     
  • Currently, if we cannot create a association to the first IP addr
    that is added to DLM, the SCTP init assoc code will just retry
    the same IP. This patch adds a simple failover schemes where we
    will try one of the addresses that was passed into DLM.

    Signed-off-by: Mike Christie
    Signed-off-by: David Teigland

    Mike Christie
     
  • We should be testing and cleaing the init pending bit because later
    when sctp_init_assoc is recalled it will be checking that it is not set
    and set the bit.

    We do not want to touch CF_CONNECT_PENDING here because we will queue
    swork and process_send_sockets will then call the connect_action function.

    Signed-off-by: Mike Christie
    Signed-off-by: David Teigland

    Mike Christie
     
  • sctp_assoc was not getting set so later lookups failed.

    Signed-off-by: Mike Christie
    Signed-off-by: David Teigland

    Mike Christie
     
  • We were clearing the base con's init pending flags, but the
    con for the node was the one with the pending bit set.

    Signed-off-by: Mike Christie
    Signed-off-by: David Teigland

    Mike Christie
     

10 Apr, 2013

1 commit

  • This patch introduces an UAPI header for the SCTP protocol,
    so that we can facilitate the maintenance and development of
    user land applications or libraries, in particular in terms
    of header synchronization.

    To not break compatibility, some fragments from lksctp-tools'
    netinet/sctp.h have been carefully included, while taking care
    that neither kernel nor user land breaks, so both compile fine
    with this change (for lksctp-tools I tested with the old
    netinet/sctp.h header and with a newly adapted one that includes
    the uapi sctp header). lksctp-tools smoke test run through
    successfully as well in both cases.

    Suggested-by: Neil Horman
    Cc: Neil Horman
    Cc: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

28 Feb, 2013

1 commit

  • I'm not sure why, but the hlist for each entry iterators were conceived

    list_for_each_entry(pos, head, member)

    The hlist ones were greedy and wanted an extra parameter:

    hlist_for_each_entry(tpos, pos, head, member)

    Why did they need an extra pos parameter? I'm not quite sure. Not only
    they don't really need it, it also prevents the iterator from looking
    exactly like the list iterator, which is unfortunate.

    Besides the semantic patch, there was some manual work required:

    - Fix up the actual hlist iterators in linux/list.h
    - Fix up the declaration of other iterators based on the hlist ones.
    - A very small amount of places were using the 'node' parameter, this
    was modified to use 'obj->member' instead.
    - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
    properly, so those had to be fixed up manually.

    The semantic patch which is mostly the work of Peter Senna Tschudin is here:

    @@
    iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

    type T;
    expression a,c,d,e;
    identifier b;
    statement S;
    @@

    -T b;

    [akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
    [akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
    [akpm@linux-foundation.org: checkpatch fixes]
    [akpm@linux-foundation.org: fix warnings]
    [akpm@linux-foudnation.org: redo intrusive kvm changes]
    Tested-by: Peter Senna Tschudin
    Acked-by: Paul E. McKenney
    Signed-off-by: Sasha Levin
    Cc: Wu Fengguang
    Cc: Marcelo Tosatti
    Cc: Gleb Natapov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sasha Levin
     

02 Nov, 2012

1 commit


13 Aug, 2012

1 commit


10 Aug, 2012

2 commits


09 Aug, 2012

1 commit

  • A deadlock sometimes occurs between dlm_controld closing
    a lowcomms connection through configfs and dlm_send looking
    up the address for a new connection in configfs.

    dlm_controld does a configfs rmdir which calls
    dlm_lowcomms_close which waits for dlm_send to
    cancel work on the workqueues.

    The dlm_send workqueue thread has called
    tcp_connect_to_sock which calls dlm_nodeid_to_addr
    which does a configfs lookup and blocks on a lock
    held by dlm_controld in the rmdir path.

    The solution here is to save the node addresses within
    the lowcomms code so that the lowcomms workqueue does
    not need to step through configfs to get a node address.

    dlm_controld:
    wait_for_completion+0x1d/0x20
    __cancel_work_timer+0x1b3/0x1e0
    cancel_work_sync+0x10/0x20
    dlm_lowcomms_close+0x4c/0xb0 [dlm]
    drop_comm+0x22/0x60 [dlm]
    client_drop_item+0x26/0x50 [configfs]
    configfs_rmdir+0x180/0x230 [configfs]
    vfs_rmdir+0xbd/0xf0
    do_rmdir+0x103/0x120
    sys_rmdir+0x16/0x20

    dlm_send:
    mutex_lock+0x2b/0x50
    get_comm+0x34/0x140 [dlm]
    dlm_nodeid_to_addr+0x18/0xd0 [dlm]
    tcp_connect_to_sock+0xf4/0x2d0 [dlm]
    process_send_sockets+0x1d2/0x260 [dlm]
    worker_thread+0x170/0x2a0

    Signed-off-by: David Teigland

    David Teigland
     

27 Apr, 2012

1 commit

  • During lowcomms shutdown, a new connection could possibly
    be created, and attempt to use a workqueue that's been
    destroyed. Similarly, during startup, a new connection
    could attempt to use a workqueue that's not been set up
    yet. Add a global variable to indicate when new connections
    are allowed.

    Based on patch by: Christine Caulfield

    Reported-by: dann frazier
    Reviewed-by: dann frazier
    Signed-off-by: David Teigland

    David Teigland