10 Oct, 2016

1 commit

  • After backporting commit ee44b4bc054a ("dlm: use sctp 1-to-1 API")
    series to a kernel with an older workqueue which didn't use RCU yet, it
    was noticed that we are freeing the workqueues in dlm_lowcomms_stop()
    too early as free_conn() will try to access that memory for canceling
    the queued works if any.

    This issue was introduced by commit 0d737a8cfd83 as before it such
    attempt to cancel the queued works wasn't performed, so the issue was
    not present.

    This patch fixes it by simply inverting the free order.

    Cc: stable@vger.kernel.org
    Fixes: 0d737a8cfd83 ("dlm: fix race while closing connections")
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     

27 Aug, 2016

1 commit

  • With the current kernel, `dlm_tool lockdebug` fails as below:

    "dlm_tool lockdebug ED0BD86DCE724393918A1AE8FDBF1EE3
    can't open /sys/kernel/debug/dlm/ED0BD86DCE724393918A1AE8FDBF1EE3:
    Operation not permitted"

    This is because table_open() depends on file->f_op to tell which
    seq_file ops should be passed down. But, the original file ops in
    file->f_op is replaced by "debugfs_full_proxy_file_operations" with
    commit 49d200deaa68 ("debugfs: prevent access to removed files'
    private data").

    Currently, I can think up 2 solutions: 1st, replace
    debugfs_create_file() with debugfs_create_file_unsafe();
    2nd, make different table_open#() accordingly. The 1st one
    is neat, but I don't thoroughly understand its risk. Maybe
    someone has a better one.

    Signed-off-by: Eric Ren
    Signed-off-by: David Teigland

    Eric Ren
     

24 Jun, 2016

1 commit

  • Replace calls to kmalloc followed by a memcpy with a direct call to
    kmemdup.

    The Coccinelle semantic patch used to make this change is as follows:
    @@
    expression from,to,size,flag;
    statement S;
    @@

    - to = \(kmalloc\|kzalloc\)(size,flag);
    + to = kmemdup(from,size,flag);
    if (to==NULL || ...) S
    - memcpy(to, from, size);

    Signed-off-by: Amitoj Kaur Chawla
    Signed-off-by: David Teigland

    Amitoj Kaur Chawla
     

21 Jun, 2016

1 commit


05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

29 Mar, 2016

1 commit

  • Commit 1ae1602de0 "configfs: switch ->default groups to a linked list"
    left the NULL gps pointer behind after removing the kcalloc() call which
    made it non-NULL. It also left the !gps check in place so make_cluster()
    now fails with ENOMEM. Remove the remaining uses of the gps variable to
    fix that.

    Reviewed-by: Bob Peterson
    Reviewed-by: Andreas Gruenbacher
    Signed-off-by: Andrew Price
    Signed-off-by: David Teigland

    Andrew Price
     

18 Mar, 2016

1 commit


06 Mar, 2016

1 commit

  • Replace the current NULL-terminated array of default groups with a linked
    list. This gets rid of lots of nasty code to size and/or dynamically
    allocate the array.

    While we're at it also provide a conveniant helper to remove the default
    groups.

    Signed-off-by: Christoph Hellwig
    Acked-by: Felipe Balbi [drivers/usb/gadget]
    Acked-by: Joel Becker
    Acked-by: Nicholas Bellinger
    Reviewed-by: Sagi Grimberg

    Christoph Hellwig
     

23 Feb, 2016

2 commits

  • This patch fixes the problems with patch b3a5bbfd7.

    1. It removes a return statement from lowcomms_error_report
    because it needs to call the original error report in all paths
    through the function.
    2. All socket callbacks are saved and restored, not just the
    sk_error_report, and that's done so with proper locking like
    sunrpc does.

    Signed-off-by: Bob Peterson
    Signed-off-by: David Teigland

    Bob Peterson
     
  • This patch replaces the call to nodeid_to_addr with a call to
    kernel_getpeername. This avoids taking a spinlock because it may
    potentially be called from a softirq context.

    Signed-off-by: Bob Peterson
    Signed-off-by: David Teigland

    Bob Peterson
     

22 Jan, 2016

1 commit


04 Jan, 2016

1 commit


02 Dec, 2015

1 commit

  • This patch is a cleanup to make following patch easier to
    review.

    Goal is to move SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA
    from (struct socket)->flags to a (struct socket_wq)->flags
    to benefit from RCU protection in sock_wake_async()

    To ease backports, we rename both constants.

    Two new helpers, sk_set_bit(int nr, struct sock *sk)
    and sk_clear_bit(int net, struct sock *sk) are added so that
    following patch can change their implementation.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     

14 Nov, 2015

1 commit

  • Pull SCSI target updates from Nicholas Bellinger:
    "This series contains HCH's changes to absorb configfs attribute
    ->show() + ->store() function pointer usage from it's original
    tree-wide consumers, into common configfs code.

    It includes usb-gadget, target w/ drivers, netconsole and ocfs2
    changes to realize the improved simplicity, that now renders the
    original include/target/configfs_macros.h CPP magic for fabric drivers
    and others, unnecessary and obsolete.

    And with common code in place, new configfs attributes can be added
    easier than ever before.

    Note, there are further improvements in-flight from other folks for
    v4.5 code in configfs land, plus number of target fixes for post -rc1
    code"

    In the meantime, a new user of the now-removed old configfs API came in
    through the char/misc tree in commit 7bd1d4093c2f ("stm class: Introduce
    an abstraction for System Trace Module devices").

    This merge resolution comes from Alexander Shishkin, who updated his stm
    class tracing abstraction to account for the removal of the old
    show_attribute and store_attribute methods in commit 517982229f78
    ("configfs: remove old API") from this pull. As Alexander says about
    that patch:

    "There's no need to keep an extra wrapper structure per item and the
    awkward show_attribute/store_attribute item ops are no longer needed.

    This patch converts policy code to the new api, all the while making
    the code quite a bit smaller and easier on the eyes.

    Signed-off-by: Alexander Shishkin "

    That patch was folded into the merge so that the tree should be fully
    bisectable.

    * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (23 commits)
    configfs: remove old API
    ocfs2/cluster: use per-attribute show and store methods
    ocfs2/cluster: move locking into attribute store methods
    netconsole: use per-attribute show and store methods
    target: use per-attribute show and store methods
    spear13xx_pcie_gadget: use per-attribute show and store methods
    dlm: use per-attribute show and store methods
    usb-gadget/f_serial: use per-attribute show and store methods
    usb-gadget/f_phonet: use per-attribute show and store methods
    usb-gadget/f_obex: use per-attribute show and store methods
    usb-gadget/f_uac2: use per-attribute show and store methods
    usb-gadget/f_uac1: use per-attribute show and store methods
    usb-gadget/f_mass_storage: use per-attribute show and store methods
    usb-gadget/f_sourcesink: use per-attribute show and store methods
    usb-gadget/f_printer: use per-attribute show and store methods
    usb-gadget/f_midi: use per-attribute show and store methods
    usb-gadget/f_loopback: use per-attribute show and store methods
    usb-gadget/ether: use per-attribute show and store methods
    usb-gadget/f_acm: use per-attribute show and store methods
    usb-gadget/f_hid: use per-attribute show and store methods
    ...

    Linus Torvalds
     

06 Nov, 2015

1 commit


04 Nov, 2015

1 commit

  • Replace wait_event_killable with wait_event_interruptible
    so that a program waiting for a posix lock can be
    interrupted by a signal. With the killable version,
    a program was not interruptible by a signal if it
    had a signal handler set for it, overriding the default
    action of terminating the process.

    Signed-off-by: Eric Ren
    Signed-off-by: David Teigland

    Eric Ren
     

23 Oct, 2015

1 commit


14 Oct, 2015

1 commit


04 Sep, 2015

1 commit

  • Pull dlm updates from David Teigland:
    "This set mainly includes a change to the way the dlm uses the SCTP API
    in the kernel, removing the direct dependency on the sctp module.
    Other odd SCTP-related fixes are also included.

    The other notable fix is for a long standing regression in the
    behavior of lock value blocks for user space locks"

    * tag 'dlm-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
    dlm: print error from kernel_sendpage
    dlm: fix lvb copy for user locks
    dlm: sctp_accept_from_sock() can be static
    dlm: fix reconnecting but not sending data
    dlm: replace BUG_ON with a less severe handling
    dlm: use sctp 1-to-1 API
    dlm: fix not reconnecting on connecting error handling
    dlm: fix race while closing connections
    dlm: fix connection stealing if using SCTP

    Linus Torvalds
     

27 Aug, 2015

1 commit


26 Aug, 2015

1 commit

  • For a userland lock request, the previous and current
    lock modes are used to decide when the lvb should be
    copied back to the user. The wrong previous value was
    used, so that it always matched the current value.
    This caused the lvb to be copied back to the user in
    the wrong cases.

    Signed-off-by: David Teigland

    David Teigland
     

18 Aug, 2015

7 commits

  • Signed-off-by: Fengguang Wu
    Signed-off-by: David Teigland

    kbuild test robot
     
  • There are cases on which lowcomms_connect_sock() is called directly,
    which caused the CF_WRITE_PENDING flag to not bet set upon reconnect,
    specially on send_to_sock() error handling. On this last, the flag was
    already cleared and no further attempt on transmitting would be done.

    As dlm tends to connect when it needs to transmit something, it makes
    sense to always mark this flag right after the connect.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • BUG_ON() is a severe action for this case, specially now that DLM with
    SCTP will use 1 socket per association. Instead, we can just close the
    socket on this error condition and return from the function.

    Also move the check to an earlier stage as it won't change and thus we
    can abort as soon as possible.

    Although this issue was reported when still using SCTP with 1-to-many
    API, this cleanup wouldn't be that simple back then because we couldn't
    close the socket and making sure such event would cease would be hard.
    And actually, previous code was closing the association, yet SCTP layer
    is still raising the new data event. Probably a bug to be fixed in SCTP.

    Reported-by:
    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
    needed but this causes it to use sctp_do_peeloff() to mimic an
    kernel_accept() and this causes a symbol dependency on sctp module.

    By switching it to 1-to-1 API we can avoid this dependency and also
    reduce quite a lot of SCTP-specific code in lowcomms.c.

    The caveat is that now DLM won't always use the same src port. It will
    choose a random one, just like TCP code. This allows the peers to
    attempt simultaneous connections, which now are handled just like for
    TCP.

    Even more sharing between TCP and SCTP code on DLM is possible, but it
    is intentionally left for a later commit.

    Note that for using nodes with this commit, you have to have at least
    the early fixes on this patchset otherwise it will trigger some issues
    on old nodes.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • If we don't clear that bit, lowcomms_connect_sock() will not schedule
    another attempt, and no further attempt will be done.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • When a connection have issues DLM may need to close it. Therefore we
    should also cancel pending workqueues for such connection at that time,
    and not just when dlm is not willing to use this connection anymore.

    Also, if we don't clear CF_CONNECT_PENDING flag, the error handling
    routines won't be able to re-connect as lowcomms_connect_sock() will
    check for it.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     
  • When using SCTP and accepting a new connection, DLM currently validates
    if the peer trying to connect to it is one of the cluster nodes, but it
    doesn't check if it already has a connection to it or not.

    If it already had a connection, it will be overwritten, and the new one
    will be used for writes, possibly causing the node to leave the cluster
    due to communication breakage.

    Still, one could DoS the node by attempting N connections and keeping
    them open.

    As said, but being explicit, both situations are only triggerable from
    other cluster nodes, but are doable with only user-level perms.

    Signed-off-by: Marcelo Ricardo Leitner
    Signed-off-by: David Teigland

    Marcelo Ricardo Leitner
     

06 Aug, 2015

1 commit

  • With well over 200+ users of this api, there are a mere 12 users that
    actually checked the return value of this function. And all of them
    really didn't do anything with that information as the system or module
    was shutting down no matter what.

    So stop pretending like it matters, and just return void from
    misc_deregister(). If something goes wrong in the call, you will get a
    WARNING splat in the syslog so you know how to fix up your driver.
    Other than that, there's nothing that can go wrong.

    Cc: Alasdair Kergon
    Cc: Neil Brown
    Cc: Oleg Drokin
    Cc: Andreas Dilger
    Cc: "Michael S. Tsirkin"
    Cc: Wim Van Sebroeck
    Cc: Christine Caulfield
    Cc: David Teigland
    Cc: Mark Fasheh
    Acked-by: Joel Becker
    Acked-by: Alexandre Belloni
    Acked-by: Alessandro Zummo
    Acked-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

11 May, 2015

1 commit


18 Jan, 2015

1 commit

  • Contrary to common expectations for an "int" return, these functions
    return only a positive value -- if used correctly they cannot even
    return 0 because the message header will necessarily be in the skb.

    This makes the very common pattern of

    if (genlmsg_end(...) < 0) { ... }

    be a whole bunch of dead code. Many places also simply do

    return nlmsg_end(...);

    and the caller is expected to deal with it.

    This also commonly (at least for me) causes errors, because it is very
    common to write

    if (my_function(...))
    /* error condition */

    and if my_function() does "return nlmsg_end()" this is of course wrong.

    Additionally, there's not a single place in the kernel that actually
    needs the message length returned, and if anyone needs it later then
    it'll be very easy to just use skb->len there.

    Remove this, and make the functions void. This removes a bunch of dead
    code as described above. The patch adds lines because I did

    - return nlmsg_end(...);
    + nlmsg_end(...);
    + return 0;

    I could have preserved all the function's return values by returning
    skb->len, but instead I've audited all the places calling the affected
    functions and found that none cared. A few places actually compared
    the return value with < 0 with no change in behaviour, so I opted for the more
    efficient version.

    One instance of the error I've made numerous times now is also present
    in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't
    check for
    Signed-off-by: David S. Miller

    Johannes Berg
     

11 Dec, 2014

1 commit

  • Pull VFS changes from Al Viro:
    "First pile out of several (there _definitely_ will be more). Stuff in
    this one:

    - unification of d_splice_alias()/d_materialize_unique()

    - iov_iter rewrite

    - killing a bunch of ->f_path.dentry users (and f_dentry macro).

    Getting that completed will make life much simpler for
    unionmount/overlayfs, since then we'll be able to limit the places
    sensitive to file _dentry_ to reasonably few. Which allows to have
    file_inode(file) pointing to inode in a covered layer, with dentry
    pointing to (negative) dentry in union one.

    Still not complete, but much closer now.

    - crapectomy in lustre (dead code removal, mostly)

    - "let's make seq_printf return nothing" preparations

    - assorted cleanups and fixes

    There _definitely_ will be more piles"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    copy_from_iter_nocache()
    new helper: iov_iter_kvec()
    csum_and_copy_..._iter()
    iov_iter.c: handle ITER_KVEC directly
    iov_iter.c: convert copy_to_iter() to iterate_and_advance
    iov_iter.c: convert copy_from_iter() to iterate_and_advance
    iov_iter.c: get rid of bvec_copy_page_{to,from}_iter()
    iov_iter.c: convert iov_iter_zero() to iterate_and_advance
    iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds
    iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds
    iov_iter.c: convert iov_iter_npages() to iterate_all_kinds
    iov_iter.c: iterate_and_advance
    iov_iter.c: macros for iterating over iov_iter
    kill f_dentry macro
    dcache: fix kmemcheck warning in switch_names
    new helper: audit_file()
    nfsd_vfs_write(): use file_inode()
    ncpfs: use file_inode()
    kill f_dentry uses
    lockd: get rid of ->f_path.dentry->d_sb
    ...

    Linus Torvalds
     

20 Nov, 2014

1 commit

  • A process may exit, leaving an orphan lock in the lockspace.
    This adds the capability for another process to acquire the
    orphan lock. Acquiring the orphan just moves the lock from
    the orphan list onto the acquiring process's list of locks.

    An adopting process must specify the resource name and mode
    of the lock it wants to adopt. If a matching lock is found,
    the lock is moved to the caller's 's list of locks, and the
    lkid of the lock is returned like the lkid of a new lock.

    If an orphan with a different mode is found, then -EAGAIN is
    returned. If no orphan lock is found on the resource, then
    -ENOENT is returned. No async completion is used because
    the result is immediately available.

    Also, when orphans are purged, allow a zero nodeid to refer
    to the local nodeid so the caller does not need to look up
    the local nodeid.

    Signed-off-by: David Teigland

    David Teigland
     

06 Nov, 2014

2 commits

  • Convert the seq_printf output with constant strings to seq_puts.

    Link: http://lkml.kernel.org/p/b416b016f4a6e49115ba736cad6ea2709a8bc1c4.1412031505.git.joe@perches.com

    Cc: Christine Caulfield
    Cc: David Teigland
    Cc: cluster-devel@redhat.com
    Signed-off-by: Joe Perches
    Signed-off-by: Steven Rostedt

    Joe Perches
     
  • The seq_printf() return is going away soon and users of it should
    check seq_has_overflowed() to see if the buffer is full and will
    not accept any more data.

    Convert functions returning int to void where seq_printf() is used.

    Link: http://lkml.kernel.org/p/43590057bcb83846acbbcc1fe641f792b2fb7773.1412031505.git.joe@perches.com
    Link: http://lkml.kernel.org/r/20141029220107.939492048@goodmis.org

    Acked-by: David Teigland
    Cc: Christine Caulfield
    Cc: cluster-devel@redhat.com
    Signed-off-by: Joe Perches
    Signed-off-by: Steven Rostedt

    Joe Perches
     

15 Oct, 2014

1 commit


10 Sep, 2014

1 commit


09 Aug, 2014

1 commit


12 Jun, 2014

1 commit

  • The connection struct with nodeid 0 is the listening socket,
    not a connection to another node. The sctp resend function
    was not checking that the nodeid was valid (non-zero), so it
    would mistakenly get and resend on the listening connection
    when nodeid was zero.

    Signed-off-by: Lidong Zhong
    Signed-off-by: David Teigland

    Lidong Zhong
     

07 Jun, 2014

1 commit