27 Sep, 2011

1 commit

  • The concensus seems to be that system calls such as stat() etc should
    not trigger an automount. Neither should the l* versions.

    This patch therefore adds a LOOKUP_AUTOMOUNT flag to tag those lookups
    that _should_ trigger an automount on the last path element.

    Signed-off-by: Trond Myklebust
    [ Edited to leave out the cases that are already covered by LOOKUP_OPEN,
    LOOKUP_DIRECTORY and LOOKUP_CREATE - all of which also fundamentally
    force automounting for their own reasons - Linus ]
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     

14 Sep, 2011

2 commits

  • Do not allow multiple mounts on same mountpoint when using -o noac

    When you normally attempt to mount a share twice on the same mountpoint,
    a check in do_add_mount causes it to return an error

    # mount localhost:/nfsv3 /mnt
    # mount localhost:/nfsv3 /mnt
    mount.nfs: /mnt is already mounted or busy

    However when using the option 'noac', the user is able to mount the same
    share on the same mountpoint multiple times. This happens because a
    share mounted with the noac option is automatically assigned the 'sync'
    flag MS_SYNCHRONOUS in nfs_initialise_sb(). This flag is set after the
    check for already existing superblocks is done in sget(). The check for
    the mount flags in nfs_compare_mount_options() does not take into
    account the 'sync' flag applied later on in the code path. This means
    that when using 'noac', a new superblock structure is assigned for every
    new mount of the same share and multiple shares on the same mountpoint
    are allowed.

    ie.
    # mount -onoac localhost:/nfsv3 /mnt
    can be run multiple times.

    The patch checks for noac and assigns the sync flag before sget() is
    called to obtain an already existing superblock structure.

    Signed-off-by: Sachin Prabhu
    Reviewed-by: Jeff Layton
    Signed-off-by: Trond Myklebust

    Sachin Prabhu
     
  • Fix a typo which causes an Oops in the RPC layer, when using wsize < 4k.

    Signed-off-by: Trond Myklebust
    Tested-by: Sricharan R

    Trond Myklebust
     

25 Aug, 2011

4 commits


19 Aug, 2011

1 commit


12 Aug, 2011

1 commit

  • Just like files-layout, blocks & objects layouts are part of the
    NFS 4.1 protocol and should be automatically selected if NFS_4_1
    is selected. The small problem is that these depend on other
    Kernel support being present, while files only depends on NFS
    itself.

    This patch removes from the user choice the presence of objects
    and blocks layout. But makes sure these are selected only if
    the depended subsystems are present in the Kernel.

    Signed-off-by: Boaz Harrosh
    Acked-by: Peng Tao
    Signed-off-by: Linus Torvalds

    Boaz Harrosh
     

11 Aug, 2011

1 commit

  • PNFS_BLOCK needs BLK_DEV_DM/MD, which is not a dependency for other
    pnfs layout drivers. Seperate it out so others can still build when
    BLK_DEV_DM/MD is not enabled.

    Also change select to depends on to avoid build failures.

    Reported-and-tested-by: Randy Dunlap
    Signed-off-by: Peng Tao
    Acked-by: Benny Halevy
    Signed-off-by: Linus Torvalds

    Peng Tao
     

04 Aug, 2011

5 commits

  • If the client is in the process of resetting the session when it receives
    a callback, then returning NFS4ERR_DELAY may cause a deadlock with the
    DESTROY_SESSION call.

    Basically, if the client returns NFS4ERR_DELAY in response to the
    CB_SEQUENCE call, then the server is entitled to believe that the
    client is busy because it is already processing that call. In that
    case, the server is perfectly entitled to respond with a
    NFS4ERR_BACK_CHAN_BUSY to any DESTROY_SESSION call.

    Fix this by having the client reply with a NFS4ERR_BADSESSION in
    response to the callback if it is resetting the session.

    Cc: stable@kernel.org [2.6.38+]
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Currently, there is no guarantee that we will call nfs4_cb_take_slot() even
    though nfs4_callback_compound() will consistently call
    nfs4_cb_free_slot() provided the cb_process_state has set the 'clp' field.
    The result is that we can trigger the BUG_ON() upon the next call to
    nfs4_cb_take_slot().

    This patch fixes the above problem by using the slot id that was taken in
    the CB_SEQUENCE operation as a flag for whether or not we need to call
    nfs4_cb_free_slot().
    It also fixes an atomicity problem: we need to set tbl->highest_used_slotid
    atomically with the check for NFS4_SESSION_DRAINING, otherwise we end up
    racing with the various tests in nfs4_begin_drain_session().

    Cc: stable@kernel.org [2.6.38+]
    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • There were bugs in the case of partial layout where olo_comp_index
    is not zero. This used to work and was tested but one of the later
    cleanup SQUASHMEs broke it and was not tested since.

    Also add a dprint that specify those received layout parameters.
    Everything else was already printed.

    [Needed in v3.0]
    CC: Stable Tree
    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • When we have a situation that the number of pages we want
    to encode is bigger then the size of the bio. (Which can
    currently happen only when all IO is going to a single device
    .e.g group_width==1) then the IO is submitted short and we
    report back only the amount of bytes we actually wrote/read
    and all is fine. BUT ...

    There was a bug that the current length counter was advanced
    before the fail to add the extra page, and we come to a situation
    that the CDB length was one-page longer then the actual bio size,
    which is of course rejected by the osd-target.

    While here also fix the bio size calculation, in the case
    that we received more then one group of devices.

    CC: Stable Tree
    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • Fix this compile error on s390:

    CC [M] fs/nfs/blocklayout/blocklayout.o
    fs/nfs/blocklayout/blocklayout.c: In function 'bl_end_io_read':
    fs/nfs/blocklayout/blocklayout.c:201:4: error: implicit declaration of function 'prefetchw'

    Introduced with 9549ec01 "pnfsblock: bl_read_pagelist".

    Cc: Fred Isaman
    Signed-off-by: Heiko Carstens
    Signed-off-by: Trond Myklebust

    Heiko Carstens
     

02 Aug, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    xfs: Fix build breakage in xfs_iops.c when CONFIG_FS_POSIX_ACL is not set
    VFS: Reorganise shrink_dcache_for_umount_subtree() after demise of dcache_lock
    VFS: Remove dentry->d_lock locking from shrink_dcache_for_umount_subtree()
    VFS: Remove detached-dentry counter from shrink_dcache_for_umount_subtree()
    switch posix_acl_chmod() to umode_t
    switch posix_acl_from_mode() to umode_t
    switch posix_acl_equiv_mode() to umode_t *
    switch posix_acl_create() to umode_t *
    block: initialise bd_super in bdget()
    vfs: avoid call to inode_lru_list_del() if possible
    vfs: avoid taking inode_hash_lock on pipes and sockets
    vfs: conditionally call inode_wb_list_del()
    VFS: Fix automount for negative autofs dentries
    Btrfs: load the key from the dir item in readdir into a fake dentry
    devtmpfs: missing initialialization in never-hit case
    hppfs: missing include

    Linus Torvalds
     

01 Aug, 2011

24 commits

  • so we can pass &inode->i_mode to it

    Signed-off-by: Al Viro

    Al Viro
     
  • Fix two recently introduced compile problems:

    Fix a typo in fs/nfs/pnfs.h

    Move the pnfs_blksize declaration outside the CONFIG_NFS_V4 section in
    struct nfs_server.

    Reported-by: Jens Axboe
    Signed-off-by: Trond Myklebust
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • * 'nfs-for-3.1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (28 commits)
    pnfsblock: write_pagelist handle zero invalid extents
    pnfsblock: note written INVAL areas for layoutcommit
    pnfsblock: bl_write_pagelist
    pnfsblock: bl_read_pagelist
    pnfsblock: cleanup_layoutcommit
    pnfsblock: encode_layoutcommit
    pnfsblock: merge rw extents
    pnfsblock: add extent manipulation functions
    pnfsblock: bl_find_get_extent
    pnfsblock: xdr decode pnfs_block_layout4
    pnfsblock: call and parse getdevicelist
    pnfsblock: merge extents
    pnfsblock: lseg alloc and free
    pnfsblock: remove device operations
    pnfsblock: add device operations
    pnfsblock: basic extent code
    pnfsblock: use pageio_ops api
    pnfsblock: add blocklayout Kconfig option, Makefile, and stubs
    pnfs: cleanup_layoutcommit
    pnfs: ask for layout_blksize and save it in nfs_server
    ...

    Linus Torvalds
     
  • For invalid extents, find other pages in the same fsblock and write them out.

    [pnfsblock: write_begin]
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Peng Tao
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • Signed-off-by: Peng Tao
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Note: When upper layer's read/write request cannot be fulfilled, the block
    layout driver shouldn't silently mark the page as error. It should do
    what can be done and leave the rest to the upper layer. To do so, we
    should set rdata/wdata->res.count properly.

    When upper layer re-send the read/write request to finish the rest
    part of the request, pgbase is the position where we should start at.

    [pnfsblock: bl_write_pagelist support functions]
    [pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS]
    Signed-off-by: Fred Isaman
    [pnfsblock: handle errors when read or write pagelist.]
    Signed-off-by: Zhang Jingwang
    [pnfs-block: use new write_pagelist api]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees

    [SQUASHME: pnfsblock: mds_offset is set in the generic layer]
    Signed-off-by: Boaz Harrosh
    Signed-off-by: Benny Halevy

    [pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED]
    Signed-off-by: Peng Tao
    [pnfsblock: SQUASHME: adjust to API change]
    Signed-off-by: Fred Isaman
    [pnfsblock: fixup blksize alignment in bl_setup_layoutcommit]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [pnfsblock: bl_write_pagelist adjust for missing PG_USE_PNFS]
    Signed-off-by: Fred Isaman
    [pnfsblock: handle errors when read or write pagelist.]
    Signed-off-by: Zhang Jingwang
    [pnfs-block: use new write_pagelist api]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Note: When upper layer's read/write request cannot be fulfilled, the block
    layout driver shouldn't silently mark the page as error. It should do
    what can be done and leave the rest to the upper layer. To do so, we
    should set rdata/wdata->res.count properly.

    When upper layer re-send the read/write request to finish the rest
    part of the request, pgbase is the position where we should start at.

    [pnfsblock: mark IO error with NFS_LAYOUT_{RW|RO}_FAILED]
    Signed-off-by: Peng Tao
    [pnfsblock: read path error handling]
    Signed-off-by: Fred Isaman
    [pnfsblock: handle errors when read or write pagelist.]
    Signed-off-by: Zhang Jingwang
    [pnfs-block: use new read_pagelist api]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • In blocklayout driver. There are two things happening
    while layoutcommit/cleanup.
    1. the modified extents are encoded.
    2. On cleanup the extents are put back on the layout rw
    extents list, for reads.

    In the new system where actual xdr encoding is done in
    encode_layoutcommit() directly into xdr buffer, these are
    the new commit stages:

    1. On setup_layoutcommit, the range is adjusted as before
    and a structure is allocated for communication with
    bl_encode_layoutcommit && bl_cleanup_layoutcommit
    (Generic layer provides a void-star to hang it on)

    2. bl_encode_layoutcommit is called to do the actual
    encoding directly into xdr. The commit-extent-list is not
    freed and is stored on above structure.
    FIXME: The code is not yet converted to the new XDR cleanup

    3. On cleanup the commit-extent-list is put back by a call
    to set_to_rw() as before, but with no need for XDR decoding
    of the list as before. And the commit-extent-list is freed.
    Finally allocated structure is freed.

    [rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
    Signed-off-by: Jim Rees
    [pnfsblock: introduce bl_committing list]
    Signed-off-by: Peng Tao
    [pnfsblock: SQUASHME: adjust to API change]
    Signed-off-by: Fred Isaman
    [blocklayout: encode_layoutcommit implementation]
    Signed-off-by: Boaz Harrosh
    [pnfsblock: fix bug setting up layoutcommit.]
    Signed-off-by: Tao Guo
    [pnfsblock: cleanup_layoutcommit wants a status parameter]
    Signed-off-by: Boaz Harrosh
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • In blocklayout driver. There are two things happening
    while layoutcommit/cleanup.
    1. the modified extents are encoded.
    2. On cleanup the extents are put back on the layout rw
    extents list, for reads.

    In the new system where actual xdr encoding is done in
    encode_layoutcommit() directly into xdr buffer, these are
    the new commit stages:

    1. On setup_layoutcommit, the range is adjusted as before
    and a structure is allocated for communication with
    bl_encode_layoutcommit && bl_cleanup_layoutcommit
    (Generic layer provides a void-star to hang it on)

    2. bl_encode_layoutcommit is called to do the actual
    encoding directly into xdr. The commit-extent-list is not
    freed and is stored on above structure.
    FIXME: The code is not yet converted to the new XDR cleanup

    3. On cleanup the commit-extent-list is put back by a call
    to set_to_rw() as before, but with no need for XDR decoding
    of the list as before. And the commit-extent-list is freed.
    Finally allocated structure is freed.

    [rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
    [pnfsblock: get rid of deprecated xdr macros]
    Signed-off-by: Jim Rees
    Signed-off-by: Peng Tao
    Signed-off-by: Fred Isaman
    [blocklayout: encode_layoutcommit implementation]
    Signed-off-by: Boaz Harrosh
    [pnfsblock: fix bug setting up layoutcommit.]
    Signed-off-by: Tao Guo
    [pnfsblock: prevent commit list corruption]
    [pnfsblock: fix layoutcommit with an empty opaque]
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Adds working implementations of various support functions
    to handle INVAL extents, needed by writes, such as
    bl_mark_sectors_init and bl_is_sector_init.

    [pnfsblock: fix 64-bit compiler warnings for extent manipulation]
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [Implement release_inval_marks]
    Signed-off-by: Zhang Jingwang
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Implement bl_find_get_extent(), one of the core extent manipulation
    routines.

    [pnfsblock: Lookup list entry of layouts and tags in reverse order]
    Signed-off-by: Zhang Jingwang
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees

    pnfsblock: fix print format warnings for sector_t and size_t

    gcc spews warnings about these on x86_64, e.g.:
    fs/nfs/blocklayout/blocklayout.c:74: warning: format ‘%Lu’ expects type ‘long long unsigned int’, but argument 2 has type ‘sector_t’
    fs/nfs/blocklayout/blocklayout.c:388: warning: format ‘%d’ expects type ‘int’, but argument 5 has type ‘size_t’

    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • XDR decodes the block layout payload sent in LAYOUTGET result, storing
    the result in an extent list.

    [pnfsblock: get rid of deprecated xdr macros]
    Signed-off-by: Jim Rees
    Signed-off-by: Fred Isaman
    [pnfsblock: fix bug getting pnfs_layout_type in translate_devid().]
    Signed-off-by: Tao Guo
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Call GETDEVICELIST during mount, then call and parse GETDEVICEINFO
    for each device returned.

    [pnfsblock: get rid of deprecated xdr macros]
    Signed-off-by: Jim Rees
    [pnfsblock: fix pnfs_deviceid references]
    Signed-off-by: Fred Isaman
    [pnfsblock: fix print format warnings for sector_t and size_t]
    [pnfs-block: #include ]
    [pnfsblock: no PNFS_NFS_SERVER]
    Signed-off-by: Benny Halevy
    [pnfsblock: fix bug determining size of striped volume]
    [pnfsblock: fix oops when using multiple devices]
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [pnfsblock: get rid of vmap and deviceid->area structure]
    Signed-off-by: Peng Tao
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Replace a stub, so that extents underlying the layouts are properly
    added, merged, or ignored as necessary.

    Signed-off-by: Fred Isaman
    [pnfsblock: delete the new node before put it]
    Signed-off-by: Mingyang Guo
    Signed-off-by: Benny Halevy
    Signed-off-by: Peng Tao
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Signed-off-by: Fred Isaman
    [pnfsblock: fix bug getting pnfs_layout_type in translate_devid().]
    Signed-off-by: Tao Guo
    Signed-off-by: Benny Halevy
    Signed-off-by: Zhang Jingwang
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • Signed-off-by: Jim Rees
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [upcall bugfixes]
    Signed-off-by: Peng Tao
    Signed-off-by: Trond Myklebust

    Jim Rees
     
  • Signed-off-by: Jim Rees
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [upcall bugfixes]
    Signed-off-by: Peng Tao
    Signed-off-by: Trond Myklebust

    Jim Rees
     
  • Adds structures and basic create/delete code for extents.

    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    Signed-off-by: Zhang Jingwang
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • [pnfsblock: use pnfs_generic_pg_init_read/write]
    Signed-off-by: Peng Tao
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Benny Halevy
     
  • Define a configuration variable to enable/disable compilation of the
    block driver code.

    Add the minimal structure for a pnfs block layout driver, and empty
    list-heads that will hold the extent data

    [pnfsblock: make NFS_V4_1 select PNFS_BLOCK]
    Signed-off-by: Peng Tao
    Signed-off-by: Fred Isaman
    Signed-off-by: Benny Halevy
    [pnfs-block: fix CONFIG_PNFS_BLOCK dependencies]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [pnfsblock: SQUASHME: adjust to API change]
    Signed-off-by: Fred Isaman
    [pnfs: move pnfs_layout_type inline in nfs_inode]
    Signed-off-by: Benny Halevy
    [blocklayout: encode_layoutcommit implementation]
    Signed-off-by: Boaz Harrosh
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [pnfsblock: layout alloc and free]
    Signed-off-by: Fred Isaman
    [pnfs: move pnfs_layout_type inline in nfs_inode]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [pnfsblock: define module alias]
    Signed-off-by: Peng Tao
    [rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • This gives layout driver a chance to cleanup structures they put in at
    encode_layoutcommit.

    Signed-off-by: Andy Adamson
    [fixup layout header pointer for layoutcommit]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    [rm inode and pnfs_layout_hdr args from cleanup_layoutcommit()]
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Andy Adamson
     
  • Block layout needs it to determine IO size.

    Signed-off-by: Fred Isaman
    Signed-off-by: Tao Guo
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Fred Isaman
     
  • To allow layout driver to issue getdevicelist at mount time, and clean up
    at umount time.

    [fixup non NFS_V4_1 set_pnfs_layoutdriver definition]
    [pnfs: pass mntfh down the init_pnfs path]
    Signed-off-by: Benny Halevy
    Signed-off-by: Benny Halevy
    Signed-off-by: Jim Rees
    Signed-off-by: Trond Myklebust

    Benny Halevy