07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

05 Nov, 2011

1 commit

  • * 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (25 commits)
    nfs: set vs_hidden on nfs4_callback_version4 (try #2)
    pnfs-obj: Support for RAID5 read-4-write interface.
    pnfs-obj: move to ore 03: Remove old raid engine
    pnfs-obj: move to ore 02: move to ORE
    pnfs-obj: move to ore 01: ore_layout & ore_components
    pnfs-obj: Rename objlayout_io_state => objlayout_io_res
    pnfs-obj: Get rid of objlayout_{alloc,free}_io_state
    pnfs-obj: Return PNFS_NOT_ATTEMPTED in case of read/write_pagelist
    pnfs-obj: Remove redundant EOF from objlayout_io_state
    nfs: Remove unused variable from write.c
    nfs: Fix unused variable warning from file.c
    NFS: Remove no-op less-than-zero checks on unsigned variables.
    NFS: Clean up nfs4_xdr_dec_secinfo()
    NFS: Fix documenting comment for nfs_create_request()
    NFS4: fix cb_recallany decode error
    nfs4: serialize layoutcommit
    SUNRPC: remove rpcbind clients destruction on module cleanup
    SUNRPC: remove rpcbind clients creation during service registering
    NFSd: call svc rpcbind cleanup explicitly
    SUNRPC: cleanup service destruction
    ...

    Linus Torvalds
     

04 Nov, 2011

1 commit


03 Nov, 2011

11 commits

  • Trond Myklebust
     
  • The ore need suplied a r4w_get_page/r4w_put_page API
    from Filesystem so it can get cache pages to read-into when
    writing parial stripes.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • Finally remove all the old raid engine, which is by now
    dead code.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • In this patch we are actually moving to the ORE.
    (Object Raid Engine).

    objio_state holds a pointer to an ore_io_state. Once
    we have an ore_io_state at hand we can call the ore
    for reading/writing. We register on the done path
    to kick off the nfs io_done mechanism.

    Again for Ease of reviewing the old code is "#if 0"
    but is not removed so the diff command works better.
    The old code will be removed in the next patch.

    fs/exofs/Kconfig::ORE is modified to also be auto-included
    if PNFS_OBJLAYOUT is set. Since we now depend on ORE.
    (See comments in fs/exofs/Kconfig)

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • For Ease of reviewing I split the move to ore into 3 parts
    move to ore 01: ore_layout & ore_components
    move to ore 02: move to ORE
    move to ore 03: Remove old raid engine

    This patch modifies the objio_lseg, layout-segment level
    and devices and components arrays to use the ORE types.

    Though it will be removed soon, also the raid engine
    is modified to actually compile, possibly run, with
    the new types. So it is the same old raid engine but
    with some new ORE types.

    For Ease of reviewing, some of the old code is
    "#if 0" but is not removed so the diff command works
    better. The old code will be removed in the 3rd patch.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • * All instances of objlayout_io_state => objlayout_io_res
    * All instances of state => oir;
    * All instances of ol_state => oir;

    Big but nothing to it

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • This is part of moving objio_osd to use the ORE.

    objlayout_io_state had two functions:
    1. It was used in the error reporting mechanism at layout_return.
    This function is kept intact.
    (Later patch will rename objlayout_io_state => objlayout_io_res)
    2. Carrier of rw io members into the objio_read/write_paglist API.
    This is removed in this patch.

    The {r,w}data received from NFS are passed directly to the
    objio_{read,write}_paglist API. The io_engine is now allocating
    it's own IO state as part of the read/write. The minimal
    functionality that was part of the generic allocation is passed
    to the io_engine.

    So part of this patch is rename of:
    ios->ol_state.foo => ios->foo

    At objlayout_{read,write}_done an objlayout_io_state is passed that
    denotes the result of the IO. (Hence the later name change).
    If the IO is successful objlayout calls an objio_free_result() API
    immediately (Which for objio_osd causes the release of the io_state).
    If the IO ended in an error it is hanged onto until reported in
    layout_return and is released later through the objio_free_result()
    API. (All this is not new just renamed and cleaned)

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • objlayout driver was always returning PNFS_ATTEMPTED from it's
    read/write_pagelist operations. Even on error. Fix that.

    Start by establishing an error return API from io-engine, by
    not returning ssize_t (length-or-error) but returning "int"
    0=OK, 0>Error. And clean up all return types in io-engine.

    Then if io-engine returned error return PNFS_NOT_ATTEMPTED
    to generic layer. (With a dprint)

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • The EOF calculation was done on .read_pagelist(), cached
    in objlayout_io_state->eof, and set in objlayout_read_done()
    into nfs_read_data->res.eof.

    So set it directly into nfs_read_data->res.eof and avoid
    the extra member.

    Signed-off-by: Boaz Harrosh
    Signed-off-by: Trond Myklebust

    Boaz Harrosh
     
  • When CONFIG_NFS=y and CONFIG_NFS_V3_{,V4}=n we get the following warning.

    fs/nfs/write.c: In function ‘nfs_writeback_done’:
    fs/nfs/write.c:1246:21: warning: unused variable ‘server’

    Remove the variable 'server' to fix the above warning.

    Signed-off-by: Rakib Mullick
    Signed-off-by: Trond Myklebust

    Rakib Mullick
     
  • Fix the following unused variable warning.

    fs/nfs/file.c: In function ‘nfs_file_release’:
    fs/nfs/file.c:140:17: warning: unused variable ‘dentry’
    fs/nfs/file.c: In function ‘nfs_file_read’:
    fs/nfs/file.c:237:9: warning: unused variable ‘count’

    Signed-off-by: Rakib Mullick
    Signed-off-by: Trond Myklebust

    Rakib Mullick
     

02 Nov, 2011

2 commits


01 Nov, 2011

2 commits

  • Some files were using the complete module.h infrastructure without
    actually including the header at all. Fix them up in advance so
    once the implicit presence is removed, we won't get failures like this:

    CC [M] fs/nfsd/nfssvc.o
    fs/nfsd/nfssvc.c: In function 'nfsd_create_serv':
    fs/nfsd/nfssvc.c:335: error: 'THIS_MODULE' undeclared (first use in this function)
    fs/nfsd/nfssvc.c:335: error: (Each undeclared identifier is reported only once
    fs/nfsd/nfssvc.c:335: error: for each function it appears in.)
    fs/nfsd/nfssvc.c: In function 'nfsd':
    fs/nfsd/nfssvc.c:555: error: implicit declaration of function 'module_put_and_exit'
    make[3]: *** [fs/nfsd/nfssvc.o] Error 1

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     
  • These files were getting via an implicit include
    path, but we want to crush those out of existence since they cost
    time during compiles of processing thousands of lines of headers
    for no reason. Give them the lightweight header that just contains
    the EXPORT_SYMBOL infrastructure.

    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

31 Oct, 2011

5 commits


29 Oct, 2011

1 commit

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/hch/vfs-queue: (21 commits)
    leases: fix write-open/read-lease race
    nfs: drop unnecessary locking in llseek
    ext4: replace cut'n'pasted llseek code with generic_file_llseek_size
    vfs: add generic_file_llseek_size
    vfs: do (nearly) lockless generic_file_llseek
    direct-io: merge direct_io_walker into __blockdev_direct_IO
    direct-io: inline the complete submission path
    direct-io: separate map_bh from dio
    direct-io: use a slab cache for struct dio
    direct-io: rearrange fields in dio/dio_submit to avoid holes
    direct-io: fix a wrong comment
    direct-io: separate fields only used in the submission path from struct dio
    vfs: fix spinning prevention in prune_icache_sb
    vfs: add a comment to inode_permission()
    vfs: pass all mask flags check_acl and posix_acl_permission
    vfs: add hex format for MAY_* flag values
    vfs: indicate that the permission functions take all the MAY_* flags
    compat: sync compat_stats with statfs.
    vfs: add "device" tag to /proc/self/mountstats
    cleanup: vfs: small comment fix for block_invalidatepage
    ...

    Fix up trivial conflict in fs/gfs2/file.c (llseek changes)

    Linus Torvalds
     

28 Oct, 2011

2 commits

  • This makes NFS follow the standard generic_file_llseek locking scheme.

    Cc: Trond.Myklebust@netapp.com
    Signed-off-by: Andi Kleen
    Signed-off-by: Christoph Hellwig

    Andi Kleen
     
  • The i_mutex lock use of generic _file_llseek hurts. Independent processes
    accessing the same file synchronize over a single lock, even though
    they have no need for synchronization at all.

    Under high utilization this can cause llseek to scale very poorly on larger
    systems.

    This patch does some rethinking of the llseek locking model:

    First the 64bit f_pos is not necessarily atomic without locks
    on 32bit systems. This can already cause races with read() today.
    This was discussed on linux-kernel in the past and deemed acceptable.
    The patch does not change that.

    Let's look at the different seek variants:

    SEEK_SET: Doesn't really need any locking.
    If there's a race one writer wins, the other loses.

    For 32bit the non atomic update races against read()
    stay the same. Without a lock they can also happen
    against write() now. The read() race was deemed
    acceptable in past discussions, and I think if it's
    ok for read it's ok for write too.

    => Don't need a lock.

    SEEK_END: This behaves like SEEK_SET plus it reads
    the maximum size too. Reading the maximum size would have the
    32bit atomic problem. But luckily we already have a way to read
    the maximum size without locking (i_size_read), so we
    can just use that instead.

    Without i_mutex there is no synchronization with write() anymore,
    however since the write() update is atomic on 64bit it just behaves
    like another racy SEEK_SET. On non atomic 32bit it's the same
    as SEEK_SET.

    => Don't need a lock, but need to use i_size_read()

    SEEK_CUR: This has a read-modify-write race window
    on the same file. One could argue that any application
    doing unsynchronized seeks on the same file is already broken.
    But for the sake of not adding a regression here I'm
    using the file->f_lock to synchronize this. Using this
    lock is much better than the inode mutex because it doesn't
    synchronize between processes.

    => So still need a lock, but can use a f_lock.

    This patch implements this new scheme in generic_file_llseek.
    I dropped generic_file_llseek_unlocked and changed all callers.

    Signed-off-by: Andi Kleen
    Signed-off-by: Christoph Hellwig

    Andi Kleen
     

25 Oct, 2011

2 commits

  • * 'nfs-for-3.2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (26 commits)
    Check validity of cl_rpcclient in nfs_server_list_show
    NFS: Get rid of the nfs_rdata_mempool
    NFS: Don't rely on PageError in nfs_readpage_release_partial
    NFS: Get rid of unnecessary calls to ClearPageError() in read code
    NFS: Get rid of nfs_restart_rpc()
    NFS: Get rid of the unused nfs_write_data->flags field
    NFS: Get rid of the unused nfs_read_data->flags field
    NFSv4: Translate NFS4ERR_BADNAME into ENOENT when applied to a lookup
    NFS: Remove the unused "lookupfh()" version of nfs4_proc_lookup()
    NFS: Use the inode->i_version to cache NFSv4 change attribute information
    SUNRPC: Remove unnecessary export of rpc_sockaddr2uaddr
    SUNRPC: Fix rpc_sockaddr2uaddr
    nfs/super.c: local functions should be static
    pnfsblock: fix writeback deadlock
    pnfsblock: fix NULL pointer dereference
    pnfs: recoalesce when ld read pagelist fails
    pnfs: recoalesce when ld write pagelist fails
    pnfs: make _set_lo_fail generic
    pnfsblock: add missing rpc_put_mount and path_put
    SUNRPC/NFS: make rpc pipe upcall generic
    ...

    Linus Torvalds
     
  • * 'for-3.2' of git://linux-nfs.org/~bfields/linux: (103 commits)
    nfs41: implement DESTROY_CLIENTID operation
    nfsd4: typo logical vs bitwise negate for want_mask
    nfsd4: allow NFS4_SHARE_SIGNAL_DELEG_WHEN_RESRC_AVAIL | NFS4_SHARE_PUSH_DELEG_WHEN_UNCONTENDED
    nfsd4: seq->status_flags may be used unitialized
    nfsd41: use SEQ4_STATUS_BACKCHANNEL_FAULT when cb_sequence is invalid
    nfsd4: implement new 4.1 open reclaim types
    nfsd4: remove unneeded CLAIM_DELEGATE_CUR workaround
    nfsd4: warn on open failure after create
    nfsd4: preallocate open stateid in process_open1()
    nfsd4: do idr preallocation with stateid allocation
    nfsd4: preallocate nfs4_file in process_open1()
    nfsd4: clean up open owners on OPEN failure
    nfsd4: simplify process_open1 logic
    nfsd4: make is_open_owner boolean
    nfsd4: centralize renew_client() calls
    nfsd4: typo logical vs bitwise negate
    nfs: fix bug about IPv6 address scope checking
    nfsd4: more robust ignoring of WANT bits in OPEN
    nfsd4: move name-length checks to xdr
    nfsd4: move access/deny validity checks to xdr code
    ...

    Linus Torvalds
     

21 Oct, 2011

1 commit


20 Oct, 2011

4 commits


19 Oct, 2011

7 commits

  • Both LOOKUP and OPEN operations may return NFS4ERR_BADNAME if we send a
    an invalid name as a filename argument. As far as the application is
    concerned, it just has to know that the file doesn't exist, and so
    ENOENT would be the appropriate reply. We should only return EINVAL
    if the filename is being used to _create_ a new object on the
    remote filesystem.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • ...and also remove the associated nfs_v4_clientops entry.

    Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • Signed-off-by: Trond Myklebust

    Trond Myklebust
     
  • commit ae50c0b5 "pnfs: client stats" added additional information to
    the output of /proc/self/mountstats. The new functions introduced are
    only used in this file and should be marked static.

    If CONFIG_NFS_V4_1 is not defined, empty stub functions are used. If
    CONFIG_NFS_V4 is not defined these stub functions are not used at all.
    Adding static for the functions results in compile warnings:

    fs/nfs/super.c:743: warning: 'show_sessions' defined but not used
    fs/nfs/super.c:756: warning: 'show_pnfs' defined but not used

    Fix this by adding a #ifdef CONFIG_NFS_V4 guard around the two
    show_ functions.

    Signed-off-by: H Hartley Sweeten
    Cc: Trond Myklebust
    Signed-off-by: Trond Myklebust

    H Hartley Sweeten
     
  • We should check if the sector is already initialized before
    trying to grab the page from page cache. Otherwise when two
    pages of the same block are written back by two threads each
    calling from writepage_locked, it can cause deadlock like bellow.

    [ 1080.972099] INFO: task kswapd0:25 blocked for more than 120 seconds.
    [ 1080.972377] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    [ 1080.972812] kswapd0 D ffff88000c4926c0 0 25 2 0x00000000
    [ 1080.972816] ffff88000df276b0 0000000000000046 ffff88000df27640 ffffffff81013ba7
    [ 1080.972821] ffff88000c492310 ffff88000df27fd8 ffff88000df27fd8 00000000001d3440
    [ 1080.972824] ffff88000c378000 ffff88000c492310 ffff8800175d3d40 ffff880017fc75a8
    [ 1080.972828] Call Trace:
    [ 1080.972860] [] ? read_tsc+0x9/0x19
    [ 1080.972877] [] ? lock_page+0x2b/0x2b
    [ 1080.972899] [] io_schedule+0x63/0x7e
    [ 1080.972902] [] sleep_on_page+0xe/0x12
    [ 1080.972905] [] __wait_on_bit_lock+0x46/0x8f
    [ 1080.972916] [] ? lock_release_holdtime.part.7+0x6b/0x72
    [ 1080.972919] [] __lock_page+0x66/0x68
    [ 1080.972928] [] ? autoremove_wake_function+0x3d/0x3d
    [ 1080.972932] [] lock_page+0x27/0x2b
    [ 1080.972934] [] find_lock_page+0x34/0x57
    [ 1080.972937] [] find_or_create_page+0x34/0x8a
    [ 1080.972947] [] bl_write_pagelist+0x205/0x6da [blocklayoutdriver]
    [ 1080.972951] [] ? bl_free_lseg+0x38/0x38 [blocklayoutdriver]
    [ 1080.972995] [] ? nfs_write_rpcsetup+0x118/0x123 [nfs]
    [ 1080.973033] [] pnfs_generic_pg_writepages+0x10b/0x1f4 [nfs]
    [ 1080.973089] [] nfs_pageio_doio+0x1a/0x43 [nfs]
    [ 1080.973098] [] nfs_pageio_complete+0x16/0x2d [nfs]
    [ 1080.973108] [] nfs_writepage_locked+0xa0/0xbf [nfs]
    [ 1080.973119] [] nfs_writepage+0x16/0x2b [nfs]
    [ 1080.973122] [] ? clear_page_dirty_for_io+0x87/0x9a
    [ 1080.973133] [] shrink_page_list+0x39b/0x6c8
    [ 1080.973139] [] shrink_inactive_list+0x22c/0x39e
    [ 1080.973144] [] ? lock_release_holdtime.part.7+0x6b/0x72
    [ 1080.973148] [] shrink_zone+0x445/0x588
    [ 1080.973152] [] balance_pgdat+0x2c2/0x56b
    [ 1080.973170] [] ? __bitmap_weight+0x34/0x80
    [ 1080.973175] [] kswapd+0x2be/0x2fa
    [ 1080.973179] [] ? __init_waitqueue_head+0x4b/0x4b
    [ 1080.973183] [] ? balance_pgdat+0x56b/0x56b
    [ 1080.973187] [] kthread+0xa8/0xb0
    [ 1080.973200] [] kernel_thread_helper+0x4/0x10
    [ 1080.973205] [] ? __init_kthread_worker+0x5a/0x5a
    [ 1080.973210] [] ? gs_change+0x13/0x13
    [ 1080.973213] no locks held by kswapd0/25.

    Signed-off-by: Peng Tao
    Signed-off-by: Jim Rees
    Cc: stable@kernel.org [3.0]
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • bl_add_page_to_bio returns error pointer. bio should be reset to
    NULL in failure cases as the out path always calls bl_submit_bio.

    Signed-off-by: Peng Tao
    Signed-off-by: Jim Rees
    Cc: stable@kernel.org [3.0]
    Signed-off-by: Trond Myklebust

    Peng Tao
     
  • For pnfs pagelist read failure, we need to pg_recoalesce and resend IO to
    mds.

    Signed-off-by: Peng Tao
    Signed-off-by: Jim Rees
    Cc: stable@kernel.org [3.0]
    Signed-off-by: Trond Myklebust

    Peng Tao