27 May, 2014

1 commit

  • If nfsd4_check_resp_size() returns an error then we should really be
    truncating the reply here, otherwise we may leave extra garbage at the
    end of the rpc reply.

    Also add a warning to catch any cases where our reply-size estimates may
    be wrong in the case of a non-idempotent operation.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

23 May, 2014

16 commits


22 May, 2014

2 commits

  • We're not cleaning up everything we need to on error. In particular,
    we're not removing our lease. Among other problems this can cause the
    struct nfs4_file used as fl_owner to be referenced after it has been
    destroyed.

    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • We're clearing the SUID/SGID bits on write by hand in nfsd_vfs_write,
    even though the subsequent vfs_writev() call will end up doing this for
    us (through file system write methods eventually calling
    file_remove_suid(), e.g., from __generic_file_aio_write).

    So, remove the redundant nfsd code.

    The only change in behavior is when the write is by root, in which case
    we previously cleared SUID/SGID, but will now leave it alone. The new
    behavior is the behavior of every filesystem we've checked.

    It seems better to be consistent with local filesystem behavior. And
    the security advantage seems limited as root could always restore these
    bits by hand if it wanted.

    SUID/SGID is not cleared after writing data with (root, local ext4),
    File: ‘test’
    Size: 0 Blocks: 0 IO Block: 4096 regular
    empty file
    Device: 803h/2051d Inode: 1200137 Links: 1
    Access: (4777/-rwsrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
    Context: unconfined_u:object_r:admin_home_t:s0
    Access: 2014-04-18 21:36:31.016029014 +0800
    Modify: 2014-04-18 21:36:31.016029014 +0800
    Change: 2014-04-18 21:36:31.026030285 +0800
    Birth: -
    File: ‘test’
    Size: 5 Blocks: 8 IO Block: 4096 regular file
    Device: 803h/2051d Inode: 1200137 Links: 1
    Access: (4777/-rwsrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
    Context: unconfined_u:object_r:admin_home_t:s0
    Access: 2014-04-18 21:36:31.016029014 +0800
    Modify: 2014-04-18 21:36:31.040032065 +0800
    Change: 2014-04-18 21:36:31.040032065 +0800
    Birth: -

    With no_root_squash, (root, remote ext4), SUID/SGID are cleared,
    File: ‘test’
    Size: 0 Blocks: 0 IO Block: 262144 regular
    empty file
    Device: 24h/36d Inode: 786439 Links: 1
    Access: (4777/-rwsrwxrwx) Uid: ( 1000/ test) Gid: ( 1000/ test)
    Context: system_u:object_r:nfs_t:s0
    Access: 2014-04-18 21:45:32.155805097 +0800
    Modify: 2014-04-18 21:45:32.155805097 +0800
    Change: 2014-04-18 21:45:32.168806749 +0800
    Birth: -
    File: ‘test’
    Size: 5 Blocks: 8 IO Block: 262144 regular file
    Device: 24h/36d Inode: 786439 Links: 1
    Access: (0777/-rwxrwxrwx) Uid: ( 1000/ test) Gid: ( 1000/ test)
    Context: system_u:object_r:nfs_t:s0
    Access: 2014-04-18 21:45:32.155805097 +0800
    Modify: 2014-04-18 21:45:32.184808783 +0800
    Change: 2014-04-18 21:45:32.184808783 +0800
    Birth: -

    Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     

21 May, 2014

2 commits

  • The current code assumes a one-to-one lockownerlock stateid
    correspondance.

    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     
  • The nfsv4 state code has always assumed a one-to-one correspondance
    between lock stateid's and lockowners even if it appears not to in some
    places.

    We may actually change that, but for now when FREE_STATEID releases a
    lock stateid it also needs to release the parent lockowner.

    Symptoms were a subsequent LOCK crashing in find_lockowner_str when it
    calls same_lockowner_ino on a lockowner that unexpectedly has an empty
    so_stateids list.

    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields
     

16 May, 2014

1 commit


09 May, 2014

7 commits

  • Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • J. Bruce Fields
     
  • Use fh_fsid when reffering to the fsid part of the filehandle. The
    variable length auth field envisioned in nfsfh wasn't ever implemented.
    Also clean up some lose ends around this and document the file handle
    format better.

    Btw, why do we even export nfsfh.h to userspace? The file handle very
    much is kernel private, and nothing in nfs-utils include the header
    either.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: J. Bruce Fields

    Christoph Hellwig
     
  • commit 4ac7249ea5a0ceef9f8269f63f33cc873c3fac61 have remove all EXPORT_SYMBOL,
    linux/export.h is not needed, just clean it.

    Signed-off-by: Kinglong Mee
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     
  • After setting ACL for directory, I got two problems that caused
    by the cached zero-length default posix acl.

    This patch make sure nfsd4_set_nfs4_acl calls ->set_acl
    with a NULL ACL structure if there are no entries.

    Thanks for Christoph Hellwig's advice.

    First problem:
    ............ hang ...........

    Second problem:
    [ 1610.167668] ------------[ cut here ]------------
    [ 1610.168320] kernel BUG at /root/nfs/linux/fs/nfsd/nfs4acl.c:239!
    [ 1610.168320] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
    [ 1610.168320] Modules linked in: nfsv4(OE) nfs(OE) nfsd(OE)
    rpcsec_gss_krb5 fscache ip6t_rpfilter ip6t_REJECT cfg80211 xt_conntrack
    rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables
    ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6
    ip6table_mangle ip6table_security ip6table_raw ip6table_filter
    ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4
    nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
    auth_rpcgss nfs_acl snd_intel8x0 ppdev lockd snd_ac97_codec ac97_bus
    snd_pcm snd_timer e1000 pcspkr parport_pc snd parport serio_raw joydev
    i2c_piix4 sunrpc(OE) microcode soundcore i2c_core ata_generic pata_acpi
    [last unloaded: nfsd]
    [ 1610.168320] CPU: 0 PID: 27397 Comm: nfsd Tainted: G OE
    3.15.0-rc1+ #15
    [ 1610.168320] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
    VirtualBox 12/01/2006
    [ 1610.168320] task: ffff88005ab653d0 ti: ffff88005a944000 task.ti:
    ffff88005a944000
    [ 1610.168320] RIP: 0010:[] []
    _posix_to_nfsv4_one+0x3cd/0x3d0 [nfsd]
    [ 1610.168320] RSP: 0018:ffff88005a945b00 EFLAGS: 00010293
    [ 1610.168320] RAX: 0000000000000001 RBX: ffff88006700bac0 RCX:
    0000000000000000
    [ 1610.168320] RDX: 0000000000000000 RSI: ffff880067c83f00 RDI:
    ffff880068233300
    [ 1610.168320] RBP: ffff88005a945b48 R08: ffffffff81c64830 R09:
    0000000000000000
    [ 1610.168320] R10: ffff88004ea85be0 R11: 000000000000f475 R12:
    ffff880068233300
    [ 1610.168320] R13: 0000000000000003 R14: 0000000000000002 R15:
    ffff880068233300
    [ 1610.168320] FS: 0000000000000000(0000) GS:ffff880077800000(0000)
    knlGS:0000000000000000
    [ 1610.168320] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    [ 1610.168320] CR2: 00007f5bcbd3b0b9 CR3: 0000000001c0f000 CR4:
    00000000000006f0
    [ 1610.168320] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
    0000000000000000
    [ 1610.168320] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
    0000000000000400
    [ 1610.168320] Stack:
    [ 1610.168320] ffffffff00000000 0000000b67c83500 000000076700bac0
    0000000000000000
    [ 1610.168320] ffff88006700bac0 ffff880068233300 ffff88005a945c08
    0000000000000002
    [ 1610.168320] 0000000000000000 ffff88005a945b88 ffffffffa034e2d5
    000000065a945b68
    [ 1610.168320] Call Trace:
    [ 1610.168320] [] nfsd4_get_nfs4_acl+0x95/0x150 [nfsd]
    [ 1610.168320] [] nfsd4_encode_fattr+0x646/0x1e70 [nfsd]
    [ 1610.168320] [] ? kmemleak_alloc+0x4e/0xb0
    [ 1610.168320] [] ?
    nfsd_setuser_and_check_port+0x52/0x80 [nfsd]
    [ 1610.168320] [] ? selinux_cred_prepare+0x1b/0x30
    [ 1610.168320] [] nfsd4_encode_getattr+0x5a/0x60 [nfsd]
    [ 1610.168320] [] nfsd4_encode_operation+0x67/0x110
    [nfsd]
    [ 1610.168320] [] nfsd4_proc_compound+0x21d/0x810 [nfsd]
    [ 1610.168320] [] nfsd_dispatch+0xbb/0x200 [nfsd]
    [ 1610.168320] [] svc_process_common+0x46d/0x6d0 [sunrpc]
    [ 1610.168320] [] svc_process+0x103/0x170 [sunrpc]
    [ 1610.168320] [] nfsd+0xbf/0x130 [nfsd]
    [ 1610.168320] [] ? nfsd_destroy+0x80/0x80 [nfsd]
    [ 1610.168320] [] kthread+0xd2/0xf0
    [ 1610.168320] [] ? insert_kthread_work+0x40/0x40
    [ 1610.168320] [] ret_from_fork+0x7c/0xb0
    [ 1610.168320] [] ? insert_kthread_work+0x40/0x40
    [ 1610.168320] Code: 78 02 e9 e7 fc ff ff 31 c0 31 d2 31 c9 66 89 45 ce
    41 8b 04 24 66 89 55 d0 66 89 4d d2 48 8d 04 80 49 8d 5c 84 04 e9 37 fd
    ff ff 0b 90 0f 1f 44 00 00 55 8b 56 08 c7 07 00 00 00 00 8b 46 0c
    [ 1610.168320] RIP [] _posix_to_nfsv4_one+0x3cd/0x3d0
    [nfsd]
    [ 1610.168320] RSP
    [ 1610.257313] ---[ end trace 838254e3e352285b ]---

    Signed-off-by: Kinglong Mee
    Cc: stable@vger.kernel.org
    Signed-off-by: J. Bruce Fields

    Kinglong Mee
     

07 May, 2014

10 commits


18 Apr, 2014

1 commit

  • Since we're still limiting attributes to a page, the result here is that
    a large getattr result will return NFS4ERR_REP_TOO_BIG/TOO_BIG_TO_CACHE
    instead of NFS4ERR_RESOURCE.

    Both error returns are wrong, and the real bug here is the arbitrary
    limit on getattr results, fixed by as-yet out-of-tree patches. But at a
    minimum we can make life easier for clients by sticking to one broken
    behavior in released kernels instead of two....

    Trond says:

    one immediate consequence of this patch will be that NFSv4.1
    clients will now report EIO instead of EREMOTEIO if they hit the
    problem. That may make debugging a little less obvious.

    Another consequence will be that if we ever do try to add client
    side handling of NFS4ERR_REP_TOO_BIG, then we now have to deal
    with the “handle existing buggy server” syndrome.

    Reported-by: Trond Myklebust
    Signed-off-by: J. Bruce Fields

    J. Bruce Fields