01 Nov, 2011

1 commit


04 Aug, 2011

1 commit


27 May, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (25 commits)
    cifs: remove unnecessary dentry_unhash on rmdir/rename_dir
    ocfs2: remove unnecessary dentry_unhash on rmdir/rename_dir
    exofs: remove unnecessary dentry_unhash on rmdir/rename_dir
    nfs: remove unnecessary dentry_unhash on rmdir/rename_dir
    ext2: remove unnecessary dentry_unhash on rmdir/rename_dir
    ext3: remove unnecessary dentry_unhash on rmdir/rename_dir
    ext4: remove unnecessary dentry_unhash on rmdir/rename_dir
    btrfs: remove unnecessary dentry_unhash in rmdir/rename_dir
    ceph: remove unnecessary dentry_unhash calls
    vfs: clean up vfs_rename_other
    vfs: clean up vfs_rename_dir
    vfs: clean up vfs_rmdir
    vfs: fix vfs_rename_dir for FS_RENAME_DOES_D_MOVE filesystems
    libfs: drop unneeded dentry_unhash
    vfs: update dentry_unhash() comment
    vfs: push dentry_unhash on rename_dir into file systems
    vfs: push dentry_unhash on rmdir into file systems
    vfs: remove dget() from dentry_unhash()
    vfs: dentry_unhash immediately prior to rmdir
    vfs: Block mmapped writes while the fs is frozen
    ...

    Linus Torvalds
     

26 May, 2011

2 commits

  • Commit 990d6c2d7aee921e3bce22b2d6a750fd552262be ("vfs: Add name to file
    handle conversion support") changed EXPORTFS to be a bool.
    This was needed for earlier revisions of the original patch, but the actual
    commit put the code needing it into its own file that only gets compiled
    when FHANDLE is selected which in turn selects EXPORTFS.
    So EXPORTFS can be safely compiled as a module when not selecting FHANDLE.

    Signed-off-by: Jonas Gorski
    Acked-by: Aneesh Kumar K.V
    Signed-off-by: Al Viro

    Jonas Gorski
     
  • Choosing TMPFS_XATTR default N was switching off TMPFS_POSIX_ACL,
    even if it had been Y in oldconfig; and Linus reports that PulseAudio
    goes subtly wrong unless it can use ACLs on /dev/shm.

    Make TMPFS_POSIX_ACL select TMPFS_XATTR (and depend upon TMPFS),
    and move the TMPFS_POSIX_ACL entry before the TMPFS_XATTR entry,
    to avoid asking unnecessary questions then ignoring their answers.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Eric Paris
     

25 May, 2011

1 commit

  • Implement generic xattrs for tmpfs filesystems. The Feodra project, while
    trying to replace suid apps with file capabilities, realized that tmpfs,
    which is used on the build systems, does not support file capabilities and
    thus cannot be used to build packages which use file capabilities. Xattrs
    are also needed for overlayfs.

    The xattr interface is a bit odd. If a filesystem does not implement any
    {get,set,list}xattr functions the VFS will call into some random LSM hooks
    and the running LSM can then implement some method for handling xattrs.
    SELinux for example provides a method to support security.selinux but no
    other security.* xattrs.

    As it stands today when one enables CONFIG_TMPFS_POSIX_ACL tmpfs will have
    xattr handler routines specifically to handle acls. Because of this tmpfs
    would loose the VFS/LSM helpers to support the running LSM. To make up
    for that tmpfs had stub functions that did nothing but call into the LSM
    hooks which implement the helpers.

    This new patch does not use the LSM fallback functions and instead just
    implements a native get/set/list xattr feature for the full security.* and
    trusted.* namespace like a normal filesystem. This means that tmpfs can
    now support both security.selinux and security.capability, which was not
    previously possible.

    The basic implementation is that I attach a:

    struct shmem_xattr {
    struct list_head list; /* anchored by shmem_inode_info->xattr_list */
    char *name;
    size_t size;
    char value[0];
    };

    Into the struct shmem_inode_info for each xattr that is set. This
    implementation could easily support the user.* namespace as well, except
    some care needs to be taken to prevent large amounts of unswappable memory
    being allocated for unprivileged users.

    [mszeredi@suse.cz: new config option, suport trusted.*, support symlinks]
    Signed-off-by: Eric Paris
    Signed-off-by: Miklos Szeredi
    Acked-by: Serge Hallyn
    Tested-by: Serge Hallyn
    Cc: Kyle McMartin
    Acked-by: Hugh Dickins
    Tested-by: Jordi Pujol
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Paris
     

17 Mar, 2011

2 commits


15 Mar, 2011

1 commit


21 Jan, 2011

1 commit

  • The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
    is used to configure any non-standard kernel with a much larger scope than
    only small devices.

    This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
    references to the option throughout the kernel. A new CONFIG_EMBEDDED
    option is added that automatically selects CONFIG_EXPERT when enabled and
    can be used in the future to isolate options that should only be
    considered for embedded systems (RISC architectures, SLOB, etc).

    Calling the option "EXPERT" more accurately represents its intention: only
    expert users who understand the impact of the configuration changes they
    are making should enable it.

    Reviewed-by: Ingo Molnar
    Acked-by: David Woodhouse
    Signed-off-by: David Rientjes
    Cc: Greg KH
    Cc: "David S. Miller"
    Cc: Jens Axboe
    Cc: Arnd Bergmann
    Cc: Robin Holt
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

17 Jan, 2011

1 commit

  • - Fix a kconfig unmet dependency warning.
    - Remove the comment that identifies which filesystems use POSIX ACL
    utility routines.
    - Move the FS_POSIX_ACL symbol outside of the BLOCK symbol if/endif block
    because its functions do not depend on BLOCK and some of the filesystems
    that use it do not depend on BLOCK.

    warning: (GENERIC_ACL && JFFS2_FS_POSIX_ACL && NFSD_V4 && NFS_ACL_SUPPORT && 9P_FS_POSIX_ACL) selects FS_POSIX_ACL which has unmet direct dependencies (BLOCK)

    Signed-off-by: Randy Dunlap
    Cc: Al Viro
    Signed-off-by: Al Viro

    Randy Dunlap
     

29 Dec, 2010

1 commit

  • Some platforms have a small amount of non-volatile storage that
    can be used to store information useful to diagnose the cause of
    a system crash. This is the generic part of a file system interface
    that presents information from the crash as a series of files in
    /dev/pstore. Once the information has been seen, the underlying
    storage is freed by deleting the files.

    Signed-off-by: Tony Luck

    Tony Luck
     

29 Oct, 2010

1 commit


28 Oct, 2010

2 commits


27 Oct, 2010

1 commit

  • Move the EXPORTFS kconfig symbol out of the NETWORK_FILESYSTEMS block
    since it provides a library function that can be (and is) used by other
    (non-network) filesystems.

    This also eliminates a kconfig dependency warning:

    warning: (XFS_FS && BLOCK || NFSD && NETWORK_FILESYSTEMS && INET && FILE_LOCKING && BKL) selects EXPORTFS which has unmet direct dependencies (NETWORK_FILESYSTEMS)

    Signed-off-by: Randy Dunlap
    Cc: Dave Chinner
    Cc: Al Viro
    Cc: Alex Elder
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

21 Oct, 2010

1 commit

  • With all the patches we have queued in the BKL removal tree, only a
    few dozen modules are left that actually rely on the BKL, and even
    there are lots of low-hanging fruit. We need to decide what to do
    about them, this patch illustrates one of the options:

    Every user of the BKL is marked as 'depends on BKL' in Kconfig,
    and the CONFIG_BKL becomes a user-visible option. If it gets
    disabled, no BKL using module can be built any more and the BKL
    code itself is compiled out.

    The one exception is file locking, which is practically always
    enabled and does a 'select BKL' instead. This effectively forces
    CONFIG_BKL to be enabled until we have solved the fs/lockd
    mess and can apply the patch that removes the BKL from fs/locks.c.

    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

06 Oct, 2010

2 commits

  • smbfs has been scheduled for removal in 2.6.27, so
    maybe we can now move it to drivers/staging on the
    way out.

    smbfs still uses the big kernel lock and nobody
    is going to fix that, so we should be getting
    rid of it soon.

    This removes the 32 bit compat mount and ioctl
    handling code, which is implemented in common fs
    code, and moves all smbfs related files into
    drivers/staging/smbfs.

    Signed-off-by: Arnd Bergmann
    Acked-by: Jeff Layton
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     
  • Nobody appears to be interested in fixing autofs3 bugs
    any more and it uses the BKL, which is going away.

    Move this to staging for retirement. Unless someone
    complains until 2.6.38, we can remove it for good.

    The include/linux/auto_fs.h header file is still used
    by autofs4, so it remains in place.

    Signed-off-by: Arnd Bergmann
    Cc: Ian Kent
    Cc: autofs@linux.kernel.org
    Cc: "H. Peter Anvin"
    Acked-by: H. Peter Anvin
    Signed-off-by: Greg Kroah-Hartman

    Arnd Bergmann
     

20 Jul, 2010

1 commit


20 Mar, 2010

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (205 commits)
    ceph: update for write_inode API change
    ceph: reset osd after relevant messages timed out
    ceph: fix flush_dirty_caps race with caps migration
    ceph: include migrating caps in issued set
    ceph: fix osdmap decoding when pools include (removed) snaps
    ceph: return EBADF if waiting for caps on closed file
    ceph: set osd request message front length correctly
    ceph: reset front len on return to msgpool; BUG on mismatched front iov
    ceph: fix snaptrace decoding on cap migration between mds
    ceph: use single osd op reply msg
    ceph: reset bits on connection close
    ceph: remove bogus mds forward warning
    ceph: remove fragile __map_osds optimization
    ceph: fix connection fault STANDBY check
    ceph: invalidate_authorizer without con->mutex held
    ceph: don't clobber write return value when using O_SYNC
    ceph: fix client_request_forward decoding
    ceph: drop messages on unregistered mds sessions; cleanup
    ceph: fix comments, locking in destroy_inode
    ceph: move dereference after NULL test
    ...

    Fix trivial conflicts in Documentation/ioctl/ioctl-number.txt

    Linus Torvalds
     

21 Nov, 2009

1 commit


30 Oct, 2009

2 commits


27 Oct, 2009

2 commits

  • Signed-off-by: Kumar Gala
    Signed-off-by: Benjamin Herrenschmidt

    Kumar Gala
     
  • The hugetlb dependencies presently depend on SUPERH && MMU while the
    hugetlb page size definitions depend on CPU_SH4 or CPU_SH5. This
    unfortunately allows SH-3 + MMU configurations to enable hugetlbfs
    without a corresponding HPAGE_SHIFT definition, resulting in the build
    blowing up.

    As SH-3 doesn't support variable page sizes, we tighten up the
    dependenies a bit to prevent hugetlbfs from being enabled. These days
    we also have a shiny new SYS_SUPPORTS_HUGETLBFS, so switch to using
    that rather than adding to the list of corner cases in fs/Kconfig.

    Reported-by: Kristoffer Ericson
    Signed-off-by: Paul Mundt

    Paul Mundt
     

07 Oct, 2009

1 commit


22 Sep, 2009

1 commit

  • CONFIG_SHMEM off gives you (ramfs masquerading as) tmpfs, even when
    CONFIG_TMPFS is off: that's a little anomalous, and I'd intended to make
    more sense of it by removing CONFIG_TMPFS altogether, always enabling its
    code when CONFIG_SHMEM; but so many defconfigs have CONFIG_SHMEM on
    CONFIG_TMPFS off that we'd better leave that as is.

    But there is no point in asking for CONFIG_TMPFS if CONFIG_SHMEM is off:
    make TMPFS depend on SHMEM, which also prevents TMPFS_POSIX_ACL
    shmem_acl.o being pointlessly built into the kernel when SHMEM is off.

    And a selfish change, to prevent the world from being rebuilt when I
    switch between CONFIG_SHMEM on and off: the only CONFIG_SHMEM in the
    header files is mm.h shmem_lock() - give that a shmem.c stub instead.

    Signed-off-by: Hugh Dickins
    Acked-by: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

14 Sep, 2009

1 commit

  • Some people asked me questions like the following:

    On Wed, 15 Jul 2009 13:11:21 +0200, Leon Woestenberg wrote:
    > just wondering, any reasons why NILFS2 is one of the miscellaneous
    > filesystems and, for example, btrfs, is not in Kconfig?

    Actually, nilfs is NOT a filesystem came from other operating systems,
    but a filesystem created purely for Linux. Nor is it a flash
    filesystem but that for generic block devices.

    So, this moves nilfs outside the misc category as I responded in LKML
    "Re: Why does NILFS2 hide under Miscellaneous filesystems?"
    (Message-Id: ).

    Signed-off-by: Ryusuke Konishi

    Ryusuke Konishi
     

14 Jul, 2009

1 commit

  • fs/Kconfig file was split into individual fs/*/Kconfig files before
    nilfs was merged. I've found the current config entry of nilfs is
    tainting the work. Sorry, I didn't notice. This fixes the violation.

    Signed-off-by: Ryusuke Konishi
    Cc: Alexey Dobriyan

    Ryusuke Konishi
     

23 Jun, 2009

1 commit

  • * 'for-2.6.31' of git://fieldses.org/git/linux-nfsd: (60 commits)
    SUNRPC: Fix the TCP server's send buffer accounting
    nfsd41: Backchannel: minorversion support for the back channel
    nfsd41: Backchannel: cleanup nfs4.0 callback encode routines
    nfsd41: Remove ip address collision detection case
    nfsd: optimise the starting of zero threads when none are running.
    nfsd: don't take nfsd_mutex twice when setting number of threads.
    nfsd41: sanity check client drc maxreqs
    nfsd41: move channel attributes from nfsd4_session to a nfsd4_channel_attr struct
    NFS: kill off complicated macro 'PROC'
    sunrpc: potential memory leak in function rdma_read_xdr
    nfsd: minor nfsd_vfs_write cleanup
    nfsd: Pull write-gathering code out of nfsd_vfs_write
    nfsd: track last inode only in use_wgather case
    sunrpc: align cache_clean work's timer
    nfsd: Use write gathering only with NFSv2
    NFSv4: kill off complicated macro 'PROC'
    NFSv4: do exact check about attribute specified
    knfsd: remove unreported filehandle stats counters
    knfsd: fix reply cache memory corruption
    knfsd: reply cache cleanups
    ...

    Linus Torvalds
     

17 Jun, 2009

2 commits


09 Jun, 2009

1 commit

  • CUSE enables implementing character devices in userspace. With recent
    additions of ioctl and poll support, FUSE already has most of what's
    necessary to implement character devices. All CUSE has to do is
    bonding all those components - FUSE, chardev and the driver model -
    nicely.

    When client opens /dev/cuse, kernel starts conversation with
    CUSE_INIT. The client tells CUSE which device it wants to create. As
    the previous patch made fuse_file usable without associated
    fuse_inode, CUSE doesn't create super block or inodes. It attaches
    fuse_file to cdev file->private_data during open and set ff->fi to
    NULL. The rest of the operation is almost identical to FUSE direct IO
    case.

    Each CUSE device has a corresponding directory /sys/class/cuse/DEVNAME
    (which is symlink to /sys/devices/virtual/class/DEVNAME if
    SYSFS_DEPRECATED is turned off) which hosts "waiting" and "abort"
    among other things. Those two files have the same meaning as the FUSE
    control files.

    The only notable lacking feature compared to in-kernel implementation
    is mmap support.

    Signed-off-by: Tejun Heo
    Signed-off-by: Miklos Szeredi

    Tejun Heo
     

14 May, 2009

1 commit

  • lockd/svclock.c is missing a header file .

    is missing a definition of locks_release_private()
    for the config case of FILE_LOCKING=n, causing a build error:

    fs/lockd/svclock.c:330: error: implicit declaration of function 'locks_release_private'

    lockd without FILE_LOCKING doesn't make sense, so make LOCKD and LOCKD_V4
    depend on FILE_LOCKING, and make NFS depend on FILE_LOCKING.

    Signed-off-by: Randy Dunlap
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: J. Bruce Fields

    Randy Dunlap
     

07 Apr, 2009

1 commit


04 Apr, 2009

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-fscache: (41 commits)
    NFS: Add mount options to enable local caching on NFS
    NFS: Display local caching state
    NFS: Store pages from an NFS inode into a local cache
    NFS: Read pages from FS-Cache into an NFS inode
    NFS: nfs_readpage_async() needs to be accessible as a fallback for local caching
    NFS: Add read context retention for FS-Cache to call back with
    NFS: FS-Cache page management
    NFS: Add some new I/O counters for FS-Cache doing things for NFS
    NFS: Invalidate FsCache page flags when cache removed
    NFS: Use local disk inode cache
    NFS: Define and create inode-level cache objects
    NFS: Define and create superblock-level objects
    NFS: Define and create server-level objects
    NFS: Register NFS for caching and retrieve the top-level index
    NFS: Permit local filesystem caching to be enabled for NFS
    NFS: Add FS-Cache option bit and debug bit
    NFS: Add comment banners to some NFS functions
    FS-Cache: Make kAFS use FS-Cache
    CacheFiles: A cache that backs onto a mounted filesystem
    CacheFiles: Export things for CacheFiles
    ...

    Linus Torvalds
     

03 Apr, 2009

2 commits

  • Add an FS-Cache cache-backend that permits a mounted filesystem to be used as a
    backing store for the cache.

    CacheFiles uses a userspace daemon to do some of the cache management - such as
    reaping stale nodes and culling. This is called cachefilesd and lives in
    /sbin. The source for the daemon can be downloaded from:

    http://people.redhat.com/~dhowells/cachefs/cachefilesd.c

    And an example configuration from:

    http://people.redhat.com/~dhowells/cachefs/cachefilesd.conf

    The filesystem and data integrity of the cache are only as good as those of the
    filesystem providing the backing services. Note that CacheFiles does not
    attempt to journal anything since the journalling interfaces of the various
    filesystems are very specific in nature.

    CacheFiles creates a misc character device - "/dev/cachefiles" - that is used
    to communication with the daemon. Only one thing may have this open at once,
    and whilst it is open, a cache is at least partially in existence. The daemon
    opens this and sends commands down it to control the cache.

    CacheFiles is currently limited to a single cache.

    CacheFiles attempts to maintain at least a certain percentage of free space on
    the filesystem, shrinking the cache by culling the objects it contains to make
    space if necessary - see the "Cache Culling" section. This means it can be
    placed on the same medium as a live set of data, and will expand to make use of
    spare space and automatically contract when the set of data requires more
    space.

    ============
    REQUIREMENTS
    ============

    The use of CacheFiles and its daemon requires the following features to be
    available in the system and in the cache filesystem:

    - dnotify.

    - extended attributes (xattrs).

    - openat() and friends.

    - bmap() support on files in the filesystem (FIBMAP ioctl).

    - The use of bmap() to detect a partial page at the end of the file.

    It is strongly recommended that the "dir_index" option is enabled on Ext3
    filesystems being used as a cache.

    =============
    CONFIGURATION
    =============

    The cache is configured by a script in /etc/cachefilesd.conf. These commands
    set up cache ready for use. The following script commands are available:

    (*) brun %
    (*) bcull %
    (*) bstop %
    (*) frun %
    (*) fcull %
    (*) fstop %

    Configure the culling limits. Optional. See the section on culling
    The defaults are 7% (run), 5% (cull) and 1% (stop) respectively.

    The commands beginning with a 'b' are file space (block) limits, those
    beginning with an 'f' are file count limits.

    (*) dir

    Specify the directory containing the root of the cache. Mandatory.

    (*) tag

    Specify a tag to FS-Cache to use in distinguishing multiple caches.
    Optional. The default is "CacheFiles".

    (*) debug

    Specify a numeric bitmask to control debugging in the kernel module.
    Optional. The default is zero (all off). The following values can be
    OR'd into the mask to collect various information:

    1 Turn on trace of function entry (_enter() macros)
    2 Turn on trace of function exit (_leave() macros)
    4 Turn on trace of internal debug points (_debug())

    This mask can also be set through sysfs, eg:

    echo 5 >/sys/modules/cachefiles/parameters/debug

    ==================
    STARTING THE CACHE
    ==================

    The cache is started by running the daemon. The daemon opens the cache device,
    configures the cache and tells it to begin caching. At that point the cache
    binds to fscache and the cache becomes live.

    The daemon is run as follows:

    /sbin/cachefilesd [-d]* [-s] [-n] [-f ]

    The flags are:

    (*) -d

    Increase the debugging level. This can be specified multiple times and
    is cumulative with itself.

    (*) -s

    Send messages to stderr instead of syslog.

    (*) -n

    Don't daemonise and go into background.

    (*) -f

    Use an alternative configuration file rather than the default one.

    ===============
    THINGS TO AVOID
    ===============

    Do not mount other things within the cache as this will cause problems. The
    kernel module contains its own very cut-down path walking facility that ignores
    mountpoints, but the daemon can't avoid them.

    Do not create, rename or unlink files and directories in the cache whilst the
    cache is active, as this may cause the state to become uncertain.

    Renaming files in the cache might make objects appear to be other objects (the
    filename is part of the lookup key).

    Do not change or remove the extended attributes attached to cache files by the
    cache as this will cause the cache state management to get confused.

    Do not create files or directories in the cache, lest the cache get confused or
    serve incorrect data.

    Do not chmod files in the cache. The module creates things with minimal
    permissions to prevent random users being able to access them directly.

    =============
    CACHE CULLING
    =============

    The cache may need culling occasionally to make space. This involves
    discarding objects from the cache that have been used less recently than
    anything else. Culling is based on the access time of data objects. Empty
    directories are culled if not in use.

    Cache culling is done on the basis of the percentage of blocks and the
    percentage of files available in the underlying filesystem. There are six
    "limits":

    (*) brun
    (*) frun

    If the amount of free space and the number of available files in the cache
    rises above both these limits, then culling is turned off.

    (*) bcull
    (*) fcull

    If the amount of available space or the number of available files in the
    cache falls below either of these limits, then culling is started.

    (*) bstop
    (*) fstop

    If the amount of available space or the number of available files in the
    cache falls below either of these limits, then no further allocation of
    disk space or files is permitted until culling has raised things above
    these limits again.

    These must be configured thusly:

    0 < bcull < brun < 100
    0 < fcull < frun < 100

    Note that these are percentages of available space and available files, and do
    _not_ appear as 100 minus the percentage displayed by the "df" program.

    The userspace daemon scans the cache to build up a table of cullable objects.
    These are then culled in least recently used order. A new scan of the cache is
    started as soon as space is made in the table. Objects will be skipped if
    their atimes have changed or if the kernel module says it is still using them.

    ===============
    CACHE STRUCTURE
    ===============

    The CacheFiles module will create two directories in the directory it was
    given:

    (*) cache/

    (*) graveyard/

    The active cache objects all reside in the first directory. The CacheFiles
    kernel module moves any retired or culled objects that it can't simply unlink
    to the graveyard from which the daemon will actually delete them.

    The daemon uses dnotify to monitor the graveyard directory, and will delete
    anything that appears therein.

    The module represents index objects as directories with the filename "I..." or
    "J...". Note that the "cache/" directory is itself a special index.

    Data objects are represented as files if they have no children, or directories
    if they do. Their filenames all begin "D..." or "E...". If represented as a
    directory, data objects will have a file in the directory called "data" that
    actually holds the data.

    Special objects are similar to data objects, except their filenames begin
    "S..." or "T...".

    If an object has children, then it will be represented as a directory.
    Immediately in the representative directory are a collection of directories
    named for hash values of the child object keys with an '@' prepended. Into
    this directory, if possible, will be placed the representations of the child
    objects:

    INDEX INDEX INDEX DATA FILES
    ========= ========== ================================= ================
    cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400
    cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...DB1ry
    cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...N22ry
    cache/@4a/I03nfs/@30/Ji000000000000000--fHg8hi8400/@75/Es0g000w...FP1ry

    If the key is so long that it exceeds NAME_MAX with the decorations added on to
    it, then it will be cut into pieces, the first few of which will be used to
    make a nest of directories, and the last one of which will be the objects
    inside the last directory. The names of the intermediate directories will have
    '+' prepended:

    J1223/@23/+xy...z/+kl...m/Epqr

    Note that keys are raw data, and not only may they exceed NAME_MAX in size,
    they may also contain things like '/' and NUL characters, and so they may not
    be suitable for turning directly into a filename.

    To handle this, CacheFiles will use a suitably printable filename directly and
    "base-64" encode ones that aren't directly suitable. The two versions of
    object filenames indicate the encoding:

    OBJECT TYPE PRINTABLE ENCODED
    =============== =============== ===============
    Index "I..." "J..."
    Data "D..." "E..."
    Special "S..." "T..."

    Intermediate directories are always "@" or "+" as appropriate.

    Each object in the cache has an extended attribute label that holds the object
    type ID (required to distinguish special objects) and the auxiliary data from
    the netfs. The latter is used to detect stale objects in the cache and update
    or retire them.

    Note that CacheFiles will erase from the cache any file it doesn't recognise or
    any file of an incorrect type (such as a FIFO file or a device file).

    ==========================
    SECURITY MODEL AND SELINUX
    ==========================

    CacheFiles is implemented to deal properly with the LSM security features of
    the Linux kernel and the SELinux facility.

    One of the problems that CacheFiles faces is that it is generally acting on
    behalf of a process, and running in that process's context, and that includes a
    security context that is not appropriate for accessing the cache - either
    because the files in the cache are inaccessible to that process, or because if
    the process creates a file in the cache, that file may be inaccessible to other
    processes.

    The way CacheFiles works is to temporarily change the security context (fsuid,
    fsgid and actor security label) that the process acts as - without changing the
    security context of the process when it the target of an operation performed by
    some other process (so signalling and suchlike still work correctly).

    When the CacheFiles module is asked to bind to its cache, it:

    (1) Finds the security label attached to the root cache directory and uses
    that as the security label with which it will create files. By default,
    this is:

    cachefiles_var_t

    (2) Finds the security label of the process which issued the bind request
    (presumed to be the cachefilesd daemon), which by default will be:

    cachefilesd_t

    and asks LSM to supply a security ID as which it should act given the
    daemon's label. By default, this will be:

    cachefiles_kernel_t

    SELinux transitions the daemon's security ID to the module's security ID
    based on a rule of this form in the policy.

    type_transition ;

    For instance:

    type_transition cachefilesd_t kernel_t : process cachefiles_kernel_t;

    The module's security ID gives it permission to create, move and remove files
    and directories in the cache, to find and access directories and files in the
    cache, to set and access extended attributes on cache objects, and to read and
    write files in the cache.

    The daemon's security ID gives it only a very restricted set of permissions: it
    may scan directories, stat files and erase files and directories. It may
    not read or write files in the cache, and so it is precluded from accessing the
    data cached therein; nor is it permitted to create new files in the cache.

    There are policy source files available in:

    http://people.redhat.com/~dhowells/fscache/cachefilesd-0.8.tar.bz2

    and later versions. In that tarball, see the files:

    cachefilesd.te
    cachefilesd.fc
    cachefilesd.if

    They are built and installed directly by the RPM.

    If a non-RPM based system is being used, then copy the above files to their own
    directory and run:

    make -f /usr/share/selinux/devel/Makefile
    semodule -i cachefilesd.pp

    You will need checkpolicy and selinux-policy-devel installed prior to the
    build.

    By default, the cache is located in /var/fscache, but if it is desirable that
    it should be elsewhere, than either the above policy files must be altered, or
    an auxiliary policy must be installed to label the alternate location of the
    cache.

    For instructions on how to add an auxiliary policy to enable the cache to be
    located elsewhere when SELinux is in enforcing mode, please see:

    /usr/share/doc/cachefilesd-*/move-cache.txt

    When the cachefilesd rpm is installed; alternatively, the document can be found
    in the sources.

    ==================
    A NOTE ON SECURITY
    ==================

    CacheFiles makes use of the split security in the task_struct. It allocates
    its own task_security structure, and redirects current->act_as to point to it
    when it acts on behalf of another process, in that process's context.

    The reason it does this is that it calls vfs_mkdir() and suchlike rather than
    bypassing security and calling inode ops directly. Therefore the VFS and LSM
    may deny the CacheFiles access to the cache data because under some
    circumstances the caching code is running in the security context of whatever
    process issued the original syscall on the netfs.

    Furthermore, should CacheFiles create a file or directory, the security
    parameters with that object is created (UID, GID, security label) would be
    derived from that process that issued the system call, thus potentially
    preventing other processes from accessing the cache - including CacheFiles's
    cache management daemon (cachefilesd).

    What is required is to temporarily override the security of the process that
    issued the system call. We can't, however, just do an in-place change of the
    security data as that affects the process as an object, not just as a subject.
    This means it may lose signals or ptrace events for example, and affects what
    the process looks like in /proc.

    So CacheFiles makes use of a logical split in the security between the
    objective security (task->sec) and the subjective security (task->act_as). The
    objective security holds the intrinsic security properties of a process and is
    never overridden. This is what appears in /proc, and is what is used when a
    process is the target of an operation by some other process (SIGKILL for
    example).

    The subjective security holds the active security properties of a process, and
    may be overridden. This is not seen externally, and is used whan a process
    acts upon another object, for example SIGKILLing another process or opening a
    file.

    LSM hooks exist that allow SELinux (or Smack or whatever) to reject a request
    for CacheFiles to run in a context of a specific security label, or to create
    files and directories with another security label.

    This documentation is added by the patch to:

    Documentation/filesystems/caching/cachefiles.txt

    Signed-Off-By: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     
  • Add the main configuration option, allowing FS-Cache to be selected; the
    module entry and exit functions and the debugging stuff used by these patches.

    The two configuration options added are:

    CONFIG_FSCACHE
    CONFIG_FSCACHE_DEBUG

    The first enables the facility, and the second makes the debugging statements
    enableable through the "debug" module parameter. The value of this parameter
    is a bitmask as described in:

    Documentation/filesystems/caching/fscache.txt

    The module can be loaded at this point, but all it will do at this point in
    the patch series is to start up the slow work facility and shut it down again.

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Acked-by: Trond Myklebust
    Acked-by: Al Viro
    Tested-by: Daire Byrne

    David Howells
     

01 Apr, 2009

1 commit