05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

12 Jan, 2016

1 commit

  • Pull vfs xattr updates from Al Viro:
    "Andreas' xattr cleanup series.

    It's a followup to his xattr work that went in last cycle; -0.5KLoC"

    * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    xattr handlers: Simplify list operation
    ocfs2: Replace list xattr handler operations
    nfs: Move call to security_inode_listsecurity into nfs_listxattr
    xfs: Change how listxattr generates synthetic attributes
    tmpfs: listxattr should include POSIX ACL xattrs
    tmpfs: Use xattr handler infrastructure
    btrfs: Use xattr handler infrastructure
    vfs: Distinguish between full xattr names and proper prefixes
    posix acls: Remove duplicate xattr name definitions
    gfs2: Remove gfs2_xattr_acl_chmod
    vfs: Remove vfs_xattr_cmp

    Linus Torvalds
     

09 Dec, 2015

1 commit

  • kmap() in page_follow_link_light() needed to go - allowing to hold
    an arbitrary number of kmaps for long is a great way to deadlocking
    the system.

    new helper (inode_nohighmem(inode)) needs to be used for pagecache
    symlinks inodes; done for all in-tree cases. page_follow_link_light()
    instrumented to yell about anything missed.

    Signed-off-by: Al Viro

    Al Viro
     

07 Dec, 2015

2 commits


14 Nov, 2015

2 commits

  • The xattr_handler operations are currently all passed a file system
    specific flags value which the operations can use to disambiguate between
    different handlers; some file systems use that to distinguish the xattr
    namespace, for example. In some oprations, it would be useful to also have
    access to the handler prefix. To allow that, pass a pointer to the handler
    to operations instead of the flags value alone.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • The list operations can never be called; they are even documented to be
    unused.

    Signed-off-by: Andreas Gruenbacher
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     

11 Sep, 2015

1 commit

  • Pages looked up by __hfs_bnode_create() (called by hfs_bnode_create() and
    hfs_bnode_find() for finding or creating pages corresponding to an inode)
    are immediately kmap()'ed and used (both read and write) and kunmap()'ed,
    and should not be page_cache_release()'ed until hfs_bnode_free().

    This patch fixes a problem I first saw in July 2012: merely running "du"
    on a large hfsplus-mounted directory a few times on a reasonably loaded
    system would get the hfsplus driver all confused and complaining about
    B-tree inconsistencies, and generates a "BUG: Bad page state". Most
    recently, I can generate this problem on up-to-date Fedora 22 with shipped
    kernel 4.0.5, by running "du /" (="/" + "/home" + "/mnt" + other smaller
    mounts) and "du /mnt" simultaneously on two windows, where /mnt is a
    lightly-used QEMU VM image of the full Mac OS X 10.9:

    $ df -i / /home /mnt
    Filesystem Inodes IUsed IFree IUse% Mounted on
    /dev/mapper/fedora-root 3276800 551665 2725135 17% /
    /dev/mapper/fedora-home 52879360 716221 52163139 2% /home
    /dev/nbd0p2 4294967295 1387818 4293579477 1% /mnt

    After applying the patch, I was able to run "du /" (60+ times) and "du
    /mnt" (150+ times) continuously and simultaneously for 6+ hours.

    There are many reports of the hfsplus driver getting confused under load
    and generating "BUG: Bad page state" or other similar issues over the
    years. [1]

    The unpatched code [2] has always been wrong since it entered the kernel
    tree. The only reason why it gets away with it is that the
    kmap/memcpy/kunmap follow very quickly after the page_cache_release() so
    the kernel has not had a chance to reuse the memory for something else,
    most of the time.

    The current RW driver appears to have followed the design and development
    of the earlier read-only hfsplus driver [3], where-by version 0.1 (Dec
    2001) had a B-tree node-centric approach to
    read_cache_page()/page_cache_release() per bnode_get()/bnode_put(),
    migrating towards version 0.2 (June 2002) of caching and releasing pages
    per inode extents. When the current RW code first entered the kernel [2]
    in 2005, there was an REF_PAGES conditional (and "//" commented out code)
    to switch between B-node centric paging to inode-centric paging. There
    was a mistake with the direction of one of the REF_PAGES conditionals in
    __hfs_bnode_create(). In a subsequent "remove debug code" commit [4], the
    read_cache_page()/page_cache_release() per bnode_get()/bnode_put() were
    removed, but a page_cache_release() was mistakenly left in (propagating
    the "REF_PAGES !REF_PAGE" mistake), and the commented-out
    page_cache_release() in bnode_release() (which should be spanned by
    !REF_PAGES) was never enabled.

    References:
    [1]:
    Michael Fox, Apr 2013
    http://www.spinics.net/lists/linux-fsdevel/msg63807.html
    ("hfsplus volume suddenly inaccessable after 'hfs: recoff %d too large'")

    Sasha Levin, Feb 2015
    http://lkml.org/lkml/2015/2/20/85 ("use after free")

    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/740814
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1027887
    https://bugzilla.kernel.org/show_bug.cgi?id=42342
    https://bugzilla.kernel.org/show_bug.cgi?id=63841
    https://bugzilla.kernel.org/show_bug.cgi?id=78761

    [2]:
    http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
    fs/hfs/bnode.c?id=d1081202f1d0ee35ab0beb490da4b65d4bc763db
    commit d1081202f1d0ee35ab0beb490da4b65d4bc763db
    Author: Andrew Morton
    Date: Wed Feb 25 16:17:36 2004 -0800

    [PATCH] HFS rewrite

    http://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/\
    fs/hfsplus/bnode.c?id=91556682e0bf004d98a529bf829d339abb98bbbd

    commit 91556682e0bf004d98a529bf829d339abb98bbbd
    Author: Andrew Morton
    Date: Wed Feb 25 16:17:48 2004 -0800

    [PATCH] HFS+ support

    [3]:
    http://sourceforge.net/projects/linux-hfsplus/

    http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.1/
    http://sourceforge.net/projects/linux-hfsplus/files/Linux%202.4.x%20patch/hfsplus%200.2/

    http://linux-hfsplus.cvs.sourceforge.net/viewvc/linux-hfsplus/linux/\
    fs/hfsplus/bnode.c?r1=1.4&r2=1.5

    Date: Thu Jun 6 09:45:14 2002 +0000
    Use buffer cache instead of page cache in bnode.c. Cache inode extents.

    [4]:
    http://git.kernel.org/cgit/linux/kernel/git/\
    stable/linux-stable.git/commit/?id=a5e3985fa014029eb6795664c704953720cc7f7d

    commit a5e3985fa014029eb6795664c704953720cc7f7d
    Author: Roman Zippel
    Date: Tue Sep 6 15:18:47 2005 -0700

    [PATCH] hfs: remove debug code

    Signed-off-by: Hin-Tak Leung
    Signed-off-by: Sergei Antonov
    Reviewed-by: Anton Altaparmakov
    Reported-by: Sasha Levin
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Vyacheslav Dubeyko
    Cc: Sougata Santra
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hin-Tak Leung
     

05 Sep, 2015

1 commit

  • Many file systems that implement the show_options hook fail to correctly
    escape their output which could lead to unescaped characters (e.g. new
    lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
    could lead to confusion, spoofed entries (resulting in things like
    systemd issuing false d-bus "mount" notifications), and who knows what
    else. This looks like it would only be the root user stepping on
    themselves, but it's possible weird things could happen in containers or
    in other situations with delegated mount privileges.

    Here's an example using overlay with setuid fusermount trusting the
    contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
    of "sudo" is something more sneaky:

    $ BASE="ovl"
    $ MNT="$BASE/mnt"
    $ LOW="$BASE/lower"
    $ UP="$BASE/upper"
    $ WORK="$BASE/work/ 0 0
    none /proc fuse.pwn user_id=1000"
    $ mkdir -p "$LOW" "$UP" "$WORK"
    $ sudo mount -t overlay -o "lowerdir=$LOW,upperdir=$UP,workdir=$WORK" none /mnt
    $ cat /proc/mounts
    none /root/ovl/mnt overlay rw,relatime,lowerdir=ovl/lower,upperdir=ovl/upper,workdir=ovl/work/ 0 0
    none /proc fuse.pwn user_id=1000 0 0
    $ fusermount -u /proc
    $ cat /proc/mounts
    cat: /proc/mounts: No such file or directory

    This fixes the problem by adding new seq_show_option and
    seq_show_option_n helpers, and updating the vulnerable show_option
    handlers to use them as needed. Some, like SELinux, need to be open
    coded due to unusual existing escape mechanisms.

    [akpm@linux-foundation.org: add lost chunk, per Kees]
    [keescook@chromium.org: seq_show_option should be using const parameters]
    Signed-off-by: Kees Cook
    Acked-by: Serge Hallyn
    Acked-by: Jan Kara
    Acked-by: Paul Moore
    Cc: J. R. Okajima
    Signed-off-by: Kees Cook
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kees Cook
     

05 Jul, 2015

1 commit

  • Pull more vfs updates from Al Viro:
    "Assorted VFS fixes and related cleanups (IMO the most interesting in
    that part are f_path-related things and Eric's descriptor-related
    stuff). UFS regression fixes (it got broken last cycle). 9P fixes.
    fs-cache series, DAX patches, Jan's file_remove_suid() work"

    [ I'd say this is much more than "fixes and related cleanups". The
    file_table locking rule change by Eric Dumazet is a rather big and
    fundamental update even if the patch isn't huge. - Linus ]

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (49 commits)
    9p: cope with bogus responses from server in p9_client_{read,write}
    p9_client_write(): avoid double p9_free_req()
    9p: forgetting to cancel request on interrupted zero-copy RPC
    dax: bdev_direct_access() may sleep
    block: Add support for DAX reads/writes to block devices
    dax: Use copy_from_iter_nocache
    dax: Add block size note to documentation
    fs/file.c: __fget() and dup2() atomicity rules
    fs/file.c: don't acquire files->file_lock in fd_install()
    fs:super:get_anon_bdev: fix race condition could cause dev exceed its upper limitation
    vfs: avoid creation of inode number 0 in get_next_ino
    namei: make set_root_rcu() return void
    make simple_positive() public
    ufs: use dir_pages instead of ufs_dir_pages()
    pagemap.h: move dir_pages() over there
    remove the pointless include of lglock.h
    fs: cleanup slight list_entry abuse
    xfs: Correctly lock inode when removing suid and file capabilities
    fs: Call security_ops->inode_killpriv on truncate
    fs: Provide function telling whether file_remove_privs() will do anything
    ...

    Linus Torvalds
     

24 Jun, 2015

1 commit

  • list_entry is just a wrapper for container_of, but it is arguably
    wrong (and slightly confusing) to use it when the pointed-to struct
    member is not a struct list_head. Use container_of directly instead.

    Signed-off-by: Rasmus Villemoes
    Signed-off-by: Al Viro

    Rasmus Villemoes
     

02 Jun, 2015

1 commit

  • With the planned cgroup writeback support, backing-dev related
    declarations will be more widely used across block and cgroup;
    unfortunately, including backing-dev.h from include/linux/blkdev.h
    makes cyclic include dependency quite likely.

    This patch separates out backing-dev-defs.h which only has the
    essential definitions and updates blkdev.h to include it. c files
    which need access to more backing-dev details now include
    backing-dev.h directly. This takes backing-dev.h off the common
    include dependency chain making it a lot easier to use it across block
    and cgroup.

    v2: fs/fat build failure fixed.

    Signed-off-by: Tejun Heo
    Reviewed-by: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    Tejun Heo
     

27 Apr, 2015

1 commit

  • Pull fourth vfs update from Al Viro:
    "d_inode() annotations from David Howells (sat in for-next since before
    the beginning of merge window) + four assorted fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    RCU pathwalk breakage when running into a symlink overmounting something
    fix I_DIO_WAKEUP definition
    direct-io: only inc/dec inode->i_dio_count for file systems
    fs/9p: fix readdir()
    VFS: assorted d_backing_inode() annotations
    VFS: fs/inode.c helpers: d_inode() annotations
    VFS: fs/cachefiles: d_backing_inode() annotations
    VFS: fs library helpers: d_inode() annotations
    VFS: assorted weird filesystems: d_inode() annotations
    VFS: normal filesystems (and lustre): d_inode() annotations
    VFS: security/: d_inode() annotations
    VFS: security/: d_backing_inode() annotations
    VFS: net/: d_inode() annotations
    VFS: net/unix: d_backing_inode() annotations
    VFS: kernel/: d_inode() annotations
    VFS: audit: d_backing_inode() annotations
    VFS: Fix up some ->d_inode accesses in the chelsio driver
    VFS: Cachefiles should perform fs modifications on the top layer only
    VFS: AF_UNIX sockets should call mknod on the top layer only

    Linus Torvalds
     

17 Apr, 2015

10 commits

  • Merge third patchbomb from Andrew Morton:

    - various misc things

    - a couple of lib/ optimisations

    - provide DIV_ROUND_CLOSEST_ULL()

    - checkpatch updates

    - rtc tree

    - befs, nilfs2, hfs, hfsplus, fatfs, adfs, affs, bfs

    - ptrace fixes

    - fork() fixes

    - seccomp cleanups

    - more mmap_sem hold time reductions from Davidlohr

    * emailed patches from Andrew Morton : (138 commits)
    proc: show locks in /proc/pid/fdinfo/X
    docs: add missing and new /proc/PID/status file entries, fix typos
    drivers/rtc/rtc-at91rm9200.c: make IO endian agnostic
    Documentation/spi/spidev_test.c: fix warning
    drivers/rtc/rtc-s5m.c: allow usage on device type different than main MFD type
    .gitignore: ignore *.tar
    MAINTAINERS: add Mediatek SoC mailing list
    tomoyo: reduce mmap_sem hold for mm->exe_file
    powerpc/oprofile: reduce mmap_sem hold for exe_file
    oprofile: reduce mmap_sem hold for mm->exe_file
    mips: ip32: add platform data hooks to use DS1685 driver
    lib/Kconfig: fix up HAVE_ARCH_BITREVERSE help text
    x86: switch to using asm-generic for seccomp.h
    sparc: switch to using asm-generic for seccomp.h
    powerpc: switch to using asm-generic for seccomp.h
    parisc: switch to using asm-generic for seccomp.h
    mips: switch to using asm-generic for seccomp.h
    microblaze: use asm-generic for seccomp.h
    arm: use asm-generic for seccomp.h
    seccomp: allow COMPAT sigreturn overrides
    ...

    Linus Torvalds
     
  • On Mac OS X, HFS+ extended attributes are not namespaced. Since we want
    to be compatible with OS X filesystems and yet still support the Linux
    namespacing system, the hfsplus driver implements a special "osx"
    namespace that is reported for any attribute that is not namespaced
    on-disk. However, the current code for getting and setting these
    unprefixed attributes is broken.

    hfsplus_osx_setattr() and hfsplus_osx_getattr() are passed names that have
    already had their "osx." prefixes stripped by the generic functions. The
    functions first, quite correctly, check those names to make sure that they
    aren't prefixed with a known namespace, which would allow namespace access
    restrictions to be bypassed. However, the functions then prepend "osx."
    to the name they're given before passing it on to hfsplus_getattr() and
    hfsplus_setattr(). Not only does this cause the "osx." prefix to be
    stored on-disk, defeating its purpose, it also breaks the check for the
    special "com.apple.FinderInfo" attribute, which is reported for all files,
    and as a consequence makes some userspace applications (e.g. GNU patch)
    fail even when extended attributes are not otherwise in use.

    There are five commits which have touched this particular code:

    127e5f5ae51e ("hfsplus: rework functionality of getting, setting and deleting of extended attributes")
    b168fff72109 ("hfsplus: use xattr handlers for removexattr")
    bf29e886b242 ("hfsplus: correct usage of HFSPLUS_ATTR_MAX_STRLEN for non-English attributes")
    fcacbd95e121 ("fs/hfsplus: move xattr_name allocation in hfsplus_getxattr()")
    ec1bbd346f18 ("fs/hfsplus: move xattr_name allocation in hfsplus_setxattr()")

    The first commit creates the functions to begin with. The namespace is
    prepended by the original code, which I believe was correct at the time,
    since hfsplus_?etattr() stripped the prefix if found. The second commit
    removes this behavior from hfsplus_?etattr() and appears to have been
    intended to also remove the prefixing from hfsplus_osx_?etattr().
    However, what it actually does is remove a necessary strncpy() call
    completely, breaking the osx namespace entirely. The third commit re-adds
    the strncpy() call as it was originally, but doesn't mention it in its
    commit message. The final two commits refactor the code and don't affect
    its functionality.

    This commit does what b168fff attempted to do (prevent the prefix from
    being added), but does it properly, instead of passing in an empty buffer
    (which is what b168fff actually did).

    Fixes: b168fff72109 ("hfsplus: use xattr handlers for removexattr")
    Signed-off-by: Thomas Hebb
    Cc: Hin-Tak Leung
    Cc: Sergei Antonov
    Cc: Anton Altaparmakov
    Cc: Fabian Frederick
    Cc: Christian Kujau
    Cc: Christoph Hellwig
    Cc: Al Viro
    Cc: Viacheslav Dubeyko
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thomas Hebb
     
  • Fix a bug which is reproduced as follows. Create a file:

    echo abc > test_file

    Try to expand the file beyond available space:

    truncate --size= test_file

    Since HFS+ does not support file size > allocated size, truncate should
    fail. However, it ends successfully. The driver returns success despite
    having been unable to allocate the requested space for the file. Also
    filesystem check finds an error:

    Checking catalog file.
    Incorrect size for file test_file
    (It should be 469094400 instead of 1000000000)

    Add a piece of code analogous to code in the fat driver. Now a proper
    error is returned and filesystem remains consistent.

    Signed-off-by: Sergei Antonov
    Cc: Vyacheslav Dubeyko
    Cc: Hin-Tak Leung
    Reviewed-by: Anton Altaparmakov
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Sougata Santra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergei Antonov
     
  • In case of memory allocation error, the return should be -ENOMEM, instead
    of -ENOSPC.

    Signed-off-by: Chengyu Song
    Reviewed-by: Sergei Antonov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chengyu Song
     
  • Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • is_known_namespace() only returns true/false. Also remove inline and let
    compiler decide what to do with static functions.

    Signed-off-by: Fabian Frederick
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • According to commit 5f16f3225b06 ("ext4: atomically set inode->i_flags in
    ext4_set_inode_flags()").

    Signed-off-by: Fabian Frederick
    Cc: "Theodore Ts'o"
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • security/trusted/user/osx setxattr did the same
    xattr_name initialization. Move that operation in hfsplus_setxattr().

    Tested with security/trusted/user getfattr/setfattr

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • security/trusted/user/osx getxattr did the same
    xattr_name initialization. Move that operation in hfsplus_getxattr().

    Tested with security/trusted/user getfattr/setfattr

    Signed-off-by: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • This doesn't change how the code works, but clearly the curly braces were
    intended.

    Signed-off-by: Dan Carpenter
    Cc: Vyacheslav Dubeyko
    Cc: Sougata Santra
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     

16 Apr, 2015

1 commit


12 Apr, 2015

4 commits


09 Apr, 2015

1 commit


26 Mar, 2015

2 commits

  • struct kiocb now is a generic I/O container, so move it to fs.h.
    Also do a #include diet for aio.h while we're at it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Fix B-tree corruption when a new record is inserted at position 0 in the
    node in hfs_brec_insert(). In this case a hfs_brec_update_parent() is
    called to update the parent index node (if exists) and it is passed
    hfs_find_data with a search_key containing a newly inserted key instead
    of the key to be updated. This results in an inconsistent index node.
    The bug reproduces on my machine after an extents overflow record for
    the catalog file (CNID=4) is inserted into the extents overflow B-tree.
    Because of a low (reserved) value of CNID=4, it has to become the first
    record in the first leaf node.

    The resulting first leaf node is correct:

    ----------------------------------------------------
    | key0.CNID=4 | key1.CNID=123 | key2.CNID=456, ... |
    ----------------------------------------------------

    But the parent index key0 still contains the previous key CNID=123:

    -----------------------
    | key0.CNID=123 | ... |
    -----------------------

    A change in hfs_brec_insert() makes hfs_brec_update_parent() work
    correctly by preventing it from getting fd->record=-1 value from
    __hfs_brec_find().

    Along the way, I removed duplicate code with unification of the if
    condition. The resulting code is equivalent to the original code
    because node is never 0.

    Also hfs_brec_update_parent() will now return an error after getting a
    negative fd->record value. However, the return value of
    hfs_brec_update_parent() is not checked anywhere in the file and I'm
    leaving it unchanged by this patch. brec.c lacks error checking after
    some other calls too, but this issue is of less importance than the one
    being fixed by this patch.

    Signed-off-by: Sergei Antonov
    Cc: Joe Perches
    Reviewed-by: Vyacheslav Dubeyko
    Acked-by: Hin-Tak Leung
    Cc: Anton Altaparmakov
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergei Antonov
     

23 Feb, 2015

1 commit

  • Convert the following where appropriate:

    (1) S_ISLNK(dentry->d_inode) to d_is_symlink(dentry).

    (2) S_ISREG(dentry->d_inode) to d_is_reg(dentry).

    (3) S_ISDIR(dentry->d_inode) to d_is_dir(dentry). This is actually more
    complicated than it appears as some calls should be converted to
    d_can_lookup() instead. The difference is whether the directory in
    question is a real dir with a ->lookup op or whether it's a fake dir with
    a ->d_automount op.

    In some circumstances, we can subsume checks for dentry->d_inode not being
    NULL into this, provided we the code isn't in a filesystem that expects
    d_inode to be NULL if the dirent really *is* negative (ie. if we're going to
    use d_inode() rather than d_backing_inode() to get the inode pointer).

    Note that the dentry type field may be set to something other than
    DCACHE_MISS_TYPE when d_inode is NULL in the case of unionmount, where the VFS
    manages the fall-through from a negative dentry to a lower layer. In such a
    case, the dentry type of the negative union dentry is set to the same as the
    type of the lower dentry.

    However, if you know d_inode is not NULL at the call site, then you can use
    the d_is_xxx() functions even in a filesystem.

    There is one further complication: a 0,0 chardev dentry may be labelled
    DCACHE_WHITEOUT_TYPE rather than DCACHE_SPECIAL_TYPE. Strictly, this was
    intended for special directory entry types that don't have attached inodes.

    The following perl+coccinelle script was used:

    use strict;

    my @callers;
    open($fd, 'git grep -l \'S_IS[A-Z].*->d_inode\' |') ||
    die "Can't grep for S_ISDIR and co. callers";
    @callers = ;
    close($fd);
    unless (@callers) {
    print "No matches\n";
    exit(0);
    }

    my @cocci = (
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISLNK(E->d_inode->i_mode)',
    '+ d_is_symlink(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISDIR(E->d_inode->i_mode)',
    '+ d_is_dir(E)',
    '',
    '@@',
    'expression E;',
    '@@',
    '',
    '- S_ISREG(E->d_inode->i_mode)',
    '+ d_is_reg(E)' );

    my $coccifile = "tmp.sp.cocci";
    open($fd, ">$coccifile") || die $coccifile;
    print($fd "$_\n") || die $coccifile foreach (@cocci);
    close($fd);

    foreach my $file (@callers) {
    chomp $file;
    print "Processing ", $file, "\n";
    system("spatch", "--sp-file", $coccifile, $file, "--in-place", "--no-show-diff") == 0 ||
    die "spatch failed";
    }

    [AV: overlayfs parts skipped]

    Signed-off-by: David Howells
    Signed-off-by: Al Viro

    David Howells
     

19 Dec, 2014

1 commit

  • Longname is not correctly handled by hfsplus driver. If an attempt to
    create a longname(>255) file/directory is made, it succeeds by creating a
    file/directory with HFSPLUS_MAX_STRLEN and incorrect catalog key. Thus
    leaving the volume in an inconsistent state. This patch fixes this issue.

    Although lookup is always called first to create a negative entry, so just
    doing a check in lookup would probably fix this issue. I choose to
    propagate error to other iops as well.

    Please NOTE: I have factored out hfsplus_cat_build_key_with_cnid from
    hfsplus_cat_build_key, to avoid unncessary branching.

    Thanks a lot.

    TEST:
    ------
    dir="TEST_DIR"
    cdir=`pwd`
    name255="_123456789_123456789_123456789_123456789_123456789_123456789\
    _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
    _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
    _123456789_123456789_123456789_123456789_123456789_1234"
    name256="${name255}5"

    mkdir $dir
    cd $dir
    touch $name255
    rm -f $name255
    touch $name256
    ls -la
    cd $cdir
    rm -rf $dir

    RESULT:
    -------
    [sougata@ultrabook tmp]$ cdir=`pwd`
    [sougata@ultrabook tmp]$
    name255="_123456789_123456789_123456789_123456789_123456789_123456789\
    > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
    > _123456789_123456789_123456789_123456789_123456789_123456789_123456789\
    > _123456789_123456789_123456789_123456789_123456789_1234"
    [sougata@ultrabook tmp]$ name256="${name255}5"
    [sougata@ultrabook tmp]$
    [sougata@ultrabook tmp]$ mkdir $dir
    [sougata@ultrabook tmp]$ cd $dir
    [sougata@ultrabook TEST_DIR]$ touch $name255
    [sougata@ultrabook TEST_DIR]$ rm -f $name255
    [sougata@ultrabook TEST_DIR]$ touch $name256
    [sougata@ultrabook TEST_DIR]$ ls -la
    ls: cannot access
    _123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234:
    No such file or directory
    total 0
    drwxrwxr-x 1 sougata sougata 3 Feb 20 19:56 .
    drwxrwxrwx 1 root root 6 Feb 20 19:56 ..
    -????????? ? ? ? ? ?
    _123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_123456789_1234
    [sougata@ultrabook TEST_DIR]$ cd $cdir
    [sougata@ultrabook tmp]$ rm -rf $dir
    rm: cannot remove `TEST_DIR': Directory not empty

    -ENAMETOOLONG returned from hfsplus_asc2uni was not propaged to iops.
    This allowed hfsplus to create files/directories with HFSPLUS_MAX_STRLEN
    and incorrect keys, leaving the FS in an inconsistent state. This patch
    fixes this issue.

    Signed-off-by: Sougata Santra
    Reviewed-by: Christoph Hellwig
    Cc: Vyacheslav Dubeyko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sougata Santra
     

13 Jun, 2014

1 commit

  • Pull vfs updates from Al Viro:
    "This the bunch that sat in -next + lock_parent() fix. This is the
    minimal set; there's more pending stuff.

    In particular, I really hope to get acct.c fixes merged this cycle -
    we need that to deal sanely with delayed-mntput stuff. In the next
    pile, hopefully - that series is fairly short and localized
    (kernel/acct.c, fs/super.c and fs/namespace.c). In this pile: more
    iov_iter work. Most of prereqs for ->splice_write with sane locking
    order are there and Kent's dio rewrite would also fit nicely on top of
    this pile"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (70 commits)
    lock_parent: don't step on stale ->d_parent of all-but-freed one
    kill generic_file_splice_write()
    ceph: switch to iter_file_splice_write()
    shmem: switch to iter_file_splice_write()
    nfs: switch to iter_splice_write_file()
    fs/splice.c: remove unneeded exports
    ocfs2: switch to iter_file_splice_write()
    ->splice_write() via ->write_iter()
    bio_vec-backed iov_iter
    optimize copy_page_{to,from}_iter()
    bury generic_file_aio_{read,write}
    lustre: get rid of messing with iovecs
    ceph: switch to ->write_iter()
    ceph_sync_direct_write: stop poking into iov_iter guts
    ceph_sync_read: stop poking into iov_iter guts
    new helper: copy_page_from_iter()
    fuse: switch to ->write_iter()
    btrfs: switch to ->write_iter()
    ocfs2: switch to ->write_iter()
    xfs: switch to ->write_iter()
    ...

    Linus Torvalds
     

07 Jun, 2014

4 commits

  • Commit a99b7069aab8 ("hfsplus: Fix undefined __divdi3 in
    hfsplus_init_header_node()") introduced do_div() to xattr.c and the
    warning below too.

    As Geert remarked: "tmp" is "loff_t" which is "__kernel_loff_t", which
    is "long long", i.e. signed, while include/asm-generic/div64.h compares
    its type with "uint64_t". As inode sizes are positive, it should be
    safe to change the type of "tmp" to "u64".

    In file included from
    arch/powerpc/include/asm/div64.h:1:0,
    from include/linux/kernel.h:124,
    from include/asm-generic/bug.h:13,
    from arch/powerpc/include/asm/bug.h:127,
    from include/linux/bug.h:4,
    from include/linux/thread_info.h:11,
    from include/asm-generic/preempt.h:4,
    from arch/powerpc/include/generated/asm/preempt.h:1,
    from include/linux/preempt.h:18,
    from include/linux/spinlock.h:50,
    from include/linux/wait.h:8,
    from include/linux/fs.h:6,
    from fs/hfsplus/hfsplus_fs.h:19,
    from fs/hfsplus/xattr.c:9:
    fs/hfsplus/xattr.c: In function 'hfsplus_init_header_node':
    include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default]
    (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \
    ^
    fs/hfsplus/xattr.c:86:2: note: in expansion of macro 'do_div'
    do_div(tmp, node_size);
    ^

    Signed-off-by: Christian Kujau
    Signed-off-by: Geert Uytterhoeven
    Acked-by: Sergei Antonov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christian Kujau
     
  • Signed-off-by: Fabian Frederick
    Suggested-By: Vyacheslav Dubeyko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • Some function declarations in hfsplus_fs.h were with argument names,
    some without, and some were mixed. This patch adds argument names
    everywhere, sorts function in order they go in .c files, and moves
    hfs_part_find() to a proper section.

    Auto-formatting and sorting was done with:
    cfunctions *.c | indent -linux | sed "s| \* | \*|"

    Signed-off-by: Sergei Antonov
    Cc: Vyacheslav Dubeyko
    Cc: Hin-Tak Leung
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sergei Antonov
     
  • Replace while blocksize;shift by ilog2

    Signed-off-by: Fabian Frederick
    Cc: Vyacheslav Dubeyko
    Cc: Joe Perches
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick