06 Jan, 2021

1 commit

  • commit dc889b8d4a8122549feabe99eead04e6b23b6513 upstream.

    Make the printk() [bfs "printf" macro] seem less severe by changing
    "WARNING:" to "NOTE:".

    warns us about using WARNING or BUG in a format string
    other than in WARN() or BUG() family macros. bfs/inode.c is doing just
    that in a normal printk() call, so change the "WARNING" string to be
    "NOTE".

    Link: https://lkml.kernel.org/r/20201203212634.17278-1-rdunlap@infradead.org
    Reported-by: syzbot+3fd34060f26e766536ff@syzkaller.appspotmail.com
    Signed-off-by: Randy Dunlap
    Cc: Dmitry Vyukov
    Cc: Al Viro
    Cc: "Tigran A. Aivazian"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Randy Dunlap
     

19 Sep, 2020

1 commit


30 Aug, 2019

1 commit

  • Fill in the appropriate limits to avoid inconsistencies
    in the vfs cached inode times when timestamps are
    outside the permitted range.

    Even though some filesystems are read-only, fill in the
    timestamps to reflect the on-disk representation.

    Signed-off-by: Deepa Dinamani
    Reviewed-by: Darrick J. Wong
    Acked-By: Tigran Aivazian
    Acked-by: Jeff Layton
    Cc: aivazian.tigran@gmail.com
    Cc: al@alarsen.net
    Cc: coda@cs.cmu.edu
    Cc: darrick.wong@oracle.com
    Cc: dushistov@mail.ru
    Cc: dwmw2@infradead.org
    Cc: hch@infradead.org
    Cc: jack@suse.com
    Cc: jaharkes@cs.cmu.edu
    Cc: luisbg@kernel.org
    Cc: nico@fluxnic.net
    Cc: phillip@squashfs.org.uk
    Cc: richard@nod.at
    Cc: salah.triki@gmail.com
    Cc: shaggy@kernel.org
    Cc: linux-xfs@vger.kernel.org
    Cc: codalist@coda.cs.cmu.edu
    Cc: linux-ext4@vger.kernel.org
    Cc: linux-mtd@lists.infradead.org
    Cc: jfs-discussion@lists.sourceforge.net
    Cc: reiserfs-devel@vger.kernel.org

    Deepa Dinamani
     

21 May, 2019

1 commit

  • Add SPDX license identifiers to all files which:

    - Have no license information of any form

    - Have MODULE_LICENCE("GPL*") inside which was used in the initial
    scan/conversion to ignore the file

    These files fall under the project license, GPL v2 only. The resulting SPDX
    license identifier is:

    GPL-2.0-only

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

02 May, 2019

1 commit


05 Jan, 2019

1 commit

  • Strengthen validation of BFS superblock against corruption. Make
    in-core inode bitmap static part of superblock info structure. Print a
    warning when mounting a BFS filesystem created with "-N 512" option as
    only 510 files can be created in the root directory. Make the kernel
    messages more uniform. Update the 'prefix' passed to bfs_dump_imap() to
    match the current naming of operations. White space and comments
    cleanup.

    Link: http://lkml.kernel.org/r/CAK+_RLkFZMduoQF36wZFd3zLi-6ZutWKsydjeHFNdtRvZZEb4w@mail.gmail.com
    Signed-off-by: Tigran Aivazian
    Reported-by: Tetsuo Handa
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tigran Aivazian
     

04 Nov, 2018

1 commit

  • syzbot is reporting too large memory allocation at bfs_fill_super() [1].
    Since file system image is corrupted such that bfs_sb->s_start == 0,
    bfs_fill_super() is trying to allocate 8MB of continuous memory. Fix
    this by adding a sanity check on bfs_sb->s_start, __GFP_NOWARN and
    printf().

    [1] https://syzkaller.appspot.com/bug?id=16a87c236b951351374a84c8a32f40edbc034e96

    Link: http://lkml.kernel.org/r/1525862104-3407-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
    Signed-off-by: Tetsuo Handa
    Reported-by: syzbot
    Reviewed-by: Andrew Morton
    Cc: Tigran Aivazian
    Cc: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     

13 Jul, 2017

1 commit

  • Mount fails if file system image has empty files because of sanity check
    while reading superblock. For empty files disk offset to end of file
    (i_eoffset) is cpu_to_le32(-1). Sanity check comparison, which compares
    disk offset with file system size isn't valid for this value and hence
    is ignored with this patch.

    Steps to reproduce:

    $ dd if=/dev/zero of=bfs-image count=204800
    $ mkfs.bfs bfs-image
    $ mkdir bfs-mount-point
    $ sudo mount -t bfs -o loop bfs-image bfs-mount-point/
    $ cd bfs-mount-point/
    $ sudo touch a
    $ cd ..
    $ sudo umount bfs-mount-point/
    $ sudo mount -t bfs -o loop bfs-image bfs-mount-point/
    mount: /dev/loop0: can't read superblock

    $ dmesg
    [25526.689580] BFS-fs: bfs_fill_super(): Inode 0x00000003 corrupted

    Tigran said:
    "If you had created the filesystem with the proper mkfs under SCO
    UnixWare 7 you (probably) wouldn't encounter this issue. But since
    commercial Unix-es are now part of history and the only proper way is
    the Linux mkfs.bfs utility, your patch is fine"

    Link: http://lkml.kernel.org/r/20170505201625.GA3097@hercules.tuxera.com
    Signed-off-by: Rakesh Pandit
    Acked-by: Tigran Aivazian
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rakesh Pandit
     

13 May, 2017

1 commit


25 Dec, 2016

1 commit


15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

26 Mar, 2015

1 commit


09 Aug, 2014

1 commit

  • All bfs related functions use bfs_ prefix. This patch also moves extern
    declaration to bfs.h and removes prototype from inode.c

    This fixes checkpatch warning:

    WARNING: externs should be avoided in .c files

    Signed-off-by: Fabian Frederick
    Cc: "Tigran A. Aivazian"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     

08 Apr, 2014

1 commit


04 Apr, 2014

1 commit

  • Reclaim will be leaving shadow entries in the page cache radix tree upon
    evicting the real page. As those pages are found from the LRU, an
    iput() can lead to the inode being freed concurrently. At this point,
    reclaim must no longer install shadow pages because the inode freeing
    code needs to ensure the page tree is really empty.

    Add an address_space flag, AS_EXITING, that the inode freeing code sets
    under the tree lock before doing the final truncate. Reclaim will check
    for this flag before installing shadow pages.

    Signed-off-by: Johannes Weiner
    Reviewed-by: Rik van Riel
    Reviewed-by: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Bob Liu
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Greg Thelen
    Cc: Hugh Dickins
    Cc: Jan Kara
    Cc: KOSAKI Motohiro
    Cc: Luigi Semenzato
    Cc: Mel Gorman
    Cc: Metin Doslu
    Cc: Michel Lespinasse
    Cc: Ozgun Erdogan
    Cc: Peter Zijlstra
    Cc: Roman Gushchin
    Cc: Ryan Mallon
    Cc: Tejun Heo
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

25 Aug, 2013

1 commit


04 Mar, 2013

1 commit

  • Modify the request_module to prefix the file system type with "fs-"
    and add aliases to all of the filesystems that can be built as modules
    to match.

    A common practice is to build all of the kernel code and leave code
    that is not commonly needed as modules, with the result that many
    users are exposed to any bug anywhere in the kernel.

    Looking for filesystems with a fs- prefix limits the pool of possible
    modules that can be loaded by mount to just filesystems trivially
    making things safer with no real cost.

    Using aliases means user space can control the policy of which
    filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
    with blacklist and alias directives. Allowing simple, safe,
    well understood work-arounds to known problematic software.

    This also addresses a rare but unfortunate problem where the filesystem
    name is not the same as it's module name and module auto-loading
    would not work. While writing this patch I saw a handful of such
    cases. The most significant being autofs that lives in the module
    autofs4.

    This is relevant to user namespaces because we can reach the request
    module in get_fs_type() without having any special permissions, and
    people get uncomfortable when a user specified string (in this case
    the filesystem type) goes all of the way to request_module.

    After having looked at this issue I don't think there is any
    particular reason to perform any filtering or permission checks beyond
    making it clear in the module request that we want a filesystem
    module. The common pattern in the kernel is to call request_module()
    without regards to the users permissions. In general all a filesystem
    module does once loaded is call register_filesystem() and go to sleep.
    Which means there is not much attack surface exposed by loading a
    filesytem module unless the filesystem is mounted. In a user
    namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
    which most filesystems do not set today.

    Acked-by: Serge Hallyn
    Acked-by: Kees Cook
    Reported-by: Kees Cook
    Signed-off-by: "Eric W. Biederman"

    Eric W. Biederman
     

03 Oct, 2012

2 commits

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     
  • There's no reason to call rcu_barrier() on every
    deactivate_locked_super(). We only need to make sure that all delayed rcu
    free inodes are flushed before we destroy related cache.

    Removing rcu_barrier() from deactivate_locked_super() affects some fast
    paths. E.g. on my machine exit_group() of a last process in IPC
    namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.

    Signed-off-by: Kirill A. Shutemov
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    Kirill A. Shutemov
     

21 Sep, 2012

1 commit


06 May, 2012

1 commit

  • After we moved inode_sync_wait() from end_writeback() it doesn't make sense
    to call the function end_writeback() anymore. Rename it to clear_inode()
    which well says what the function really does - set I_CLEAR flag.

    Signed-off-by: Jan Kara
    Signed-off-by: Fengguang Wu

    Jan Kara
     

21 Mar, 2012

1 commit


04 Jan, 2012

1 commit

  • Seeing that just about every destructor got that INIT_LIST_HEAD() copied into
    it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once();
    the cost of taking it into inode_init_always() will be negligible for pipes
    and sockets and negative for everything else. Not to mention the removal of
    boilerplate code from ->destroy_inode() instances...

    Signed-off-by: Al Viro

    Al Viro
     

02 Nov, 2011

1 commit


07 Jan, 2011

1 commit

  • RCU free the struct inode. This will allow:

    - Subsequent store-free path walking patch. The inode must be consulted for
    permissions when walking, so an RCU inode reference is a must.
    - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
    to take i_lock no longer need to take sb_inode_list_lock to walk the list in
    the first place. This will simplify and optimize locking.
    - Could remove some nested trylock loops in dcache code
    - Could potentially simplify things a bit in VM land. Do not need to take the
    page lock to follow page->mapping.

    The downsides of this is the performance cost of using RCU. In a simple
    creat/unlink microbenchmark, performance drops by about 10% due to inability to
    reuse cache-hot slab objects. As iterations increase and RCU freeing starts
    kicking over, this increases to about 20%.

    In cases where inode lifetimes are longer (ie. many inodes may be allocated
    during the average life span of a single inode), a lot of this cache reuse is
    not applicable, so the regression caused by this patch is smaller.

    The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
    however this adds some complexity to list walking and store-free path walking,
    so I prefer to implement this at a later date, if it is shown to be a win in
    real situations. I haven't found a regression in any non-micro benchmark so I
    doubt it will be a problem.

    Signed-off-by: Nick Piggin

    Nick Piggin
     

29 Oct, 2010

1 commit


05 Oct, 2010

2 commits

  • The BKL is only used in put_super and fill_super that are both protected by
    the superblocks s_umount rw_semaphore. Therefore it is safe to remove the BKL
    entirely.

    Signed-off-by: Jan Blunck
    Signed-off-by: Arnd Bergmann

    Jan Blunck
     
  • This patch is a preparation necessary to remove the BKL from do_new_mount().
    It explicitly adds calls to lock_kernel()/unlock_kernel() around
    get_sb/fill_super operations for filesystems that still uses the BKL.

    I've read through all the code formerly covered by the BKL inside
    do_kern_mount() and have satisfied myself that it doesn't need the BKL
    any more.

    do_kern_mount() is already called without the BKL when mounting the rootfs
    and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
    from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
    through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
    afs_mntpt_follow_link(). Both later functions are actually the filesystems
    follow_link inode operation. vfs_kern_mount() is calling the specified
    get_sb function and lets the filesystem do its job by calling the given
    fill_super function.

    Therefore I think it is safe to push down the BKL from the VFS to the
    low-level filesystems get_sb/fill_super operation.

    [arnd: do not add the BKL to those file systems that already
    don't use it elsewhere]

    Signed-off-by: Jan Blunck
    Signed-off-by: Arnd Bergmann
    Cc: Matthew Wilcox
    Cc: Christoph Hellwig

    Jan Blunck
     

10 Aug, 2010

2 commits

  • BFS is a very simple FS and its superblocks contains only static
    information and is never changed. However, the BFS code for some
    misterious reasons marked its buffer head as dirty from time to
    time, but nothing in that buffer was ever changed.

    This patch removes all the BFS superblock manipulation, simply
    because it is not needed. It removes:

    1. The si_sbh filed from 'struct bfs_sb_info' because it is not
    needed. We only need to read the SB once on mount to get the
    start of data blocks and the FS size. After this, we can forget
    about the SB.
    2. All instances of 'mark_buffer_dirty(sbh)' for BFS SB because
    it is never changed.
    3. The '->sync_fs()' method because there is nothing to sync
    (inodes are synched by VFS).
    4. The '->write_super()' method, again, because the SB is never
    changed.

    Tested-by: Artem Bityutskiy
    Signed-off-by: Artem Bityutskiy
    Signed-off-by: Al Viro

    Artem Bityutskiy
     
  • Signed-off-by: Al Viro

    Al Viro
     

06 Mar, 2010

1 commit

  • This gives the filesystem more information about the writeback that
    is happening. Trond requested this for the NFS unstable write handling,
    and other filesystems might benefit from this too by beeing able to
    distinguish between the different callers in more detail.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

27 Jan, 2010

1 commit


12 Jun, 2009

4 commits

  • Add a ->sync_fs method for data integrity syncs, and reimplement
    ->write_super ontop of it.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • Signed-off-by: Al Viro

    Al Viro
     
  • Move BKL into ->put_super from the only caller. A couple of
    filesystems had trivial enough ->put_super (only kfree and NULLing of
    s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs,
    hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most
    of them probably don't need it, but I'd rather sort that out individually.
    Preferably after all the other BKL pushdowns in that area.

    [AV: original used to move lock_super() down as well; these changes are
    removed since we don't do lock_super() at all in generic_shutdown_super()
    now]
    [AV: fuse, btrfs and xfs are known to need no damn BKL, exempt]

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     
  • We just did a full fs writeout using sync_filesystem before, and if
    that's not enough for the filesystem it can perform it's own writeout
    in ->put_super, which many filesystems already do.

    Move a call to foofs_write_super into every foofs_put_super for now to
    guarantee identical behaviour until it's cleaned up by the individual
    filesystem maintainers.

    Exceptions:

    - affs already has identical copy & pasted code at the beginning of
    affs_put_super so no need to do it twice.
    - xfs does the right thing without it and I have changes pending for
    the xfs tree touching this are so I don't really need conflicts
    here..

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

07 Jan, 2009

2 commits

  • Since all sanity checks rely on the validity of s_start which gets only
    checked to be smaller than s_end, we should also check if s_end is sane.
    Now we also try to retrieve the last block of the filesystem, which is
    computed by s_end. If this fails, something is bogus.

    Signed-off-by: Eric Sesterhenn
    Acked-by: Tigran Aivazian

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sesterhenn
     
  • bfs_fill_super() already touches all inodes, so we can easily add some
    cheap sanity checks and check if the inode start and end blocks are
    smaller than the maximum number of blocks, the inode start block lies
    behind the end block or the file end offset is behind the end of the
    filesystem. Also check if the start of data offset in the super block
    fits the filesystem.

    The added sanity checks catch softlockup issues early when we try to
    sb_bread() lots of blocks in a loop in bfs_readdir() and bfs_find_entry().
    In addition an oom issue in bfs_fill_super() is prevented by this when
    s_start is corrupted, which influences imap_len and we try to allocate a
    huge info->si_imap.

    Signed-off-by: Eric Sesterhenn
    Acked-by: Tigran Aivazian

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sesterhenn
     

27 Jul, 2008

2 commits

  • Kmem cache passed to constructor is only needed for constructors that are
    themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
    passed kmem cache in non-trivial way, so pass only pointer to object.

    Non-trivial places are:
    arch/powerpc/mm/init_64.c
    arch/powerpc/mm/hugetlbpage.c

    This is flag day, yes.

    Signed-off-by: Alexey Dobriyan
    Acked-by: Pekka Enberg
    Acked-by: Christoph Lameter
    Cc: Jon Tollefson
    Cc: Nick Piggin
    Cc: Matt Mackall
    [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
    [akpm@linux-foundation.org: fix mm/slab.c]
    [akpm@linux-foundation.org: fix ubifs]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Replace the BKL-based locking scheme used in the bfs driver by a private
    filesystem-wide mutex.

    Signed-off-by: Dmitri Vorobiev
    Cc: Tigran Aivazian
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitri Vorobiev