18 Apr, 2017

11 commits


02 Mar, 2017

1 commit


25 Dec, 2016

1 commit


23 Dec, 2016

1 commit

  • ... and fix the minor buglet in compat io_submit() - native one
    kills ioctx as cleanup when put_user() fails. Get rid of
    bogus compat_... in !CONFIG_AIO case, while we are at it - they
    should simply fail with ENOSYS, same as for native counterparts.

    Signed-off-by: Al Viro

    Al Viro
     

08 Dec, 2016

1 commit

  • put_compat_statfs64() does NOT return -1 and setting errno to EOVERFLOW
    when some variables(like: f_bsize) overflowed in the returned struct.

    The reason is that the ubuf->f_blocks is __u64 type, it couldn't be
    4bits as the judgement in put_comat_statfs64(). Here correct the
    __u32 variables(in struct compat_statfs64) for comparison.

    reproducer:
    step1. mount hugetlbfs with two different pagesize on ppc64 arch.

    $ hugeadm --pool-pages-max 16M:0
    $ hugeadm --create-mount
    $ mount | grep -i hugetlbfs
    none on /var/lib/hugetlbfs/pagesize-16MB type hugetlbfs (rw,relatime,seclabel,pagesize=16777216)
    none on /var/lib/hugetlbfs/pagesize-16GB type hugetlbfs (rw,relatime,seclabel,pagesize=17179869184)

    step2. compile & run this C program.

    $ cat statfs64_test.c

    #define _LARGEFILE64_SOURCE
    #include
    #include
    #include

    int main()
    {
    struct statfs64 sb;
    int err;

    err = syscall(SYS_statfs64, "/var/lib/hugetlbfs/pagesize-16GB", sizeof(sb), &sb);
    if (err)
    return -1;

    printf("sizeof f_bsize = %d, f_bsize=%ld\n", sizeof(sb.f_bsize), sb.f_bsize);

    return 0;
    }

    $ gcc -m32 statfs64_test.c
    $ ./a.out
    sizeof f_bsize = 4, f_bsize=0

    Signed-off-by: Li Wang
    Reviewed-by: Andreas Dilger
    Signed-off-by: Al Viro

    Li Wang
     

28 Sep, 2016

2 commits

  • After 7e8e385aaf6e ("x86/compat: Remove sys32_vm86_warning"), this
    function has become unused, so we can remove it as well.

    Link: http://lkml.kernel.org/r/20160617142903.3070388-1-arnd@arndb.de
    Signed-off-by: Arnd Bergmann
    Cc: Alexander Viro
    Cc: "Theodore Ts'o"
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Andrew Morton

    Arnd Bergmann
     
  • nr_segs should never be less than zero as its type
    is unsigned long, so let's remove this check.

    Signed-off-by: Shawn Lin
    Signed-off-by: Al Viro

    Shawn Lin
     

25 May, 2016

1 commit

  • Pull ext4 updates from Ted Ts'o:
    "Fix a number of bugs, most notably a potential stale data exposure
    after a crash and a potential BUG_ON crash if a file has the data
    journalling flag enabled while it has dirty delayed allocation blocks
    that haven't been written yet. Also fix a potential crash in the new
    project quota code and a maliciously corrupted file system.

    In addition, fix some DAX-specific bugs, including when there is a
    transient ENOSPC situation and races between writes via direct I/O and
    an mmap'ed segment that could lead to lost I/O.

    Finally the usual set of miscellaneous cleanups"

    * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
    ext4: pre-zero allocated blocks for DAX IO
    ext4: refactor direct IO code
    ext4: fix race in transient ENOSPC detection
    ext4: handle transient ENOSPC properly for DAX
    dax: call get_blocks() with create == 1 for write faults to unwritten extents
    ext4: remove unmeetable inconsisteny check from ext4_find_extent()
    jbd2: remove excess descriptions for handle_s
    ext4: remove unnecessary bio get/put
    ext4: silence UBSAN in ext4_mb_init()
    ext4: address UBSAN warning in mb_find_order_for_block()
    ext4: fix oops on corrupted filesystem
    ext4: fix check of dqget() return value in ext4_ioctl_setproject()
    ext4: clean up error handling when orphan list is corrupted
    ext4: fix hang when processing corrupted orphaned inode list
    ext4: remove trailing \n from ext4_warning/ext4_error calls
    ext4: fix races between changing inode journal mode and ext4_writepages
    ext4: handle unwritten or delalloc buffers before enabling data journaling
    ext4: fix jbd2 handle extension in ext4_ext_truncate_extend_restart()
    ext4: do not ask jbd2 to write data for delalloc buffers
    jbd2: add support for avoiding data writes during transaction commits
    ...

    Linus Torvalds
     

03 May, 2016

1 commit


24 Apr, 2016

1 commit

  • If a directory has a large number of empty blocks, iterating over all
    of them can take a long time, leading to scheduler warnings and users
    getting irritated when they can't kill a process in the middle of one
    of these long-running readdir operations. Fix this by adding checks to
    ext4_readdir() and ext4_htree_fill_tree().

    This was reverted earlier due to a typo in the original commit where I
    experimented with using signal_pending() instead of
    fatal_signal_pending(). The test was in the wrong place if we were
    going to return signal_pending() since we would end up returning
    duplicant entries. See 9f2394c9be47 for a more detailed explanation.

    Added fix as suggested by Linus to check for signal_pending() in
    in the filldir() functions.

    Reported-by: Benjamin LaHaise
    Google-Bug-Id: 27880676
    Signed-off-by: Theodore Ts'o

    Theodore Ts'o
     

04 Jan, 2016

1 commit


01 Nov, 2014

1 commit


09 Oct, 2014

2 commits

  • It would make more sense to pass char __user * instead of
    char * in callers of do_mount() and do getname() inside do_mount().

    Suggested-by: Al Viro
    Signed-off-by: Seunghun Lee
    Signed-off-by: Al Viro

    Seunghun Lee
     
  • The gcc version 4.9.1 compiler complains Even though it isn't possible for
    these variables to not get initialized before they are used.

    fs/namespace.c: In function ‘SyS_mount’:
    fs/namespace.c:2720:8: warning: ‘kernel_dev’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
    ^
    fs/namespace.c:2699:8: note: ‘kernel_dev’ was declared here
    char *kernel_dev;
    ^
    fs/namespace.c:2720:8: warning: ‘kernel_type’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    ret = do_mount(kernel_dev, kernel_dir->name, kernel_type, flags,
    ^
    fs/namespace.c:2697:8: note: ‘kernel_type’ was declared here
    char *kernel_type;
    ^

    Fix the warnings by simplifying copy_mount_string() as suggested by Al Viro.

    Cc: Alexander Viro
    Signed-off-by: Tim Gardner
    Signed-off-by: Al Viro

    Tim Gardner
     

22 Apr, 2014

1 commit

  • File-private locks have been merged into Linux for v3.15, and *now*
    people are commenting that the name and macro definitions for the new
    file-private locks suck.

    ...and I can't even disagree. The names and command macros do suck.

    We're going to have to live with these for a long time, so it's
    important that we be happy with the names before we're stuck with them.
    The consensus on the lists so far is that they should be rechristened as
    "open file description locks".

    The name isn't a big deal for the kernel, but the command macros are not
    visually distinct enough from the traditional POSIX lock macros. The
    glibc and documentation folks are recommending that we change them to
    look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
    programmer will typo one of the commands wrong, and also makes it easier
    to spot this difference when reading code.

    This patch makes the following changes that I think are necessary before
    v3.15 ships:

    1) rename the command macros to their new names. These end up in the uapi
    headers and so are part of the external-facing API. It turns out that
    glibc doesn't actually use the fcntl.h uapi header, but it's hard to
    be sure that something else won't. Changing it now is safest.

    2) make the the /proc/locks output display these as type "OFDLCK"

    Cc: Michael Kerrisk
    Cc: Christoph Hellwig
    Cc: Carlos O'Donell
    Cc: Stefan Metzmacher
    Cc: Andy Lutomirski
    Cc: Frank Filz
    Cc: Theodore Ts'o
    Signed-off-by: Jeff Layton

    Jeff Layton
     

05 Apr, 2014

1 commit

  • Pull file locking updates from Jeff Layton:
    "Highlights:

    - maintainership change for fs/locks.c. Willy's not interested in
    maintaining it these days, and is OK with Bruce and I taking it.
    - fix for open vs setlease race that Al ID'ed
    - cleanup and consolidation of file locking code
    - eliminate unneeded BUG() call
    - merge of file-private lock implementation"

    * 'locks-3.15' of git://git.samba.org/jlayton/linux:
    locks: make locks_mandatory_area check for file-private locks
    locks: fix locks_mandatory_locked to respect file-private locks
    locks: require that flock->l_pid be set to 0 for file-private locks
    locks: add new fcntl cmd values for handling file private locks
    locks: skip deadlock detection on FL_FILE_PVT locks
    locks: pass the cmd value to fcntl_getlk/getlk64
    locks: report l_pid as -1 for FL_FILE_PVT locks
    locks: make /proc/locks show IS_FILE_PVT locks as type "FLPVT"
    locks: rename locks_remove_flock to locks_remove_file
    locks: consolidate checks for compatible filp->f_mode values in setlk handlers
    locks: fix posix lock range overflow handling
    locks: eliminate BUG() call when there's an unexpected lock on file close
    locks: add __acquires and __releases annotations to locks_start and locks_stop
    locks: remove "inline" qualifier from fl_link manipulation functions
    locks: clean up comment typo
    locks: close potential race between setlease and open
    MAINTAINERS: update entry for fs/locks.c

    Linus Torvalds
     

03 Apr, 2014

1 commit

  • Pull compat time conversion changes from Peter Anvin:
    "Despite the branch name this is really neither an x86 nor an
    x32-specific patchset, although it the implementation of the
    discussions that followed the x32 security hole a few months ago.

    This removes get/put_compat_timespec/val() and replaces them with
    compat_get/put_timespec/val() which are savvy as to the current status
    of COMPAT_USE_64BIT_TIME.

    It removes several unused and/or incorrect/misleading functions (like
    compat_put_timeval_convert which doesn't in fact do any conversion)
    and also replaces several open-coded implementations what is now
    called compat_convert_timespec() with that function"

    * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    compat: Fix sparse address space warnings
    compat: Get rid of (get|put)_compat_time(val|spec)

    Linus Torvalds
     

31 Mar, 2014

1 commit

  • Due to some unfortunate history, POSIX locks have very strange and
    unhelpful semantics. The thing that usually catches people by surprise
    is that they are dropped whenever the process closes any file descriptor
    associated with the inode.

    This is extremely problematic for people developing file servers that
    need to implement byte-range locks. Developers often need a "lock
    management" facility to ensure that file descriptors are not closed
    until all of the locks associated with the inode are finished.

    Additionally, "classic" POSIX locks are owned by the process. Locks
    taken between threads within the same process won't conflict with one
    another, which renders them useless for synchronization between threads.

    This patchset adds a new type of lock that attempts to address these
    issues. These locks conflict with classic POSIX read/write locks, but
    have semantics that are more like BSD locks with respect to inheritance
    and behavior on close.

    This is implemented primarily by changing how fl_owner field is set for
    these locks. Instead of having them owned by the files_struct of the
    process, they are instead owned by the filp on which they were acquired.
    Thus, they are inherited across fork() and are only released when the
    last reference to a filp is put.

    These new semantics prevent them from being merged with classic POSIX
    locks, even if they are acquired by the same process. These locks will
    also conflict with classic POSIX locks even if they are acquired by
    the same process or on the same file descriptor.

    The new locks are managed using a new set of cmd values to the fcntl()
    syscall. The initial implementation of this converts these values to
    "classic" cmd values at a fairly high level, and the details are not
    exposed to the underlying filesystem. We may eventually want to push
    this handing out to the lower filesystem code but for now I don't
    see any need for it.

    Also, note that with this implementation the new cmd values are only
    available via fcntl64() on 32-bit arches. There's little need to
    add support for legacy apps on a new interface like this.

    Signed-off-by: Jeff Layton

    Jeff Layton
     

06 Mar, 2014

2 commits

  • Some fs compat system calls have unsigned long parameters instead of
    compat_ulong_t.
    In order to allow the COMPAT_SYSCALL_DEFINE macro generate code that
    performs proper zero and sign extension convert all 64 bit parameters
    their corresponding 32 bit counterparts.

    compat_sys_io_getevents() is a bit different: the non-compat version
    has signed parameters for the "min_nr" and "nr" parameters while the
    compat version has unsigned parameters.
    So change this as well. For all practical purposes this shouldn't make
    any difference (doesn't fix a real bug).
    Also introduce a generic compat_aio_context_t type which can be used
    everywhere.
    The access_ok() check within compat_sys_io_getevents() got also removed
    since the non-compat sys_io_getevents() should be able to handle
    everything anyway.

    Signed-off-by: Heiko Carstens

    Heiko Carstens
     
  • Convert all compat system call functions where all parameter types
    have a size of four or less than four bytes, or are pointer types
    to COMPAT_SYSCALL_DEFINE.
    The implicit casts within COMPAT_SYSCALL_DEFINE will perform proper
    zero and sign extension to 64 bit of all parameters if needed.

    Signed-off-by: Heiko Carstens

    Heiko Carstens
     

04 Mar, 2014

1 commit

  • For architecture dependent compat syscalls in common code an architecture
    must define something like __ARCH_WANT_ if it wants to use the
    code.
    This however is not true for compat_sys_getdents64 for which architectures
    must define __ARCH_OMIT_COMPAT_SYS_GETDENTS64 if they do not want the code.

    This leads to the situation where all architectures, except mips, get the
    compat code but only x86_64, arm64 and the generic syscall architectures
    actually use it.

    So invert the logic, so that architectures actively must do something to
    get the compat code.

    This way a couple of architectures get rid of otherwise dead code.

    Signed-off-by: Heiko Carstens

    Heiko Carstens
     

03 Feb, 2014

1 commit

  • We have two APIs for compatiblity timespec/val, with confusingly
    similar names. compat_(get|put)_time(val|spec) *do* handle the case
    where COMPAT_USE_64BIT_TIME is set, whereas
    (get|put)_compat_time(val|spec) do not. This is an accident waiting
    to happen.

    Clean it up by favoring the full-service version; the limited version
    is replaced with double-underscore versions static to kernel/compat.c.

    A common pattern is to convert a struct timespec to kernel format in
    an allocation on the user stack. Unfortunately it is open-coded in
    several places. Since this allocation isn't actually needed if
    COMPAT_USE_64BIT_TIME is true (since user format == kernel format)
    encapsulate that whole pattern into the function
    compat_convert_timespec(). An equivalent function should be written
    for struct timeval if it is needed in the future.

    Finally, get rid of compat_(get|put)_timeval_convert(): each was only
    used once, and the latter was not even doing what the function said
    (no conversion actually was being done.) Moving the conversion into
    compat_sys_settimeofday() itself makes the code much more similar to
    sys_settimeofday() itself.

    v3: Remove unused compat_convert_timeval().

    v2: Drop bogus "const" in the destination argument for
    compat_convert_time*().

    Cc: Mauro Carvalho Chehab
    Cc: Alexander Viro
    Cc: Hans Verkuil
    Cc: Andrew Morton
    Cc: Heiko Carstens
    Cc: Manfred Spraul
    Cc: Mateusz Guzik
    Cc: Rafael Aquini
    Cc: Davidlohr Bueso
    Cc: Stephen Rothwell
    Cc: Dan Carpenter
    Cc: Arnd Bergmann
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Linus Torvalds
    Cc: Catalin Marinas
    Cc: Will Deacon
    Tested-by: H.J. Lu
    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     

29 Jun, 2013

3 commits

  • Signed-off-by: Al Viro

    Al Viro
     
  • New method - ->iterate(file, ctx). That's the replacement for ->readdir();
    it takes callback from ctx->actor, uses ctx->pos instead of file->f_pos and
    calls dir_emit(ctx, ...) instead of filldir(data, ...). It does *not*
    update file->f_pos (or look at it, for that matter); iterate_dir() does the
    update.

    Note that dir_emit() takes the offset from ctx->pos (and eventually
    filldir_t will lose that argument).

    Signed-off-by: Al Viro

    Al Viro
     
  • iterate_dir(): new helper, replacing vfs_readdir().

    struct dir_context: contains the readdir callback (and will get more stuff
    in it), embedded into whatever data that callback wants to deal with;
    eventually, we'll be passing it to ->readdir() replacement instead of
    (data,filldir) pair.

    Signed-off-by: Al Viro

    Al Viro
     

08 May, 2013

1 commit

  • Faster kernel compiles by way of fewer unnecessary includes.

    [akpm@linux-foundation.org: fix fallout]
    [akpm@linux-foundation.org: fix build]
    Signed-off-by: Kent Overstreet
    Cc: Zach Brown
    Cc: Felipe Balbi
    Cc: Greg Kroah-Hartman
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Rusty Russell
    Cc: Jens Axboe
    Cc: Asai Thambi S P
    Cc: Selvan Mani
    Cc: Sam Bradshaw
    Cc: Jeff Moyer
    Cc: Al Viro
    Cc: Benjamin LaHaise
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kent Overstreet
     

05 May, 2013

1 commit


02 May, 2013

1 commit

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     

01 May, 2013

1 commit

  • Pull compat cleanup from Al Viro:
    "Mostly about syscall wrappers this time; there will be another pile
    with patches in the same general area from various people, but I'd
    rather push those after both that and vfs.git pile are in."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
    syscalls.h: slightly reduce the jungles of macros
    get rid of union semop in sys_semctl(2) arguments
    make do_mremap() static
    sparc: no need to sign-extend in sync_file_range() wrapper
    ppc compat wrappers for add_key(2) and request_key(2) are pointless
    x86: trim sys_ia32.h
    x86: sys32_kill and sys32_mprotect are pointless
    get rid of compat_sys_semctl() and friends in case of ARCH_WANT_OLD_COMPAT_IPC
    merge compat sys_ipc instances
    consolidate compat lookup_dcookie()
    convert vmsplice to COMPAT_SYSCALL_DEFINE
    switch getrusage() to COMPAT_SYSCALL_DEFINE
    switch epoll_pwait to COMPAT_SYSCALL_DEFINE
    convert sendfile{,64} to COMPAT_SYSCALL_DEFINE
    switch signalfd{,4}() to COMPAT_SYSCALL_DEFINE
    make SYSCALL_DEFINE-generated wrappers do asmlinkage_protect
    make HAVE_SYSCALL_WRAPPERS unconditional
    consolidate cond_syscall and SYSCALL_ALIAS declarations
    teach SYSCALL_DEFINE how to deal with long long/unsigned long long
    get rid of duplicate logics in __SC_....[1-6] definitions

    Linus Torvalds
     

10 Apr, 2013

1 commit