20 Aug, 2011

1 commit


15 Mar, 2011

1 commit

  • New flag for open(2) - O_PATH. Semantics:
    * pathname is resolved, but the file itself is _NOT_ opened
    as far as filesystem is concerned.
    * almost all operations on the resulting descriptors shall
    fail with -EBADF. Exceptions are:
    1) operations on descriptors themselves (i.e.
    close(), dup(), dup2(), dup3(), fcntl(fd, F_DUPFD),
    fcntl(fd, F_DUPFD_CLOEXEC, ...), fcntl(fd, F_GETFD),
    fcntl(fd, F_SETFD, ...))
    2) fcntl(fd, F_GETFL), for a common non-destructive way to
    check if descriptor is open
    3) "dfd" arguments of ...at(2) syscalls, i.e. the starting
    points of pathname resolution
    * closing such descriptor does *NOT* affect dnotify or
    posix locks.
    * permissions are checked as usual along the way to file;
    no permission checks are applied to the file itself. Of course,
    giving such thing to syscall will result in permission checks (at
    the moment it means checking that starting point of ....at() is
    a directory and caller has exec permissions on it).

    fget() and fget_light() return NULL on such descriptors; use of
    fget_raw() and fget_raw_light() is needed to get them. That protects
    existing code from dealing with those things.

    There are two things still missing (they come in the next commits):
    one is handling of symlinks (right now we refuse to open them that
    way; see the next commit for semantics related to those) and another
    is descriptor passing via SCM_RIGHTS datagrams.

    Signed-off-by: Al Viro

    Al Viro
     

10 Oct, 2010

1 commit

  • All 'pid_t' were changed to '__kernel_pid_t' in a previous commit:
    make exported headers use strict posix types

    A number of standard posix types are used in exported headers,
    which is not allowed if __STRICT_KERNEL_NAMES is defined. In order
    to get rid of the non-__STRICT_KERNEL_NAMES part and to make sane
    headers the default, we have to change them all to safe types.

    but a later change introduced 'pid_t' again:
    fcntl: add F_[SG]ETOWN_EX

    This makes asm-generic/fcntl.h d use strict posix types again.

    Signed-off-by: Lucian Adrian Grijincu
    Signed-off-by: Andrew Morton
    Signed-off-by: Arnd Bergmann

    Lucian Adrian Grijincu
     

11 Aug, 2010

1 commit

  • The O_* bit numbers are defined in 20+ arch/*, and can silently overlap.
    Add a compile time check to ensure the uniqueness as suggested by David
    Miller.

    Signed-off-by: Wu Fengguang
    Cc: David Miller
    Cc: Stephen Rothwell
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Eric Paris
    Cc: Roland Dreier
    Cc: Jamie Lokier
    Cc: Andreas Schwab
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

28 Jul, 2010

2 commits

  • sparc used the same value as FMODE_NONOTIFY so change FMODE_NONOTIFY to be
    something unique.

    Signed-off-by: Wu Fengguang
    Signed-off-by: Eric Paris

    Signed-off-by: Wu Fengguang
     
  • This is a new f_mode which can only be set by the kernel. It indicates
    that the fd was opened by fanotify and should not cause future fanotify
    events. This is needed to prevent fanotify livelock. An example of
    obvious livelock is from fanotify close events.

    Process A closes file1
    This creates a close event for file1.
    fanotify opens file1 for Listener X
    Listener X deals with the event and closes its fd for file1.
    This creates a close event for file1.
    fanotify opens file1 for Listener X
    Listener X deals with the event and closes its fd for file1.
    This creates a close event for file1.
    fanotify opens file1 for Listener X
    Listener X deals with the event and closes its fd for file1.
    notice a pattern?

    The fix is to add the FMODE_NONOTIFY bit to the open filp done by the kernel
    for fanotify. Thus when that file is used it will not generate future
    events.

    This patch simply defines the bit.

    Signed-off-by: Eric Paris

    Eric Paris
     

18 Dec, 2009

1 commit


10 Dec, 2009

1 commit

  • While Linux provided an O_SYNC flag basically since day 1, it took until
    Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
    since that day we had generic_osync_around with only minor changes and the
    great "For now, when the user asks for O_SYNC, we'll actually give
    O_DSYNC" comment. This patch intends to actually give us real O_SYNC
    semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC
    patches which are required before this patch it's actually surprisingly
    simple, we just need to figure out when to set the datasync flag to
    vfs_fsync_range and when not.

    This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
    numerical value to keep binary compatibility, and adds a new real O_SYNC
    flag. To guarantee backwards compatiblity it is defined as expanding to
    both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
    sure we are backwards-compatible when compiled against the new headers.

    This also means that all places that don't care about the differences can
    just check O_DSYNC and get the right behaviour for O_SYNC, too - only
    places that actuall care need to check __O_SYNC in addition. Drivers and
    network filesystems have been updated in a fail safe way to always do the
    full sync magic if O_DSYNC is set. The few places setting O_SYNC for
    lower layers are kept that way for now to stay failsafe.

    We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
    to make sure we always get these sane options.

    Note that parisc really screwed up their headers as they already define a
    O_DSYNC that has always been a no-op. We try to repair it by using it for
    the new O_DSYNC and redefinining O_SYNC to send both the traditional
    O_SYNC numerical value _and_ the O_DSYNC one.

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Grant Grundler
    Cc: "David S. Miller"
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Al Viro
    Cc: Andreas Dilger
    Acked-by: Trond Myklebust
    Acked-by: Kyle McMartin
    Acked-by: Ulrich Drepper
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Jan Kara

    Christoph Hellwig
     

18 Nov, 2009

1 commit

  • This is for consistency with various ioctl() operations that include the
    suffix "PGRP" in their names, and also for consistency with PRIO_PGRP,
    used with setpriority() and getpriority(). Also, using PGRP instead of
    GID avoids confusion with the common abbreviation of "group ID".

    I'm fine with anything that makes it more consistent, and if PGRP is what
    is the predominant abbreviation then I see no need to further confuse
    matters by adding a third one.

    Signed-off-by: Peter Zijlstra
    Acked-by: Michael Kerrisk
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

12 Nov, 2009

1 commit

  • Fix a bug in

    commit ba0a6c9f6fceed11c6a99e8326f0477fe383e6b5
    Author: Peter Zijlstra
    AuthorDate: Wed Sep 23 15:57:03 2009 -0700
    Commit: Linus Torvalds
    CommitDate: Thu Sep 24 07:21:01 2009 -0700

    fcntl: add F_[SG]ETOWN_EX

    In asm-generic/fcntl.h, F_SETOWN_EX and F_GETLK64 both have value 12, and
    F_GETOWN_EX and F_SETLK64 both have value 13.

    Reported-by: "Joseph S. Myers"
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Andreas Schwab
    Signed-off-by: Peter Zijlstra
    Acked-by: Ulrich Drepper
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

24 Sep, 2009

1 commit

  • In order to direct the SIGIO signal to a particular thread of a
    multi-threaded application we cannot, like suggested by the manpage, put a
    TID into the regular fcntl(F_SETOWN) call. It will still be send to the
    whole process of which that thread is part.

    Since people do want to properly direct SIGIO we introduce F_SETOWN_EX.

    The need to direct SIGIO comes from self-monitoring profiling such as with
    perf-counters. Perf-counters uses SIGIO to notify that new sample data is
    available. If the signal is delivered to the same task that generated the
    new sample it can augment that data by inspecting the task's user-space
    state right after it returns from the kernel. This is esp. convenient
    for interpreted or virtual machine driven environments.

    Both F_SETOWN_EX and F_GETOWN_EX take a pointer to a struct f_owner_ex
    as argument:

    struct f_owner_ex {
    int type;
    pid_t pid;
    };

    Where type is one of F_OWNER_TID, F_OWNER_PID or F_OWNER_GID.

    Signed-off-by: Peter Zijlstra
    Reviewed-by: Oleg Nesterov
    Tested-by: stephane eranian
    Cc: Michael Kerrisk
    Cc: Roland McGrath
    Cc: Al Viro
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

27 Mar, 2009

1 commit

  • A number of standard posix types are used in exported headers, which
    is not allowed if __STRICT_KERNEL_NAMES is defined. In order to
    get rid of the non-__STRICT_KERNEL_NAMES part and to make sane headers
    the default, we have to change them all to safe types.

    There are also still some leftovers in reiserfs_fs.h, elfcore.h
    and coda.h, but these files have not compiled in user space for
    a long time.

    This leaves out the various integer types ({u_,u,}int{8,16,32,64}_t),
    which we take care of separately.

    Signed-off-by: Arnd Bergmann
    Acked-by: Mauro Carvalho Chehab
    Cc: David Airlie
    Cc: Arnaldo Carvalho de Melo
    Cc: YOSHIFUJI Hideaki
    Cc: netdev@vger.kernel.org
    Cc: linux-ppp@vger.kernel.org
    Cc: Jaroslav Kysela
    Cc: Takashi Iwai
    Cc: David Woodhouse
    Signed-off-by: H. Peter Anvin
    Signed-off-by: Ingo Molnar

    Arnd Bergmann
     

17 Jul, 2007

1 commit

  • The problem is as follows: in multi-threaded code (or more correctly: all
    code using clone() with CLONE_FILES) we have a race when exec'ing.

    thread #1 thread #2

    fd=open()

    fork + exec

    fcntl(fd,F_SETFD,FD_CLOEXEC)

    In some applications this can happen frequently. Take a web browser. One
    thread opens a file and another thread starts, say, an external PDF viewer.
    The result can even be a security issue if that open file descriptor
    refers to a sensitive file and the external program can somehow be tricked
    into using that descriptor.

    Just adding O_CLOEXEC support to open() doesn't solve the whole set of
    problems. There are other ways to create file descriptors (socket,
    epoll_create, Unix domain socket transfer, etc). These can and should be
    addressed separately though. open() is such an easy case that it makes not
    much sense putting the fix off.

    The test program:

    #include
    #include
    #include
    #include

    #ifndef O_CLOEXEC
    # define O_CLOEXEC 02000000
    #endif

    int
    main (int argc, char *argv[])
    {
    int fd;
    if (argc > 1)
    {
    fd = atol (argv[1]);
    printf ("child: fd = %d\n", fd);
    if (fcntl (fd, F_GETFD) == 0 || errno != EBADF)
    {
    puts ("file descriptor valid in child");
    return 1;
    }
    return 0;
    }

    fd = open ("/proc/self/exe", O_RDONLY | O_CLOEXEC);
    printf ("in parent: new fd = %d\n", fd);
    char buf[20];
    snprintf (buf, sizeof (buf), "%d", fd);
    execl ("/proc/self/exe", argv[0], buf, NULL);
    puts ("execl failed");
    return 1;
    }

    [kyle@parisc-linux.org: parisc fix]
    Signed-off-by: Ulrich Drepper
    Acked-by: Ingo Molnar
    Cc: Davide Libenzi
    Cc: Michael Kerrisk
    Cc: Chris Zankel
    Signed-off-by: Kyle McMartin
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

26 Apr, 2006

1 commit


08 Sep, 2005

5 commits