30 Jan, 2020

1 commit


31 May, 2019

1 commit

  • Based on 1 normalized pattern(s):

    this program is free software you can redistribute it and or modify
    it under the terms of the gnu general public license as published by
    the free software foundation either version 2 of the license or at
    your option any later version

    extracted by the scancode license scanner the SPDX license identifier

    GPL-2.0-or-later

    has been chosen to replace the boilerplate/reference in 3029 file(s).

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Allison Randal
    Cc: linux-spdx@vger.kernel.org
    Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

13 Jul, 2017

2 commits

  • kcmp syscall is build iif CONFIG_CHECKPOINT_RESTORE is selected, so wrap
    appropriate helpers in epoll code with the config to build it
    conditionally.

    Link: http://lkml.kernel.org/r/20170513083456.GG1881@uranus.lan
    Signed-off-by: Cyrill Gorcunov
    Reported-by: Andrew Morton
    Cc: Andrey Vagin
    Cc: Al Viro
    Cc: Pavel Emelyanov
    Cc: Michael Kerrisk
    Cc: Jason Baron
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     
  • With current epoll architecture target files are addressed with
    file_struct and file descriptor number, where the last is not unique.
    Moreover files can be transferred from another process via unix socket,
    added into queue and closed then so we won't find this descriptor in the
    task fdinfo list.

    Thus to checkpoint and restore such processes CRIU needs to find out
    where exactly the target file is present to add it into epoll queue.
    For this sake one can use kcmp call where some particular target file
    from the queue is compared with arbitrary file passed as an argument.

    Because epoll target files can have same file descriptor number but
    different file_struct a caller should explicitly specify the offset
    within.

    To test if some particular file is matching entry inside epoll one have
    to

    - fill kcmp_epoll_slot structure with epoll file descriptor,
    target file number and target file offset (in case if only
    one target is present then it should be 0)

    - call kcmp as kcmp(pid1, pid2, KCMP_EPOLL_TFD, fd, &kcmp_epoll_slot)
    - the kernel fetch file pointer matching file descriptor @fd of pid1
    - lookups for file struct in epoll queue of pid2 and returns traditional
    0,1,2 result for sorting purpose

    Link: http://lkml.kernel.org/r/20170424154423.511592110@gmail.com
    Signed-off-by: Cyrill Gorcunov
    Acked-by: Andrey Vagin
    Cc: Al Viro
    Cc: Pavel Emelyanov
    Cc: Michael Kerrisk
    Cc: Jason Baron
    Cc: Andy Lutomirski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Cyrill Gorcunov
     

13 Oct, 2012

1 commit


06 Oct, 2012

1 commit

  • Enhanced epoll_ctl to support EPOLL_CTL_DISABLE, which disables an epoll
    item. If epoll_ctl doesn't return -EBUSY in this case, it is then safe to
    delete the epoll item in a multi-threaded environment. Also added a new
    test_epoll self- test app to both demonstrate the need for this feature
    and test it.

    Signed-off-by: Paton J. Lewis
    Cc: Alexander Viro
    Cc: Jason Baron
    Cc: Paul Holland
    Cc: Davide Libenzi
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paton J. Lewis
     

18 Jul, 2012

1 commit

  • As discussed in
    http://thread.gmane.org/gmane.linux.kernel/1249726/focus=1288990,
    the capability introduced in 4d7e30d98939a0340022ccd49325a3d70f7e0238
    to govern EPOLLWAKEUP seems misnamed: this capability is about governing
    the ability to suspend the system, not using a particular API flag
    (EPOLLWAKEUP). We should make the name of the capability more general
    to encourage reuse in related cases. (Whether or not this capability
    should also be used to govern the use of /sys/power/wake_lock is a
    question that needs to be separately resolved.)

    This patch renames the capability to CAP_BLOCK_SUSPEND. In order to ensure
    that the old capability name doesn't make it out into the wild, could you
    please apply and push up the tree to ensure that it is incorporated
    for the 3.5 release.

    Signed-off-by: Michael Kerrisk
    Acked-by: Serge Hallyn
    Signed-off-by: Rafael J. Wysocki

    Michael Kerrisk
     

06 May, 2012

1 commit


13 Jan, 2012

1 commit

  • The current epoll code can be tickled to run basically indefinitely in
    both loop detection path check (on ep_insert()), and in the wakeup paths.
    The programs that tickle this behavior set up deeply linked networks of
    epoll file descriptors that cause the epoll algorithms to traverse them
    indefinitely. A couple of these sample programs have been previously
    posted in this thread: https://lkml.org/lkml/2011/2/25/297.

    To fix the loop detection path check algorithms, I simply keep track of
    the epoll nodes that have been already visited. Thus, the loop detection
    becomes proportional to the number of epoll file descriptor and links.
    This dramatically decreases the run-time of the loop check algorithm. In
    one diabolical case I tried it reduced the run-time from 15 mintues (all
    in kernel time) to .3 seconds.

    Fixing the wakeup paths could be done at wakeup time in a similar manner
    by keeping track of nodes that have already been visited, but the
    complexity is harder, since there can be multiple wakeups on different
    cpus...Thus, I've opted to limit the number of possible wakeup paths when
    the paths are created.

    This is accomplished, by noting that the end file descriptor points that
    are found during the loop detection pass (from the newly added link), are
    actually the sources for wakeup events. I keep a list of these file
    descriptors and limit the number and length of these paths that emanate
    from these 'source file descriptors'. In the current implemetation I
    allow 1000 paths of length 1, 500 of length 2, 100 of length 3, 50 of
    length 4 and 10 of length 5. Note that it is sufficient to check the
    'source file descriptors' reachable from the newly added link, since no
    other 'source file descriptors' will have newly added links. This allows
    us to check only the wakeup paths that may have gotten too long, and not
    re-check all possible wakeup paths on the system.

    In terms of the path limit selection, I think its first worth noting that
    the most common case for epoll, is probably the model where you have 1
    epoll file descriptor that is monitoring n number of 'source file
    descriptors'. In this case, each 'source file descriptor' has a 1 path of
    length 1. Thus, I believe that the limits I'm proposing are quite
    reasonable and in fact may be too generous. Thus, I'm hoping that the
    proposed limits will not prevent any workloads that currently work to
    fail.

    In terms of locking, I have extended the use of the 'epmutex' to all
    epoll_ctl add and remove operations. Currently its only used in a subset
    of the add paths. I need to hold the epmutex, so that we can correctly
    traverse a coherent graph, to check the number of paths. I believe that
    this additional locking is probably ok, since its in the setup/teardown
    paths, and doesn't affect the running paths, but it certainly is going to
    add some extra overhead. Also, worth noting is that the epmuex was
    recently added to the ep_ctl add operations in the initial path loop
    detection code using the argument that it was not on a critical path.

    Another thing to note here, is the length of epoll chains that is allowed.
    Currently, eventpoll.c defines:

    /* Maximum number of nesting allowed inside epoll sets */
    #define EP_MAX_NESTS 4

    This basically means that I am limited to a graph depth of 5 (EP_MAX_NESTS
    + 1). However, this limit is currently only enforced during the loop
    check detection code, and only when the epoll file descriptors are added
    in a certain order. Thus, this limit is currently easily bypassed. The
    newly added check for wakeup paths, stricly limits the wakeup paths to a
    length of 5, regardless of the order in which ep's are linked together.
    Thus, a side-effect of the new code is a more consistent enforcement of
    the graph depth.

    Thus far, I've tested this, using the sample programs previously
    mentioned, which now either return quickly or return -EINVAL. I've also
    testing using the piptest.c epoll tester, which showed no difference in
    performance. I've also created a number of different epoll networks and
    tested that they behave as expectded.

    I believe this solves the original diabolical test cases, while still
    preserving the sane epoll nesting.

    Signed-off-by: Jason Baron
    Cc: Nelson Elhage
    Cc: Davide Libenzi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Baron
     

31 Mar, 2011

1 commit


16 Mar, 2009

1 commit

  • This lock moves out of the CONFIG_EPOLL ifdef and becomes f_lock. For now,
    epoll remains the only user, but a future patch will use it to protect
    f_flags as well.

    Cc: Davide Libenzi
    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jonathan Corbet

    Jonathan Corbet
     

25 Jul, 2008

2 commits

  • Remove the size parameter from the new epoll_create syscall and renames the
    syscall itself. The updated test program follows.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include
    #include

    #ifndef __NR_epoll_create2
    # ifdef __x86_64__
    # define __NR_epoll_create2 291
    # elif defined __i386__
    # define __NR_epoll_create2 329
    # else
    # error "need __NR_epoll_create2"
    # endif
    #endif

    #define EPOLL_CLOEXEC O_CLOEXEC

    int
    main (void)
    {
    int fd = syscall (__NR_epoll_create2, 0);
    if (fd == -1)
    {
    puts ("epoll_create2(0) failed");
    return 1;
    }
    int coe = fcntl (fd, F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if (coe & FD_CLOEXEC)
    {
    puts ("epoll_create2(0) set close-on-exec flag");
    return 1;
    }
    close (fd);

    fd = syscall (__NR_epoll_create2, EPOLL_CLOEXEC);
    if (fd == -1)
    {
    puts ("epoll_create2(EPOLL_CLOEXEC) failed");
    return 1;
    }
    coe = fcntl (fd, F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if ((coe & FD_CLOEXEC) == 0)
    {
    puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
    return 1;
    }
    close (fd);

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • This patch adds the new epoll_create2 syscall. It extends the old epoll_create
    syscall by one parameter which is meant to hold a flag value. In this
    patch the only flag support is EPOLL_CLOEXEC which causes the close-on-exec
    flag for the returned file descriptor to be set.

    A new name EPOLL_CLOEXEC is introduced which in this implementation must
    have the same value as O_CLOEXEC.

    The following test must be adjusted for architectures other than x86 and
    x86-64 and in case the syscall numbers changed.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    #include
    #include
    #include
    #include
    #include

    #ifndef __NR_epoll_create2
    # ifdef __x86_64__
    # define __NR_epoll_create2 291
    # elif defined __i386__
    # define __NR_epoll_create2 329
    # else
    # error "need __NR_epoll_create2"
    # endif
    #endif

    #define EPOLL_CLOEXEC O_CLOEXEC

    int
    main (void)
    {
    int fd = syscall (__NR_epoll_create2, 1, 0);
    if (fd == -1)
    {
    puts ("epoll_create2(0) failed");
    return 1;
    }
    int coe = fcntl (fd, F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if (coe & FD_CLOEXEC)
    {
    puts ("epoll_create2(0) set close-on-exec flag");
    return 1;
    }
    close (fd);

    fd = syscall (__NR_epoll_create2, 1, EPOLL_CLOEXEC);
    if (fd == -1)
    {
    puts ("epoll_create2(EPOLL_CLOEXEC) failed");
    return 1;
    }
    coe = fcntl (fd, F_GETFD);
    if (coe == -1)
    {
    puts ("fcntl failed");
    return 1;
    }
    if ((coe & FD_CLOEXEC) == 0)
    {
    puts ("epoll_create2(EPOLL_CLOEXEC) set close-on-exec flag");
    return 1;
    }
    close (fd);

    puts ("OK");

    return 0;
    }
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Ulrich Drepper
    Acked-by: Davide Libenzi
    Cc: Michael Kerrisk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     

29 Oct, 2007

1 commit

  • Don't undef __i386__/__x86_64__ in uml anymore, make sure that (few) places
    that required adjusting the ifdefs got those.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

28 Mar, 2007

1 commit

  • UML/x86_64 needs the same packing of struct epoll_event as x86_64.

    Signed-off-by: Jeff Dike
    Cc: Davide Libenzi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     

26 Jun, 2006

1 commit

  • A few days ago Arjan signaled a lockdep red flag on epoll locks, and
    precisely between the epoll's device structure lock (->lock) and the wait
    queue head lock (->lock).

    Like I explained in another email, and directly to Arjan, this can't happen
    in reality because of the explicit check at eventpoll.c:592, that does not
    allow to drop an epoll fd inside the same epoll fd. Since lockdep is
    working on per-structure locks, it will never be able to know of policies
    enforced in other parts of the code.

    It was decided time ago of having the ability to drop epoll fds inside
    other epoll fds, that triggers a very trick wakeup operations (due to
    possibly reentrant callback-driven wakeups) handled by the
    ep_poll_safewake() function. While looking again at the code though, I
    noticed that all the operations done on the epoll's main structure wait
    queue head (->wq) are already protected by the epoll lock (->lock), so that
    locked-style functions can be used to manipulate the ->wq member. This
    makes both a lock-acquire save, and lockdep happy.

    Running totalmess on my dual opteron for a while did not reveal any problem
    so far:

    http://www.xmailserver.org/totalmess.c

    Signed-off-by: Davide Libenzi
    Cc: Arjan van de Ven
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

23 Mar, 2006

1 commit

  • Eliminate a handful of cache references by keeping current in a register
    instead of reloading (helps x86) and avoiding the overhead of a function
    call. Inlining eventpoll_init_file() saves 24 bytes. Also reorder file
    initialization to make writes occur more sequentially.

    Signed-off-by: Benjamin LaHaise
    Cc: Davide Libenzi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin LaHaise
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds