20 Jul, 2007

1 commit

  • Remove the arg+env limit of MAX_ARG_PAGES by copying the strings directly from
    the old mm into the new mm.

    We create the new mm before the binfmt code runs, and place the new stack at
    the very top of the address space. Once the binfmt code runs and figures out
    where the stack should be, we move it downwards.

    It is a bit peculiar in that we have one task with two mm's, one of which is
    inactive.

    [a.p.zijlstra@chello.nl: limit stack size]
    Signed-off-by: Ollie Wild
    Signed-off-by: Peter Zijlstra
    Cc:
    Cc: Hugh Dickins
    [bunk@stusta.de: unexport bprm_mm_init]
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ollie Wild
     

24 May, 2007

1 commit


17 May, 2007

1 commit


11 May, 2007

2 commits


09 May, 2007

3 commits

  • Implement utimensat(2) which is an extension to futimesat(2) in that it

    a) supports nano-second resolution for the timestamps
    b) allows to selectively ignore the atime/mtime value
    c) allows to selectively use the current time for either atime or mtime
    d) supports changing the atime/mtime of a symlink itself along the lines
    of the BSD lutimes(3) functions

    For this change the internally used do_utimes() functions was changed to
    accept a timespec time value and an additional flags parameter.

    Additionally the sys_utime function was changed to match compat_sys_utime
    which already use do_utimes instead of duplicating the work.

    Also, the completely missing futimensat() functionality is added. We have
    such a function in glibc but we have to resort to using /proc/self/fd/* which
    not everybody likes (chroot etc).

    Test application (the syscall number will need per-arch editing):

    #include
    #include
    #include
    #include
    #include
    #include

    #define __NR_utimensat 280

    #define UTIME_NOW ((1l << 30) - 1l)
    #define UTIME_OMIT ((1l << 30) - 2l)

    int
    main(void)
    {
    int status = 0;

    int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
    if (fd == -1)
    error (1, errno, "failed to create test file \"ttt\"");

    struct stat64 st1;
    if (fstat64 (fd, &st1) != 0)
    error (1, errno, "fstat failed");

    struct timespec t[2];
    t[0].tv_sec = 0;
    t[0].tv_nsec = 0;
    t[1].tv_sec = 0;
    t[1].tv_nsec = 0;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    struct stat64 st2;
    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
    puts ("atim not reset to zero");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("mtim not reset to zero");
    status = 1;
    }
    if (status != 0)
    goto out;

    t[0] = st1.st_atim;
    t[1].tv_sec = 0;
    t[1].tv_nsec = UTIME_OMIT;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
    || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
    puts ("atim not set");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("mtim changed from zero");
    status = 1;
    }
    if (status != 0)
    goto out;

    t[0].tv_sec = 0;
    t[0].tv_nsec = UTIME_OMIT;
    t[1] = st1.st_mtim;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
    || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
    puts ("mtim changed from original time");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
    || st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
    {
    puts ("mtim not set");
    status = 1;
    }
    if (status != 0)
    goto out;

    sleep (2);

    t[0].tv_sec = 0;
    t[0].tv_nsec = UTIME_NOW;
    t[1].tv_sec = 0;
    t[1].tv_nsec = UTIME_NOW;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    struct timeval tv;
    gettimeofday(&tv,NULL);

    if (st2.st_atim.tv_sec tv.tv_sec)
    {
    puts ("atim not set to NOW");
    status = 1;
    }
    if (st2.st_mtim.tv_sec tv.tv_sec)
    {
    puts ("mtim not set to NOW");
    status = 1;
    }

    if (symlink ("ttt", "tttsym") != 0)
    error (1, errno, "cannot create symlink");

    t[0].tv_sec = 0;
    t[0].tv_nsec = 0;
    t[1].tv_sec = 0;
    t[1].tv_nsec = 0;
    if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
    error (1, errno, "utimensat failed");

    if (lstat64 ("tttsym", &st2) != 0)
    error (1, errno, "lstat failed");

    if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
    puts ("symlink atim not reset to zero");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("symlink mtim not reset to zero");
    status = 1;
    }
    if (status != 0)
    goto out;

    t[0].tv_sec = 1;
    t[0].tv_nsec = 0;
    t[1].tv_sec = 1;
    t[1].tv_nsec = 0;
    if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
    {
    puts ("atim not reset to one");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("mtim not reset to one");
    status = 1;
    }

    if (status == 0)
    puts ("all OK");

    out:
    close (fd);
    unlink ("ttt");
    unlink ("tttsym");

    return status;
    }

    [akpm@linux-foundation.org: add missing i386 syscall table entry]
    Signed-off-by: Ulrich Drepper
    Cc: Alexey Dobriyan
    Cc: Michael Kerrisk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • Merge all compat ioctl handling into compat_ioctl.c instead of splitting it
    over compat.c and compat_ioctl.c. This also allows to get rid of ioctl32.h

    Signed-off-by: Christoph Hellwig
    Looks-good-to: Andi Kleen
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • ROUND_UP macro cleanup use,ALIGN or DIV_ROUND_UP where ever appropriate.

    Signed-off-by: Milind Arun Choudhary
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Milind Arun Choudhary
     

03 May, 2007

1 commit


08 Mar, 2007

1 commit

  • IA64 and ARM-OABI are currently using their own version of epoll compat_
    code.

    An architecture needs epoll_event translation if alignof(u64) in 32 bit
    mode is different from alignof(u64) in 64 bit mode. If an architecture
    needs epoll_event translation, it must define struct compat_epoll_event in
    asm/compat.h and set CONFIG_HAVE_COMPAT_EPOLL_EVENT and use
    compat_sys_epoll_ctl and compat_sys_epoll_wait.

    All 64 bit architecture should use compat_sys_epoll_pwait.

    [sfr: restructure and move to fs/compat.c, remove MIPS version
    of compat_sys_epoll_pwait, use __put_user_unaligned]

    Signed-off-by: Stephen Rothwell
    Cc: David Woodhouse
    Cc: Russell King
    Cc: "Luck, Tony"
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     

11 Dec, 2006

1 commit

  • Currently, each fdtable supports three dynamically-sized arrays of data: the
    fdarray and two fdsets. The code allows the number of fds supported by the
    fdarray (fdtable->max_fds) to differ from the number of fds supported by each
    of the fdsets (fdtable->max_fdset).

    In practice, it is wasteful for these two sizes to differ: whenever we hit a
    limit on the smaller-capacity structure, we will reallocate the entire fdtable
    and all the dynamic arrays within it, so any delta in the memory used by the
    larger-capacity structure will never be touched at all.

    Rather than hogging this excess, we shouldn't even allocate it in the first
    place, and keep the capacities of the fdarray and the fdsets equal. This
    patch removes fdtable->max_fdset. As an added bonus, most of the supporting
    code becomes simpler.

    Signed-off-by: Vadim Lobanov
    Cc: Christoph Hellwig
    Cc: Al Viro
    Cc: Dipankar Sarma
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vadim Lobanov
     

09 Dec, 2006

1 commit

  • This patch changes struct file to use struct path instead of having
    independent pointers to struct dentry and struct vfsmount, and converts all
    users of f_{dentry,vfsmnt} in fs/ to use f_path.{dentry,mnt}.

    Additionally, it adds two #define's to make the transition easier for users of
    the f_dentry and f_vfsmnt.

    Signed-off-by: Josef "Jeff" Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef "Jeff" Sipek
     

08 Dec, 2006

2 commits

  • Signed-off-by: Heiko Carstens
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     
  • OpenVZ Linux kernel team has found a problem with mounting in compat mode.

    Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
    leads to oops:

    Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: compat_sys_mount+0xd6/0x290
    Process mount (pid: 14656, veid=300, threadinfo ffff810034d30000, task ffff810034c86bc0)
    Call Trace: ia32_sysret+0x0/0xa

    The problem is that data_page pointer can be NULL, so we should skip data
    conversion in this case.

    Signed-off-by: Andrey Mirkin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrey Mirkin
     

04 Dec, 2006

2 commits


04 Nov, 2006

1 commit

  • 758333458aa719bfc26ec16eafd4ad3a9e96014d fixes the not checked copy_to_user
    return value of compat_sys_pselect7. I ran into this too because of an old
    source tree, but my fix would look quite a bit different to Andi's fix.

    The reason is that the compat function IMHO should behave the very same as
    the non-compat function if possible. Since sys_pselect7 does not return
    -EFAULT in this specific case, change the compat code so it behaves like
    sys_pselect7.

    Cc: David Woodhouse
    Cc: Andi Kleen
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Heiko Carstens
     

11 Oct, 2006

1 commit


03 Oct, 2006

1 commit

  • These patches make the kernel pass 64-bit inode numbers internally when
    communicating to userspace, even on a 32-bit system. They are required
    because some filesystems have intrinsic 64-bit inode numbers: NFS3+ and XFS
    for example. The 64-bit inode numbers are then propagated to userspace
    automatically where the arch supports it.

    Problems have been seen with userspace (eg: ld.so) using the 64-bit inode
    number returned by stat64() or getdents64() to differentiate files, and
    failing because the 64-bit inode number space was compressed to 32-bits, and
    so overlaps occur.

    This patch:

    Make filldir_t take a 64-bit inode number and struct kstat carry a 64-bit
    inode number so that 64-bit inode numbers can be passed back to userspace.

    The stat functions then returns the full 64-bit inode number where
    available and where possible. If it is not possible to represent the inode
    number supplied by the filesystem in the field provided by userspace, then
    error EOVERFLOW will be issued.

    Similarly, the getdents/readdir functions now pass the full 64-bit inode
    number to userspace where possible, returning EOVERFLOW instead when a
    directory entry is encountered that can't be properly represented.

    Note that this means that some inodes will not be stat'able on a 32-bit
    system with old libraries where they were before - but it does mean that
    there will be no ambiguity over what a 32-bit inode number refers to.

    Note similarly that directory scans may be cut short with an error on a
    32-bit system with old libraries where the scan would work before for the
    same reasons.

    It is judged unlikely that this situation will occur because modern glibc
    uses 64-bit capable versions of stat and getdents class functions
    exclusively, and that older systems are unlikely to encounter
    unrepresentable inode numbers anyway.

    [akpm: alpha build fix]
    Signed-off-by: David Howells
    Cc: Trond Myklebust
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

02 Oct, 2006

1 commit

  • Revert Andrew Morton's patch to temporarily hack around the lack of a
    declaration of sigset_t in linux/compat.h to make the block-disablement
    patches build on IA64. This got accidentally pushed to Linus and should
    be fixed in a different manner.

    Also make linux/compat.h #include asm/signal.h to gain a definition of
    sigset_t so that it can externally declare sigset_from_compat().

    This has been compile-tested for i386, x86_64, ia64, mips, mips64, frv, ppc and
    ppc64 and run-tested on frv.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

01 Oct, 2006

3 commits


26 Sep, 2006

1 commit

  • Fix

    linux/fs/compat.c: In function compat_sys_pselect7
    linux/fs/compat.c:1869: warning: ignoring return value of copy_to_user, declared with attribute warn_unused_result

    To make it easier to handle I changed to semantics to not try to
    write out a timespec if an error occurred. I hope that's ok.

    Cc: dwmw2@infradead.org

    Signed-off-by: Andi Kleen

    Andi Kleen
     

27 Jun, 2006

1 commit


23 Jun, 2006

1 commit

  • Give the statfs superblock operation a dentry pointer rather than a superblock
    pointer.

    This complements the get_sb() patch. That reduced the significance of
    sb->s_root, allowing NFS to place a fake root there. However, NFS does
    require a dentry to use as a target for the statfs operation. This permits
    the root in the vfsmount to be used instead.

    linux/mount.h has been added where necessary to make allyesconfig build
    successfully.

    Interest has also been expressed for use with the FUSE and XFS filesystems.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Nathan Scott
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

22 May, 2006

1 commit

  • Functions compat_nfs_svc_trans, compat_nfs_clnt_trans,
    compat_nfs_exp_trans, compat_nfs_getfd_trans and compat_nfs_getfs_trans,
    which are called by compat_sys_nfsservctl(fs/compat.c), don't handle the
    return value of access_ok properly. access_ok return 1 when the addr is
    valid, and 0 when it's not, but these functions have the reversed
    understanding. When the address is valid, they always return -EFAULT to
    compat_sys_nfsservctl.

    An example is to run /usr/sbin/rpc.nfsd(32bit program on Power5). It
    doesn't function as expected. strace showes that nfsservctl returns
    -EFAULT.

    The patch fixes this by correcting the error handling on the return value
    of access_ok in the five functions.

    Signed-off-by: Lin Feng Shen
    Cc: Trond Myklebust
    Acked-by: Neil Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lin Feng Shen
     

16 May, 2006

1 commit


04 May, 2006

1 commit


02 May, 2006

1 commit


26 Apr, 2006

1 commit

  • This patch addresses a flaw in LSM, where there is no mediation of readv()
    and writev() in for 32-bit compatible apps using a 64-bit kernel.

    This bug was discovered and fixed initially in the native readv/writev
    code [1], but was not fixed in the compat code. Thanks to Al for spotting
    this one.

    [1] http://lwn.net/Articles/154282/

    Signed-off-by: James Morris
    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    James Morris
     

29 Mar, 2006

1 commit


26 Mar, 2006

2 commits


24 Mar, 2006

1 commit


18 Feb, 2006

1 commit

  • I got all of these backwards. We want to return

    min(input timeout, new timeout)

    to userspace to prevent increasing the time-remaining value.

    Thanks to Ernst Herzberg for reporting and diagnosing.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

12 Feb, 2006

1 commit

  • With David Woodhouse

    select() presently has a habit of increasing the value of the user's
    `timeout' argument on return.

    We were writing back a timeout larger than the original. We _deliberately_
    round up, since we know we must wait at _least_ as long as the caller asks
    us to.

    The patch adds a couple of helper functions for magnitude comparison of
    timespecs and of timevals, and uses them to prevent the various poll and
    select functions from returning a timeout which is larger than the one which
    was passed in.

    The patch also fixes a bug in compat_sys_pselect7(): it was adding the new
    timeout value to the old one and was returning that. It should just return
    the new timeout value.

    (We have various handy timespec/timeval-to-from-nsec conversion functions in
    time.h. But this code open-codes it all).

    Cc: "David S. Miller"
    Cc: Andi Kleen
    Cc: Ulrich Drepper
    Cc: Thomas Gleixner
    Cc: george anzinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

02 Feb, 2006

2 commits

  • Most of the 64 bit architectures will zero extend the first argument to
    compat_sys_{openat,newfstatat,futimesat} which will fail if the 32 bit
    syscall was passed AT_FDCWD (which is a small negative number). Declare
    the first argument to be an unsigned int which will force the correct
    sign extension when the internal functions are called in each case.

    Also, do some small white space cleanups in fs/compat.c.

    Signed-off-by: Stephen Rothwell
    Acked-by: David S. Miller
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     
  • fs/compat.c: In function `compat_sys_pselect7':
    fs/compat.c:1820: warning: passing arg 5 of `compat_core_sys_select' from incompatible pointer type

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

20 Jan, 2006

1 commit

  • The compat layer timeout handling changes in:

    9f72949f679df06021c9e43886c9191494fdb007

    are busted. This is most easily seen with an X application
    that uses sub-second select/poll timeout such as emacs. You
    hit a key and it takes a second or so before the app responds.

    The two ROUND_UP() calls upon entry are using {tv,ts}_sec where it
    should instead be using {tv_usec,ts_nsec}, which perfectly explains
    the observed incorrect behavior.

    Another bug shot down with git bisect.

    Signed-off-by: David S. Miller
    Signed-off-by: Linus Torvalds

    David S. Miller