17 Jan, 2012

1 commit

  • * 'x86-syscall-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Move from trace_syscalls.c to asm/syscall.h
    x86, um: Fix typo in 32-bit system call modifications
    um: Use $(srctree) not $(KBUILD_SRC)
    x86, um: Mark system call tables readonly
    x86, um: Use the same style generated syscall tables as native
    um: Generate headers before generating user-offsets.s
    um: Run host archheaders, allow use of host generated headers
    kbuild, headers.sh: Don't make archheaders explicitly
    x86, syscall: Allow syscall offset to be symbolic
    x86, syscall: Re-fix typo in comment
    x86: Simplify syscallhdr.sh
    x86: Generate system call tables and unistd_*.h from tables
    checksyscalls: Use arch/x86/syscalls/syscall_32.tbl as source
    x86: Machine-readable syscall tables and scripts to process them
    trace: Include in trace_syscalls.c
    x86-64, ia32: Move compat_ni_syscall into C and its own file
    x86-64, syscall: Adjust comment spacing and remove typo
    kbuild: Add support for an "archheaders" target
    kbuild: Add support for installing generated asm headers

    Linus Torvalds
     

06 Dec, 2011

2 commits

  • system_call_after_swapgs doesn't really benefit from forcing
    alignment from it - quite the opposite, native code needlessly
    so far got a big NOP instruction inserted in front of it. Xen
    being the only user of the separate entry point can well live
    with the branch going to three bytes into a cache line.

    The compatibility mode ptregs entry points for one can make use
    of the GLOBAL() macro, and should be suitably aligned. Their
    shared continuation point (ia32_ptregs_common) otoh doesn't need
    to be global at all, but should continue to be properly aligned.

    Signed-off-by: Jan Beulich
    Reviewed-by: Andi Kleen
    Link: http://lkml.kernel.org/r/4ED4CEEA020000780006407D@nat28.tlf.novell.com
    Signed-off-by: Ingo Molnar

    Jan Beulich
     
  • GET_THREAD_INFO() involves a memory read immediately followed by
    an "sub" on the value read, in turn (in several cases)
    immediately followed by a use of the calculated value as the
    base address of a memory access. This combination of
    instructions has a non-negligible potential for stalls.

    In the system call entry point code, however, the (fixed) offset
    of the stack pointer from the end of the stack is generally
    known, and hence we can instead avoid the memory load and
    subtract, and instead do the memory reference using %rsp as the
    base register. To do so in a legible fashion, introduce a
    THREAD_INFO() macro which, provided a register (generally %rsp)
    and the known offset from the end of the stack, produces a
    suitable memory access operand.

    The patch attempts to only touch the fast paths (no auditing and
    alike), but manages to do so only in the 64-bit entry point
    case; the compatibility mode entry points have so many
    interdependencies between their various branch targets that it
    was necessary to also adjust the slow paths to eliminate the
    risk of having missed some register dependency during code
    analysis.

    Signed-off-by: Jan Beulich
    Reviewed-by: Andi Kleen
    Link: http://lkml.kernel.org/r/4ED4CD690200007800064075@nat28.tlf.novell.com
    Signed-off-by: Ingo Molnar

    Jan Beulich
     

19 Nov, 2011

1 commit

  • Fix the same typo as was fixed in:

    b7641d2c x86-64, syscall: Adjust comment spacing and remove typo

    ... for the new versions of this file (32-bit and IA32 compat).

    Signed-off-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1321569446-20433-4-git-send-email-hpa@linux.intel.com

    H. Peter Anvin
     

18 Nov, 2011

2 commits

  • Generate system call tables and unistd_*.h automatically from the
    tables in arch/x86/syscalls. All other information, like NR_syscalls,
    is auto-generated, some of which is in asm-offsets_*.c.

    This allows us to keep all the system call information in one place,
    and allows for kernel space and user space to see different
    information; this is currently used for the ia32 system call numbers
    when building the 64-bit kernel, but will be used by the x32 ABI in
    the near future.

    This also removes some gratuitious differences between i386, x86-64
    and ia32; in particular, now all system call tables are generated with
    the same mechanism.

    Cc: H. J. Lu
    Cc: Sam Ravnborg
    Cc: Michal Marek
    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     
  • Move compat_ni_syscall out of ia32entry.S and into its own .c file.
    Although this is a trivial function, it is not performance-critical,
    and this will simplify further cleanups.

    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     

01 Nov, 2011

1 commit

  • The basic idea behind cross memory attach is to allow MPI programs doing
    intra-node communication to do a single copy of the message rather than a
    double copy of the message via shared memory.

    The following patch attempts to achieve this by allowing a destination
    process, given an address and size from a source process, to copy memory
    directly from the source process into its own address space via a system
    call. There is also a symmetrical ability to copy from the current
    process's address space into a destination process's address space.

    - Use of /proc/pid/mem has been considered, but there are issues with
    using it:
    - Does not allow for specifying iovecs for both src and dest, assuming
    preadv or pwritev was implemented either the area read from or
    written to would need to be contiguous.
    - Currently mem_read allows only processes who are currently
    ptrace'ing the target and are still able to ptrace the target to read
    from the target. This check could possibly be moved to the open call,
    but its not clear exactly what race this restriction is stopping
    (reason appears to have been lost)
    - Having to send the fd of /proc/self/mem via SCM_RIGHTS on unix
    domain socket is a bit ugly from a userspace point of view,
    especially when you may have hundreds if not (eventually) thousands
    of processes that all need to do this with each other
    - Doesn't allow for some future use of the interface we would like to
    consider adding in the future (see below)
    - Interestingly reading from /proc/pid/mem currently actually
    involves two copies! (But this could be fixed pretty easily)

    As mentioned previously use of vmsplice instead was considered, but has
    problems. Since you need the reader and writer working co-operatively if
    the pipe is not drained then you block. Which requires some wrapping to
    do non blocking on the send side or polling on the receive. In all to all
    communication it requires ordering otherwise you can deadlock. And in the
    example of many MPI tasks writing to one MPI task vmsplice serialises the
    copying.

    There are some cases of MPI collectives where even a single copy interface
    does not get us the performance gain we could. For example in an
    MPI_Reduce rather than copy the data from the source we would like to
    instead use it directly in a mathops (say the reduce is doing a sum) as
    this would save us doing a copy. We don't need to keep a copy of the data
    from the source. I haven't implemented this, but I think this interface
    could in the future do all this through the use of the flags - eg could
    specify the math operation and type and the kernel rather than just
    copying the data would apply the specified operation between the source
    and destination and store it in the destination.

    Although we don't have a "second user" of the interface (though I've had
    some nibbles from people who may be interested in using it for intra
    process messaging which is not MPI). This interface is something which
    hardware vendors are already doing for their custom drivers to implement
    fast local communication. And so in addition to this being useful for
    OpenMPI it would mean the driver maintainers don't have to fix things up
    when the mm changes.

    There was some discussion about how much faster a true zero copy would
    go. Here's a link back to the email with some testing I did on that:

    http://marc.info/?l=linux-mm&m=130105930902915&w=2

    There is a basic man page for the proposed interface here:

    http://ozlabs.org/~cyeoh/cma/process_vm_readv.txt

    This has been implemented for x86 and powerpc, other architecture should
    mainly (I think) just need to add syscall numbers for the process_vm_readv
    and process_vm_writev. There are 32 bit compatibility versions for
    64-bit kernels.

    For arch maintainers there are some simple tests to be able to quickly
    verify that the syscalls are working correctly here:

    http://ozlabs.org/~cyeoh/cma/cma-test-20110718.tgz

    Signed-off-by: Chris Yeoh
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Arnd Bergmann
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: James Morris
    Cc:
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christopher Yeoh
     

27 Aug, 2011

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

23 Jul, 2011

1 commit


15 Jul, 2011

1 commit


04 Jun, 2011

2 commits


29 May, 2011

1 commit

  • 32bit and 64bit on x86 are tested and working. The rest I have looked
    at closely and I can't find any problems.

    setns is an easy system call to wire up. It just takes two ints so I
    don't expect any weird architecture porting problems.

    While doing this I have noticed that we have some architectures that are
    very slow to get new system calls. cris seems to be the slowest where
    the last system calls wired up were preadv and pwritev. avr32 is weird
    in that recvmmsg was wired up but never declared in unistd.h. frv is
    behind with perf_event_open being the last syscall wired up. On h8300
    the last system call wired up was epoll_wait. On m32r the last system
    call wired up was fallocate. mn10300 has recvmmsg as the last system
    call wired up. The rest seem to at least have syncfs wired up which was
    new in the 2.6.39.

    v2: Most of the architecture support added by Daniel Lezcano
    v3: ported to v2.6.36-rc4 by: Eric W. Biederman
    v4: Moved wiring up of the system call to another patch
    v5: ported to v2.6.39-rc6
    v6: rebased onto parisc-next and net-next to avoid syscall conflicts.
    v7: ported to Linus's latest post 2.6.39 tree.

    >  arch/blackfin/include/asm/unistd.h     |    3 ++-
    >  arch/blackfin/mach-common/entry.S      |    1 +
    Acked-by: Mike Frysinger

    Oh - ia64 wiring looks good.
    Acked-by: Tony Luck

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

06 May, 2011

1 commit

  • This patch adds a multiple message send syscall and is the send
    version of the existing recvmmsg syscall. This is heavily
    based on the patch by Arnaldo that added recvmmsg.

    I wrote a microbenchmark to test the performance gains of using
    this new syscall:

    http://ozlabs.org/~anton/junkcode/sendmmsg_test.c

    The test was run on a ppc64 box with a 10 Gbit network card. The
    benchmark can send both UDP and RAW ethernet packets.

    64B UDP

    batch pkts/sec
    1 804570
    2 872800 (+ 8 %)
    4 916556 (+14 %)
    8 939712 (+17 %)
    16 952688 (+18 %)
    32 956448 (+19 %)
    64 964800 (+20 %)

    64B raw socket

    batch pkts/sec
    1 1201449
    2 1350028 (+12 %)
    4 1461416 (+22 %)
    8 1513080 (+26 %)
    16 1541216 (+28 %)
    32 1553440 (+29 %)
    64 1557888 (+30 %)

    We see a 20% improvement in throughput on UDP send and 30%
    on raw socket send.

    [ Add sparc syscall entries. -DaveM ]

    Signed-off-by: Anton Blanchard
    Signed-off-by: David S. Miller

    Anton Blanchard
     

24 Mar, 2011

1 commit


21 Mar, 2011

1 commit

  • It is frequently useful to sync a single file system, instead of all
    mounted file systems via sync(2):

    - On machines with many mounts, it is not at all uncommon for some of
    them to hang (e.g. unresponsive NFS server). sync(2) will get stuck on
    those and may never get to the one you do care about (e.g., /).
    - Some applications write lots of data to the file system and then
    want to make sure it is flushed to disk. Calling fsync(2) on each
    file introduces unnecessary ordering constraints that result in a large
    amount of sub-optimal writeback/flush/commit behavior by the file
    system.

    There are currently two ways (that I know of) to sync a single super_block:

    - BLKFLSBUF ioctl on the block device: That also invalidates the bdev
    mapping, which isn't usually desirable, and doesn't work for non-block
    file systems.
    - 'mount -o remount,rw' will call sync_filesystem as an artifact of the
    current implemention. Relying on this little-known side effect for
    something like data safety sounds foolish.

    Both of these approaches require root privileges, which some applications
    do not have (nor should they need?) given that sync(2) is an unprivileged
    operation.

    This patch introduces a new system call syncfs(2) that takes an fd and
    syncs only the file system it references. Maybe someday we can

    $ sync /some/path

    and not get

    sync: ignoring all arguments

    The syscall is motivated by comments by Al and Christoph at the last LSF.
    syncfs(2) seems like an appropriate name given statfs(2).

    A similar ioctl was also proposed a while back, see
    http://marc.info/?l=linux-fsdevel&m=127970513829285&w=2

    Signed-off-by: Sage Weil
    Signed-off-by: Al Viro

    Sage Weil
     

16 Mar, 2011

3 commits

  • * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, binutils, xen: Fix another wrong size directive
    x86: Remove dead config option X86_CPU
    x86: Really print supported CPUs if PROCESSOR_SELECT=y
    x86: Fix a bogus unwind annotation in lib/semaphore_32.S
    um, x86-64: Fix UML build after adding CFI annotations to lib/rwsem_64.S
    x86: Remove unused bits from lib/thunk_*.S
    x86: Use {push,pop}_cfi in more places
    x86-64: Add CFI annotations to lib/rwsem_64.S
    x86, asm: Cleanup unnecssary macros in asm-offsets.c
    x86, system.h: Drop unused __SAVE/__RESTORE macros
    x86: Use bitmap library functions
    x86: Partly unify asm-offsets_{32,64}.c
    x86: Reduce back the alignment of the per-CPU data section

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (62 commits)
    posix-clocks: Check write permissions in posix syscalls
    hrtimer: Remove empty hrtimer_init_hres_timer()
    hrtimer: Update hrtimer->state documentation
    hrtimer: Update base[CLOCK_BOOTTIME].offset correctly
    timers: Export CLOCK_BOOTTIME via the posix timers interface
    timers: Add CLOCK_BOOTTIME hrtimer base
    time: Extend get_xtime_and_monotonic_offset() to also return sleep
    time: Introduce get_monotonic_boottime and ktime_get_boottime
    hrtimers: extend hrtimer base code to handle more then 2 clockids
    ntp: Remove redundant and incorrect parameter check
    mn10300: Switch do_timer() to xtimer_update()
    posix clocks: Introduce dynamic clocks
    posix-timers: Cleanup namespace
    posix-timers: Add support for fd based clocks
    x86: Add clock_adjtime for x86
    posix-timers: Introduce a syscall for clock tuning.
    time: Splitout compat timex accessors
    ntp: Add ADJ_SETOFFSET mode bit
    time: Introduce timekeeping_inject_offset
    posix-timer: Update comment
    ...

    Fix up new system-call-related conflicts in
    arch/x86/ia32/ia32entry.S
    arch/x86/include/asm/unistd_32.h
    arch/x86/include/asm/unistd_64.h
    arch/x86/kernel/syscall_table_32.S
    (name_to_handle_at()/open_by_handle_at() vs clock_adjtime()), and some
    due to movement of get_jiffies_64() in:
    kernel/time.c

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits)
    perf probe: Clean up probe_point_lazy_walker() return value
    tracing: Fix irqoff selftest expanding max buffer
    tracing: Align 4 byte ints together in struct tracer
    tracing: Export trace_set_clr_event()
    tracing: Explain about unstable clock on resume with ring buffer warning
    ftrace/graph: Trace function entry before updating index
    ftrace: Add .ref.text as one of the safe areas to trace
    tracing: Adjust conditional expression latency formatting.
    tracing: Fix event alignment: skb:kfree_skb
    tracing: Fix event alignment: mce:mce_record
    tracing: Fix event alignment: kvm:kvm_hv_hypercall
    tracing: Fix event alignment: module:module_request
    tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
    tracing: Remove lock_depth from event entry
    perf header: Stop using 'self'
    perf session: Use evlist/evsel for managing perf.data attributes
    perf top: Don't let events to eat up whole header line
    perf top: Fix events overflow in top command
    ring-buffer: Remove unused #include <linux/trace_irq.h>
    tracing: Add an 'overwrite' trace_option.
    ...

    Linus Torvalds
     

15 Mar, 2011

1 commit


09 Mar, 2011

1 commit

  • Put x86 entry code into a separate link section: .entry.text.

    Separating the entry text section seems to have performance
    benefits - caused by more efficient instruction cache usage.

    Running hackbench with perf stat --repeat showed that the change
    compresses the icache footprint. The icache load miss rate went
    down by about 15%:

    before patch:
    19417627 L1-icache-load-misses ( +- 0.147% )

    after patch:
    16490788 L1-icache-load-misses ( +- 0.180% )

    The motivation of the patch was to fix a particular kprobes
    bug that relates to the entry text section, the performance
    advantage was discovered accidentally.

    Whole perf output follows:

    - results for current tip tree:

    Performance counter stats for './hackbench/hackbench 10' (500 runs):

    19417627 L1-icache-load-misses ( +- 0.147% )
    2676914223 instructions # 0.497 IPC ( +- 0.079% )
    5389516026 cycles ( +- 0.144% )

    0.206267711 seconds time elapsed ( +- 0.138% )

    - results for current tip tree with the patch applied:

    Performance counter stats for './hackbench/hackbench 10' (500 runs):

    16490788 L1-icache-load-misses ( +- 0.180% )
    2717734941 instructions # 0.502 IPC ( +- 0.079% )
    5414756975 cycles ( +- 0.148% )

    0.206747566 seconds time elapsed ( +- 0.137% )

    Signed-off-by: Jiri Olsa
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Cc: Nick Piggin
    Cc: Eric Dumazet
    Cc: masami.hiramatsu.pt@hitachi.com
    Cc: ananth@in.ibm.com
    Cc: davem@davemloft.net
    Cc: 2nddept-manager@sdl.hitachi.co.jp
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     

01 Mar, 2011

1 commit


02 Feb, 2011

1 commit


18 Nov, 2010

1 commit


15 Oct, 2010

1 commit

  • akiphie points out that a.out core-dumps have that odd task struct
    dumping that was never used and was never really a good idea (it goes
    back into the mists of history, probably the original core-dumping
    code). Just remove it.

    Also do the access_ok() check on dump_write(). It probably doesn't
    matter (since normal filesystems all seem to do it anyway), but he
    points out that it's normally done by the VFS layer, so ...

    [ I suspect that we should possibly do "vfs_write()" instead of
    calling ->write directly. That also does the whole fsnotify and write
    statistics thing, which may or may not be a good idea. ]

    And just to be anal, do this all for the x86-64 32-bit a.out emulation
    code too, even though it's not enabled (and won't currently even
    compile)

    Reported-by: akiphie
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 Sep, 2010

2 commits

  • In commit d4d6715, we reopened an old hole for a 64-bit ptracer touching a
    32-bit tracee in system call entry. A %rax value set via ptrace at the
    entry tracing stop gets used whole as a 32-bit syscall number, while we
    only check the low 32 bits for validity.

    Fix it by truncating %rax back to 32 bits after syscall_trace_enter,
    in addition to testing the full 64 bits as has already been added.

    Reported-by: Ben Hawkes
    Signed-off-by: Roland McGrath
    Signed-off-by: H. Peter Anvin

    Roland McGrath
     
  • On 64 bits, we always, by necessity, jump through the system call
    table via %rax. For 32-bit system calls, in theory the system call
    number is stored in %eax, and the code was testing %eax for a valid
    system call number. At one point we loaded the stored value back from
    the stack to enforce zero-extension, but that was removed in checkin
    d4d67150165df8bf1cc05e532f6efca96f907cab. An actual 32-bit process
    will not be able to introduce a non-zero-extended number, but it can
    happen via ptrace.

    Instead of re-introducing the zero-extension, test what we are
    actually going to use, i.e. %rax. This only adds a handful of REX
    prefixes to the code.

    Reported-by: Ben Hawkes
    Signed-off-by: H. Peter Anvin
    Cc:
    Cc: Roland McGrath
    Cc: Andrew Morton

    H. Peter Anvin
     

14 Aug, 2010

1 commit

  • Mark arguments to certain system calls as being const where they should be but
    aren't. The list includes:

    (*) The filename arguments of various stat syscalls, execve(), various utimes
    syscalls and some mount syscalls.

    (*) The filename arguments of some syscall helpers relating to the above.

    (*) The buffer argument of various write syscalls.

    Signed-off-by: David Howells
    Acked-by: David S. Miller
    Signed-off-by: Linus Torvalds

    David Howells
     

11 Aug, 2010

2 commits

  • As pointed out by Jiri Slaby: when I resolved the the 32-bit x85 system
    call entry tables for prlimit (due to the conflict with fanotify), I
    forgot to add the numbering in comments that we do for every fifth entry.

    Reported-by: Jiri Slaby
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * 'writable_limits' of git://decibel.fi.muni.cz/~xslaby/linux:
    unistd: add __NR_prlimit64 syscall numbers
    rlimits: implement prlimit64 syscall
    rlimits: switch more rlimit syscalls to do_prlimit
    rlimits: redo do_setrlimit to more generic do_prlimit
    rlimits: add rlimit64 structure
    rlimits: do security check under task_lock
    rlimits: allow setrlimit to non-current tasks
    rlimits: split sys_setrlimit
    rlimits: selinux, do rlimits changes under task_lock
    rlimits: make sure ->rlim_max never grows in sys_setrlimit
    rlimits: add task_struct to update_rlimit_cpu
    rlimits: security, add task_struct to setrlimit

    Fix up various system call number conflicts. We not only added fanotify
    system calls in the meantime, but asm-generic/unistd.h added a wait4
    along with a range of reserved per-architecture system calls.

    Linus Torvalds
     

28 Jul, 2010

2 commits


16 Jul, 2010

1 commit


21 Apr, 2010

1 commit

  • Before commit e28cbf22933d0c0ccaf3c4c27a1a263b41f73859 ("improve
    sys_newuname() for compat architectures") 64-bit x86 had a private
    implementation of sys_uname which was just called sys_uname, which other
    architectures used for the old uname.

    Due to some merge issues with the uname refactoring patches we ended up
    calling the old uname version for both the old and new system call
    slots, which lead to the domainname filed never be set which caused
    failures with libnss_nis.

    Reported-and-tested-by: Andy Isaacson
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

13 Mar, 2010

3 commits

  • Add generic implementations of the old and really old uname system calls.
    Note that sh only implements sys_olduname but not sys_oldolduname, but I'm
    not going to bother with another ifdef for that special case.

    m32r implemented an old uname but never wired it up, so kill it, too.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Cc: Andreas Schwab
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Add a generic implementation of the old mmap() syscall, which expects its
    argument in a memory block and switch all architectures over to use it.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Reviewed-by: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Cc: Andreas Schwab
    Acked-by: Jesper Nilsson
    Acked-by: Russell King
    Acked-by: Greg Ungerer
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Add a generic implementation of the old select() syscall, which expects
    its argument in a memory block and switch all architectures over to use
    it.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Reviewed-by: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Acked-by: Andreas Schwab
    Acked-by: Russell King
    Acked-by: Greg Ungerer
    Acked-by: David Howells
    Cc: Andreas Schwab
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

04 Mar, 2010

1 commit

  • * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
    x86: Fix out of order of gsi
    x86: apic: Fix mismerge, add arch_probe_nr_irqs() again
    x86, irq: Keep chip_data in create_irq_nr and destroy_irq
    xen: Remove unnecessary arch specific xen irq functions.
    smp: Use nr_cpus= to set nr_cpu_ids early
    x86, irq: Remove arch_probe_nr_irqs
    sparseirq: Use radix_tree instead of ptrs array
    sparseirq: Change irq_desc_ptrs to static
    init: Move radix_tree_init() early
    irq: Remove unnecessary bootmem code
    x86: Add iMac9,1 to pci_reboot_dmi_table
    x86: Convert i8259_lock to raw_spinlock
    x86: Convert nmi_lock to raw_spinlock
    x86: Convert ioapic_lock and vector_lock to raw_spinlock
    x86: Avoid race condition in pci_enable_msix()
    x86: Fix SCI on IOAPIC != 0
    x86, ia32_aout: do not kill argument mapping
    x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path
    x86, irq: Update the vector domain for legacy irqs handled by io-apic
    x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's
    ...

    Linus Torvalds