12 Oct, 2006

40 commits

  • I was looking at lockdep-desing.txt and i guess i am confused with the
    changes with respect to fd7bcea35e7efb108c34ee2b3840942a3749cadb. It
    says

    + '.' acquired while irqs enabled
    + '+' acquired in irq context
    + '-' acquired in process context with irqs disabled
    + '?' read-acquired both with irqs enabled and in irq context
    +

    But the get_usage_chars() function does this for '-'
    if (class->usage_mask & LOCKF_ENABLED_HARDIRQS)
    *c1 = '-';

    So i guess what would be correct would be
    '.' acquired while irqs disabled
    '+' acquired in irq context
    '-' acquired with irqs enabled
    '?' read acquired in irq context with irqs enabled.

    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar
     
  • Signed-off-by: Alexey Dobriyan
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Signed-off-by: Alexey Dobriyan
    Cc: David Woodhouse
    Cc: David Howells
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • CONFIG_PCI=n, CONFIG_HT_IRQ=y results in the following compile error:

    ...
    LD vmlinux
    arch/i386/mach-generic/built-in.o: In function `apicid_to_node':
    summit.c:(.text+0x53): undefined reference to `apicid_2_node'
    arch/i386/kernel/built-in.o: In function `arch_setup_ht_irq':
    (.text+0xcf79): undefined reference to `write_ht_irq_low'
    arch/i386/kernel/built-in.o: In function `arch_setup_ht_irq':
    (.text+0xcf85): undefined reference to `write_ht_irq_high'
    arch/i386/kernel/built-in.o: In function `k7nops':
    alternative.c:(.data+0x1358): undefined reference to `mask_ht_irq'
    alternative.c:(.data+0x1360): undefined reference to `unmask_ht_irq'
    make[1]: *** [vmlinux] Error 1

    Bug report by Jesper Juhl.

    Signed-off-by: Adrian Bunk
    Cc: "Eric W. Biederman"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar
     
  • There's nothing arch-specific about check_signature(), so move it to
    . Use a cross between the Alpha and i386 implementations as
    the generic one.

    Signed-off-by: Matthew Wilcox
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • In preparation for moving check_signature, change these users from asm/io.h
    to linux/io.h

    Signed-off-by: Matthew Wilcox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     
  • A couple of flush_dcache_page()s are missing on the I/O-error paths.

    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Monakhov Dmitriy
     
  • Aince all callers dereference sb, and this function does so earlier too, we
    dont need the check.

    Signed-off-by: Eric Sesterhenn
    Acked-by: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sesterhenn
     
  • If try_to_release_page() is called with a zero gfp mask, then the
    filesystem is effectively denied the possibility of sleeping while
    attempting to release the page. There doesn't appear to be any valid
    reason why this should be banned, given that we're not calling this from a
    memory allocation context.

    For this reason, change the gfp_mask argument of the call to GFP_KERNEL.

    Signed-off-by: Trond Myklebust
    Cc: Steve Dickson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • The pipe-a-coredump-to-a-program feature was undocumented.
    *Grumble*.

    NB: a good enhancement to that patch would be: save all the stuff that a
    core file can get from the %x expansions in the environment.

    Signed-off-by: Matthias Urlichs
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthias Urlichs
     
  • lib/bitmap.c:bitmap_parse() is a library function that received as input a
    user buffer. This seemed to have originated from the way the write_proc
    function of the /proc filesystem operates.

    This has been reworked to not use kmalloc and eliminates a lot of
    get_user() overhead by performing one access_ok before using __get_user().

    We need to test if we are in kernel or user space (is_user) and access the
    buffer differently. We cannot use __get_user() to access kernel addresses
    in all cases, for example in architectures with separate address space for
    kernel and user.

    This function will be useful for other uses as well; for example, taking
    input for /sysfs instead of /proc, so it was changed to accept kernel
    buffers. We have this use for the Linux UWB project, as part as the
    upcoming bandwidth allocator code.

    Only a few routines used this function and they were changed too.

    Signed-off-by: Reinette Chatre
    Signed-off-by: Inaky Perez-Gonzalez
    Cc: Paul Jackson
    Cc: Joe Korty
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Reinette Chatre
     
  • A couple of HDIO IOCTLs are not yet handled and a few others are marked
    as using a pointer rather than an unsigned long. The formers include:

    HDIO_GET_WCACHE, HDIO_GET_ACOUSTIC, HDIO_GET_ADDRESS and
    HDIO_GET_BUSSTATE. The latters are: HDIO_SET_MULTCOUNT,
    HDIO_SET_UNMASKINTR, HDIO_SET_KEEPSETTINGS, HDIO_SET_32BIT,
    HDIO_SET_NOWERR, HDIO_SET_DMA, HDIO_SET_PIO_MODE and HDIO_SET_NICE.

    Additionally 0x330 used to be HDIO_GETGEO_BIG and may be issued by 32-bit
    `hdparm' run on a 64-bit kernel making Linux complain loudly.

    This is a fix for these issues.

    Signed-off-by: Maciej W. Rozycki
    Cc: Alan Cox
    Acked-by: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maciej W. Rozycki
     
  • Signed-off-by: Jeff Garzik
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Garzik
     
  • A failure in invalidate_inode_pages2_range() can result in unpleasant things
    happening in NFS (at least). Stick a WARN_ON_ONCE() in there so we can find
    out if it happens, and maybe why.

    (akpm: might be a -mm-only patch, we'll see..)

    Cc: Chuck Lever
    Cc: Trond Myklebust
    Cc: Steve Dickson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • This likely profiling is pretty fun. I found a few possible problems
    in sched.c.

    This patch may be not measurable, but when I did measure long ago,
    nooping (un)likely cost a couple of % on scheduler heavy benchmarks, so
    it all adds up.

    Tweak some branch hints:

    - the 2nd 64 bits in the bitmask is likely to be populated, because it
    contains the first 28 bits (nearly 3/4) of the normal priorities.
    (ratio of 669669:691 ~= 1000:1).

    - it isn't unlikely that context switching switches to another process. it
    might be very rapidly switching to and from the idle process (ratio of
    475815:419004 and 471330:423544). Let the branch predictor decide.

    - preempt_enable seems to be very often called in a nested preempt_disable
    or with interrupts disabled (ratio of 3567760:87965 ~= 40:1)

    Signed-off-by: Nick Piggin
    Acked-by: Ingo Molnar
    Cc: Daniel Walker
    Cc: Hua Zhong
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • - handle sysfs error
    - handle driver model errors
    - de-obfuscate platform_device_register_simple() call, which included an
    assignment in between two function calls, in the same C statement.

    Signed-off-by: Jeff Garzik
    Acked-by: Kylene Hall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Garzik
     
  • Current error behaviour for ext2 and ext3 filesystems does not fully
    correspond to the documentation and should be fixed.

    According to man 8 mount, ext2 and ext3 file systems allow to set one of 3
    different on-errors behaviours:

    ---- start of quote man 8 mount ----

    errors=continue / errors=remount-ro / errors=panic

    Define the behaviour when an error is encountered. (Either ignore
    errors and just mark the file system erroneous and continue, or remount
    the file system read-only, or panic and halt the system.) The default is
    set in the filesystem superblock, and can be changed using tune2fs(8).

    ---- end of quote ----

    However EXT3_ERRORS_CONTINUE is not read from the superblock, and thus
    ERRORS_CONT is not saved on the sbi->s_mount_opt. It leads to the incorrect
    handle of errors on ext3.

    Then we've checked corresponding code in ext2 and discovered that it is buggy
    as well:

    - EXT2_ERRORS_CONTINUE is not read from the superblock (the same);

    - parse_option() does not clean the alternative values and thus something
    like (ERRORS_CONT|ERRORS_RO) can be set;

    - if options are omitted, parse_option() does not set any of these options.

    Therefore it is possible to set any combination of these options on the ext2:

    - none of them may be set: EXT2_ERRORS_CONTINUE on superblock / empty mount
    options;

    - any of them may be set using mount options;

    - 2 any options may be set: by using EXT2_ERRORS_RO/EXT2_ERRORS_PANIC on the
    superblock and other value in mount options;

    - and finally all three options may be set by adding third option in remount.

    Currently ext2 uses these values only in ext2_error() and it is not leading to
    any noticeable troubles. However somebody may be discouraged when he will try
    to workaround EXT2_ERRORS_PANIC on the superblock by using errors=continue in
    mount options.

    This patch:

    EXT2_ERRORS_CONTINUE should be read from the superblock as default value for
    error behaviour. parse_option() should clean the alternative options and
    should not change default value taken from the superblock.

    Signed-off-by: Vasily Averin
    Acked-by: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vasily Averin
     
  • Current error behaviour for ext2 and ext3 filesystems does not fully
    correspond to the documentation and should be fixed.

    According to man 8 mount, ext2 and ext3 file systems allow to set one of 3
    different on-errors behaviours:

    ---- start of quote man 8 mount ----

    errors=continue / errors=remount-ro / errors=panic

    Define the behaviour when an error is encountered. (Either ignore
    errors and just mark the file system erroneous and continue, or remount
    the file system read-only, or panic and halt the system.) The default is
    set in the filesystem superblock, and can be changed using tune2fs(8).

    ---- end of quote ----

    However EXT3_ERRORS_CONTINUE is not read from the superblock, and thus
    ERRORS_CONT is not saved on the sbi->s_mount_opt. It leads to the incorrect
    handle of errors on ext3.

    Then we've checked corresponding code in ext2 and discovered that it is buggy
    as well:

    - EXT2_ERRORS_CONTINUE is not read from the superblock (the same);

    - parse_option() does not clean the alternative values and thus something
    like (ERRORS_CONT|ERRORS_RO) can be set;

    - if options are omitted, parse_option() does not set any of these options.

    Therefore it is possible to set any combination of these options on the ext2:

    - none of them may be set: EXT2_ERRORS_CONTINUE on superblock / empty mount
    options;

    - any of them may be set using mount options;

    - 2 any options may be set: by using EXT2_ERRORS_RO/EXT2_ERRORS_PANIC on the
    superblock and other value in mount options;

    - and finally all three options may be set by adding third option in remount.

    Currently ext2 uses these values only in ext2_error() and it is not leading to
    any noticeable troubles. However somebody may be discouraged when he will try
    to workaround EXT2_ERRORS_PANIC on the superblock by using errors=continue in
    mount options.

    This patch:

    EXT3_ERRORS_CONTINUE should be taken from the superblock as default value for
    error behaviour.

    Signed-off-by: Dmitry Mishin
    Acked-by: Vasily Averin
    Acked-by: Kirill Korotaev
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Mishin
     
  • Module taint flags listing in Oops/panic has a couple of issues:

    * taint_flags() doesn't null-terminate the buffer after printing the flags

    * per-module taints are only set if the kernel is not already tainted
    (with that particular flag) => only the first offending module gets its
    taint info correctly updated

    Some additional changes:

    * 'license_gplok' is no longer needed - equivalent to !(taints &
    TAINT_PROPRIETARY_MODULE) - so we can drop it from struct module *
    exporting module taint info via /proc/module:

    pwc 88576 0 - Live 0xf8c32000
    evilmod 6784 1 pwc, Live 0xf8bbf000 (PF)

    Signed-off-by: Florin Malita
    Cc: "Randy.Dunlap"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Florin Malita
     
  • Some people find their Jmicron pata port reports its disabled even
    though it has devices on it and was boot probed. Fix this

    (Candidate for 2.6.18.*, less so for 2.6.19 as we've got a proper
    jmicron driver on the merge for that to replace ide-generic support)

    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • If grow_buffers() is for some reason passed a block number which wants to lie
    outside the maximum-addressable pagecache range (PAGE_SIZE * 4G bytes) then it
    will accidentally truncate `index' and will then instnatiate a page at the
    wrong pagecache offset. This causes __getblk_slow() to go into an infinite
    loop.

    This can happen with corrupted disks, or with software errors elsewhere.

    Detect that, and handle it.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • This is a follow-up patch based on the review for perfmon2. This patch
    adds the carta_random32() library routine + carta_random32.h header file.

    This is fast, simple, and efficient pseudo number generator algorithm. We
    use it in perfmon2 to randomize the sampling periods. In this context, we
    do not need any fancy randomizer.

    Signed-off-by: stephane eranian
    Cc: David Mosberger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephane Eranian
     
  • Implement the epoll_pwait system call, that extend the event wait mechanism
    with the same logic ppoll and pselect do. The definition of epoll_pwait
    is:

    int epoll_pwait(int epfd, struct epoll_event *events, int maxevents,
    int timeout, const sigset_t *sigmask, size_t sigsetsize);

    The difference between the vanilla epoll_wait and epoll_pwait is that the
    latter allows the caller to specify a signal mask to be set while waiting
    for events. Hence epoll_pwait will wait until either one monitored event,
    or an unmasked signal happen. If sigmask is NULL, the epoll_pwait system
    call will act exactly like epoll_wait. For the POSIX definition of
    pselect, information is available here:

    http://www.opengroup.org/onlinepubs/009695399/functions/select.html

    Signed-off-by: Davide Libenzi
    Cc: David Woodhouse
    Cc: Andi Kleen
    Cc: Michael Kerrisk
    Cc: Ulrich Drepper
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Davide Libenzi
     
  • In order to encourage people to notice when they break the exported
    headers, add a config option which automatically runs the sanity checks
    when building vmlinux. That way, those who use allyesconfig will notice
    failures.

    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Woodhouse
     
  • Now that various memory splits are enabled, add a config option allowing the
    user to compile UML for its need - HOST_2G_2G allowed to choose either 3G/1G
    or 2G/2G, and enabling it reduced the usable virtual memory.

    Detecting this at run time should be implemented in the future, but we must
    make the stop-gap measure work well enough (this is valid in _many_ cases).

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Deprecate TT mode in Kconfig so that users won't select it, update the
    MODE_SKAS description (it was largely obsolete and misleadin) and btw describe
    advantages for high memory usage with CONFIG_STATIC_LINK.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • The export is together with the definition, in arch/x86_64/lib/csum-partial.c,
    which is compiled in by arch/um/sys-x86_64/Makefile.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Unify macros common to x86 and x86_64 kernel-offsets.h files.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Enable compilation of x86_64 crypto code;, and add the needed constant to make
    the code compile again (that macro was added to i386 asm-offsets between
    2.6.17 and 2.6.18, in 6c2bb98bc33ae33c7a33a133a4cd5a06395fece5).

    Cc: Herbert Xu
    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Declare UML partial support for LOCKDEP - however IRQFLAGS tracing requires
    some coding which nobody did yet, so we cannot run full lockdep on UML. Grep
    for CONFIG_TRACE_IRQFLAGS on i386 code to find their implementation.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • On a 64bit Uml, if run under "setarch i386" (which a user did), uname()
    currently returns the obtained i686 as machine - fix that. Btw, I'm quite
    surprised that under setarch i386 a 64-bit binary can run.

    Cc: Andi Kleen
    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Makes UML compile on any possible processor choice. The two problems were:

    *) x86 code, when 386 is selected, checks at runtime boot_cpuflags, which we do
    not have.

    *) 3Dnow support for memcpy() et al. does not compile currently and fixing this
    is not trivial, so simply disable it; with this change, if one selects MK7
    UML compiles (while it did not).

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • If enable is moved by GCC in a register its value may not be preserved after
    coming back there with longjmp(). So, mark it as volatile to prevent this;
    this is suggested (it seems) in info gcc, when it talks about -Wuninitialized.
    I re-read this and it seems to say something different, but I still believe
    this may be needed.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Make TT mode compile after the introduction of klibc's implementation of
    setjmp.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • This was forgot in a previous patch so UML does not compile with TT mode
    enabled.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Correct commit 5906e4171ad61ce68de95e51b773146707671f80 - this makes more
    sense: we turn pte_mkexec + pte_wrprotect to pte_mkread. However, due to a
    bug in pte_mkread, it does the exact same thing as pte_mkwrite, so this patch
    improves the code but does not change anything in practice. The pte_mkread
    bug is fixed separately, as it may have big impact.

    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Andi Kleen pointed out that -mcmodel=kernel does not make sense for userspace
    code and would stop everything from working, and pointed out the correct fix
    for the original bug (not easy to do for me).

    Reverts part of commit 06837504de7b4883e92af207dbbab4310d0db0ed.

    Cc: Andi Kleen
    Signed-off-by: Paolo 'Blaisorblade' Giarrusso
    Cc: Jeff Dike
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paolo 'Blaisorblade' Giarrusso
     
  • Move the lock debug checks below the page reserved checks. Also, having
    debug_check_no_locks_freed in kernel_map_pages is wrong.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin