30 Sep, 2006

40 commits

  • This is an updated version of Eric Biederman's is_init() patch.
    (http://lkml.org/lkml/2006/2/6/280). It applies cleanly to 2.6.18-rc3 and
    replaces a few more instances of ->pid == 1 with is_init().

    Further, is_init() checks pid and thus removes dependency on Eric's other
    patches for now.

    Eric's original description:

    There are a lot of places in the kernel where we test for init
    because we give it special properties. Most significantly init
    must not die. This results in code all over the kernel test
    ->pid == 1.

    Introduce is_init to capture this case.

    With multiple pid spaces for all of the cases affected we are
    looking for only the first process on the system, not some other
    process that has pid == 1.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Dave Hansen
    Cc: Serge Hallyn
    Cc: Cedric Le Goater
    Cc:
    Acked-by: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     
  • This appears to be the only usage of is_init in the kernel besides the
    usage in sched.h. On ia64 the same function is called in_init. So to
    remove the conflict and make the kernel more consistent rename is_init
    is_core is_local and is_local_section to in_init in_core in_local and
    in_local_section respectively.

    Thanks to Adrian Bunk who spotted this, and to Matthew Wilcox
    who suggested this fix.

    Signed-off-by: Eric Biederman
    Cc: Kyle McMartin
    Cc: Matthew Wilcox
    Cc: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Biederman
     
  • Fixed race on put_files_struct on exec with proc. Restoring files on
    current on error path may lead to proc having a pointer to already kfree-d
    files_struct.

    ->files changing at exit.c and khtread.c are safe as exit_files() makes all
    things under lock.

    Found during OpenVZ stress testing.

    [akpm@osdl.org: add export]
    Signed-off-by: Pavel Emelianov
    Signed-off-by: Kirill Korotaev
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill Korotaev
     
  • Convert i386 apm.c from kernel_thread(), whose export is deprecated, to
    kthread API.

    Signed-off-by: Serge E. Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     
  • The current kernel serializes console resizes but does not serialize the
    resize against the tty structure updates. This means that while two
    parallel resizes cannot mess up the console you can get incorrect results
    reported.

    Secondly while doing this I added vc_lock_resize() to lock and resize the
    console. This leaves all knowledge of the console_sem in the vt/console
    driver and kicks it out of the tty layer, which is good

    Thirdly while doing this I decided I couldn't stand "disallocate" any
    longer so I switched it to "deallocate".

    Signed-off-by: Alan Cox
    Cc: Paul Fulghum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Fat is commonly used on removable media. Mounting with -o flush tells the
    FS to write things to disk as quickly as possible. It is like -o sync, but
    much faster (and not as safe).

    Signed-off-by: Chris Mason
    Cc: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Mason
     
  • Need to enable/disable all the counters instead of just counter 0.

    This affects all cpus with family=6, including i386/core. Usual symptom:
    only counter 0 provides samples. Other counters don't produce samples.

    Signed-off-by: Arun Sharma
    Cc: Philippe Elie
    Cc: John Levon
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     
  • [assuming BSD security levels are deleted]
    The only user of i_security, f_security, s_security fields is SELinux,
    however, quite a few security modules are trying to get into kernel.
    So, wrap them under CONFIG_SECURITY. Adding config option for each
    security field is likely an overkill.

    Following Stephen Smalley's suggestion, i_security initialization is
    moved to security_inode_alloc() to not clutter core code with ifdefs
    and make alloc_inode() codepath tiny little bit smaller and faster.

    The user of (highly greppable) struct fown_struct::security field is
    still to be found. I've checked every "fown_struct" and every "f_owner"
    occurence. Additionally it's removal doesn't break i386 allmodconfig
    build.

    struct inode, struct file, struct super_block, struct fown_struct
    become smaller.

    P.S. Combined with two reiserfs inode shrinking patches sent to
    linux-fsdevel, I can finally suck 12 reiserfs inodes into one page.

    /proc/slabinfo

    -ext2_inode_cache 388 10
    +ext2_inode_cache 384 10
    -inode_cache 280 14
    +inode_cache 276 14
    -proc_inode_cache 296 13
    +proc_inode_cache 292 13
    -reiser_inode_cache 336 11
    +reiser_inode_cache 332 12
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Shrink reiserfs inode more (by 8 bytes) for ACL non-users:

    -reiser_inode_cache 344 11
    +reiser_inode_cache 336 11

    Signed-off-by: Alexey Dobriyan
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Shrink reiserfs inode by 12 bytes for xattr non-users (me).

    -reiser_inode_cache 356 11
    +reiser_inode_cache 344 11

    Signed-off-by: Alexey Dobriyan
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Fix "variable defined but not used" compiler warning in unwind.c when
    CONFIG_MODULES is not set.

    Signed-off-by: Chuck Ebbert
    Cc: Jan Beulich
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chuck Ebbert
     
  • All suppliers of ->quota_read, ->quota_write (I've found ext2, ext3, UFS,
    reiserfs) already have them properly ifdeffed. All callers of
    ->quota_read, ->quota_write are under CONFIG_QUOTA umbrella, so...

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • ReiserFS does periodic cleanup of old transactions in order to limit the
    length of time a journal replay may take after a crash. Sometimes, writing
    metadata from an old (already committed) transaction may require committing
    a newer transaction, which also requires writing all data=ordered buffers.
    This can cause very long stalls on journal_begin.

    This patch makes sure new transactions will not need to be committed before
    trying a periodic reclaim of an old transaction. It is low risk because if
    a bad decision is made, it just means a slightly longer journal replay
    after a crash.

    Signed-off-by: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Mason
     
  • make sure that reiserfs_fsync only triggers barriers when mounted with -o
    barrier=flush

    Signed-off-by: Chris Mason
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Mason
     
  • Fix utf-8 mode so alternate charset modes always work according to control
    sequences interpreted in do_con_trol function preserving backward US-ASCII
    and VT100 semigraphics compatibility.

    Malformed utf-8 sequences are represented as sequences of replacement
    glyphs,original codes or '?' as a last resort.

    unicode-xterm, gnome-terminal, kconsole and other terminal emulators in
    utf-8 mode respect acsc, enacs, rmacs sequences. Also I found that some
    important system programs (from Debian distro) uses acsc in utf-8 mode -
    dselect, aptitude, w3m for example.

    Signed-off-by: Adam Tlalka
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adam Tlalka
     
  • ucb1x00-ts: handle errors from input_register_device()

    Signed-off-by: Dmitry Torokhov
    Cc: Russell King
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Torokhov
     
  • In cases where we detect a single bit has been flipped, we spew the usual
    slab corruption message, which users instantly think is a kernel bug. In a
    lot of cases, single bit errors are down to bad memory, or other hardware
    failure.

    This patch adds an extra line to the slab debug messages in those cases, in
    the hope that users will try memtest before they report a bug.

    000: 6b 6b 6b 6b 6a 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
    Single bit error detected. Possibly bad RAM. Run memtest86.

    [akpm@osdl.org: cleanups]
    Signed-off-by: Dave Jones
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • Just comment and next "while" look _very_ wrong. Place { correctly to hint
    unsuspecting ones that it's the end of the loop actually.

    Signed-off-by: Alexey Dobriyan
    Cc: Dave Jones
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • This code has suffered from broken core design and lack of developer
    attention. Broken security modules are too dangerous to leave around. It
    is time to remove this one.

    Signed-off-by: Chris Wright
    Acked-by: Michael Halcrow
    Acked-by: Serge Hallyn
    Cc: Davi Arnaut
    Acked-by: Greg Kroah-Hartman
    Acked-by: James Morris
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wright
     
  • Now that the generic DMA code has a function to decide if a given DMA
    mapping is valid use it. This will catch cases where direction is not any
    of the defined enum values but some random number outside the valid range.
    The current implementation will only catch the defined but invalid case
    DMA_NONE.

    Signed-off-by: Rolf Eike Beer
    Acked-by: Muli Ben-Yehuda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     
  • As suggested by Muli Ben-Yehuda this function is moved to generic code as
    may be useful for all archs.

    [akpm@osdl.org: fix]
    Signed-off-by: Rolf Eike Beer
    Cc: Muli Ben-Yehuda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     
  • Jon Smirl noted a couple of tty driver functions now are quite misleadingly
    named with the death of devfs. A quick grep found another case in the lp
    driver.

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Some of the kerneldoc comments in this file are ignored since the lead-in
    is malformed, using either "/*" or "/***" instead of "/**".

    [rdunlap@xenotime.net: kerneldoc fixes]
    Signed-off-by: Rolf Eike Beer
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rolf Eike Beer
     
  • I was looking for the a way around an OOM-problem, and found a couple of
    undocumented new features for tuning the OOM-score of individual processes.
    Here's a small documentation patch for /proc//oom_adj and
    /proc//oom_score.

    Signed-off-by: Jan-Frode Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan-Frode Myklebust
     
  • Oleg brought up some interesting points about grabbing the pi_lock for some
    protections. In this discussion, I realized that there are some places
    that the pi_lock is being grabbed when it really wasn't necessary. Also
    this patch does a little bit of clean up.

    This patch basically does three things:

    1) renames the "boost" variable to "chain_walk". Since it is used in
    the debugging case when it isn't going to be boosted. It better
    describes what the test is going to do if it succeeds.

    2) moves get_task_struct to just before the unlocking of the wait_lock.
    This removes duplicate code, and makes it a little easier to read. The
    owner wont go away while either the pi_lock or the wait_lock are held.

    3) removes the pi_locking and owner blocked checking completely from the
    debugging case. This is because the grabbing the lock and doing the
    check, then releasing the lock is just so full of races. It's just as
    good to go ahead and call the pi_chain_walk function, since after
    releasing the lock the owner can then block anyway, and we would have
    missed that. For the debug case, we really do want to do the chain walk
    to test for deadlocks anyway.

    [oleg@tv-sign.ru: more of the same]
    Signed-of-by: Steven Rostedt
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Oleg Nesterov
    Cc: Esben Nielsen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven Rostedt
     
  • Previously, since determination whether there was an Intel random number
    generator was based on a single bit, on systems with a matching bridge
    device but without a firmware hub, there was a 50% chance that the code
    would incorrectly decide that the system had an RNG. This patch adds
    detection of the firmware hub to better qualify the existence of an RNG.

    There is one issue with the patch: I was unable to determine the LPC
    equivalent for the PCI bridge 8086:2430 (since the old code didn't care
    about which of the many devices provided by the ICH/ESB it was chose to use
    the PCI bridge device, but the FWH settings live in the LPC device, so the
    device list needed to be changed).

    Signed-off-by: Jan Beulich
    Signed-off-by: Michael Buesch
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • There's a bug where a UDF_PART_FLAG_READ_ONLY udf partition gets mounted
    read-write, then subsequent problems happen; files seem to be able to be
    removed, but file creation results in EIO or worse, oops.

    EIO is coming from udf_new_block(), which returns EIO if the right flags
    aren't set; only UDF_PART_FLAG_READ_ONLY is set in this case. We probably
    s hould not have gotten this far...

    Attached patch seems to fix it - and includes a printk to alert the user
    that their "rw" mount request has been converted to "ro."

    Here's the testcase I used:

    [root@magnesium ~]# mkisofs -R -J -udf -o testiso /tmp/
    ...
    Total translation table size: 0
    Total rockridge attributes bytes: 342923
    Total directory bytes: 382312
    Path table size(bytes): 104
    Max brk space used 103000
    105059 extents written (205 MB)

    [root@magnesium ~]# mount -o loop testiso /mnt/test/
    [root@magnesium ~]# ls /mnt/test/fsfile
    /mnt/test/fsfile
    [root@magnesium ~]# rm /mnt/test/fsfile
    [root@magnesium ~]# ls /mnt/test/fsfile
    ls: /mnt/test/fsfile: No such file or directory
    [root@magnesium ~]# touch /mnt/test/fsfile
    touch: cannot touch `/mnt/test/fsfile': Input/output error
    [root@magnesium tmp]# grep udf /proc/mounts
    /dev/loop1 /mnt/test udf rw 0 0

    Force readonly mounts of UDF partitions marked as read-only.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Sandeen
     
  • Signed-off-by: Alexey Dobriyan
    Acked-by: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • ipc/sem.c only.

    $ agrep sysvsem -w -n
    ipc/sem.c:912: undo_list = current->sysvsem.undo_list;
    ipc/sem.c:932: undo_list = current->sysvsem.undo_list;
    ipc/sem.c:954: undo_list = current->sysvsem.undo_list;
    ipc/sem.c:963: current->sysvsem.undo_list = undo_list;
    ipc/sem.c:1247: tsk->sysvsem.undo_list = undo_list;
    ipc/sem.c:1249: tsk->sysvsem.undo_list = NULL;
    ipc/sem.c:1271: undo_list = tsk->sysvsem.undo_list;
    include/linux/sched.h:876: struct sysv_sem sysvsem;

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • The on-disk data structures from AIX are not known, also the filesystem
    layout is not known. There is a msdos partition signature at the end of
    the first block, and the kernel recognizes 3 small (and overlapping)
    partitions. But they are not usable. Maybe the firmware uses it to find
    the bootloader for AIX, but AIX boots also if the first block is cleared.

    This is the content of the partition table:
    # dd if=/dev/sdb count=$(( 4 * 16 )) bs=1 skip=$(( 0x1be )) | xxd
    0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
    0000010: 80ff ffff 41ff ffff 1b11 0000 381b 0000 ....A.......8...
    0000020: 00ff ffff 41ff ffff 0211 0000 1900 0000 ....A...........
    0000030: 80ff ffff 41ff ffff 1b11 0000 381b 0000 ....A.......8...

    Handle the whole disk as empty disk.

    This fixes also YaST which compares the output from parted (and formerly
    fdisk) with /proc/partitions. fdisk recognizes the AIX label since a long
    time, SuSE has a patch for parted to handle the disk label as unknown.

    dmesg will look like this:
    sda: [AIX] unknown partition table

    Tested on an IBM B50 with AIX V4.3.3.

    Signed-off-by: Olaf Hering
    Cc: Albert Cahalan
    Cc: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Olaf Hering
     
  • Unused: tty_struct.max_flip_cnt

    $ git grep max_flip_cnt
    include/linux/tty.h: int max_flip_cnt;
    $

    Cc: Paul Fulghum
    Acked-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthias Urlichs
     
  • This teaches dmi_decode() how to decode and save OEM Strings (type 11) DMI
    information, which is currently discarded silently. Existing code using
    DMI is not affected. Follows the "System Management BIOS (SMBIOS)
    Specification" (http://www.dmtf.org/standards/smbios), and also the
    userspace dmidecode.c code.

    OEM Strings are the only safe way to identify some hardware, e.g., the
    ThinkPad embedded controller used by the soon-to-be-submitted tp_smapi
    driver. This will also let us eliminate the long whitelist in the mainline
    hdaps driver (in a future patch).

    Signed-off-by: Shem Multinymous
    Cc: Bjorn Helgaas
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shem Multinymous
     
  • lock_timer_base acquires a lock and returns with that lock held. Add a
    lock annotation to this function so that sparse can check callers for lock
    pairing, and so that sparse will not complain about this function since it
    intentionally uses the lock in this manner.

    Signed-off-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     
  • In the "operation does permission checking" model used by fuse, chdir
    permission is not checked, since there's no chdir method.

    For this case set a lookup flag, which will be passed to ->permission(), so
    fuse can distinguish it from permission checks for other operations.

    Signed-off-by: Miklos Szeredi
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Some filesystems may want to report different values depending on the path
    within the filesystem, i.e. one mount is actually several filesystems. This
    can be the case for a network filesystem exported by an unprivileged server
    (e.g. sshfs).

    This is now possible, thanks to David Howells "VFS: Permit filesystem to
    perform statfs with a known root dentry" patch.

    This change is backward compatible, so no need to change interface version.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     
  • Initialize module_subsys earlier (or at least earlier than devices) since
    it could be used very early in the boot process if kmod loads a module
    before the device initcalls. Otherwise, kmod will crash in
    kernel/module.c:mod_sysfs_setup() since the kset in module_subsys is not
    initialized yet.

    I only noticed this problem because occasionally, kmod loads the modules
    for my SCSI and Ethernet adapters very early, during the boot process
    itself. I don't quite understand why it loads them sometimes and doesn't
    load them other times. Or who is telling kmod to do so. Can someone
    explain?

    Signed-off-by: Mark Huang
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mark Huang
     
  • Files supported by fs/proc/base.c, i.e. /proc//*, are not capable of
    meeting the validity checks in ELF load_elf_*() handling because they have
    no mmap handler which is required by ELF. In order to stop a.out
    executables being used as part of an exploit attack against /proc-related
    vulnerabilities, we make a.out executables depend on ->mmap() existing.

    Signed-off-by: Eugene Teo
    Signed-off-by: Marcel Holtmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eugene Teo
     
  • Replace kernel_thread() call in drivers/base/firmware_class.c with
    kthread_create() since kernel_thread() is deprecated in drivers.

    Signed-off-by: Sukadev Bhattiprolu
    Cc: Cedric Le Goater
    Cc: Serge E. Hallyn
    Cc: Dave Hansen
    Cc: Manuel Estrada Sainz
    Acked-by: Marcel Holtmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sukadev Bhattiprolu
     
  • rcu_torture_read_lock and rcu_bh_torture_read_lock acquire locks without
    releasing them, and the matching functions rcu_torture_read_unlock and
    rcu_bh_torture_read_unlock get called with the corresponding locks held and
    release them. Add lock annotations to these four functions so that sparse
    can check callers for lock pairing, and so that sparse will not complain
    about these functions since they intentionally use locks in this manner.

    Signed-off-by: Josh Triplett
    Acked-by: Paul McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett
     
  • grab_super gets called with sb_lock held, and releases it. Add a lock
    annotation to this function so that sparse can check callers for lock
    pairing, and so that sparse will not complain about this function since it
    intentionally uses the lock in this manner.

    Signed-off-by: Josh Triplett
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josh Triplett