28 Jun, 2006

40 commits

  • This patch reverts notifier_block changes made in 2.6.17

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     
  • In 2.6.17, there was a problem with cpu_notifiers and XFS. I provided a
    band-aid solution to solve that problem. In the process, i undid all the
    changes you both were making to ensure that these notifiers were available
    only at init time (unless CONFIG_HOTPLUG_CPU is defined).

    We deferred the real fix to 2.6.18. Here is a set of patches that fixes the
    XFS problem cleanly and makes the cpu notifiers available only at init time
    (unless CONFIG_HOTPLUG_CPU is defined).

    If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
    time.

    This patch reverts the notifier_call changes made in 2.6.17

    Signed-off-by: Chandra Seetharaman
    Cc: Ashok Raj
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chandra Seetharaman
     
  • We need to serialize access to the global rtc_idr even in this error path.

    Signed-off-by: Sonny Rao
    Acked-by: Alessandro Zummo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Sonny Rao
     
  • There are two locking sets involved. One locks the board mappings and the
    other is the tty open/close locking. The low level code was clearly
    designed to be ported to OS's with spin locks already so pretty much comes
    out in the wash

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Corey Minyard

    The kthread used to speed up polling for IPMI was using udelay in its
    busy-wait polling loop when the lower-level state machine told it to do a
    short delay. This just used CPU and didn't help scheduling, thus causing
    bad problems with other tasks. Call schedule() instead.

    Signed-off-by: Corey Minyard
    Acked-by: Matt Domsch
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    akpm@osdl.org
     
  • Add operations for the call_rcu_bh() variant of RCU. Also add an
    rcu_batches_completed_bh() function, which is needed by rcutorture.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • Add an ops vector to rcutorture, and add the ops for Classic RCU. Update
    the rcutorture documentation to reflect slight change to the dmesg formats.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • This just catches the RCU torture documentation up with the recent fixes
    that test RCU for architectures that turn of the scheduling-clock interrupt
    for idle CPUs and the addition of a SUCCESS/FAILURE indication, fixing up
    an obsolete comment as well.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     
  • Fix kernel-doc parameters in kernel/

    Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1376): No description found for parameter 'u_abs_timeout'
    Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_msg_prio'
    Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_abs_timeout'
    Warning(/var/linsrc/linux-2617-g9//kernel/acct.c:526): No description found for parameter 'pacct'

    Signed-off-by: Randy Dunlap
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Fix TCSBRK comment to prevent confusion or accidental removal.

    Signed-off-by: Paul Fulghum
    Cc: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Fulghum
     
  • Add missed ufsi->i_dir_start_lookup initialization in ufs_read_inode in
    UFS2 case. Also it cleans ufs_read_inode function to prevent such kind of
    situation in the future: it move depend on UFS type parts of code into
    separate functions and leaves in ufs_read_inode only generic code. It
    cleans code and avoids duplication.

    Signed-off-by: Evgeniy Dushistov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Evgeniy Dushistov
     
  • Signed-off-by: Atsushi Nemoto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Atsushi Nemoto
     
  • generic_file_buffered_write() prefaults in user pages in order to avoid
    deadlock on copying from the same page as write goes to.

    However, it looks like there is a problem when write is vectored:
    fault_in_pages_readable brings in current segment or its part (maxlen).
    OTOH, filemap_copy_from_user_iovec is called to copy number of bytes
    (bytes) which may exceed current segment, so filemap_copy_from_user_iovec
    switches to the next segment which is not brought in yet. Pagefault is
    generated. That causes the deadlock if pagefault is for the same page
    write goes to: page being written is locked and not uptodate, pagefault
    will deadlock trying to lock locked page.

    [akpm@osdl.org: somewhat rewritten]
    Cc: Neil Brown
    Cc: Martin Schwidefsky
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir V. Saveliev
     
  • We include config.h on the compiler command line. There's no need for it
    to be included again.

    Signed-off-by: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Woodhouse
     
  • locking init cleanups:

    - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
    - convert rwlocks in a similar manner

    this patch was generated automatically.

    Motivation:

    - cleanliness
    - lockdep needs control of lock initialization, which the open-coded
    variants do not give
    - it's also useful for -rt and for lock debugging in general

    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • - add a proper prototype for the following global function:
    - buffer_init()

    - make the following needlessly global function static:
    - end_buffer_async_write()

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Add more poison values to include/linux/poison.h. It's not clear to me
    whether some others should be added or not, so I haven't added any of
    these:

    ./include/linux/libata.h:#define ATA_TAG_POISON 0xfafbfcfdU
    ./arch/ppc/8260_io/fcc_enet.c:1918: memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
    ./drivers/usb/mon/mon_text.c:429: memset(mem, 0xe5, sizeof(struct mon_event_text));
    ./drivers/char/ftape/lowlevel/ftape-ctl.c:738: memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
    ./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Update two drivers to use poison.h.

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Localize poison values into one header file for better documentation and
    easier/quicker debugging and so that the same values won't be used for
    multiple purposes.

    Use these constants in core arch., mm, driver, and fs code.

    Signed-off-by: Randy Dunlap
    Acked-by: Matt Mackall
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: "David S. Miller"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Move the i386 VDSO down into a vma and thus randomize it.

    Besides the security implications, this feature also helps debuggers, which
    can COW a vma-backed VDSO just like a normal DSO and can thus do
    single-stepping and other debugging features.

    It's good for hypervisors (Xen, VMWare) too, which typically live in the same
    high-mapped address space as the VDSO, hence whenever the VDSO is used, they
    get lots of guest pagefaults and have to fix such guest accesses up - which
    slows things down instead of speeding things up (the primary purpose of the
    VDSO).

    There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
    for older glibcs that still rely on a prelinked high-mapped VDSO. Newer
    distributions (using glibc 2.3.3 or later) can turn this option off. Turning
    it off is also recommended for security reasons: attackers cannot use the
    predictable high-mapped VDSO page as syscall trampoline anymore.

    There is a new vdso=[0|1] boot option as well, and a runtime
    /proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
    on/off.

    (This version of the VDSO-randomization patch also has working ELF
    coredumping, the previous patch crashed in the coredumping code.)

    This code is a combined work of the exec-shield VDSO randomization
    code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
    started this patch and i completed it.

    [akpm@osdl.org: cleanups]
    [akpm@osdl.org: compile fix]
    [akpm@osdl.org: compile fix 2]
    [akpm@osdl.org: compile fix 3]
    [akpm@osdl.org: revernt MAXMEM change]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Arjan van de Ven
    Cc: Gerd Hoffmann
    Cc: Rusty Russell
    Cc: Zachary Amsden
    Cc: Andi Kleen
    Cc: Jan Beulich
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • The following

    [PATCH] Clean up and refactor i386 sub-architecture setup

    Doesn't quite work, since it leaves out an include of asm/io.h, without
    which the use of inb/outb in the setup file won.t work. This corrects that
    and also removes a spurious acpi reference that apparently crept in ages
    ago but should never have been there.

    Signed-off-by: James Bottomley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Commit 1e9f28fa1eb9773bf65bae08288c6a0a38eef4a7 ("[PATCH] sched: new
    sched domain for representing multi-core") incorrectly made SCHED_SMT
    and some of the structures it uses dependent on SMP.

    However, this is wrong, the structures are only defined if X86_HT, so
    SCHED_SMT has to depend on that as well.

    The patch broke voyager, since it doesn't provide any of the multi-core
    or hyperthreading structures.

    Signed-off-by: James Bottomley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     
  • Commit c3ff8ec31c1249d268cd11390649768a12bec1b9 ("[PATCH] i386: Don't
    miss pending signals returning to user mode after signal processing")
    meant that vm86 interrupt/signal handling got broken for the case when
    vm86 is called from kernel space.

    In this scenario, if signal is pending because of vm86 interrupt,
    do_notify_resume/do_signal exits immediately due to user_mode() check,
    without processing any signals. Thus, resume_userspace handler is spinning
    in a tight loop with signal pending and TIF_SIGPENDING is set. Previously
    everything worked Ok.

    No in-tree usage of vm86() from kernel space exists, but I've heard
    about a number of projects out there which use vm86 calls from kernel,
    one of them being this, for instance:

    http://dev.gentoo.org/~spock/projects/vesafb-tng/

    The following patch fixes the issue.

    Signed-off-by: Aleksey Gorelov
    Cc: Atsushi Nemoto
    Cc: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aleksey Gorelov
     
  • Using C code for current_thread_info() lets the compiler optimize it.
    With gcc 4.0.2, kernel is smaller:

    text data bss dec hex filename
    3645212 555556 312024 4512792 44dc18 2.6.17-rc6-nb-post/vmlinux
    3647276 555556 312024 4514856 44e428 2.6.17-rc6-nb/vmlinux
    -------
    -2064

    Signed-off-by: Chuck Ebbert
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chuck Ebbert
     
  • Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure. Similar
    patch for x86_64 is already accepted by Andi earlier this week.

    [akpm@osdl.org: fix warning]
    Signed-off-by: Rohit Seth
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rohit Seth
     
  • Signed-off-by: Andreas Mohr
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Mohr
     
  • Remove the limit of 256 interrupt vectors by changing the value stored in
    orig_{e,r}ax to be the complemented interrupt vector. The orig_{e,r}ax
    needs to be < 0 to allow the signal code to distinguish between return from
    interrupt and return from syscall. With this change applied, NR_IRQS can
    be > 256.

    Xen extends the IRQ numbering space to include room for dynamically
    allocated virtual interrupts (in the range 256-511), which requires a more
    permissive interface to do_IRQ.

    Signed-off-by: Ian Pratt
    Signed-off-by: Christian Limpach
    Signed-off-by: Chris Wright
    Signed-off-by: Rusty Russell
    Cc: "Protasevich, Natalie"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • The patch fixes two issues:

    1. cpu_init is called with interrupt disabled. Allocating gdt table
    there isn't good at runtime.

    2. gdt table page cause memory leak in CPU hotplug case.

    Signed-off-by: Shaohua Li
    Cc: Ashok Raj
    Cc: Zachary Amsden
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • Update SELinux to cause the keycreate process attribute held in
    /proc/self/attr/keycreate to be inherited across a fork and reset upon
    execve. This is consistent with the handling of the other process
    attributes provided by SELinux and also makes it simpler to adapt logon
    programs to properly handle the keycreate attribute.

    Signed-off-by: Michael LeMay
    Signed-off-by: David Howells
    Acked-by: Stephen Smalley
    Acked-by: James Morris
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael LeMay
     
  • With Goto-san's patch, we can add new pgdat/node at runtime. I'm now
    considering node-hot-add with cpu + memory on ACPI.

    I found acpi container, which describes node, could evaluate cpu before
    memory. This means cpu-hot-add occurs before memory hot add.

    In most part, cpu-hot-add doesn't depend on node hot add. But register_cpu(),
    which creates symbolic link from node to cpu, requires that node should be
    onlined before register_cpu(). When a node is onlined, its pgdat should be
    there.

    This patch-set holds off creating symbolic link from node to cpu
    until node is onlined.

    This removes node arguments from register_cpu().

    Now, register_cpu() requires 'struct node' as its argument. But the array of
    struct node is now unified in driver/base/node.c now (By Goto's node hotplug
    patch). We can get struct node in generic way. So, this argument is not
    necessary now.

    This patch also guarantees add cpu under node only when node is onlined. It
    is necessary for node-hot-add vs. cpu-hot-add patch following this.

    Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard
    to its 'struct node *root' argument. This patch removes it.

    Also modify callers of register_cpu()/unregister_cpu, whose args are changed
    by register-cpu-remove-node-struct patch.

    [Brice.Goglin@ens-lyon.org: fix it]
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Cc: Ashok Raj
    Cc: Dave Hansen
    Signed-off-by: Brice Goglin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • This is a patch to allocate pgdat and per node data area for ia64. The size
    for them can be calculated by compute_pernodesize().

    Signed-off-by: Yasunori Goto
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • This is to refresh node_data[] array for ia64. As I mentioned previous
    patches, ia64 has copies of information of pgdat address array on each node as
    per node data.

    At v2 of node_add, this function used stop_machine_run() to update them. (I
    wished that they were copied safety as much as possible.) But, in this patch,
    this arrays are just copied simply, and set node_online_map bit after
    completion of pgdat initialization.

    So, kernel must touch NODE_DATA() macro after checking node_online_map().
    (Current code has already done it.) This is more simple way for just
    hot-add.....

    Note : It will be problem when hot-remove will occur,
    because, even if online_map bit is set, kernel may
    touch NODE_DATA() due to race condition. :-(

    Signed-off-by: Yasunori Goto
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • This is a preparatory patch to make common code for updating of NODE_DATA() of
    ia64 between boottime and hotplug.

    Current code remembers pgdat address in mem_data which is used at just boot
    time. But its information can be used at hotplug time by moving to global
    value. The next patch uses this array.

    Signed-off-by: Yasunori Goto
    Cc: "Luck, Tony"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • When new node becomes enable by hot-add, new sysfs file must be created for
    new node. So, if new node is enabled by add_memory(), register_one_node() is
    called to create it. In addition, I386's arch_register_node() and a part of
    register_nodes() of powerpc are consolidated to register_one_node() as a
    generic_code().

    This is tested by Tiger4(IPF) with node hot-plug emulation.

    Signed-off-by: Keiichiro Tokunaga
    Signed-off-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • Fix "undefined reference to `arch_add_memory'" on sparc64 allmodconfig.

    sparc64 doesn't support memory hotplug. But we want it to support
    sparsemem.

    Signed-off-by: Yasunori Goto
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • This patch allows hot-add memory which is not aligned to section.

    Now, hot-added memory has to be aligned to section size. Considering big
    section sized archs, this is not useful.

    When hot-added memory is registerd as iomem resoruce by iomem resource
    patch, we can make use of that information to detect valid memory range.

    Note: With this, not-aligned memory can be registerd. To allow hot-add
    memory with holes, we have to do more work around add_memory().
    (It doesn't allows add memory to already existing mem section.)

    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • Register hot-added memory to iomem_resource. With this, /proc/iomem can
    show hot-added memory.

    Note: kdump uses /proc/iomem to catch memory range when it is installed.
    So, kdump should be re-installed after /proc/iomem change.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Vivek Goyal
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • Add node-hot-add support to add_memory().

    node hotadd uses this sequence.
    1. allocate pgdat.
    2. refresh NODE_DATA()
    3. call free_area_init_node() to initialize
    4. create sysfs entry
    5. add memory (old add_memory())
    6. set node online
    7. run kswapd for new node.
    (8). update zonelist after pages are onlined. (This is already merged in -mm
    due to update phase is difference.)

    Note:
    To make common function as much as possible,
    there is 2 changes from v2.
    - The old add_memory(), which is defiend by each archs,
    is renamed to arch_add_memory(). New add_memory becomes
    caller of arch dependent function as a common code.

    - This patch changes add_memory()'s interface
    From: add_memory(start, end)
    TO : add_memory(nid, start, end).
    It was cause of similar code that finding node id from
    physical address is inside of old add_memory() on each arch.

    In addition, acpi memory hotplug driver can find node id easier.
    In v2, it must walk DSDT'S _CRS by matching physical address to
    get the handle of its memory device, then get _PXM and node id.
    Because input is just physical address.
    However, in v3, the acpi driver can use handle to get _PXM and node id
    for the new memory device. It can pass just node id to add_memory().

    Fix interface of arch_add_memory() is in next patche.

    Signed-off-by: Yasunori Goto
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Dave Hansen
    Cc: "Brown, Len"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • When node is hot-added, kswapd for the node should start. This export kswapd
    start function as kswapd_run() to use at add_memory().

    [akpm@osdl.org: daemonize() isn't needed when using the kthread API]
    Signed-off-by: Yasunori Goto
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Dave Hansen
    Cc: "Brown, Len"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • Refresh NODE_DATA() for generic archs. In this case, NODE_DATA(nid) ==
    node_data[nid]. node_data[] is array of address of pgdat. So, refresh is
    quite simple.

    Signed-off-by: Yasunori Goto
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Dave Hansen
    Cc: "Brown, Len"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto