11 Feb, 2009

1 commit

  • When we check if a task has been switched out since the last scan, we might
    have a race condition on the following scenario:

    - the task is freshly created and scheduled

    - it puts its state to TASK_UNINTERRUPTIBLE and is not yet switched out

    - check_hung_task() scans this task and will report a false positive because
    t->nvcsw + t->nivcsw == t->last_switch_count == 0

    Add a check for such cases.

    Signed-off-by: Frederic Weisbecker
    Acked-by: Mandeep Singh Baines
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     

09 Feb, 2009

1 commit


06 Feb, 2009

2 commits

  • Since the tasklist is protected by rcu list operations, it is safe
    to convert the read_lock()s to rcu_read_lock().

    Suggested-by: Peter Zijlstra
    Signed-off-by: Mandeep Singh Baines
    Signed-off-by: Ingo Molnar

    Mandeep Singh Baines
     
  • Impact: extend the scope of hung-task checks

    Changed the default value of hung_task_check_count to PID_MAX_LIMIT.
    hung_task_batch_count added to put an upper bound on the critical
    section. Every hung_task_batch_count checks, the rcu lock is never
    held for a too long time.

    Keeping the critical section small minimizes time preemption is disabled
    and keeps rcu grace periods small.

    To prevent following a stale pointer, get_task_struct is called on g and t.
    To verify that g and t have not been unhashed while outside the critical
    section, the task states are checked.

    The design was proposed by Frédéric Weisbecker.

    Signed-off-by: Mandeep Singh Baines
    Suggested-by: Frédéric Weisbecker
    Acked-by: Andrew Morton
    Signed-off-by: Ingo Molnar

    Mandeep Singh Baines
     

19 Jan, 2009

1 commit


17 Jan, 2009

1 commit


16 Jan, 2009

1 commit

  • Decoupling allows:

    * hung tasks check to happen at very low priority

    * hung tasks check and softlockup to be enabled/disabled independently
    at compile and/or run-time

    * individual panic settings to be enabled disabled independently
    at compile and/or run-time

    * softlockup threshold to be reduced without increasing hung tasks
    poll frequency (hung task check is expensive relative to softlock watchdog)

    * hung task check to be zero over-head when disabled at run-time

    Signed-off-by: Mandeep Singh Baines
    Signed-off-by: Ingo Molnar

    Mandeep Singh Baines
     

14 Jan, 2009

3 commits


13 Jan, 2009

2 commits


12 Jan, 2009

1 commit


11 Jan, 2009

6 commits


10 Jan, 2009

8 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/arjan/linux-2.6-async-2:
    async: make async a command line option for now
    partial revert of asynchronous inode delete

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-nommu:
    NOMMU: Support XIP on initramfs
    NOMMU: Teach kobjsize() about VMA regions.
    FLAT: Don't attempt to expand the userspace stack to fill the space allocated
    FDPIC: Don't attempt to expand the userspace stack to fill the space allocated
    NOMMU: Improve procfs output using per-MM VMAs
    NOMMU: Make mmap allocation page trimming behaviour configurable.
    NOMMU: Make VMAs per MM as for MMU-mode linux
    NOMMU: Delete askedalloc and realalloc variables
    NOMMU: Rename ARM's struct vm_region
    NOMMU: Fix cleanup handling in ramfs_nommu_get_umapped_area()

    Linus Torvalds
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    CRED: Fix commit_creds() on a process that has no mm

    Linus Torvalds
     
  • ... and have it default off.
    This does allow people to work with it for testing.

    Signed-off-by: Arjan van de Ven

    Arjan van de Ven
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/oprofile: (31 commits)
    powerpc/oprofile: fix whitespaces in op_model_cell.c
    powerpc/oprofile: IBM CELL: add SPU event profiling support
    powerpc/oprofile: fix cell/pr_util.h
    powerpc/oprofile: IBM CELL: cleanup and restructuring
    oprofile: make new cpu buffer functions part of the api
    oprofile: remove #ifdef CONFIG_OPROFILE_IBS in non-ibs code
    ring_buffer: fix ring_buffer_event_length()
    oprofile: use new data sample format for ibs
    oprofile: add op_cpu_buffer_get_data()
    oprofile: add op_cpu_buffer_add_data()
    oprofile: rework implementation of cpu buffer events
    oprofile: modify op_cpu_buffer_read_entry()
    oprofile: add op_cpu_buffer_write_reserve()
    oprofile: rename variables in add_ibs_begin()
    oprofile: rename add_sample() in cpu_buffer.c
    oprofile: rename variable ibs_allowed to has_ibs in op_model_amd.c
    oprofile: making add_sample_entry() inline
    oprofile: remove backtrace code for ibs
    oprofile: remove unused ibs macro
    oprofile: remove unused components in struct oprofile_cpu_buffer
    ...

    Linus Torvalds
     
  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (94 commits)
    ACPICA: hide private headers
    ACPICA: create acpica/ directory
    ACPI: fix build warning
    ACPI : Use RSDT instead of XSDT by adding boot option of "acpi=rsdt"
    ACPI: Avoid array address overflow when _CST MWAIT hint bits are set
    fujitsu-laptop: Simplify SBLL/SBL2 backlight handling
    fujitsu-laptop: Add BL power, LED control and radio state information
    ACPICA: delete utcache.c
    ACPICA: delete acdisasm.h
    ACPICA: Update version to 20081204.
    ACPICA: FADT: Update error msgs for consistency
    ACPICA: FADT: set acpi_gbl_use_default_register_widths to TRUE by default
    ACPICA: FADT parsing changes and fixes
    ACPICA: Add ACPI_MUTEX_TYPE configuration option
    ACPICA: Fixes for various ACPI data tables
    ACPICA: Restructure includes into public/private
    ACPI: remove private acpica headers from driver files
    ACPI: reboot.c: use new acpi_reset interface
    ACPICA: New: acpi_reset interface - write to reset register
    ACPICA: Move all public H/W interfaces to new hwxface
    ...

    Linus Torvalds
     
  • The newly allocated creds in prepare_kernel_cred() must be initialised
    before get_uid() and get_group_info() can access them. They should be
    copied from the old credentials.

    Reported-by: Steve Dickson
    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Missing put_cred() in the error handling path of prepare_kernel_cred().

    Signed-off-by: David Howells
    Acked-by: Steve Dickson
    Signed-off-by: Linus Torvalds

    David Howells
     

09 Jan, 2009

13 commits

  • Len Brown
     
  • Len Brown
     
  • turns out that there are real problems with allowing async
    tasks that are scheduled from async tasks to run after
    the async_synchronize_full() returns.

    This patch makes the _full more strict and a complete
    synchronization. Later I might need to add back a lighter
    form of synchronization for other uses.. but not right now.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Linus Torvalds

    Arjan van de Ven
     
  • Use the new generic implementation.

    Signed-off-by: Wu Fengguang
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Currently task_active_pid_ns is not safe to call after a task becomes a
    zombie and exit_task_namespaces is called, as nsproxy becomes NULL. By
    reading the pid namespace from the pid of the task we can trivially solve
    this problem at the cost of one extra memory read in what should be the
    same cacheline as we read the namespace from.

    When moving things around I have made task_active_pid_ns out of line
    because keeping it in pid_namespace.h would require adding includes of
    pid.h and sched.h that I don't think we want.

    This change does make task_active_pid_ns unsafe to call during
    copy_process until we attach a pid on the task_struct which seems to be a
    reasonable trade off.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Sukadev Bhattiprolu
    Cc: Oleg Nesterov
    Cc: Roland McGrath
    Cc: Bastian Blank
    Cc: Pavel Emelyanov
    Cc: Nadia Derbey
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     
  • Impact: cleanups, use new cpumask API

    Final trivial cleanups: mainly s/cpumask_t/struct cpumask

    Note there is a FIXME in generate_sched_domains(). A future patch will
    change struct cpumask *doms to struct cpumask *doms[].
    (I suppose Rusty will do this.)

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: use new cpumask API

    This patch mainly does the following things:
    - change cs->cpus_allowed from cpumask_t to cpumask_var_t
    - call alloc_bootmem_cpumask_var() for top_cpuset in cpuset_init_early()
    - call alloc_cpumask_var() for other cpusets
    - replace cpus_xxx() to cpumask_xxx()

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: cleanups, reduce stack usage

    This patch prepares for the next patch. When we convert
    cpuset.cpus_allowed to cpumask_var_t, (trialcs = *cs) no longer works.

    Another result of this patch is reducing stack usage of trialcs.
    sizeof(*cs) can be as large as 148 bytes on x86_64, so it's really not
    good to have it on stack.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: reduce stack usage

    Allocate a global cpumask_var_t at boot, and use it in cpuset_attach(), so
    we won't fail cpuset_attach().

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Impact: reduce stack usage

    Just use cs->cpus_allowed, and no need to allocate a cpumask_var_t.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • This patchset converts cpuset to use new cpumask API, and thus
    remove on stack cpumask_t to reduce stack usage.

    Before:
    # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t
    21
    After:
    # cat kernel/cpuset.c include/linux/cpuset.h | grep -c cpumask_t
    0

    This patch:

    Impact: reduce stack usage

    It's safe to call cpulist_scnprintf inside callback_mutex, and thus we can
    just remove the cpumask_t and no need to allocate a cpumask_var_t.

    Signed-off-by: Li Zefan
    Cc: Ingo Molnar
    Cc: Rusty Russell
    Acked-by: Mike Travis
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • I found a bug on my dual-cpu box. I created a sub cpuset in top cpuset
    and assign 1 to its cpus. And then we attach some tasks into this sub
    cpuset. After this, we offline CPU1. Now, the tasks in this new cpuset
    are moved into top cpuset automatically because there is no cpu in sub
    cpuset. Then we online CPU1, we find all the tasks which doesn't belong
    to top cpuset originally just run on CPU0.

    We fix this bug by setting task's cpu_allowed to cpu_possible_map when
    attaching it into top cpuset. This method needn't modify the current
    behavior of cpusets on CPU hotplug, and all of tasks in top cpuset use
    cpu_possible_map to initialize their cpu_allowed.

    Signed-off-by: Miao Xie
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miao Xie
     
  • task_cs() calls task_subsys_state().

    We must use rcu_read_lock() to protect cgroup_subsys_state().

    It's correct that top_cpuset is never freed, but cgroup_subsys_state()
    accesses css_set, this css_set maybe freed when task_cs() called.

    We use use rcu_read_lock() to protect it.

    Signed-off-by: Lai Jiangshan
    Acked-by: Paul Menage
    Cc: KAMEZAWA Hiroyuki
    Cc: Pavel Emelyanov
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lai Jiangshan