21 Dec, 2011

1 commit

  • Mike reported a 13% drop in netperf TCP_RR performance due to the
    new remote wakeup code. Suresh too noticed some performance issues
    with it.

    Reducing the IPIs to only cross cache domains solves the observed
    performance issues.

    Reported-by: Suresh Siddha
    Reported-by: Mike Galbraith
    Acked-by: Suresh Siddha
    Acked-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    Cc: Chris Mason
    Cc: Dave Kleikamp
    Link: http://lkml.kernel.org/r/1323338531.17673.7.camel@twins
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

07 Dec, 2011

3 commits

  • Now that we initialize jump_labels before sched_init() we can use them
    for the debug features without having to worry about a window where
    they have the wrong setting.

    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-vpreo4hal9e0kzqmg5y0io2k@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Right now, after we collect tick statistics for user and system and store them
    in a well known location, we keep the same statistics again for cpuacct.
    Since cpuacct is hierarchical, the numbers for the root cgroup should be
    absolutely equal to the system-wide numbers.

    So it would be better to just use it: this patch changes cpuacct accounting
    in a way that the cpustat statistics are kept in a struct kernel_cpustat percpu
    array. In the root cgroup case, we just point it to the main array. The rest of
    the hierarchy walk can be totally disabled later with a static branch - but I am
    not doing it here.

    Signed-off-by: Glauber Costa
    Signed-off-by: Peter Zijlstra
    Cc: Paul Tuner
    Link: http://lkml.kernel.org/r/1322498719-2255-4-git-send-email-glommer@parallels.com
    Signed-off-by: Ingo Molnar

    Glauber Costa
     
  • hrtick_start_fair() shows up in profiles even when disabled.

    v3.0.6

    taskset -c 3 pipe-test

    PerfTop: 997 irqs/sec kernel:89.5% exact: 0.0% [1000Hz cycles], (all, CPU: 3)
    ------------------------------------------------------------------------------------------------

    Virgin Patched
    samples pcnt function samples pcnt function
    _______ _____ ___________________________ _______ _____ ___________________________

    2880.00 10.2% __schedule 3136.00 11.3% __schedule
    1634.00 5.8% pipe_read 1615.00 5.8% pipe_read
    1458.00 5.2% system_call 1534.00 5.5% system_call
    1382.00 4.9% _raw_spin_lock_irqsave 1412.00 5.1% _raw_spin_lock_irqsave
    1202.00 4.3% pipe_write 1255.00 4.5% copy_user_generic_string
    1164.00 4.1% copy_user_generic_string 1241.00 4.5% __switch_to
    1097.00 3.9% __switch_to 929.00 3.3% mutex_lock
    872.00 3.1% mutex_lock 846.00 3.0% mutex_unlock
    687.00 2.4% mutex_unlock 804.00 2.9% pipe_write
    682.00 2.4% native_sched_clock 713.00 2.6% native_sched_clock
    643.00 2.3% system_call_after_swapgs 653.00 2.3% _raw_spin_unlock_irqrestore
    617.00 2.2% sched_clock_local 633.00 2.3% fsnotify
    612.00 2.2% fsnotify 605.00 2.2% sched_clock_local
    596.00 2.1% _raw_spin_unlock_irqrestore 593.00 2.1% system_call_after_swapgs
    542.00 1.9% sysret_check 559.00 2.0% sysret_check
    467.00 1.7% fget_light 472.00 1.7% fget_light
    462.00 1.6% finish_task_switch 461.00 1.7% finish_task_switch
    437.00 1.5% vfs_write 442.00 1.6% vfs_write
    431.00 1.5% do_sync_write 428.00 1.5% do_sync_write
    413.00 1.5% select_task_rq_fair 404.00 1.5% _raw_spin_lock_irq
    386.00 1.4% update_curr 402.00 1.4% update_curr
    385.00 1.4% rw_verify_area 389.00 1.4% do_sync_read
    377.00 1.3% _raw_spin_lock_irq 378.00 1.4% vfs_read
    369.00 1.3% do_sync_read 340.00 1.2% pipe_iov_copy_from_user
    360.00 1.3% vfs_read 316.00 1.1% __wake_up_sync_key
    * 342.00 1.2% hrtick_start_fair 313.00 1.1% __wake_up_common

    Signed-off-by: Mike Galbraith
    [ fixed !CONFIG_SCHED_HRTICK borkage ]
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1321971607.6855.17.camel@marge.simson.net
    Signed-off-by: Ingo Molnar

    Mike Galbraith
     

06 Dec, 2011

3 commits

  • Introduce nr_busy_cpus in the struct sched_group_power [Not in sched_group
    because sched groups are duplicated for the SD_OVERLAP scheduler domain]
    and for each cpu that enters and exits idle, this parameter will
    be updated in each scheduler group of the scheduler domain that this cpu
    belongs to.

    To avoid the frequent update of this state as the cpu enters
    and exits idle, the update of the stat during idle exit is
    delayed to the first timer tick that happens after the cpu becomes busy.
    This is done using NOHZ_IDLE flag in the struct rq's nohz_flags.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20111202010832.555984323@sbsiddha-desk.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • Introduce nohz_flags in the struct rq, which will track these two flags
    for now.

    NOHZ_TICK_STOPPED keeps track of the tick stopped status that gets set when
    the tick is stopped. It will be used to update the nohz idle load balancer data
    structures during the first busy tick after the tick is restarted. At this
    first busy tick after tickless idle, NOHZ_TICK_STOPPED flag will be reset.
    This will minimize the nohz idle load balancer status updates that currently
    happen for every tickless exit, making it more scalable when there
    are many logical cpu's that enter and exit idle often.

    NOHZ_BALANCE_KICK will track the need for nohz idle load balance
    on this rq. This will replace the nohz_balance_kick in the rq, which was
    not being updated atomically.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20111202010832.499438999@sbsiddha-desk.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • Instead of going through the scheduler domain hierarchy multiple times
    (for giving priority to an idle core over an idle SMT sibling in a busy
    core), start with the highest scheduler domain with the SD_SHARE_PKG_RESOURCES
    flag and traverse the domain hierarchy down till we find an idle group.

    This cleanup also addresses an issue reported by Mike where the recent
    changes returned the busy thread even in the presence of an idle SMT
    sibling in single socket platforms.

    Signed-off-by: Suresh Siddha
    Tested-by: Mike Galbraith
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1321556904.15339.25.camel@sbsiddha-desk.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

17 Nov, 2011

1 commit