23 Nov, 2010

4 commits


18 Nov, 2010

13 commits

  • Formerly sched_group_set_shares would force a rebalance by overflowing domain
    share sums. Now that per-cpu averages are maintained we can set the true value
    by issuing an update_cfs_shares() following a tg->shares update.

    Also initialize tg se->load to 0 for consistency since we'll now set correct
    weights on enqueue.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Refactor the global load updates from update_shares_cpu() so that
    update_cfs_load() can update global load when it is more than ~10%
    out of sync.

    The new global_load parameter allows us to force an update, regardless of
    the error factor so that we can synchronize w/ update_shares().

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • When the system is busy, dilation of rq->next_balance makes lb->update_shares()
    insufficiently frequent for threads which don't sleep (no dequeue/enqueue
    updates). Adjust for this by making demand based updates based on the
    accumulation of execution time sufficient to wrap our averaging window.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Since shares updates are no longer expensive and effectively local, update them
    at idle_balance(). This allows us to more quickly redistribute shares to
    another cpu when our load becomes idle.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Introduce a new sysctl for the shares window and disambiguate it from
    sched_time_avg.

    A 10ms window appears to be a good compromise between accuracy and performance.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Avoid duplicate shares update calls by ensuring children always appear before
    parents in rq->leaf_cfs_rq_list.

    This allows us to do a single in-order traversal for update_shares().

    Since we always enqueue in bottom-up order this reduces to 2 cases:

    1) Our parent is already in the list, e.g.

    root
    \
    b
    /\
    c d* (root->b->c already enqueued)

    Since d's parent is enqueued we push it to the head of the list, implicitly ahead of b.

    2) Our parent does not appear in the list (or we have no parent)

    In this case we enqueue to the tail of the list, if our parent is subsequently enqueued
    (bottom-up) it will appear to our right by the same rule.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Using cfs_rq->nr_running is not sufficient to synchronize update_cfs_load with
    the put path since nr_running accounting occurs at deactivation.

    It's also not safe to make the removal decision based on load_avg as this fails
    with both high periods and low shares. Resolve this by clipping history after
    4 periods without activity.

    Note: the above will always occur from update_shares() since in the
    last-task-sleep-case that task will still be cfs_rq->curr when update_cfs_load
    is called.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • As part of enqueue_entity both a new entity weight and its contribution to the
    queuing cfs_rq / rq are updated. Since update_cfs_shares will only update the
    queueing weights when the entity is on_rq (which in this case it is not yet),
    there's a dependency loop here:

    update_cfs_shares needs account_entity_enqueue to update cfs_rq->load.weight
    account_entity_enqueue needs the updated weight for the queuing cfs_rq load[*]

    Fix this and avoid spurious dequeue/enqueues by issuing update_cfs_shares as
    if we had accounted the enqueue already.

    This was also resulting in rq->load corruption previously.

    [*]: this dependency also exists when using the group cfs_rq w/
    update_cfs_shares as the weight of the enqueued entity changes
    without the load being updated.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Turner
     
  • Make tg_shares_up() use the active cgroup list, this means we cannot
    do a strict bottom-up walk of the hierarchy, but assuming its a very
    wide tree with a small number of active groups it should be a win.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Make certain load-balance actions scale per number of active cgroups
    instead of the number of existing cgroups.

    This makes wakeup/sleep paths more expensive, but is a win for systems
    where the vast majority of existing cgroups are idle.

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • By tracking a per-cpu load-avg for each cfs_rq and folding it into a
    global task_group load on each tick we can rework tg_shares_up to be
    strictly per-cpu.

    This should improve cpu-cgroup performance for smp systems
    significantly.

    [ Paul: changed to use queueing cfs_rq + bug fixes ]

    Signed-off-by: Paul Turner
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • While discussing the need for sched_idle_next(), Oleg remarked that
    since try_to_wake_up() ensures sleeping tasks will end up running on a
    sane cpu, we can do away with migrate_live_tasks().

    If we then extend the existing hack of migrating current from
    CPU_DYING to migrating the full rq worth of tasks from CPU_DYING, the
    need for the sched_idle_next() abomination disappears as well, since
    idle will be the only possible thread left after the migration thread
    stops.

    This greatly simplifies the hot-unplug task migration path, as can be
    seen from the resulting code reduction (and about half the new lines
    are comments).

    Suggested-by: Oleg Nesterov
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Merge reason: Move to a .37-rc base.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

16 Nov, 2010

23 commits

  • Linus Torvalds
     
  • The addition of CONFIG_SECURITY_DMESG_RESTRICT resulted in a build
    failure when CONFIG_PRINTK=n. This is because the capabilities code
    which used the new option was built even though the variable in question
    didn't exist.

    The patch here fixes this by moving the capabilities checks out of the
    LSM and into the caller. All (known) LSMs should have been calling the
    capabilities hook already so it actually makes the code organization
    better to eliminate the hook altogether.

    Signed-off-by: Eric Paris
    Acked-by: James Morris
    Signed-off-by: Linus Torvalds

    Eric Paris
     
  • …/git/tmlind/linux-omap-2.6

    * 'omap-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6:
    arm: omap1: devices: need to return with a value
    OMAP1: camera.h: add missing include
    omap: dma: Add read-back to DMA interrupt handler to avoid spuriousinterrupts
    OMAP2: Devkit8000: Fix mmc regulator failure

    Linus Torvalds
     
  • * 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    hwmon: (w83795) Check for BEEP pin availability
    hwmon: (w83795) Clear intrusion alarm immediately
    hwmon: (w83795) Read the intrusion state properly
    hwmon: (w83795) Print the actual temperature channels as sources
    hwmon: (w83795) List all usable temperature sources
    hwmon: (w83795) Expose fan control method
    hwmon: (w83795) Fix fan control mode attributes
    hwmon: (lm95241) Check validity of input values
    hwmon: Change mail address of Hans J. Koch

    Linus Torvalds
     
  • * 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    i2c: Sanity checks on adapter registration
    i2c: Mark i2c_adapter.id as deprecated
    i2c: Drivers shouldn't include
    i2c: Delete unused adapter IDs
    i2c: Remove obsolete cleanup for clientdata

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: sysfs: fix printk warnings
    PCI: fix pci_bus_alloc_resource() hang, prefer positive decode
    PCI: read current power state at enable time
    PCI: fix size checks for mmap() on /proc/bus/pci files
    x86/PCI: coalesce overlapping host bridge windows
    PCI hotplug: ibmphp: Add check to prevent reading beyond mapped area

    Linus Torvalds
     
  • Make sure I2C adapters being registered have the required struct
    fields set. If they don't, problems will happen later.

    Signed-off-by: Jean Delvare

    Jean Delvare
     
  • It's about time to make it clear that i2c_adapter.id is deprecated.
    Hopefully this will remind the last user to move over to a different
    strategy.

    Signed-off-by: Jean Delvare
    Acked-by: Jarod Wilson
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Hans Verkuil

    Jean Delvare
     
  • Drivers don't need to include , especially not when
    they don't use anything that header file provides.

    Signed-off-by: Jean Delvare
    Cc: Michael Hunold
    Acked-by: Mauro Carvalho Chehab

    Jean Delvare
     
  • Delete unused I2C adapter IDs. Special cases are:

    * I2C_HW_B_RIVA was still set in driver rivafb, however no other
    driver is ever looking for this value, so we can safely remove it.
    * I2C_HW_B_HDPVR is used in staging driver lirc_zilog, however no
    adapter ID is ever set to this value, so the code in question never
    runs. As the code additionally expects that I2C_HW_B_HDPVR may not
    be defined, we can delete it now and let the lirc_zilog driver
    maintainer rewrite this piece of code.

    Big thanks for Hans Verkuil for doing all the hard work :)

    Signed-off-by: Jean Delvare
    Acked-by: Jarod Wilson
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Hans Verkuil

    Jean Delvare
     
  • A few new i2c-drivers came into the kernel which clear the clientdata-pointer
    on exit. This is obsolete meanwhile, so fix it and hope the word will spread.

    Signed-off-by: Wolfram Sang
    Acked-by: Alan Cox
    Acked-by: Guennadi Liakhovetski
    Acked-by: Greg Kroah-Hartman
    Signed-off-by: Jean Delvare

    Wolfram Sang
     
  • Move the logging bits from kernel.h into printk.h so that
    there is a bit more logical separation of the generic from
    the printk logging specific parts.

    Signed-off-by: Joe Perches
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The fix in commit 6b4e81db2552 ("i8k: Tell gcc that *regs gets
    clobbered") to work around the gcc miscompiling i8k.c to add "+m
    (*regs)" caused register pressure problems and a build failure.

    Changing the 'asm' statement to 'asm volatile' instead should prevent
    that and works around the gcc bug as well, so we can remove the "+m".

    [ Background on the gcc bug: a memory clobber fails to mark the function
    the asm resides in as non-pure (aka "__attribute__((const))"), so if
    the function does nothing else that triggers the non-pure logic, gcc
    will think that that function has no side effects at all. As a result,
    callers will be mis-compiled.

    Adding the "+m" made gcc see that it's not a pure function, and so
    does "asm volatile". The problem was never really the need to mark
    "*regs" as changed, since the memory clobber did that part - the
    problem was just a bug in the gcc "pure" function analysis - Linus ]

    Signed-off-by: Jim Bos
    Acked-by: Jakub Jelinek
    Cc: Andi Kleen
    Cc: Andreas Schwab
    Signed-off-by: Linus Torvalds

    Jim Bos
     
  • On the W83795ADG, there's a single pin for BEEP and OVT#, so you
    can't have both. Check the configuration and don't create beep
    attributes when BEEP pin is not available.

    The W83795G has a dedicated BEEP pin so the functionality is always
    available there.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • When asked to clear the intrusion alarm, do so immediately. We have to
    invalidate the cache to make sure the new status will be read. But we
    also have to read from the status register once to clear the pending
    alarm, as writing to CLR_CHS surprising won't clear it automatically.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • We can't read the intrusion state from the real-time alarm registers
    as we do for all other alarm flags, because real-time alarm bits don't
    stick (by definition) and the intrusion state has to stick until
    explicitly cleared (otherwise it has little value.)

    So we have to use the interrupt status register instead, which is read
    from the same address but with a configuration bit flipped in another
    register.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • Don't expose raw register values to user-space. Decode and encode
    temperature channels selected as temperature sources as needed.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • Temperature sources are not correlated directly with temperature
    channels. A look-up table is required to find out which temperature
    sources can be used depending on which temperature channels (both
    analog and digital) are enabled.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • Expose fan control method (DC vs. PWM) using the standard sysfs
    attributes. I've made it read-only as the board should be wired for
    a given mode, the BIOS should have set up the chip for this mode, and
    you shouldn't have to change it. But it would be easy enough to make
    it changeable if someone comes up with a use case.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • There were two bugs:
    * Speed cruise mode was improperly reported for all fans but fan1.
    * Fan control method (PWM vs. DC) was mixed with the control mode.
    It will be added back as a separate attribute, as per the standard
    sysfs interface.

    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • This clears the following build-time warnings I was seeing:

    drivers/hwmon/lm95241.c: In function "set_interval":
    drivers/hwmon/lm95241.c:132:15: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_max2":
    drivers/hwmon/lm95241.c:278:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_max1":
    drivers/hwmon/lm95241.c:277:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_min2":
    drivers/hwmon/lm95241.c:249:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_min1":
    drivers/hwmon/lm95241.c:248:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_type2":
    drivers/hwmon/lm95241.c:220:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result
    drivers/hwmon/lm95241.c: In function "set_type1":
    drivers/hwmon/lm95241.c:219:1: warning: ignoring return value of "strict_strtol", declared with attribute warn_unused_result

    This also fixes a small race in set_interval() as a side effect: by
    working with a temporary local variable we prevent data->interval from
    being accessed at a time it contains the interval value in the wrong
    unit.

    Signed-off-by: Jean Delvare
    Cc: Davide Rizzo

    Jean Delvare
     
  • My old mail address doesn't exist anymore. This changes all occurrences
    to my new address.

    Signed-off-by: Hans J. Koch
    Signed-off-by: Jean Delvare

    Hans J. Koch
     
  • Cast pci_resource_start() and pci_resource_len() to u64 for printk.

    drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 9 has type 'resource_size_t'
    drivers/pci/pci-sysfs.c:753: warning: format '%16Lx' expects type 'long long unsigned int', but argument 10 has type 'resource_size_t'

    Signed-off-by: Randy Dunlap
    Signed-off-by: Jesse Barnes

    Randy Dunlap