20 May, 2011

9 commits

  • …/gregkh/driver-core-2.6

    * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (44 commits)
    debugfs: Silence DEBUG_STRICT_USER_COPY_CHECKS=y warning
    sysfs: remove "last sysfs file:" line from the oops messages
    drivers/base/memory.c: fix warning due to "memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION"
    memory hotplug: Speed up add/remove when blocks are larger than PAGES_PER_SECTION
    SYSFS: Fix erroneous comments for sysfs_update_group().
    driver core: remove the driver-model structures from the documentation
    driver core: Add the device driver-model structures to kerneldoc
    Translated Documentation/email-clients.txt
    RAW driver: Remove call to kobject_put().
    reboot: disable usermodehelper to prevent fs access
    efivars: prevent oops on unload when efi is not enabled
    Allow setting of number of raw devices as a module parameter
    Introduce CONFIG_GOOGLE_FIRMWARE
    driver: Google Memory Console
    driver: Google EFI SMI
    x86: Better comments for get_bios_ebda()
    x86: get_bios_ebda_length()
    misc: fix ti-st build issues
    params.c: Use new strtobool function to process boolean inputs
    debugfs: move to new strtobool
    ...

    Fix up trivial conflicts in fs/debugfs/file.c due to the same patch
    being applied twice, and an unrelated cleanup nearby.

    Linus Torvalds
     
  • * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
    Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
    net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
    batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
    batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
    batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
    net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
    net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
    net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
    perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
    perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
    net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
    net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
    net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
    net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
    security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
    net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
    net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
    net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
    ...

    Linus Torvalds
     
  • * 'x86-smep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, cpu: Enable/disable Supervisor Mode Execution Protection
    x86, cpu: Add SMEP CPU feature in CR4
    x86, cpufeature: Add cpufeature flag for SMEP

    Linus Torvalds
     
  • * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, cpu: Fix detection of Celeron Covington stepping A1 and B0
    Documentation, ABI: Update L3 cache index disable text
    x86, AMD, cacheinfo: Fix L3 cache index disable checks
    x86, AMD, cacheinfo: Fix fallout caused by max3 conversion
    x86, cpu: Change NOP selection for certain Intel CPUs
    x86, cpu: Clean up and unify the NOP selection infrastructure
    x86, percpu: Use ASM_NOP4 instead of hardcoding P6_NOP4
    x86, cpu: Move AMD Elan Kconfig under "Processor family"

    Fix up trivial conflicts in alternative handling (commit dc326fca2b64
    "x86, cpu: Clean up and unify the NOP selection infrastructure" removed
    some hacky 5-byte instruction stuff, while commit d430d3d7e646 "jump
    label: Introduce static_branch() interface" renamed HAVE_JUMP_LABEL to
    CONFIG_JUMP_LABEL in the code that went away)

    Linus Torvalds
     
  • …kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
    sched: Fix and optimise calculation of the weight-inverse
    sched: Avoid going ahead if ->cpus_allowed is not changed
    sched, rt: Update rq clock when unthrottling of an otherwise idle CPU
    sched: Remove unused parameters from sched_fork() and wake_up_new_task()
    sched: Shorten the construction of the span cpu mask of sched domain
    sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG
    sched: Remove unused 'this_best_prio arg' from balance_tasks()
    sched: Remove noop in alloc_rt_sched_group()
    sched: Get rid of lock_depth
    sched: Remove obsolete comment from scheduler_tick()
    sched: Fix sched_domain iterations vs. RCU
    sched: Next buddy hint on sleep and preempt path
    sched: Make set_*_buddy() work on non-task entities
    sched: Remove need_migrate_task()
    sched: Move the second half of ttwu() to the remote cpu
    sched: Restructure ttwu() some more
    sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
    sched: Remove rq argument from ttwu_stat()
    sched: Remove rq->lock from the first half of ttwu()
    sched: Drop rq->lock from sched_exec()
    ...

    * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched: Fix rt_rq runtime leakage bug

    Linus Torvalds
     
  • * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    irq: Export functions to allow modular irq drivers
    genirq: Uninline and sanity check generic_handle_irq()
    genirq: Remove pointless ifdefs
    genirq: Make generic irq chip depend on CONFIG_GENERIC_IRQ_CHIP
    genirq: Add chip suspend and resume callbacks
    genirq: Implement a generic interrupt chip
    genirq: Support per-IRQ thread disabling.
    genirq: irq_desc: Document preflow_handler and affinity_hint
    genirq: Update DocBook comments
    genirq: Forgotten updates/deletions after removal of compat code

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, gart: Rename pci-gart_64.c to amd_gart_64.c
    x86/amd-iommu: Use threaded interupt handler
    arch/x86/kernel/pci-iommu_table.c: Convert sprintf_symbol to %pS
    x86/amd-iommu: Add support for invalidate_all command
    x86/amd-iommu: Add extended feature detection
    x86/amd-iommu: Add ATS enable/disable code
    x86/amd-iommu: Add flag to indicate IOTLB support
    x86/amd-iommu: Flush device IOTLB if ATS is enabled
    x86/amd-iommu: Select PCI_IOV with AMD IOMMU driver
    PCI: Move ATS declarations in seperate header file
    dma-debug: print information about leaked entry
    x86/amd-iommu: Flush all internal TLBs when IOMMUs are enabled
    x86/amd-iommu: Rename iommu_flush_device
    x86/amd-iommu: Improve handling of full command buffer
    x86/amd-iommu: Rename iommu_flush* to domain_flush*
    x86/amd-iommu: Remove command buffer resetting logic
    x86/amd-iommu: Cleanup completion-wait handling
    x86/amd-iommu: Cleanup inv_pages command handling
    x86/amd-iommu: Move inv-dte command building to own function
    x86/amd-iommu: Move compl-wait command building to own function

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (34 commits)
    PM: Introduce generic prepare and complete callbacks for subsystems
    PM: Allow drivers to allocate memory from .prepare() callbacks safely
    PM: Remove CONFIG_PM_VERBOSE
    Revert "PM / Hibernate: Reduce autotuned default image size"
    PM / Hibernate: Add sysfs knob to control size of memory for drivers
    PM / Wakeup: Remove useless synchronize_rcu() call
    kmod: always provide usermodehelper_disable()
    PM / ACPI: Remove acpi_sleep=s4_nonvs
    PM / Wakeup: Fix build warning related to the "wakeup" sysfs file
    PM: Print a warning if firmware is requested when tasks are frozen
    PM / Runtime: Rework runtime PM handling during driver removal
    Freezer: Use SMP barriers
    PM / Suspend: Do not ignore error codes returned by suspend_enter()
    PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset
    PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops"
    OMAP1 / PM: Use generic clock manipulation routines for runtime PM
    PM: Remove sysdev suspend, resume and shutdown operations
    PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
    PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM
    PM / AVR32: Use struct syscore_ops instead of sysdevs for PM
    ...

    Linus Torvalds
     
  • This reverts commit e59fb3120becfb36b22ddb8bd27d065d3cdca499.

    This reversion was due to (extreme) boot-time slowdowns on SPARC seen by
    Yinghai Lu and on x86 by Ingo
    .
    This is a non-trivial reversion due to intervening commits.

    Conflicts:

    Documentation/RCU/trace.txt
    kernel/rcutree.c

    Signed-off-by: Ingo Molnar

    Paul E. McKenney
     

18 May, 2011

4 commits

  • Enable/disable newly documented SMEP (Supervisor Mode Execution Protection) CPU
    feature in kernel. CR4.SMEP (bit 20) is 0 at power-on. If the feature is
    supported by CPU (X86_FEATURE_SMEP), enable SMEP by setting CR4.SMEP. New kernel
    option nosmep disables the feature even if the feature is supported by CPU.

    [ hpa: moved the call to setup_smep() until after the vendor-specific
    initialization; that ensures that CPUID features are unmasked. We
    will still run it before we have userspace (never mind uncontrolled
    userspace). ]

    Signed-off-by: Fenghua Yu
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Fenghua Yu
     
  • If device drivers allocate substantial amounts of memory (above 1 MB)
    in their hibernate .freeze() callbacks (or in their legacy suspend
    callbcks during hibernation), the subsequent creation of hibernate
    image may fail due to the lack of memory. This is the case, because
    the drivers' .freeze() callbacks are executed after the hibernate
    memory preallocation has been carried out and the preallocated amount
    of memory may be too small to cover the new driver allocations.
    Unfortunately, the drivers' .prepare() callbacks also are executed
    after the hibernate memory preallocation has completed, so they are
    not suitable for allocating additional memory either. Thus the only
    way a driver can safely allocate memory during hibernation is to use
    a hibernate/suspend notifier. However, the notifiers are called
    before the freezing of user space and the drivers wanting to use them
    for allocating additional memory may not know how much memory needs
    to be allocated at that point.

    To let device drivers overcome this difficulty rework the hibernation
    sequence so that the memory preallocation is carried out after the
    drivers' .prepare() callbacks have been executed, so that the
    .prepare() callbacks can be used for allocating additional memory
    to be used by the drivers' .freeze() callbacks. Update documentation
    to match the new behavior of the code.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • Martin reports that on his system hibernation occasionally fails due
    to the lack of memory, because the radeon driver apparently allocates
    too much of it during the device freeze stage. It turns out that the
    amount of memory allocated by radeon during hibernation (and
    presumably during system suspend too) depends on the utilization of
    the GPU (e.g. hibernating while there are two KDE 4 sessions with
    compositing enabled causes radeon to allocate more memory than for
    one KDE 4 session).

    In principle it should be possible to use image_size to make the
    memory preallocation mechanism free enough memory for the radeon
    driver, but in practice it is not easy to guess the right value
    because of the way the preallocation code uses image_size. For this
    reason, it seems reasonable to allow users to control the amount of
    memory reserved for driver allocations made after the hibernate
    preallocation, which currently is constant and amounts to 1 MB.

    Introduce a new sysfs file, /sys/power/reserved_size, whose value
    will be used as the amount of memory to reserve for the
    post-preallocation reservations made by device drivers, in bytes.
    For backwards compatibility, set its default (and initial) value to
    the currently used number (1 MB).

    References: https://bugzilla.kernel.org/show_bug.cgi?id=34102
    Reported-and-tested-by: Martin Steigerwald
    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • acpi_sleep=s4_nonvs is superseded by acpi_sleep=nonvs, so remove it.

    Signed-off-by: WANG Cong
    Acked-by: Pavel Machek
    Acked-by: Len Brown
    Signed-off-by: Rafael J. Wysocki

    Amerigo Wang
     

17 May, 2011

1 commit


12 May, 2011

1 commit


10 May, 2011

1 commit


07 May, 2011

6 commits


06 May, 2011

9 commits

  • Increment a per-CPU counter on each pass through rcu_cpu_kthread()'s
    service loop, and add it to the rcudata trace output.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • This commit adds the age in jiffies of the current grace period along
    with the duration in jiffies of the longest grace period since boot
    to the rcu/rcugp debugfs file. It also adds an additional "O" state
    to kthread tracing to differentiate between the kthread waiting due to
    having nothing to do on the one hand and waiting due to being on the
    wrong CPU on the other hand.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit documents the new debugfs rcu/rcutorture and rcu/rcuboost
    trace files. The description has been updated as suggested by Josh
    Triplett.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney

    Paul E. McKenney
     
  • This commit adds an indication of the state of the callback queue using
    a string of four characters following the "ql=" integer queue length.
    The first character is "N" if there are callbacks that have been
    queued that are not yet ready to be handled by the next grace period, or
    "." otherwise. The second character is "R" if there are callbacks queued
    that are ready to be handled by the next grace period, or "." otherwise.
    The third character is "W" if there are callbacks waiting for the current
    grace period, or "." otherwise. Finally, the fourth character is "D"
    if there are callbacks that have been handled by a prior grace period
    and are waiting to be invoked, or ".".

    Note that callbacks that are in the process of being invoked are
    not shown. These callbacks would have been removed from the rcu_data
    structure's list by rcu_do_batch() prior to being executed. (These
    callbacks are also not reflected in the "ql=" total, FWIW.)

    Also, document the new callback-queue trace information.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • The trace.txt file had obsolete output for the debugfs rcu/rcudata
    file, so update it.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • If RCU priority boosting is to be meaningful, callback invocation must
    be boosted in addition to preempted RCU readers. Otherwise, in presence
    of CPU real-time threads, the grace period ends, but the callbacks don't
    get invoked. If the callbacks don't get invoked, the associated memory
    doesn't get freed, so the system is still subject to OOM.

    But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
    moves the callback invocations to a kthread, which can be boosted easily.

    Also add comments and properly synchronized all accesses to
    rcu_cpu_kthread_task, as suggested by Lai Jiangshan.

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Combine the current TREE_PREEMPT_RCU ->blocked_tasks[] lists in the
    rcu_node structure into a single ->blkd_tasks list with ->gp_tasks
    and ->exp_tasks tail pointers. This is in preparation for RCU priority
    boosting, which will add a third dimension to the combinatorial explosion
    in the ->blocked_tasks[] case, but simply a third pointer in the new
    ->blkd_tasks case.

    Also update documentation to reflect blocked_tasks[] merge

    Signed-off-by: Paul E. McKenney
    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • Commit d09b62d fixed grace-period synchronization, but left some smp_mb()
    invocations in rcu_process_callbacks() that are no longer needed, but
    sheer paranoia prevented them from being removed. This commit removes
    them and provides a proof of correctness in their absence. It also adds
    a memory barrier to rcu_report_qs_rsp() immediately before the update to
    rsp->completed in order to handle the theoretical possibility that the
    compiler or CPU might move massive quantities of code into a lock-based
    critical section. This also proves that the sheer paranoia was not
    entirely unjustified, at least from a theoretical point of view.

    In addition, the old dyntick-idle synchronization depended on the fact
    that grace periods were many milliseconds in duration, so that it could
    be assumed that no dyntick-idle CPU could reorder a memory reference
    across an entire grace period. Unfortunately for this design, the
    addition of expedited grace periods breaks this assumption, which has
    the unfortunate side-effect of requiring atomic operations in the
    functions that track dyntick-idle state for RCU. (There is some hope
    that the algorithms used in user-level RCU might be applied here, but
    some work is required to handle the NMIs that user-space applications
    can happily ignore. For the short term, better safe than sorry.)

    This proof assumes that neither compiler nor CPU will allow a lock
    acquisition and release to be reordered, as doing so can result in
    deadlock. The proof is as follows:

    1. A given CPU declares a quiescent state under the protection of
    its leaf rcu_node's lock.

    2. If there is more than one level of rcu_node hierarchy, the
    last CPU to declare a quiescent state will also acquire the
    ->lock of the next rcu_node up in the hierarchy, but only
    after releasing the lower level's lock. The acquisition of this
    lock clearly cannot occur prior to the acquisition of the leaf
    node's lock.

    3. Step 2 repeats until we reach the root rcu_node structure.
    Please note again that only one lock is held at a time through
    this process. The acquisition of the root rcu_node's ->lock
    must occur after the release of that of the leaf rcu_node.

    4. At this point, we set the ->completed field in the rcu_state
    structure in rcu_report_qs_rsp(). However, if the rcu_node
    hierarchy contains only one rcu_node, then in theory the code
    preceding the quiescent state could leak into the critical
    section. We therefore precede the update of ->completed with a
    memory barrier. All CPUs will therefore agree that any updates
    preceding any report of a quiescent state will have happened
    before the update of ->completed.

    5. Regardless of whether a new grace period is needed, rcu_start_gp()
    will propagate the new value of ->completed to all of the leaf
    rcu_node structures, under the protection of each rcu_node's ->lock.
    If a new grace period is needed immediately, this propagation
    will occur in the same critical section that ->completed was
    set in, but courtesy of the memory barrier in #4 above, is still
    seen to follow any pre-quiescent-state activity.

    6. When a given CPU invokes __rcu_process_gp_end(), it becomes
    aware of the end of the old grace period and therefore makes
    any RCU callbacks that were waiting on that grace period eligible
    for invocation.

    If this CPU is the same one that detected the end of the grace
    period, and if there is but a single rcu_node in the hierarchy,
    we will still be in the single critical section. In this case,
    the memory barrier in step #4 guarantees that all callbacks will
    be seen to execute after each CPU's quiescent state.

    On the other hand, if this is a different CPU, it will acquire
    the leaf rcu_node's ->lock, and will again be serialized after
    each CPU's quiescent state for the old grace period.

    On the strength of this proof, this commit therefore removes the memory
    barriers from rcu_process_callbacks() and adds one to rcu_report_qs_rsp().
    The effect is to reduce the number of memory barriers by one and to
    reduce the frequency of execution from about once per scheduling tick
    per CPU to once per grace period.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     
  • The RCU CPU stall warnings can now be controlled using the
    rcu_cpu_stall_suppress boot-time parameter or via the same parameter
    from sysfs. There is therefore no longer any reason to have
    kernel config parameters for this feature. This commit therefore
    removes the RCU_CPU_STALL_DETECTOR and RCU_CPU_STALL_DETECTOR_RUNNABLE
    kernel config parameters. The RCU_CPU_STALL_TIMEOUT parameter remains
    to allow the timeout to be tuned and the RCU_CPU_STALL_VERBOSE parameter
    remains to allow task-stall information to be suppressed if desired.

    Signed-off-by: Paul E. McKenney
    Reviewed-by: Josh Triplett

    Paul E. McKenney
     

04 May, 2011

1 commit


01 May, 2011

1 commit


30 Apr, 2011

2 commits

  • This patch introduces the 'memconsole' driver.

    Our firmware gives us access to an in-memory log of the firmware's
    output. This gives us visibility in a data-center of headless machines
    as to what the firmware is doing.

    The memory console is found by the driver by finding a header block in
    the EBDA. The buffer is then copied out, and is exported to userland in
    the file /sys/firmware/log.

    Signed-off-by: San Mehat
    Signed-off-by: Mike Waychison
    Signed-off-by: Greg Kroah-Hartman

    Mike Waychison
     
  • The "gsmi" driver bridges userland with firmware specific routines for
    accessing hardware.

    Currently, this driver only supports NVRAM and eventlog information.
    Deprecated functions have been removed from the driver, though their
    op-codes are left in place so that they are not re-used.

    This driver works by trampolining into the firmware via the smi_command
    outlined in the FADT table. Three protocols are used due to various
    limitations over time, but all are included herein.

    This driver should only ever load on Google boards, identified by either
    a "Google, Inc." board vendor string in DMI, or "GOOGLE" in the OEM
    strings of the FADT ACPI table. This logic happens in
    gsmi_system_valid().

    Signed-off-by: Duncan Laurie
    Signed-off-by: Aaron Durbin
    Signed-off-by: Mike Waychison
    Signed-off-by: Greg Kroah-Hartman

    Mike Waychison
     

29 Apr, 2011

5 commits

  • Recent Xeon processor thermal sensors are supported by the coretemp
    driver and not the adm1021 driver. Only one old generation of Xeon
    processors (the first Netburst ones) are supported by the adm1021
    driver.

    Reported-by: Darren Hart
    Signed-off-by: Jean Delvare
    Acked-by: Guenter Roeck

    Jean Delvare
     
  • The lm90 driver's attribute update interval is configurable.
    Reflect this information in the driver documentation.

    Signed-off-by: Guenter Roeck
    Signed-off-by: Jean Delvare

    Guenter Roeck
     
  • This patch adds support for ADT7461A and NCT1008 to the lm90 driver.
    Both chips have identical functionality and report the same manufacturing ID
    and device ID values.

    Signed-off-by: Guenter Roeck
    Signed-off-by: Jean Delvare

    Guenter Roeck
     
  • Change flex_array_prealloc to take the number of elements for which space
    should be allocated instead of the last (inclusive) element. Users
    and documentation are updated accordingly. flex_arrays got introduced before
    they had users. When folks started using it, they ended up needing a
    different API than was coded up originally. This swaps over to the API that
    folks apparently need.

    Based-on-patch-by: Steffen Klassert
    Signed-off-by: Eric Paris
    Tested-by: Chris Richards
    Acked-by: Dave Hansen
    Cc: stable@kernel.org [2.6.38+]

    Eric Paris
     
  • Since 569b846d ("memcg: coalesce uncharge during unmap/truncate"), we do
    batched (delayed) uncharge at truncation/unmap. And since cdec2e42(memcg:
    coalesce charging via percpu storage), we have percpu cache for
    res_counter.

    These changes improved performance of memory cgroup very much, but made
    res_counter->usage usually have a bigger value than the actual value of
    memory usage. So, *.usage_in_bytes, which show res_counter->usage, are
    not desirable for precise values of memory(and swap) usage anymore.

    Instead of removing these files completely(because we cannot know
    res_counter->usage without them), this patch updates the meaning of those
    files.

    Signed-off-by: Daisuke Nishimura
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Michal Hocko
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daisuke Nishimura