08 Aug, 2011

1 commit

  • task->cred is declared as __rcu, and access to other tasks' ->cred is,
    indeed, protected. Access to current->cred does not need rcu_dereference()
    at all, since only the task itself can change its ->cred. sparse, of
    course, has no way of knowing that...

    Add force-cast in current_cred(), make current_fsuid() et.al. use it.

    Signed-off-by: Al Viro
    Signed-off-by: Linus Torvalds

    Al Viro
     

07 Aug, 2011

9 commits

  • * 'for-linus' of git://git.open-osd.org/linux-open-osd:
    ore: Make ore its own module
    exofs: Rename raid engine from exofs/ios.c => ore
    exofs: ios: Move to a per inode components & device-table
    exofs: Move exofs specific osd operations out of ios.c
    exofs: Add offset/length to exofs_get_io_state
    exofs: Fix truncate for the raid-groups case
    exofs: Small cleanup of exofs_fill_super
    exofs: BUG: Avoid sbi realloc
    exofs: Remove pnfs-osd private definitions
    nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4

    Linus Torvalds
     
  • The inode structure layout is largely random, and some of the vfs paths
    really do care. The path lookup in particular is already quite D$
    intensive, and profiles show that accessing the 'inode->i_op->xyz'
    fields is quite costly.

    We already optimized the dcache to not unnecessarily load the d_op
    structure for members that are often NULL using the DCACHE_OP_xyz bits
    in dentry->d_flags, and this does something very similar for the inode
    ops that are used during pathname lookup.

    It also re-orders the fields so that the fields accessed by 'stat' are
    together at the beginning of the inode structure, and roughly in the
    order accessed.

    The effect of this seems to be in the 1-2% range for an empty kernel
    "make -j" run (which is fairly kernel-intensive, mostly in filename
    lookup), so it's visible. The numbers are fairly noisy, though, and
    likely depend a lot on exact microarchitecture. So there's more tuning
    to be done.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Gcc tends to generate better code with small integers, including the
    DCACHE_xyz flag tests - so move the common ones to be first in the list.
    Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
    DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
    tree.

    And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
    common case to be a nice straight-line fall-through.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    net: Compute protocol sequence numbers and fragment IDs using MD5.
    crypto: Move md5_transform to lib/md5.c

    Linus Torvalds
     
  • ORE stands for "Objects Raid Engine"

    This patch is a mechanical rename of everything that was in ios.c
    and its API declaration to an ore.c and an osd_ore.h header. The ore
    engine will later be used by the pnfs objects layout driver.

    * File ios.c => ore.c

    * Declaration of types and API are moved from exofs.h to a new
    osd_ore.h

    * All used types are prefixed by ore_ from their exofs_ name.

    * Shift includes from exofs.h to osd_ore.h so osd_ore.h is
    independent, include it from exofs.h.

    Other than a pure rename there are no other changes. Next patch
    will move the ore into it's own module and will export the API
    to be used by exofs and later the layout driver

    Signed-off-by: Boaz Harrosh

    Boaz Harrosh
     
  • Computers have become a lot faster since we compromised on the
    partial MD4 hash which we use currently for performance reasons.

    MD5 is a much safer choice, and is inline with both RFC1948 and
    other ISS generators (OpenBSD, Solaris, etc.)

    Furthermore, only having 24-bits of the sequence number be truly
    unpredictable is a very serious limitation. So the periodic
    regeneration and 8-bit counter have been removed. We compute and
    use a full 32-bit sequence number.

    For ipv6, DCCP was found to use a 32-bit truncated initial sequence
    number (it needs 43-bits) and that is fixed here as well.

    Reported-by: Dan Kaminsky
    Tested-by: Willy Tarreau
    Signed-off-by: David S. Miller

    David S. Miller
     
  • We are going to use this for TCP/IP sequence number and fragment ID
    generation.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (38 commits)
    acer-wmi: support Lenovo ideapad S205 wifi switch
    acerhdf.c: spaces in aliased changed to *
    platform-drivers-x86: ideapad-laptop: add missing ideapad_input_exit in ideapad_acpi_add error path
    x86 driver: fix typo in TDP override enabling
    Platform: fix samsung-laptop DMI identification for N150/N210/220/N230
    dell-wmi: Add keys for Dell XPS L502X
    platform-drivers-x86: samsung-q10: make dmi_check_callback return 1
    Platform: Samsung Q10 backlight driver
    platform-drivers-x86: intel_scu_ipc: convert to DEFINE_PCI_DEVICE_TABLE
    platform-drivers-x86: intel_rar_register: convert to DEFINE_PCI_DEVICE_TABLE
    platform-drivers-x86: intel_menlow: add missing return AE_OK for intel_menlow_register_sensor()
    platform-drivers-x86: intel_mid_thermal: fix memory leak
    platform-drivers-x86: msi-wmi: add missing sparse_keymap_free in msi_wmi_init error path
    Samsung Laptop platform driver: support N510
    asus-wmi: add uwb rfkill support
    asus-wmi: add gps rfkill support
    asus-wmi: add CWAP support and clarify the meaning of WAPF bits
    asus-wmi: return proper value in store_cpufv()
    asus-wmi: check for temp1 presence
    asus-wmi: add thermal sensor
    ...

    Linus Torvalds
     
  • For ChromiumOS, we use SHA-1 to verify the integrity of the root
    filesystem. The speed of the kernel sha-1 implementation has a major
    impact on our boot performance.

    To improve boot performance, we investigated using the heavily optimized
    sha-1 implementation used in git. With the git sha-1 implementation, we
    see a 11.7% improvement in boot time.

    10 reboots, remove slowest/fastest.

    Before:

    Mean: 6.58 seconds Stdev: 0.14

    After (with git sha-1, this patch):

    Mean: 5.89 seconds Stdev: 0.07

    The other cool thing about the git SHA-1 implementation is that it only
    needs 64 bytes of stack for the workspace while the original kernel
    implementation needed 320 bytes.

    Signed-off-by: Mandeep Singh Baines
    Cc: Ramsay Jones
    Cc: Nicolas Pitre
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: linux-crypto@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Mandeep Singh Baines
     

06 Aug, 2011

3 commits

  • I suspect that this works on T410.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Matthew Garrett

    Andy Lutomirski
     
  • * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (55 commits)
    Revert "drm/i915: Try enabling RC6 by default (again)"
    drm/radeon: Extended DDC Probing for ECS A740GM-M DVI-D Connector
    drm/radeon: Log Subsystem Vendor and Device Information
    drm/radeon: Extended DDC Probing for Connectors with Improperly Wired DDC Lines (here: Asus M2A-VM HDMI)
    drm: Separate EDID Header Check from EDID Block Check
    drm: Add NULL check about irq functions
    drm: Fix irq install error handling
    drm/radeon: fix potential NULL dereference in drivers/gpu/drm/radeon/atom.c
    drm/radeon: clean reg header files
    drm/debugfs: Initialise empty variable
    drm/radeon/kms: add thermal chip quirk for asus 9600xt
    drm/radeon: off by one in check_reg() functions
    drm/radeon/kms: fix version comment due to merge timing
    drm/i915: allow cache sharing policy control
    drm/i915/hdmi: HDMI source product description infoframe support
    drm/i915/hdmi: split infoframe setting from infoframe type code
    drm: track CEA version number if present
    drm/i915: Try enabling RC6 by default (again)
    Revert "drm/i915/dp: Zero the DPCD data before connection probe"
    drm/i915/dp: wait for previous AUX channel activity to clear
    ...

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits)
    ipv6: check for IPv4 mapped addresses when connecting IPv6 sockets
    mlx4: decreasing ref count when removing mac
    net: Fix security_socket_sendmsg() bypass problem.
    net: Cap number of elements for sendmmsg
    net: sendmmsg should only return an error if no messages were sent
    ixgbe: fix PHY link setup for 82599
    ixgbe: fix __ixgbe_notify_dca() bail out code
    igb: fix WOL on second port of i350 device
    e1000e: minor re-order of #include files
    e1000e: remove unnecessary check for NULL pointer
    intel drivers: repair missing flush operations
    macb: restore wrap bit when performing underrun cleanup
    cdc_ncm: fix endianness problem.
    irda: use PCI_VENDOR_ID_*
    mlx4: Fixing Ethernet unicast packet steering
    net: fix NULL dereferences in check_peer_redir()
    bnx2x: Clear MDIO access warning during first driver load
    bnx2x: Fix BCM578xx MAC test
    bnx2x: Fix BCM54618se invalid link indication
    bnx2x: Fix BCM84833 link
    ...

    Linus Torvalds
     

05 Aug, 2011

4 commits


04 Aug, 2011

20 commits

  • Provides function drm_edid_header_is_valid() for EDID header check
    and replaces EDID header check part of function drm_edid_block_valid()
    by a call of drm_edid_header_is_valid().
    This is a prerequisite to extend DDC probing, e. g. in function
    radeon_ddc_probe() for Radeon devices, by a central EDID header check.

    Tested for kernel 2.6.35, 2.6.38 and 3.0

    Cc:
    Signed-off-by: Thomas Reim
    Reviewed-by: Alex Deucher
    Acked-by: Stephen Michaels
    Signed-off-by: Dave Airlie

    Thomas Reim
     
  • …t/keithp/linux-2.6 into drm-fixes

    * 'drm-intel-next' of ssh://master.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: (42 commits)
    drm/i915: allow cache sharing policy control
    drm/i915/hdmi: HDMI source product description infoframe support
    drm/i915/hdmi: split infoframe setting from infoframe type code
    drm: track CEA version number if present
    drm/i915: Try enabling RC6 by default (again)
    Revert "drm/i915/dp: Zero the DPCD data before connection probe"
    drm/i915/dp: wait for previous AUX channel activity to clear
    drm/i915: don't use uninitialized EDID bpc values when picking pipe bpp
    drm/i915/pch: Save/restore PCH_PORT_HOTPLUG across suspend
    drm/i915: apply phase pointer override on SNB+ too
    drm/i915: Add quirk to disable SSC on Sony Vaio Y2
    drm/i915: provide more error output when mode sets fail
    drm/i915: add GPU max frequency control file
    i915: add Dell OptiPlex FX170 to intel_no_lvds
    drm/i915: Ignore GPU wedged errors while pinning scanout buffers
    drm/i915/hdmi: send AVI info frames on ILK+ as well
    drm/i915: fix CB tuning check for ILK+
    drm/i915: Flush other plane register writes
    drm/i915: flush plane control changes on ILK+ as well
    drm/i915: apply timing generator bug workaround on CPT and PPT
    ...

    Dave Airlie
     
  • This reverts commit 750f463a749e28464151ad26938d11b07b1c43cb.

    of_alias_* still needs work to be generalized for 'promtree' dt
    platforms, and to no implicitly create entries for available ids.

    Signed-off-by: Grant Likely

    Grant Likely
     
  • * 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
    cpuidle: stop depending on pm_idle
    x86 idle: move mwait_idle_with_hints() to where it is used
    cpuidle: replace xen access to x86 pm_idle and default_idle
    cpuidle: create bootparam "cpuidle.off=1"
    mrst_pmu: driver for Intel Moorestown Power Management Unit

    Linus Torvalds
     
  • * 'apei-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
    ACPI, APEI, EINJ Param support is disabled by default
    APEI GHES: 32-bit buildfix
    ACPI: APEI build fix
    ACPI, APEI, GHES: Add hardware memory error recovery support
    HWPoison: add memory_failure_queue()
    ACPI, APEI, GHES, Error records content based throttle
    ACPI, APEI, GHES, printk support for recoverable error via NMI
    lib, Make gen_pool memory allocator lockless
    lib, Add lock-less NULL terminated single list
    Add Kconfig option ARCH_HAVE_NMI_SAFE_CMPXCHG
    ACPI, APEI, Add WHEA _OSC support
    ACPI, APEI, Add APEI bit support in generic _OSC call
    ACPI, APEI, GHES, Support disable GHES at boot time
    ACPI, APEI, GHES, Prevent GHES to be built as module
    ACPI, APEI, Use apei_exec_run_optional in APEI EINJ and ERST
    ACPI, APEI, Add apei_exec_run_optional
    ACPI, APEI, GHES, Do not ratelimit fatal error printk before panic
    ACPI, APEI, ERST, Fix erst-dbg long record reading issue
    ACPI, APEI, ERST, Prevent erst_dbg from loading if ERST is disabled

    Linus Torvalds
     
  • * 'devicetree/next' of git://git.secretlab.ca/git/linux-2.6:
    dt: add of_alias_scan and of_alias_get_id

    Linus Torvalds
     
  • Drivers need to know the CEA version number in addition to other display
    info (like whether the display is an HDMI sink) before enabling certain
    features. So track the CEA version number in the display info
    structure.

    Signed-off-by: Jesse Barnes
    Signed-off-by: Keith Packard

    Jesse Barnes
     
  • We have already acknowledged that swapoff of a tmpfs file is slower than
    it was before conversion to the generic radix_tree: a little slower
    there will be acceptable, if the hotter paths are faster.

    But it was a shock to find swapoff of a 500MB file 20 times slower on my
    laptop, taking 10 minutes; and at that rate it significantly slows down
    my testing.

    Now, most of that turned out to be overhead from PROVE_LOCKING and
    PROVE_RCU: without those it was only 4 times slower than before; and
    more realistic tests on other machines don't fare as badly.

    I've tried a number of things to improve it, including tagging the swap
    entries, then doing lookup by tag: I'd expected that to halve the time,
    but in practice it's erratic, and often counter-productive.

    The only change I've so far found to make a consistent improvement, is
    to short-circuit the way we go back and forth, gang lookup packing
    entries into the array supplied, then shmem scanning that array for the
    target entry. Scanning in place doubles the speed, so it's now only
    twice as slow as before (or three times slower when the PROVEs are on).

    So, add radix_tree_locate_item() as an expedient, once-off,
    single-caller hack to do the lookup directly in place. #ifdef it on
    CONFIG_SHMEM and CONFIG_SWAP, as much to document its limited
    applicability as save space in other configurations. And, sadly,
    #include sched.h for cond_resched().

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • But we've not yet removed the old swp_entry_t i_direct[16] from
    shmem_inode_info. That's because it was still being shared with the
    inline symlink. Remove it now (saving 64 or 128 bytes from shmem inode
    size), and use kmemdup() for short symlinks, say, those up to 128 bytes.

    I wonder why mpol_free_shared_policy() is done in shmem_destroy_inode()
    rather than shmem_evict_inode(), where we usually do such freeing? I
    guess it doesn't matter, and I'm not into NUMA mpol testing right now.

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Reviewed-by: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Remove mem_cgroup_shmem_charge_fallback(): it was only required when we
    had to move swappage to filecache with GFP_NOWAIT.

    Remove the GFP_NOWAIT special case from mem_cgroup_cache_charge(), by
    moving its call out from shmem_add_to_page_cache() to two of thats three
    callers. But leave it doing mem_cgroup_uncharge_cache_page() on error:
    although asymmetrical, it's easier for all 3 callers to handle.

    These two changes would also be appropriate if anyone were to start
    using shmem_read_mapping_page_gfp() with GFP_NOWAIT.

    Remove mem_cgroup_get_shmem_target(): mc_handle_file_pte() can test
    radix_tree_exceptional_entry() to get what it needs for itself.

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • While it's at its least, make a number of boring nitpicky cleanups to
    shmem.c, mostly for consistency of variable naming. Things like "swap"
    instead of "entry", "pgoff_t index" instead of "unsigned long idx".

    And since everything else here is prefixed "shmem_", better change
    init_tmpfs() to shmem_init().

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The maximum size of a shmem/tmpfs file has been limited by the maximum
    size of its triple-indirect swap vector. With 4kB page size, maximum
    filesize was just over 2TB on a 32-bit kernel, but sadly one eighth of
    that on a 64-bit kernel. (With 8kB page size, maximum filesize was just
    over 4TB on a 64-bit kernel, but 16TB on a 32-bit kernel,
    MAX_LFS_FILESIZE being then more restrictive than swap vector layout.)

    It's a shame that tmpfs should be more restrictive than ramfs, and this
    limitation has now been noticed. Add another level to the swap vector?
    No, it became obscure and hard to maintain, once I complicated it to
    make use of highmem pages nine years ago: better choose another way.

    Surely, if 2.4 had had the radix tree pagecache introduced in 2.5, then
    tmpfs would never have invented its own peculiar radix tree: we would
    have fitted swap entries into the common radix tree instead, in much the
    same way as we fit swap entries into page tables.

    And why should each file have a separate radix tree for its pages and
    for its swap entries? The swap entries are required precisely where and
    when the pages are not. We want to put them together in a single radix
    tree: which can then avoid much of the locking which was needed to
    prevent them from being exchanged underneath us.

    This also avoids the waste of memory devoted to swap vectors, first in
    the shmem_inode itself, then at least two more pages once a file grew
    beyond 16 data pages (pages accounted by df and du, but not by memcg).
    Allocated upfront, to avoid allocation when under swapping pressure, but
    pure waste when CONFIG_SWAP is not set - I have never spattered around
    the ifdefs to prevent that, preferring this move to sharing the common
    radix tree instead.

    There are three downsides to sharing the radix tree. One, that it binds
    tmpfs more tightly to the rest of mm, either requiring knowledge of swap
    entries in radix tree there, or duplication of its code here in shmem.c.
    I believe that the simplications and memory savings (and probable higher
    performance, not yet measured) justify that.

    Two, that on HIGHMEM systems with SWAP enabled, it's the lowmem radix
    nodes that cannot be freed under memory pressure - whereas before it was
    the less precious highmem swap vector pages that could not be freed.
    I'm hoping that 64-bit has now been accessible for long enough, that the
    highmem argument has grown much less persuasive.

    Three, that swapoff is slower than it used to be on tmpfs files, since
    it's using a simple generic mechanism not tailored to it: I find this
    noticeable, and shall want to improve, but maybe nobody else will
    notice.

    So... now remove most of the old swap vector code from shmem.c. But,
    for the moment, keep the simple i_direct vector of 16 pages, with simple
    accessors shmem_put_swap() and shmem_get_swap(), as a toy implementation
    to help mark where swap needs to be handled in subsequent patches.

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • If swap entries are to be stored along with struct page pointers in a
    radix tree, they need to be distinguished as exceptional entries.

    Most of the handling of swap entries in radix tree will be contained in
    shmem.c, but a few functions in filemap.c's common code need to check
    for their appearance: find_get_page(), find_lock_page(),
    find_get_pages() and find_get_pages_contig().

    So as not to slow their fast paths, tuck those checks inside the
    existing checks for unlikely radix_tree_deref_slot(); except for
    find_lock_page(), where it is an added test. And make it a BUG in
    find_get_pages_tag(), which is not applied to tmpfs files.

    A part of the reason for eliminating shmem_readpage() earlier, was to
    minimize the places where common code would need to allow for swap
    entries.

    The swp_entry_t known to swapfile.c must be massaged into a slightly
    different form when stored in the radix tree, just as it gets massaged
    into a pte_t when stored in page tables.

    In an i386 kernel this limits its information (type and page offset) to
    30 bits: given 32 "types" of swapfile and 4kB pagesize, that's a maximum
    swapfile size of 128GB. Which is less than the 512GB we previously
    allowed with X86_PAE (where the swap entry can occupy the entire upper
    32 bits of a pte_t), but not a new limitation on 32-bit without PAE; and
    there's not a new limitation on 64-bit (where swap filesize is already
    limited to 16TB by a 32-bit page offset). Thirty areas of 128GB is
    probably still enough swap for a 64GB 32-bit machine.

    Provide swp_to_radix_entry() and radix_to_swp_entry() conversions, and
    enforce filesize limit in read_swap_header(), just as for ptes.

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • A patchset to extend tmpfs to MAX_LFS_FILESIZE by abandoning its
    peculiar swap vector, instead keeping a file's swap entries in the same
    radix tree as its struct page pointers: thus saving memory, and
    simplifying its code and locking.

    This patch:

    The radix_tree is used by several subsystems for different purposes. A
    major use is to store the struct page pointers of a file's pagecache for
    memory management. But what if mm wanted to store something other than
    page pointers there too?

    The low bit of a radix_tree entry is already used to denote an indirect
    pointer, for internal use, and the unlikely radix_tree_deref_retry()
    case.

    Define the next bit as denoting an exceptional entry, and supply inline
    functions radix_tree_exception() to return non-0 in either unlikely
    case, and radix_tree_exceptional_entry() to return non-0 in the second
    case.

    If a subsystem already uses radix_tree with that bit set, no problem: it
    does not affect internal workings at all, but is defined for the
    convenience of those storing well-aligned pointers in the radix_tree.

    The radix_tree_gang_lookups have an implicit assumption that the caller
    can deduce the offset of each entry returned e.g. by the page->index of
    a struct page. But that may not be feasible for some kinds of item to
    be stored there.

    radix_tree_gang_lookup_slot() allow for an optional indices argument,
    output array in which to return those offsets. The same could be added
    to other radix_tree_gang_lookups, but for now keep it to the only one
    for which we need it.

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • - Current implementation tests wrong value for setting
    aat2870_bl->max_current.

    - In the current implementation, we cannot differentiate between 2 cases:

    a) if pdata->max_current is not set , or

    b) pdata->max_current is set to AAT2870_CURRENT_0_45 (which is also 0).

    Fix it by setting AAT2870_CURRENT_0_45 to be 1 and adjust the equation in
    aat2870_brightness() accordingly.

    Signed-off-by: Axel Lin
    Cc: Richard Purdie
    Cc: Samuel Ortiz
    Tested-by: Jin Park
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Axel Lin
     
  • __GFP_OTHER_NODE is used for NUMA allocations on behalf of other nodes.
    It's supposed to be passed through from the page allocator to
    zone_statistics(), but it never gets there as gfp_allowed_mask is not
    wide enough and masks out the flag early in the allocation path.

    The result is an accounting glitch where successful NUMA allocations
    by-agent are not properly attributed as local.

    Increase __GFP_BITS_SHIFT so that it includes __GFP_OTHER_NODE.

    Signed-off-by: Johannes Weiner
    Acked-by: Andi Kleen
    Reviewed-by: Minchan Kim
    Acked-by: Mel Gorman
    Reviewed-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • The current hyper-optimized functions are overkill if you simply want to
    allocate an id for a device. Create versions which use an internal
    lock.

    In followup patches, numerous drivers are converted to use this
    interface.

    Thanks to Tejun for feedback.

    Signed-off-by: Rusty Russell
    Acked-by: Tejun Heo
    Acked-by: Jonathan Cameron
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rusty Russell
     
  • init_fault_attr_dentries() is used to export fault_attr via debugfs.
    But it can only export it in debugfs root directory.

    Per Forlin is working on mmc_fail_request which adds support to inject
    data errors after a completed host transfer in MMC subsystem.

    The fault_attr for mmc_fail_request should be defined per mmc host and
    export it in debugfs directory per mmc host like
    /sys/kernel/debug/mmc0/mmc_fail_request.

    init_fault_attr_dentries() doesn't help for mmc_fail_request. So this
    introduces fault_create_debugfs_attr() which is able to create a
    directory in the arbitrary directory and replace
    init_fault_attr_dentries().

    [akpm@linux-foundation.org: extraneous semicolon, per Randy]
    Signed-off-by: Akinobu Mita
    Tested-by: Per Forlin
    Cc: Jens Axboe
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Matt Mackall
    Cc: Randy Dunlap
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • cpuidle users should call cpuidle_call_idle() directly
    rather than via (pm_idle)() function pointer.

    Architecture may choose to continue using (pm_idle)(),
    but cpuidle need not depend on it:

    my_arch_cpu_idle()
    ...
    if(cpuidle_call_idle())
    pm_idle();

    cc: Kevin Hilman
    cc: Paul Mundt
    cc: x86@kernel.org
    Acked-by: H. Peter Anvin
    Signed-off-by: Len Brown

    Len Brown
     
  • When a Xen Dom0 kernel boots on a hypervisor, it gets access
    to the raw-hardware ACPI tables. While it parses the idle tables
    for the hypervisor's beneift, it uses HLT for its own idle.

    Rather than have xen scribble on pm_idle and access default_idle,
    have it simply disable_cpuidle() so acpi_idle will not load and
    architecture default HLT will be used.

    cc: xen-devel@lists.xensource.com
    Tested-by: Konrad Rzeszutek Wilk
    Acked-by: H. Peter Anvin
    Signed-off-by: Len Brown

    Len Brown
     

03 Aug, 2011

3 commits

  • Some trivial conflicts due to other various merges
    adding to the end of common lists sooner than this one.

    arch/ia64/Kconfig
    arch/powerpc/Kconfig
    arch/x86/Kconfig
    lib/Kconfig
    lib/Makefile

    Signed-off-by: Len Brown

    Len Brown
     
  • as GHES is optional...

    When # CONFIG_ACPI_APEI_GHES is not set:

    (.init.text+0x4c22): undefined reference to `ghes_disable'

    Reported-by: Randy Dunlap
    Acked-by: Randy Dunlap
    Signed-off-by: Len Brown

    Len Brown
     
  • memory_failure() is the entry point for HWPoison memory error
    recovery. It must be called in process context. But commonly
    hardware memory errors are notified via MCE or NMI, so some delayed
    execution mechanism must be used. In MCE handler, a work queue + ring
    buffer mechanism is used.

    In addition to MCE, now APEI (ACPI Platform Error Interface) GHES
    (Generic Hardware Error Source) can be used to report memory errors
    too. To add support to APEI GHES memory recovery, a mechanism similar
    to that of MCE is implemented. memory_failure_queue() is the new
    entry point that can be called in IRQ context. The next step is to
    make MCE handler uses this interface too.

    Signed-off-by: Huang Ying
    Cc: Andi Kleen
    Cc: Wu Fengguang
    Cc: Andrew Morton
    Signed-off-by: Len Brown

    Huang Ying