18 Nov, 2011

1 commit

  • The buf_lock cannot be held while populating the inodes, so make the backend
    pass forward an allocated and filled buffer instead. This solves the following
    backtrace. The effect is that "buf" is only ever used to notify the backends
    that something was written to it, and shouldn't be used in the read path.

    To replace the buf_lock during the read path, isolate the open/read/close
    loop with a separate mutex to maintain serialized access to the backend.

    Note that is is up to the pstore backend to cope if the (*write)() path is
    called in the middle of the read path.

    [ 59.691019] BUG: sleeping function called from invalid context at .../mm/slub.c:847
    [ 59.691019] in_atomic(): 0, irqs_disabled(): 1, pid: 1819, name: mount
    [ 59.691019] Pid: 1819, comm: mount Not tainted 3.0.8 #1
    [ 59.691019] Call Trace:
    [ 59.691019] [] __might_sleep+0xc3/0xca
    [ 59.691019] [] kmem_cache_alloc+0x32/0xf3
    [ 59.691019] [] ? __d_lookup_rcu+0x6f/0xf4
    [ 59.691019] [] alloc_inode+0x2a/0x64
    [ 59.691019] [] new_inode+0x18/0x43
    [ 59.691019] [] pstore_get_inode.isra.1+0x11/0x98
    [ 59.691019] [] pstore_mkfile+0xae/0x26f
    [ 59.691019] [] ? kmem_cache_free+0x19/0xb1
    [ 59.691019] [] ? ida_get_new_above+0x140/0x158
    [ 59.691019] [] ? __init_rwsem+0x1e/0x2c
    [ 59.691019] [] ? inode_init_always+0x111/0x1b0
    [ 59.691019] [] ? should_resched+0xd/0x27
    [ 59.691019] [] ? _cond_resched+0xd/0x21
    [ 59.691019] [] pstore_get_records+0x52/0xa7
    [ 59.691019] [] pstore_fill_super+0x7d/0x91
    [ 59.691019] [] mount_single+0x46/0x82
    [ 59.691019] [] pstore_mount+0x15/0x17
    [ 59.691019] [] ? pstore_get_inode.isra.1+0x98/0x98
    [ 59.691019] [] mount_fs+0x5a/0x12d
    [ 59.691019] [] ? alloc_vfsmnt+0xa4/0x14a
    [ 59.691019] [] vfs_kern_mount+0x4f/0x7d
    [ 59.691019] [] do_kern_mount+0x34/0xb2
    [ 59.691019] [] do_mount+0x5fc/0x64a
    [ 59.691019] [] ? strndup_user+0x2e/0x3f
    [ 59.691019] [] sys_mount+0x66/0x99
    [ 59.691019] [] sysenter_do_call+0x12/0x26

    Signed-off-by: Kees Cook
    Signed-off-by: Tony Luck

    Kees Cook
     

13 Nov, 2011

1 commit

  • After commit e978aa7d7d57 ("cpuidle: Move dev->last_residency update to
    driver enter routine; remove dev->last_state") setting acpi_idle_suspend
    to 1 by acpi_processor_suspend() causes the ACPI cpuidle routines to
    return error codes continuously, which in turn causes cpuidle to lock up
    (hard).

    However, acpi_idle_suspend doesn't appear to be useful for any
    particular purpose (it's racy and doesn't really provide any real
    protection), so it can be removed, which makes the problem go away.

    Reported-and-tested-by: Tomas M.
    Reported-and-tested-by: Ferenc Wagner
    Tested-by: Arnd Bergmann
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

08 Nov, 2011

1 commit

  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
    cpuidle: Single/Global registration of idle states
    cpuidle: Split cpuidle_state structure and move per-cpu statistics fields
    cpuidle: Remove CPUIDLE_FLAG_IGNORE and dev->prepare()
    cpuidle: Move dev->last_residency update to driver enter routine; remove dev->last_state
    ACPI: Fix CONFIG_ACPI_DOCK=n compiler warning
    ACPI: Export FADT pm_profile integer value to userspace
    thermal: Prevent polling from happening during system suspend
    ACPI: Drop ACPI_NO_HARDWARE_INIT
    ACPI atomicio: Convert width in bits to bytes in __acpi_ioremap_fast()
    PNPACPI: Simplify disabled resource registration
    ACPI: Fix possible recursive locking in hwregs.c
    ACPI: use kstrdup()
    mrst pmu: update comment
    tools/power turbostat: less verbose debugging

    Linus Torvalds
     

07 Nov, 2011

11 commits

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     
  • Len Brown
     
  • Len Brown
     
  • This patch makes the cpuidle_states structure global (single copy)
    instead of per-cpu. The statistics needed on per-cpu basis
    by the governor are kept per-cpu. This simplifies the cpuidle
    subsystem as state registration is done by single cpu only.
    Having single copy of cpuidle_states saves memory. Rare case
    of asymmetric C-states can be handled within the cpuidle driver
    and architectures such as POWER do not have asymmetric C-states.

    Having single/global registration of all the idle states,
    dynamic C-state transitions on x86 are handled by
    the boot cpu. Here, the boot cpu would disable all the devices,
    re-populate the states and later enable all the devices,
    irrespective of the cpu that would receive the notification first.

    Reference:
    https://lkml.org/lkml/2011/4/25/83

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • This is the first step towards global registration of cpuidle
    states. The statistics used primarily by the governor are per-cpu
    and have to be split from rest of the fields inside cpuidle_state,
    which would be made global i.e. single copy. The driver_data field
    is also per-cpu and moved.

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • Cpuidle governor only suggests the state to enter using the
    governor->select() interface, but allows the low level driver to
    override the recommended state. The actual entered state
    may be different because of software or hardware demotion. Software
    demotion is done by the back-end cpuidle driver and can be accounted
    correctly. Current cpuidle code uses last_state field to capture the
    actual state entered and based on that updates the statistics for the
    state entered.

    Ideally the driver enter routine should update the counters,
    and it should return the state actually entered rather than the time
    spent there. The generic cpuidle code should simply handle where
    the counters live in the sysfs namespace, not updating the counters.

    Reference:
    https://lkml.org/lkml/2011/3/25/52

    Signed-off-by: Deepthi Dharwar
    Signed-off-by: Trinabh Gupta
    Tested-by: Jean Pihet
    Reviewed-by: Kevin Hilman
    Acked-by: Arjan van de Ven
    Acked-by: Kevin Hilman
    Signed-off-by: Len Brown

    Deepthi Dharwar
     
  • There are a lot userspace approaches to detect the usage of the
    platform (laptop, workstation, server, ...) and adjust kernel tunables
    accordingly (io/process scheduler, power management, ...).

    These approaches need constant maintaining and are ugly to implement
    (detect PCMCIA controller -> laptop,
    does not work on recent systems anymore, ...)
    On ACPI systems there is an easy and reliable way (if implemented
    in BIOS and most recent platforms have this value set).
    -> export it to userspace.

    Signed-off-by: Thomas Renninger
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Len Brown

    Thomas Renninger
     
  • ACPI_NO_HARDWARE_INIT is only used by acpi_early_init() and
    acpi_bus_init() when calling acpi_enable_subsystem(), but
    acpi_enable_subsystem() doesn't check that flag, so it can be
    dropped.

    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Len Brown

    Rafael J. Wysocki
     
  • Callers to __acpi_ioremap_fast() pass the bit_width that they found in the
    acpi_generic_address structure. Convert from bits to bytes when passing to
    __acpi_find_iomap() - as it wants to see bytes, not bits.

    cc: stable@kernel.org
    Signed-off-by: Tony Luck
    Signed-off-by: Len Brown

    Luck, Tony
     
  • Calling pm-suspend might trigger a recursive lock in it's code path.
    In function acpi_hw_clear_acpi_status, acpi_os_acquire_lock holds
    the lock acpi_gbl_hardware_lock before calling acpi_hw_register_write(),
    then without releasing acpi_gbl_hardware_lock, this function calls
    acpi_ev_walk_gpe_list, which tries to hold acpi_gbl_gpe_lock.
    Both acpi_gbl_hardware_lock and acpi_gbl_gpe_lock are at same
    lock-class and which might cause lock recursion deadlock.

    Following patch fixes this scenario by just releasing
    acpi_gbl_hardware_lock before calling acpi_ev_walk_gpe_list.

    Changes since v0(https://lkml.org/lkml/2011/9/21/355):
    - Fix changelog, thanks to Lin Ming.

    Changes since v1 (https://lkml.org/lkml/2011/11/3/89):
    - Update changelog and rename goto label, courtesy Srivatsa S. Bhat.

    Signed-off-by: Rakib Mullick
    Reviewed-by: Srivatsa S. Bhat
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Len Brown

    Rakib Mullick
     
  • Use kstrdup rather than duplicating its implementation

    The semantic patch that makes this output is available
    in scripts/coccinelle/api/kstrdup.cocci.

    More information about semantic patching is available at
    http://coccinelle.lip6.fr/

    Signed-off-by: Thomas Meyer
    Signed-off-by: Len Brown

    Thomas Meyer
     

05 Nov, 2011

1 commit


02 Nov, 2011

1 commit


01 Nov, 2011

4 commits


29 Oct, 2011

1 commit

  • * 'next-rebase' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci:
    PCI: Clean-up MPS debug output
    pci: Clamp pcie_set_readrq() when using "performance" settings
    PCI: enable MPS "performance" setting to properly handle bridge MPS
    PCI: Workaround for Intel MPS errata
    PCI: Add support for PASID capability
    PCI: Add implementation for PRI capability
    PCI: Export ATS functions to modules
    PCI: Move ATS implementation into own file
    PCI / PM: Remove unnecessary error variable from acpi_dev_run_wake()
    PCI hotplug: acpiphp: Prevent deadlock on PCI-to-PCI bridge remove
    PCI / PM: Extend PME polling to all PCI devices
    PCI quirk: mmc: Always check for lower base frequency quirk for Ricoh 1180:e823
    PCI: Make pci_setup_bridge() non-static for use by arch code
    x86: constify PCI raw ops structures
    PCI: Add quirk for known incorrect MPSS
    PCI: Add Solarflare vendor ID and SFC4000 device IDs

    Linus Torvalds
     

26 Oct, 2011

3 commits

  • * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
    llist: Add back llist_add_batch() and llist_del_first() prototypes
    sched: Don't use tasklist_lock for debug prints
    sched: Warn on rt throttling
    sched: Unify the ->cpus_allowed mask copy
    sched: Wrap scheduler p->cpus_allowed access
    sched: Request for idle balance during nohz idle load balance
    sched: Use resched IPI to kick off the nohz idle balance
    sched: Fix idle_cpu()
    llist: Remove cpu_relax() usage in cmpxchg loops
    sched: Convert to struct llist
    llist: Add llist_next()
    irq_work: Use llist in the struct irq_work logic
    llist: Return whether list is empty before adding in llist_add()
    llist: Move cpu_relax() to after the cmpxchg()
    llist: Remove the platform-dependent NMI checks
    llist: Make some llist functions inline
    sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch
    sched: Remove redundant test in check_preempt_tick()
    sched: Add documentation for bandwidth control
    sched: Return unused runtime on group dequeue
    ...

    Linus Torvalds
     
  • * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (121 commits)
    perf symbols: Increase symbol KSYM_NAME_LEN size
    perf hists browser: Refuse 'a' hotkey on non symbolic views
    perf ui browser: Use libslang to read keys
    perf tools: Fix tracing info recording
    perf hists browser: Elide DSO column when it is set to just one DSO, ditto for threads
    perf hists: Don't consider filtered entries when calculating column widths
    perf hists: Don't decay total_period for filtered entries
    perf hists browser: Honour symbol_conf.show_{nr_samples,total_period}
    perf hists browser: Do not exit on tab key with single event
    perf annotate browser: Don't change selection line when returning from callq
    perf tools: handle endianness of feature bitmap
    perf tools: Add prelink suggestion to dso update message
    perf script: Fix unknown feature comment
    perf hists browser: Apply the dso and thread filters when merging new batches
    perf hists: Move the dso and thread filters from hist_browser
    perf ui browser: Honour the xterm colors
    perf top tui: Give color hints just on the percentage, like on --stdio
    perf ui browser: Make the colors configurable and change the defaults
    perf tui: Remove unneeded call to newtCls on startup
    perf hists: Don't format the percentage on hist_entry__snprintf
    ...

    Fix up conflicts in arch/x86/kernel/kprobes.c manually.

    Ingo's tree did the insane "add volatile to const array", which just
    doesn't make sense ("volatile const"?). But we could remove the const
    *and* make the array volatile to make doubly sure that gcc doesn't
    optimize it away..

    Also fix up kernel/trace/ring_buffer.c non-data-conflicts manually: the
    reader_lock has been turned into a raw lock by the core locking merge,
    and there was a new user of it introduced in this perf core merge. Make
    sure that new use also uses the raw accessor functions.

    Linus Torvalds
     
  • * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
    rtmutex: Add missing rcu_read_unlock() in debug_rt_mutex_print_deadlock()
    lockdep: Comment all warnings
    lib: atomic64: Change the type of local lock to raw_spinlock_t
    locking, lib/atomic64: Annotate atomic64_lock::lock as raw
    locking, x86, iommu: Annotate qi->q_lock as raw
    locking, x86, iommu: Annotate irq_2_ir_lock as raw
    locking, x86, iommu: Annotate iommu->register_lock as raw
    locking, dma, ipu: Annotate bank_lock as raw
    locking, ARM: Annotate low level hw locks as raw
    locking, drivers/dca: Annotate dca_lock as raw
    locking, powerpc: Annotate uic->lock as raw
    locking, x86: mce: Annotate cmci_discover_lock as raw
    locking, ACPI: Annotate c3_lock as raw
    locking, oprofile: Annotate oprofilefs lock as raw
    locking, video: Annotate vga console lock as raw
    locking, latencytop: Annotate latency_lock as raw
    locking, timer_stats: Annotate table_lock as raw
    locking, rwsem: Annotate inner lock as raw
    locking, semaphores: Annotate inner lock as raw
    locking, sched: Annotate thread_group_cputimer as raw
    ...

    Fix up conflicts in kernel/posix-cpu-timers.c manually: making
    cputimer->cputime a raw lock conflicted with the ABBA fix in commit
    bcd5cff7216f ("cputimer: Cure lock inversion").

    Linus Torvalds
     

22 Oct, 2011

1 commit


17 Oct, 2011

2 commits


15 Oct, 2011

1 commit

  • I originally submitted a patch to workaround this by pushing all Ejection
    Requests and Device Checks onto the kacpi_hotplug queue.

    http://marc.info/?l=linux-acpi&m=131678270930105&w=2

    The patch is still insufficient in that Bus Checks also need to be added.

    Rather than add all events, including non-PCI-hotplug events, to the
    hotplug queue, mjg suggested that a better approach would be to modify
    the acpiphp driver so only acpiphp events would be added to the
    kacpi_hotplug queue.

    It's a longer patch, but at least we maintain the benefit of having separate
    queues in ACPI. This, of course, is still only a workaround the problem.
    As Bjorn and mjg pointed out, we have to refactor a lot of this code to do
    the right thing but at this point it is a better to have this code working.

    The acpi core places all events on the kacpi_notify queue. When the acpiphp
    driver is loaded and a PCI card with a PCI-to-PCI bridge is removed the
    following call sequence occurs:

    cleanup_p2p_bridge()
    -> cleanup_bridge()
    -> acpi_remove_notify_handler()
    -> acpi_os_wait_events_complete()
    -> flush_workqueue(kacpi_notify_wq)

    which is the queue we are currently executing on and the process will hang.

    Move all hotplug acpiphp events onto the kacpi_hotplug workqueue. In
    handle_hotplug_event_bridge() and handle_hotplug_event_func() we can simply
    push the rest of the work onto the kacpi_hotplug queue and then avoid the
    deadlock.

    Signed-off-by: Prarit Bhargava
    Cc: mjg@redhat.com
    Cc: bhelgaas@google.com
    Cc: linux-acpi@vger.kernel.org
    Signed-off-by: Jesse Barnes

    Prarit Bhargava
     

13 Oct, 2011

1 commit


10 Oct, 2011

1 commit

  • Just convert all the files that have an nmi handler to the new routines.
    Most of it is straight forward conversion. A couple of places needed some
    tweaking like kgdb which separates the debug notifier from the nmi handler
    and mce removes a call to notify_die.

    [Thanks to Ying for finding out the history behind that mce call

    https://lkml.org/lkml/2010/5/27/114

    And Boris responding that he would like to remove that call because of it

    https://lkml.org/lkml/2011/9/21/163]

    The things that get converted are the registeration/unregistration routines
    and the nmi handler itself has its args changed along with code removal
    to check which list it is on (most are on one NMI list except for kgdb
    which has both an NMI routine and an NMI Unknown routine).

    Signed-off-by: Don Zickus
    Signed-off-by: Peter Zijlstra
    Acked-by: Corey Minyard
    Cc: Jason Wessel
    Cc: Andi Kleen
    Cc: Robert Richter
    Cc: Huang Ying
    Cc: Corey Minyard
    Cc: Jack Steiner
    Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.com
    Signed-off-by: Ingo Molnar

    Don Zickus
     

08 Oct, 2011

1 commit

  • * pm-qos:
    PM / QoS: Update Documentation for the pm_qos and dev_pm_qos frameworks
    PM / QoS: Add function dev_pm_qos_read_value() (v3)
    PM QoS: Add global notification mechanism for device constraints
    PM QoS: Implement per-device PM QoS constraints
    PM QoS: Generalize and export constraints management code
    PM QoS: Reorganize data structs
    PM QoS: Code reorganization
    PM QoS: Minor clean-ups
    PM QoS: Move and rename the implementation files

    Rafael J. Wysocki
     

04 Oct, 2011

1 commit

  • Because llist code will be used in performance critical scheduler
    code path, make llist_add() and llist_del_all() inline to avoid
    function calling overhead and related 'glue' overhead.

    Signed-off-by: Huang Ying
    Acked-by: Mathieu Desnoyers
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1315461646-1379-2-git-send-email-ying.huang@intel.com
    Signed-off-by: Ingo Molnar

    Huang Ying
     

13 Sep, 2011

2 commits


30 Aug, 2011

1 commit


25 Aug, 2011

1 commit


17 Aug, 2011

1 commit

  • pstore was using mutex locking to protect read/write access to the
    backend plug-ins. This causes problems when pstore is executed in
    an NMI context through panic() -> kmsg_dump().

    This patch changes the mutex to a spin_lock_irqsave then also checks to
    see if we are in an NMI context. If we are in an NMI and can't get the
    lock, just print a message stating that and blow by the locking.

    All this is probably a hack around the bigger locking problem but it
    solves my current situation of trying to sleep in an NMI context.

    Tested by loading the lkdtm module and executing a HARDLOCKUP which
    will cause the machine to panic inside the nmi handler.

    Signed-off-by: Don Zickus
    Acked-by: Matthew Garrett
    Signed-off-by: Tony Luck

    Don Zickus
     

12 Aug, 2011

2 commits

  • IRQ_WORK is used by GHES, but it is selected by PERF_EVENT.
    For now PERF_EVENT is selected by x86 by default, but
    in concept, IRQ_WORK should be selected by GHES, not by others.

    Signed-off-by: Chen Gong
    Signed-off-by: Len Brown

    Chen Gong
     
  • Bit 0 of the support parameter to the OSC call should be set in order to
    indicate that the OS supports the WHEA mechanism. Stuart Hayes tracked
    an APEI issue on some Dell platforms down to this.

    Reported-by: Stuart Hayes
    Signed-off-by: Matthew Garrett
    Signed-off-by: Len Brown

    Matthew Garrett
     

06 Aug, 2011

1 commit