15 Aug, 2009

1 commit


14 Aug, 2009

10 commits

  • Reason: Martin's timekeeping cleanup series depends on both
    timers/core and mainline changes.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Linus Torvalds
     
  • free_irq() can remove an irqaction while the corresponding interrupt
    is in progress, but free_irq() sets action->thread to NULL
    unconditionally, which might lead to a NULL pointer dereference in
    handle_IRQ_event() when the hard interrupt context tries to wake up
    the handler thread.

    Prevent this by moving the thread stop after synchronize_irq(). No
    need to set action->thread to NULL either as action is going to be
    freed anyway.

    This fixes a boot crash reported against preempt-rt which uses the
    mainline irq threads code to implement full irq threading.

    [ tglx: removed local irqthread variable ]

    Signed-off-by: Linus Torvalds
    Signed-off-by: Thomas Gleixner

    Linus Torvalds
     
  • …x/kernel/git/tip/linux-2.6-tip

    * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    perf_counter: Report the cloning task as parent on perf_counter_fork()
    perf_counter: Fix an ipi-deadlock
    perf: Rework/fix the whole read vs group stuff
    perf_counter: Fix swcounter context invariance
    perf report: Don't show unresolved DSOs and symbols when -S/-d is used
    perf tools: Add a general option to enable raw sample records
    perf tools: Add a per tracepoint counter attribute to get raw sample
    perf_counter: Provide hw_perf_counter_setup_online() APIs
    perf list: Fix large list output by using the pager
    perf_counter, x86: Fix/improve apic fallback
    perf record: Add missing -C option support for specifying profile cpu
    perf tools: Fix dso__new handle() to handle deleted DSOs
    perf tools: Fix fallback to cplus_demangle() when bfd_demangle() is not available
    perf report: Show the tid too in -D
    perf record: Fix .tid and .pid fill-in when synthesizing events
    perf_counter, x86: Fix generic cache events on P6-mobile CPUs
    perf_counter, x86: Fix lapic printk message

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    futex: Fix handling of bad requeue syscall pairing
    futex: Fix compat_futex to be same as futex for REQUEUE_PI
    locking, sched: Give waitqueue spinlocks their own lockdep classes
    futex: Update futex_q lock_ptr on requeue proxy lock

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Fix oops in identify_cpu() on CPUs without CPUID
    x86: Clear incorrectly forced X86_FEATURE_LAHF_LM flag
    x86, mce: therm_throt - change when we print messages
    x86: Add reboot quirk for every 5 series MacBook/Pro

    Linus Torvalds
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: (22 commits)
    ocfs2: Fix possible deadlock when extending quota file
    ocfs2: keep index within status_map[]
    ocfs2: Initialize the cluster we're writing to in a non-sparse extend
    ocfs2: Remove redundant BUG_ON in __dlm_queue_ast()
    ocfs2/quota: Release lock for error in ocfs2_quota_write.
    ocfs2: Define credit counts for quota operations
    ocfs2: Remove syncjiff field from quota info
    ocfs2: Fix initialization of blockcheck stats
    ocfs2: Zero out padding of on disk dquot structure
    ocfs2: Initialize blocks allocated to local quota file
    ocfs2: Mark buffer uptodate before calling ocfs2_journal_access_dq()
    ocfs2: Make global quota files blocksize aligned
    ocfs2: Use ocfs2_rec_clusters in ocfs2_adjust_adjacent_records.
    ocfs2: Fix deadlock on umount
    ocfs2: Add extra credits and access the modified bh in update_edge_lengths.
    ocfs2: Fail ocfs2_get_block() immediately when a block needs allocation
    ocfs2: Fix error return in ocfs2_write_cluster()
    ocfs2: Fix compilation warning for fs/ocfs2/xattr.c
    ocfs2: Initialize count in aio_write before generic_write_checks
    ocfs2: log the actual return value of ocfs2_file_aio_write()
    ...

    Linus Torvalds
     
  • * 'for-linus' of git://neil.brown.name/md:
    md: allow upper limit for resync/reshape to be set when array is read-only
    md/raid5: Properly remove excess drives after shrinking a raid5/6
    md/raid5: make sure a reshape restarts at the correct address.
    md/raid5: allow new reshape modes to be restarted in the middle.
    md: never advance 'events' counter by more than 1.
    Remove deadlock potential in md_open

    Linus Torvalds
     
  • * 'sh/for-2.6.31' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
    sh: fix i2c init order on ap325rxa V2
    sh: fix i2c init order on Migo-R V2
    sh: convert processor device setup functions to arch_initcall()

    Linus Torvalds
     
  • kernel_sendpage() does the proper default case handling for when the
    socket doesn't have a native sendpage implementation.

    Now, arguably this might be something that we could instead solve by
    just specifying that all protocols should do it themselves at the
    protocol level, but we really only care about the common protocols.
    Does anybody really care about sendpage on something like Appletalk? Not
    likely.

    Acked-by: David S. Miller
    Acked-by: Julien TINNES
    Acked-by: Tavis Ormandy
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Aug, 2009

19 commits

  • A bug in (9f498cc: perf_counter: Full task tracing) makes
    profiling multi-threaded apps it go belly up.

    [ output as: (PID:TID):(PPID:PTID) ]

    # ./perf report -D | grep FORK
    0x4b0 [0x18]: PERF_EVENT_FORK: (3237:3237):(3236:3236)
    0xa10 [0x18]: PERF_EVENT_FORK: (3237:3238):(3236:3236)
    0xa70 [0x18]: PERF_EVENT_FORK: (3237:3239):(3236:3236)
    0xad0 [0x18]: PERF_EVENT_FORK: (3237:3240):(3236:3236)
    0xb18 [0x18]: PERF_EVENT_FORK: (3237:3241):(3236:3236)

    Shows us that the test (27d028d perf report: Update for the new
    FORK/EXIT events) in builtin-report.c:

    /*
    * A thread clone will have the same PID for both
    * parent and child.
    */
    if (thread == parent)
    return 0;

    Will clearly fail.

    The problem is that perf_counter_fork() reports the actual
    parent, instead of the cloning thread.

    Fixing that (with the below patch), yields:

    # ./perf report -D | grep FORK
    0x4c8 [0x18]: PERF_EVENT_FORK: (1590:1590):(1589:1589)
    0xbd8 [0x18]: PERF_EVENT_FORK: (1590:1591):(1590:1590)
    0xc80 [0x18]: PERF_EVENT_FORK: (1590:1592):(1590:1590)
    0x3338 [0x18]: PERF_EVENT_FORK: (1590:1593):(1590:1590)
    0x66b0 [0x18]: PERF_EVENT_FORK: (1590:1594):(1590:1590)

    Which both makes more sense and doesn't confuse perf report
    anymore.

    Reported-by: Pekka Enberg
    Signed-off-by: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: paulus@samba.org
    Cc: Anton Blanchard
    Cc: Arjan van de Ven
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • perf_pending_counter() is called from IRQ context and will call
    perf_counter_disable(), however perf_counter_disable() uses
    smp_call_function_single() which doesn't fancy being used with
    IRQs disabled due to IPI deadlocks.

    Fix this by making it use the local __perf_counter_disable()
    call and teaching the counter_sched_out() code about pending
    disables as well.

    This should cover the case where a counter migrates before the
    pending queue gets processed.

    Signed-off-by: Peter Zijlstra
    Cc: Corey J Ashford
    Cc: Paul Mackerras
    Cc: stephane eranian
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Replace PERF_SAMPLE_GROUP with PERF_SAMPLE_READ and introduce
    PERF_FORMAT_GROUP to deal with group reads in a more generic
    way.

    This allows you to get group reads out of read() as well.

    Signed-off-by: Peter Zijlstra
    Cc: Corey J Ashford
    Cc: Paul Mackerras
    Cc: stephane eranian
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • perf_swcounter_is_counting() uses a lock, which means we cannot
    use swcounters from NMI or when holding that particular lock,
    this is unintended.

    The below removes the lock, this opens up race window, but not
    worse than the swcounters already experience due to RCU
    traversal of the context in perf_swcounter_ctx_event().

    This also fixes the hard lockups while opening a lockdep
    tracepoint counter.

    Signed-off-by: Peter Zijlstra
    Acked-by: Frederic Weisbecker
    Cc: Paul Mackerras
    Cc: stephane eranian
    Cc: Corey J Ashford
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • We're interested in just those symbols/DSOs, so filter out the
    unresolved ones.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • While we can enable the perf sample records per tracepoint
    counter, we may also want to enable this option for every
    tracepoint counters to open, so that we don't need to add a
    :record flag for all of them.

    Add the -R, --raw-samples options for this purpose.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Add a new flag field while opening a tracepoint perf counter:

    -e tracepoint_subsystem:tracepoint_name:flags

    This is intended to be generic although for now it only supports the
    r[e[c[o[r[d]]]]] flag:

    ./perf record -e workqueue:workqueue_insertion:record
    ./perf record -e workqueue:workqueue_insertion:r

    will have the same effect: enabling the raw samples record for
    the given tracepoint counter.

    In the future, we may want to support further flags, separated
    by commas.

    Signed-off-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    Cc: Arnaldo Carvalho de Melo
    Cc: Mike Galbraith
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Provide weak aliases for hw_perf_counter_setup_online(). This is
    used by the BTS patches (for v2.6.32), but it interacts with
    fixes so propagate this upstream. (it has no effect as of yet)

    Also export perf_counter_output() to architecture code.

    Cc: Peter Zijlstra
    Cc: Paul Mackerras
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • When /sys/kernel/debug is mounted the list can be imense, so
    use the pager like the other tools.

    Signed-off-by: Arnaldo Carvalho de Melo
    Acked-by: Frederic Weisbecker
    Cc: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • Convert the AP325RXA board code to register devices at
    arch_initcall() time instead of device_initcall(). This
    fix unbreaks pcf8563 RTC driver support.

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     
  • Convert the Migo-R board code to register devices at
    arch_initcall() time instead of __initcall(). This fix
    unbreaks migor_ts touch screen driver support.

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     
  • Convert the processor platform device setup
    functions from __initcall() and sometimes
    device_initcall() to arch_initcall().

    This makes sure that the platform devices are
    registered a bit earlier so the devices are
    available when drivers register using initcall
    levels earlier than device_initcall().

    A good example is platform devices needed by
    i2c-sh_mobile.c which registers a bit earlier
    using subsys_initcall().

    Signed-off-by: Magnus Damm
    Signed-off-by: Paul Mundt

    Magnus Damm
     
  • Normally we only allow the upper limit for a reshape to be decreased
    when the array not performing a sync/recovery/reshape, otherwise there
    could be races. But if an array is part-way through a reshape when it
    is assembled the reshape is started immediately leaving no window
    to set an upper bound.

    If the array is started read-only, the reshape will be suspended until
    the array becomes writable, so that provides a window during which it
    is perfectly safe to reduce the upper limit of a reshape.

    So: allow the upper limit (sync_max) to be reduced even if the reshape
    thread is running, as long as the array is still read-only.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • We were removing the drives, from the array, but not
    removing symlinks from /sys/.... and not marking the device
    as having been removed.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • This "if" don't allow for the possibility that the number of devices
    doesn't change, and so sector_nr isn't set correctly in that case.
    So change '>' to '>='.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • md/raid5 doesn't allow a reshape to restart if it involves writing
    over the same part of disk that it would be reading from.
    This happens at the beginning of a reshape that increases the number
    of devices, at the end of a reshape that decreases the number of
    devices, and continuously for a reshape that does not change the
    number of devices.

    The current code is correct for the "increase number of devices"
    case as the critical section at the start is handled by userspace
    performing a backup.

    It does not work for reducing the number of devices, or the
    no-change case.
    For 'reducing', we need to invert the test. For no-change we cannot
    really be sure things will be safe, so simply require the array
    to be read-only, which is how the user-space code which carefully
    starts such arrays works.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • When assembling arrays, md allows two devices to have different event
    counts as long as the difference is only '1'. This is to cope with
    a system failure between updating the metadata on two difference
    devices.

    However there are currently times when we update the event count by
    2. This was done to keep the event count even when the array is clean
    and odd when it is dirty, which allows us to avoid writing common
    update to spare devices and so allow those spares to go to sleep.

    This is bad for the above reason. So change it to never increase by
    two. This means that the alignment between 'odd/even' and
    'clean/dirty' might take a little longer to attain, but that is only a
    small cost. The spares will get a few more updates but that will
    still be spared (;-) most updates and can still go to sleep.

    Prior to this patch there was a small chance that after a crash an
    array would fail to assemble due to the overly large event count
    mismatch.

    Signed-off-by: NeilBrown

    NeilBrown
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    Remove double removal of blktrace directory

    Linus Torvalds
     
  • commit fd51d251e4cdb21f68e9dbc4336514d64a105a79
    Author: Stefan Raspl
    Date: Tue May 19 09:59:08 2009 +0200

    blktrace: remove debugfs entries on bad path

    added in an explicit invocation of debugfs_remove for bt->dir, in
    blk_remove_buf_file_callback we are also getting the directory removed. On
    occasion I am seeing memory corruption that I have bisected down to
    this commit. [The testing involves a (long) series of I/O benchmarks
    with blktrace invoked around the actual runs.] I believe that this
    committed patch is correct, but the problem actually lies in the code
    in blk_remove_buf_file_callback.

    With this patch I am able to consistently get complete runs whereas
    previously I could not get a single run to complete.

    The first part of the patch simply moves the debugfs_remove below the
    relay_close: the relay_close call will remove files under bt->dir, and
    so we should not remove the directory until all the files we created
    have been removed. (Note: This is not sufficient to fix the problem -
    the file system code has ref counts on the directoy, so our invocation
    does not cause the directory to actually be removed. Nonetheless, we
    should not rely upon that feature.)

    Signed-off-by: Alan D. Brunelle
    Signed-off-by: Jens Axboe

    Alan D. Brunelle
     

12 Aug, 2009

10 commits

  • * 'for-linus' of git://oss.sgi.com/xfs/xfs:
    xfs: fix spin_is_locked assert on uni-processor builds
    xfs: check for dinode realtime flag corruption
    use XFS_CORRUPTION_ERROR in xfs_btree_check_sblock
    xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_get
    xfs: switch to NOFS allocation under i_lock in xfs_readlink_bmap
    xfs: switch to NOFS allocation under i_lock in xfs_attr_rmtval_set
    xfs: switch to NOFS allocation under i_lock in xfs_buf_associate_memory
    xfs: switch to NOFS allocation under i_lock in xfs_dir_cilookup_result
    xfs: switch to NOFS allocation under i_lock in xfs_da_buf_make
    xfs: switch to NOFS allocation under i_lock in xfs_da_state_alloc
    xfs: switch to NOFS allocation under i_lock in xfs_getbmap
    xfs: avoid memory allocation under m_peraglock in growfs code

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
    ALSA: hda - Don't override ADC definitions for ALC codecs
    ALSA: hda - Add missing vmaster initialization for ALC269
    ASoC: Add missing DRV_NAME definitions for fsl/* drivers

    Linus Torvalds
     
  • * 'zerolen' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/misc-2.6:
    Remove zero-length file drivers/mtd/maps/sbc8240.c

    Linus Torvalds
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
    ahci: add workaround for on-board 5723s on some gigabyte boards
    ahci: Soften up the dmesg on SB600 PMP softreset failure recovery
    Documentation/kernel-parameters.txt: document libata's ignore_hpa option
    sata_nv: MSI support, disabled by default
    libata: OCZ Vertex can't do HPA
    pata_atiixp: fix second channel support
    pata_at91: fix resource release

    Linus Torvalds
     
  • We can't call nfs_readdata_release()/nfs_writedata_release() without
    first initialising and referencing args.context. Doing so inside
    nfs_direct_read_schedule_segment()/nfs_direct_write_schedule_segment()
    causes an Oops.

    We should rather be calling nfs_readdata_free()/nfs_writedata_free() in
    those cases.

    Looking at the O_DIRECT code, the "struct nfs_direct_req" is already
    referencing the nfs_open_context for us. Since the readdata and writedata
    structures carry a reference to that, we can simplify things by getting rid
    of the extra nfs_open_context references, so that we can replace all
    instances of nfs_readdata_release()/nfs_writedata_release().

    Reported-by: Catalin Marinas
    Signed-off-by: Trond Myklebust
    Tested-by: Catalin Marinas
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • Johannes Stezenbach reported that his Pentium-M based
    laptop does not have the local APIC enabled by default,
    and hence perfcounters do not get initialized.

    Add a fallback for this case: allow non-sampled counters
    and return with an error on sampled counters. This allows
    'perf stat' to work out of box - and allows 'perf top'
    and 'perf record' to fall back on a hrtimer based sampling
    method.

    ( Passing 'lapic' on the boot line will allow hardware
    sampling to occur - but if the APIC is disabled
    permanently by the hardware then this fallback still
    allows more systems to use perfcounters. )

    Also decouple perfcounter support from X86_LOCAL_APIC.

    -v2: fix typo breaking counters on all other systems ...

    Reported-by: Johannes Stezenbach
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • perf top supports a -C for setting the profile CPU, but perf
    record does not. This adds the same option for perf record,
    allowing the user to specify a specific target profile CPU.

    Signed-off-by: Jens Axboe
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Paul Mackerras
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jens Axboe
     
  • It is better than showing the map addr, this way at least we
    know that we can't get the symtabs because the DSO was deleted
    (system update) while an app still used such DSO.

    Yeah, don't do that, but if you do, you'll figure it out
    quicker this way.

    [acme@doppio linux-2.6-tip]$ perf report | head -15
    # Samples: 3796
    #
    # Overhead Command Shared Object Symbol
    # ........ ....... ................................................................... ......
    #
    23.55% pidgin /lib64/libglib-2.0.so.0.2000.4.#prelink#.Pd98lu (deleted) [.] 0x00000000038844
    21.55% pidgin /lib64/libpthread-2.10.1.so.#prelink#.AFwK8Q (deleted) [.] 0x0000000000a42d
    10.85% pidgin [kernel] [.] vread_hpet
    7.85% pidgin /lib64/libgobject-2.0.so.0.2000.4.#prelink#.o1vpU7 (deleted) [.] 0x00000000014de8
    3.35% pidgin /lib64/libc-2.10.1.so (deleted) [.] 0x0000000007a875
    3.19% pidgin /lib64/libdbus-1.so.3.4.0.#prelink#.6mwgZP (deleted) [.] 0x0000000001d254
    3.06% pidgin /usr/lib64/libgtk-x11-2.0.so.0.1600.5.#prelink#.511hAl (deleted) [.] 0x000000002334e7
    2.90% pidgin /usr/lib64/libgdk-x11-2.0.so.0.1600.5.#prelink#.5qlMo1 (deleted) [.] 0x00000000037b2d
    1.84% pidgin [kernel] [k] do_sys_poll
    1.45% pidgin /usr/lib64/libX11.so.6.2.0.#prelink#.iR59Rx (deleted) [.] 0x0000000004c751
    [acme@doppio linux-2.6-tip]$

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Luis Claudio R. Gonçalves
    Cc: Clark Williams
    Cc: H. Peter Anvin
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • In old binutils we can't access bfd_demangle(), use
    cplus_demangle() just like oprofile.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: Luis Claudio R. Gonçalves
    Cc: H. Peter Anvin
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo
     
  • This made it easier to find the firefox threading related
    bug.

    Signed-off-by: Arnaldo Carvalho de Melo
    Cc: "H. Peter Anvin"
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Frédéric Weisbecker
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Arnaldo Carvalho de Melo