14 Feb, 2012

40 commits

  • commit dc9086004b3d5db75997a645b3fe08d9138b7ad0 upstream.

    When isolating pages for migration, migration starts at the start of a
    zone while the free scanner starts at the end of the zone. Migration
    avoids entering a new zone by never going beyond the free scanned.

    Unfortunately, in very rare cases nodes can overlap. When this happens,
    migration isolates pages without the LRU lock held, corrupting lists
    which will trigger errors in reclaim or during page free such as in the
    following oops

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: [] free_pcppages_bulk+0xcc/0x450
    PGD 1dda554067 PUD 1e1cb58067 PMD 0
    Oops: 0000 [#1] SMP
    CPU 37
    Pid: 17088, comm: memcg_process_s Tainted: G X
    RIP: free_pcppages_bulk+0xcc/0x450
    Process memcg_process_s (pid: 17088, threadinfo ffff881c2926e000, task ffff881c2926c0c0)
    Call Trace:
    free_hot_cold_page+0x17e/0x1f0
    __pagevec_free+0x90/0xb0
    release_pages+0x22a/0x260
    pagevec_lru_move_fn+0xf3/0x110
    putback_lru_page+0x66/0xe0
    unmap_and_move+0x156/0x180
    migrate_pages+0x9e/0x1b0
    compact_zone+0x1f3/0x2f0
    compact_zone_order+0xa2/0xe0
    try_to_compact_pages+0xdf/0x110
    __alloc_pages_direct_compact+0xee/0x1c0
    __alloc_pages_slowpath+0x370/0x830
    __alloc_pages_nodemask+0x1b1/0x1c0
    alloc_pages_vma+0x9b/0x160
    do_huge_pmd_anonymous_page+0x160/0x270
    do_page_fault+0x207/0x4c0
    page_fault+0x25/0x30

    The "X" in the taint flag means that external modules were loaded but but
    is unrelated to the bug triggering. The real problem was because the PFN
    layout looks like this

    Zone PFN ranges:
    DMA 0x00000010 -> 0x00001000
    DMA32 0x00001000 -> 0x00100000
    Normal 0x00100000 -> 0x01e80000
    Movable zone start PFN for each node
    early_node_map[14] active PFN ranges
    0: 0x00000010 -> 0x0000009b
    0: 0x00000100 -> 0x0007a1ec
    0: 0x0007a354 -> 0x0007a379
    0: 0x0007f7ff -> 0x0007f800
    0: 0x00100000 -> 0x00680000
    1: 0x00680000 -> 0x00e80000
    0: 0x00e80000 -> 0x01080000
    1: 0x01080000 -> 0x01280000
    0: 0x01280000 -> 0x01480000
    1: 0x01480000 -> 0x01680000
    0: 0x01680000 -> 0x01880000
    1: 0x01880000 -> 0x01a80000
    0: 0x01a80000 -> 0x01c80000
    1: 0x01c80000 -> 0x01e80000

    The fix is straight-forward. isolate_migratepages() has to make a
    similar check to isolate_freepage to ensure that it never isolates pages
    from a zone it does not hold the LRU lock for.

    This was discovered in a 3.0-based kernel but it affects 3.1.x, 3.2.x
    and current mainline.

    Signed-off-by: Mel Gorman
    Acked-by: Michal Nazarewicz
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Mel Gorman
     
  • commit 05df1f3c2afaef5672627f2b7095f0d4c4dbc3a0 upstream.

    Error handling in msm_iommu_unmap() is broken. On some error
    conditions retval is set to a non-zero value which causes
    the function to return 'len' at the end. This hides the
    error from the user. Zero should be returned in those error
    cases.

    Cc: David Brown
    Cc: Stepan Moskovchenko
    Signed-off-by: Joerg Roedel
    Acked-by: David Brown
    Signed-off-by: Greg Kroah-Hartman

    Joerg Roedel
     
  • commit af1be04901e27ce669b4ecde1c953d5c939498f5 upstream.

    On some systems the IVRS table does not contain all PCI
    devices present in the system. In case a device not present
    in the IVRS table is translated by the IOMMU no DMA is
    possible from that device by default.
    This patch fixes this by removing the DTE entry for every
    PCI device present in the system and not covered by IVRS.

    Signed-off-by: Joerg Roedel
    Signed-off-by: Greg Kroah-Hartman

    Joerg Roedel
     
  • commit 2492250e4412c6411324c14ab289629360640b0a upstream.

    The driver accidentally exchanged the left/right fields for stereo AC'97
    mixer registers. This affected only the aux and CD inputs because the
    line input bypasses the AC'97 codec and the mic input is mono; cards
    without AC'97 (Xonar DS/DG/HDAV Slim, HG2PCI, HiFier) were not affected.

    Reported-and-tested-by: Abby Cedar
    Signed-off-by: Clemens Ladisch
    Signed-off-by: Takashi Iwai
    Signed-off-by: Greg Kroah-Hartman

    Clemens Ladisch
     
  • commit 025e4ab3db07fcbf62c01e4f30d1012234beb980 upstream.

    This fixes a memory-corrupting bug: not only does it cause the warning,
    but as a result of dropping the refcount to zero, it causes the
    pcmcia_socket0 device structure to be freed while it still has
    references, causing slab caches corruption. A fatal oops quickly
    follows this warning - often even just a 'dmesg' following the warning
    causes the kernel to oops.

    While testing suspend/resume on an ARM device with PCMCIA support, and a
    CF card inserted, I found that after five suspend and resumes, the
    kernel would complain, and shortly die after with slab corruption.

    WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()

    As the message doesn't give a clue about which kobject, and the built-in
    debugging in drivers/base/power/main.c happens too late, this was added
    right before each get_device():

    printk("%s: %p [%s] %u\n", __func__, dev, kobject_name(&dev->kobj), atomic_read(&dev->kobj.kref.refcount));

    and on the 3rd s2ram cycle, the following behaviour observed:

    On the 3rd suspend/resume cycle:

    dpm_prepare: c1a0d998 [pcmcia_socket0] 3
    dpm_suspend: c1a0d998 [pcmcia_socket0] 3
    dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 3
    dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 3
    dpm_resume: c1a0d998 [pcmcia_socket0] 3
    dpm_complete: c1a0d998 [pcmcia_socket0] 2

    4th:

    dpm_prepare: c1a0d998 [pcmcia_socket0] 2
    dpm_suspend: c1a0d998 [pcmcia_socket0] 2
    dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 2
    dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 2
    dpm_resume: c1a0d998 [pcmcia_socket0] 2
    dpm_complete: c1a0d998 [pcmcia_socket0] 1

    5th:

    dpm_prepare: c1a0d998 [pcmcia_socket0] 1
    dpm_suspend: c1a0d998 [pcmcia_socket0] 1
    dpm_suspend_noirq: c1a0d998 [pcmcia_socket0] 1
    dpm_resume_noirq: c1a0d998 [pcmcia_socket0] 1
    dpm_resume: c1a0d998 [pcmcia_socket0] 1
    dpm_complete: c1a0d998 [pcmcia_socket0] 0
    ------------[ cut here ]------------
    WARNING: at include/linux/kref.h:41 kobject_get+0x28/0x50()
    Modules linked in: ucb1x00_core
    Backtrace:
    [] (dump_backtrace+0x0/0x110) from [] (dump_stack+0x18/0x1c)
    [] (dump_stack+0x0/0x1c) from [] (warn_slowpath_common+0x50/0x68)
    [] (warn_slowpath_common+0x0/0x68) from [] (warn_slowpath_null+0x24/0x28)
    [] (warn_slowpath_null+0x0/0x28) from [] (kobject_get+0x28/0x50)
    [] (kobject_get+0x0/0x50) from [] (get_device+0x1c/0x24)
    [] (dpm_complete+0x0/0x1a0) from [] (dpm_resume_end+0x1c/0x20)
    ...

    Looking at commit 7b24e7988263 ("pcmcia: split up central event handler"),
    the following change was made to cs.c:

    return 0;
    }
    #endif
    -
    - send_event(skt, CS_EVENT_PM_RESUME, CS_EVENT_PRI_LOW);
    + if (!(skt->state & SOCKET_CARDBUS) && (skt->callback))
    + skt->callback->early_resume(skt);
    return 0;
    }

    And the corresponding change in ds.c is from:

    -static int ds_event(struct pcmcia_socket *skt, event_t event, int priority)
    -{
    - struct pcmcia_socket *s = pcmcia_get_socket(skt);
    ...
    - switch (event) {
    ...
    - case CS_EVENT_PM_RESUME:
    - if (verify_cis_cache(skt) != 0) {
    - dev_dbg(&skt->dev, "cis mismatch - different card\n");
    - /* first, remove the card */
    - ds_event(skt, CS_EVENT_CARD_REMOVAL, CS_EVENT_PRI_HIGH);
    - mutex_lock(&s->ops_mutex);
    - destroy_cis_cache(skt);
    - kfree(skt->fake_cis);
    - skt->fake_cis = NULL;
    - s->functions = 0;
    - mutex_unlock(&s->ops_mutex);
    - /* now, add the new card */
    - ds_event(skt, CS_EVENT_CARD_INSERTION,
    - CS_EVENT_PRI_LOW);
    - }
    - break;
    ...
    - }

    - pcmcia_put_socket(s);

    - return 0;
    -} /* ds_event */

    to:

    +static int pcmcia_bus_early_resume(struct pcmcia_socket *skt)
    +{
    + if (!verify_cis_cache(skt)) {
    + pcmcia_put_socket(skt);
    + return 0;
    + }

    + dev_dbg(&skt->dev, "cis mismatch - different card\n");

    + /* first, remove the card */
    + pcmcia_bus_remove(skt);
    + mutex_lock(&skt->ops_mutex);
    + destroy_cis_cache(skt);
    + kfree(skt->fake_cis);
    + skt->fake_cis = NULL;
    + skt->functions = 0;
    + mutex_unlock(&skt->ops_mutex);

    + /* now, add the new card */
    + pcmcia_bus_add(skt);
    + return 0;
    +}

    As can be seen, the original function called pcmcia_get_socket() and
    pcmcia_put_socket() around the guts, whereas the replacement code
    calls pcmcia_put_socket() only in one path. This creates an imbalance
    in the refcounting.

    Testing with pcmcia_put_socket() put removed shows that the bug is gone:

    dpm_suspend: c1a10998 [pcmcia_socket0] 5
    dpm_suspend_noirq: c1a10998 [pcmcia_socket0] 5
    dpm_resume_noirq: c1a10998 [pcmcia_socket0] 5
    dpm_resume: c1a10998 [pcmcia_socket0] 5
    dpm_complete: c1a10998 [pcmcia_socket0] 5

    Tested-by: Russell King
    Signed-off-by: Russell King
    Cc: Dominik Brodowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Russell King
     
  • commit f647e1526fd6c7c8ab720781c40d11e11f930e93 upstream.

    The VMID ramp rate is supposed to be 0x3, not 11b. Fix that.

    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Mark Brown
     
  • commit db966f8abb9ba74f7d5a7230f51572f52c31c4e5 upstream.

    We can enable VMID independently of the bias in some use cases so we need
    to ensure that the core device is powered up.

    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Mark Brown
     
  • commit 2b6712b19531e22455e7fa18371c5ba9eec76699 upstream.

    Signed-off-by: Susan Gao
    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Susan Gao
     
  • commit 43b6cec27e1e50a1de3eff47e66e502f3fe7e66e upstream.

    The second line output mixer has the controls for the line input bypasses
    in the opposite order.

    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Mark Brown
     
  • commit ee76744c51ec342df9822b4a85dbbfc3887b6d60 upstream.

    IN1L/R is routed to both line output mixers, we don't route IN1 to LINEOUT1
    and IN2 to LINEOUT2.

    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Mark Brown
     
  • commit 2f9bc894c67dbacae5a6a9875818d2a18a918d18 upstream.

    This patch addresses a bug with sendtargets discovery where INADDR_ANY (0.0.0.0)
    + IN6ADDR_ANY_INIT ([0:0:0:0:0:0:0:0]) network portals where incorrectly being
    reported back to initiators instead of the address of the connecting interface.
    To address this, save local socket ->getname() output during iscsi login setup,
    and makes iscsit_build_sendtargets_response() return these TargetAddress keys
    when INADDR_ANY or IN6ADDR_ANY_INIT portals are in use.

    Reported-by: Dax Kelson
    Reported-by: Andy Grover
    Cc: David S. Miller
    Signed-off-by: Nicholas Bellinger
    Signed-off-by: Greg Kroah-Hartman

    Nicholas Bellinger
     
  • commit cd931ee62fd0258fc85c76a7c5499fe85e0f3436 upstream.

    This patch fixes a bug where the iscsit_add_reject_from_cmd() call
    from a failure to iscsit_alloc_buffs() was incorrectly passing
    add_to_conn=1 and causing a double list_add after iscsi_cmd->i_list
    had already been added in iscsit_handle_scsi_cmd().

    Signed-off-by: Nicholas Bellinger
    Signed-off-by: Greg Kroah-Hartman

    Nicholas Bellinger
     
  • commit c1ce4bd56f2846de55043374598fd929ad3b711b upstream.

    This patch addresses a bug where iscsit_free_cmd() was incorrectly calling
    iscsit_release_cmd() for ISCSI_OP_REJECT because iscsi_add_reject*() will
    overwrite the original iscsi_cmd->iscsi_opcode assignment. This bug was
    introduced with the following commit:

    commit 0be67f2ed8f577d2c72d917928394c5885fa9134
    Author: Nicholas Bellinger
    Date: Sun Oct 9 01:48:14 2011 -0700

    iscsi-target: Remove SCF_SE_LUN_CMD flag abuses

    and was manifesting itself as list corruption with the following:

    [ 131.191092] ------------[ cut here ]------------
    [ 131.191092] WARNING: at lib/list_debug.c:53 __list_del_entry+0x8d/0x98()
    [ 131.191092] Hardware name: VMware Virtual Platform
    [ 131.191092] list_del corruption. prev->next should be ffff880022d3c100, but was 6b6b6b6b6b6b6b6b
    [ 131.191092] Modules linked in: tcm_vhost ib_srpt ib_cm ib_sa ib_mad ib_core tcm_qla2xxx qla2xxx tcm_loop tcm_fc libfc scsi_transport_fc crc32c iscsi_target_mod target_core_stgt scsi_tgt target_core_pscsi target_core_file target_core_iblock target_core_mod configfs ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi sr_mod cdrom sd_mod e1000 ata_piix libata mptspi mptscsih mptbase [last unloaded: scsi_wait_scan]
    [ 131.191092] Pid: 2250, comm: iscsi_ttx Tainted: G W 3.2.0-rc4+ #42
    [ 131.191092] Call Trace:
    [ 131.191092] [] warn_slowpath_common+0x80/0x98
    [ 131.191092] [] warn_slowpath_fmt+0x41/0x43
    [ 131.191092] [] __list_del_entry+0x8d/0x98
    [ 131.191092] [] transport_lun_remove_cmd+0x9b/0xb7 [target_core_mod]
    [ 131.191092] [] transport_generic_free_cmd+0x5d/0x71 [target_core_mod]
    [ 131.191092] [] iscsit_free_cmd+0x1e/0x27 [iscsi_target_mod]
    [ 131.191092] [] iscsit_close_connection+0x14d/0x5b2 [iscsi_target_mod]
    [ 131.191092] [] iscsit_take_action_for_connection_exit+0xdb/0xe0 [iscsi_target_mod]
    [ 131.191092] [] iscsi_target_tx_thread+0x15cb/0x1608 [iscsi_target_mod]
    [ 131.191092] [] ? check_preempt_wakeup+0x121/0x185
    [ 131.191092] [] ? __dequeue_entity+0x2e/0x33
    [ 131.191092] [] ? iscsit_send_text_rsp+0x25f/0x25f [iscsi_target_mod]
    [ 131.191092] [] ? iscsit_send_text_rsp+0x25f/0x25f [iscsi_target_mod]
    [ 131.191092] [] ? schedule+0x55/0x57
    [ 131.191092] [] kthread+0x7d/0x85
    [ 131.191092] [] kernel_thread_helper+0x4/0x10
    [ 131.191092] [] ? kthread_worker_fn+0x16d/0x16d
    [ 131.191092] [] ? gs_change+0x13/0x13

    Reported-by:
    Signed-off-by: Nicholas Bellinger
    Signed-off-by: Greg Kroah-Hartman

    Nicholas Bellinger
     
  • commit 9ec84acee1e221d99dc33237bff5e82839d10cc0 upstream.

    We do want to allow lock debugging for GPL-compatible modules
    that are not (yet) built in-tree. This was disabled as a
    side-effect of commit 2449b8ba0745327c5fa49a8d9acffe03b2eded69
    ('module,bug: Add TAINT_OOT_MODULE flag for modules not built
    in-tree'). Lock debug warnings now include taint flags, so
    kernel developers should still be able to deflect warnings
    caused by out-of-tree modules.

    The TAINT_PROPRIETARY_MODULE flag for non-GPL-compatible modules
    will still disable lock debugging.

    Signed-off-by: Ben Hutchings
    Cc: Nick Bowler
    Cc: Dave Jones
    Cc: Rusty Russell
    Cc: Randy Dunlap
    Cc: Debian kernel maintainers
    Cc: Peter Zijlstra
    Cc: Alan Cox
    Link: http://lkml.kernel.org/r/1323268258.18450.11.camel@deadeye
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Ben Hutchings
     
  • commit df754e6af2f237a6c020c0daff55a1a609338e31 upstream.

    It's unlikely that TAINT_FIRMWARE_WORKAROUND causes false
    lockdep messages, so do not disable lockdep in that case.
    We still want to keep lockdep disabled in the
    TAINT_OOT_MODULE case:

    - bin-only modules can cause various instabilities in
    their and in unrelated kernel code

    - they are impossible to debug for kernel developers

    - they also typically do not have the copyright license
    permission to link to the GPL-ed lockdep code.

    Suggested-by: Ben Hutchings
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-xopopjjens57r0i13qnyh2yo@git.kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • commit 9f1065032ceb7e86c7c9f16bb86518857e88a172 upstream.

    An error was existing in the saving of CONTRAST_CTR register
    across suspend/resume.

    Signed-off-by: Hubert Feurstein
    Signed-off-by: Nicolas Ferre
    Acked-by: Jean-Christophe PLAGNIOL-VILLARD
    Signed-off-by: Florian Tobias Schandinat
    Signed-off-by: Greg Kroah-Hartman

    Hubert Feurstein
     
  • commit de47a4176c532ef5961b8a46a2d541a3517412d3 upstream.

    For null user mounts, do not invoke string length function
    during session setup.

    Reported-and-Tested-by: Chris Clayton
    Acked-by: Jeff Layton
    Signed-off-by: Shirish Pargaonkar
    Signed-off-by: Steve French
    Signed-off-by: Greg Kroah-Hartman

    Shirish Pargaonkar
     
  • commit 585c0fd8216e0c9f98e2434092af7ec0f999522d upstream.

    NCT6776F can select fan input pins for fans 3 to 5 with a secondary set of
    chip register bits. Check that second set of bits in addition to the first set
    to detect if fans 3..5 are monitored.

    Signed-off-by: Guenter Roeck
    Acked-by: Jean Delvare
    Signed-off-by: Greg Kroah-Hartman

    Guenter Roeck
     
  • commit 684a3ff7e69acc7c678d1a1394fe9e757993fd34 upstream.

    ecryptfs_write() can enter an infinite loop when truncating a file to a
    size larger than 4G. This only happens on architectures where size_t is
    represented by 32 bits.

    This was caused by a size_t overflow due to it incorrectly being used to
    store the result of a calculation which uses potentially large values of
    type loff_t.

    [tyhicks@canonical.com: rewrite subject and commit message]
    Signed-off-by: Li Wang
    Signed-off-by: Yunchuan Wen
    Reviewed-by: Cong Wang
    Signed-off-by: Tyler Hicks
    Signed-off-by: Greg Kroah-Hartman

    Li Wang
     
  • commit 9f1f46a45a681d357d1ceedecec3671a5ae957f4 upstream.

    The problem this patch solves is that the forcewake accounting
    necessary for register reads is protected by dev->struct_mutex. But the
    hangcheck and error_capture code need to access registers without
    grabbing this mutex because we hold it while waiting for the gpu.
    So a new lock is required. Because currently the error_state capture
    is called from the error irq handler and the hangcheck code runs from
    a timer, it needs to be an irqsafe spinlock (note that the registers
    used by the irq handler (neglecting the error handling part) only uses
    registers that don't need the forcewake dance).

    We could tune this down to a normal spinlock when we rework the
    error_state capture and hangcheck code to run from a workqueue. But
    we don't have any read in a fastpath that needs forcewake, so I've
    decided to not care much about overhead.

    This prevents tests/gem_hangcheck_forcewake from i-g-t from killing my
    snb on recent kernels - something must have slightly changed the
    timings. On previous kernels it only trigger a WARN about the broken
    locking.

    v2: Drop the previous patch for the register writes.

    v3: Improve the commit message per Chris Wilson's suggestions.

    Signed-Off-by: Daniel Vetter
    Reviewed-by: Chris Wilson
    Reviewed-by: Eugeni Dodonov
    Signed-off-by: Keith Packard
    Signed-off-by: Eugeni Dodonov
    Signed-off-by: Greg Kroah-Hartman

    Daniel Vetter
     
  • commit 8109021313c7a3d8947677391ce6ab9cd0bb1d28 upstream.

    This was forgotten in the original multi-threaded forcewake
    conversion:

    commit 8d715f0024f64ad1b1be85d8c081cf577944c847
    Author: Keith Packard
    Date: Fri Nov 18 20:39:01 2011 -0800

    drm/i915: add multi-threaded forcewake support

    Signed-off-by: Daniel Vetter
    Reviewed-by: Eugeni Dodonov
    Signed-off-by: Keith Packard
    Signed-off-by: Eugeni Dodonov
    Signed-off-by: Greg Kroah-Hartman

    Daniel Vetter
     
  • commit 07c1e8c1462fa7324de4c36ae9e55da2abd79cee upstream.

    We don't need to check 3rd pipe specifically, as it shares PLL with some
    other one.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41977
    Signed-off-by: Eugeni Dodonov
    Reviewed-by: Jesse Barnes
    Signed-off-by: Keith Packard
    Signed-off-by: Greg Kroah-Hartman

    Eugeni Dodonov
     
  • commit 23bd15ec662344dc10e9918fdd0dbc58bc71526d upstream.

    TV Out refresh rate was half of the specification for almost all modes.
    Due to this reason pixel clock was so low for some modes causing flickering screen.

    Signed-off-by: Rodrigo Vivi
    Reviewed-by: Jesse Barnes
    Signed-off-by: Keith Packard
    Signed-off-by: Eugeni Dodonov
    Signed-off-by: Greg Kroah-Hartman

    Rodrigo Vivi
     
  • commit 097354eb14fa94d31a09c64d640643f58e4a5a9a upstream.

    Otherwise hangcheck spuriously fires when running blitter/bsd-only
    workloads.

    Contrary to a similar patch by Ben Widawsky this does not check
    INSTDONE of the other rings. Chris Wilson implied that in a failure to
    detect a hang, most likely because INSTDONE was fluctuating. Thus only
    check ACTHD, which as far as I know is rather reliable. Also, blitter
    and bsd rings can't launch complex tasks from a single instruction
    (like 3D_PRIM on the render with complex or even infinite shaders).

    This fixes spurious gpu hang detection when running
    tests/gem_hangcheck_forcewake on snb/ivb.

    Signed-Off-by: Daniel Vetter
    Reviewed-by: Chris Wilson
    Signed-off-by: Keith Packard
    Signed-off-by: Eugeni Dodonov
    Signed-off-by: Greg Kroah-Hartman

    Daniel Vetter
     
  • commit 832afda6a7d7235ef0e09f4ec46736861540da6d upstream.

    On DP monitor hot remove, clear DP_AUDIO_OUTPUT_ENABLE accordingly,
    so that the audio driver will receive hot plug events and take action
    to refresh its device state and ELD contents.

    Note that the DP_AUDIO_OUTPUT_ENABLE bit may be enabled or disabled
    only when the link training is complete and set to "Normal".

    Tested OK for both hot plug/remove and DPMS on/off.

    Signed-off-by: Wu Fengguang
    Signed-off-by: Keith Packard
    Signed-off-by: Eugeni Dodonov
    Signed-off-by: Greg Kroah-Hartman

    Wu Fengguang
     
  • commit 2deed761188d7480eb5f7efbfe7aa77f09322ed8 upstream.

    On HDMI monitor hot remove, clear SDVO_AUDIO_ENABLE accordingly, so that
    the audio driver will receive hot plug events and take action to refresh
    its device state and ELD contents.

    The cleared SDVO_AUDIO_ENABLE bit needs to be restored to prevent losing
    HDMI audio after DPMS on.

    CC: Wang Zhenyu
    Signed-off-by: Wu Fengguang
    Signed-off-by: Keith Packard
    Signed-off-by: Eugeni Dodonov
    Signed-off-by: Greg Kroah-Hartman

    Wu Fengguang
     
  • commit 853a0c25baf96b028de1654bea1e0c8857eadf3d upstream.

    When we hit EIO while writing LVID, the buffer uptodate bit is cleared.
    This then results in an anoying warning from mark_buffer_dirty() when we
    write the buffer again. So just set uptodate flag unconditionally.

    Reviewed-by: Namjae Jeon
    Signed-off-by: Jan Kara
    Cc: Dave Jones
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit b189e810619a676e6b931a942a3e8387f3d39c21 upstream.

    The driver uses __napi_complete and napi_gro_receive. Without it, the
    driver hits the BUG_ON(n->gro_list) assertion hard in __napi_complete.

    Signed-off-by: Francois Romieu
    Tested-by: Marin Glibic
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Francois Romieu
     
  • commit fe9161db2e6053da21e4649d77bbefaf3030b11d upstream.

    In the SNAPSHOT_CREATE_IMAGE ioctl, if the call to hibernation_snapshot()
    fails, the frozen tasks are not thawed.

    And in the case of success, if we happen to exit due to a successful freezer
    test, all tasks (including those of userspace) are thawed, whereas actually
    we should have thawed only the kernel threads at that point. Fix both these
    issues.

    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Srivatsa S. Bhat
     
  • commit 97819a26224f019e73d88bb2fd4eb5a614860461 upstream.

    Commit 2aede851ddf08666f68ffc17be446420e9d2a056 (PM / Hibernate: Freeze
    kernel threads after preallocating memory) moved the freezing of kernel
    threads to hibernation_snapshot() function.

    So now, if the call to hibernation_snapshot() returns early due to a
    successful hibernation test, the caller has to thaw processes to ensure
    that the system gets back to its original state.

    But in SNAPSHOT_CREATE_IMAGE hibernation ioctl, the caller does not thaw
    processes in case hibernation_snapshot() returned due to a successful
    freezer test. Fix this issue. But note we still send the value of 'in_suspend'
    (which is now 0) to userspace, because we are not in an error path per-se,
    and moreover, the value of in_suspend correctly depicts the situation here.

    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Srivatsa S. Bhat
     
  • commit cb297a3e433dbdcf7ad81e0564e7b804c941ff0d upstream.

    This issue happens under the following conditions:

    1. preemption is off
    2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined
    3. RT scheduling class
    4. SMP system

    Sequence is as follows:

    1.suppose current task is A. start schedule()
    2.task A is enqueued pushable task at the entry of schedule()
    __schedule
    prev = rq->curr;
    ...
    put_prev_task
    put_prev_task_rt
    enqueue_pushable_task
    4.pick the task B as next task.
    next = pick_next_task(rq);
    3.rq->curr set to task B and context_switch is started.
    rq->curr = next;
    4.At the entry of context_swtich, release this cpu's rq->lock.
    context_switch
    prepare_task_switch
    prepare_lock_switch
    raw_spin_unlock_irq(&rq->lock);
    5.Shortly after rq->lock is released, interrupt is occurred and start IRQ context
    6.try_to_wake_up() which called by ISR acquires rq->lock
    try_to_wake_up
    ttwu_remote
    rq = __task_rq_lock(p)
    ttwu_do_wakeup(rq, p, wake_flags);
    task_woken_rt
    7.push_rt_task picks the task A which is enqueued before.
    task_woken_rt
    push_rt_tasks(rq)
    next_task = pick_next_pushable_task(rq)
    8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
    lowest_rq can be the remote rq.
    (But,If preemption is on, double_lock_balance always return 1 and it
    does't happen.)
    push_rt_task
    find_lock_lowest_rq
    if (double_lock_balance(rq, lowest_rq))..
    9.find_lock_lowest_rq return the available rq. task A is migrated to
    the remote cpu/rq.
    push_rt_task
    ...
    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, lowest_rq->cpu);
    activate_task(lowest_rq, next_task, 0);
    10. But, task A is on irq context at this cpu.
    So, task A is scheduled by two cpus at the same time until restore from IRQ.
    Task A's stack is corrupted.

    To fix it, don't migrate an RT task if it's still running.

    Signed-off-by: Chanho Min
    Signed-off-by: Peter Zijlstra
    Acked-by: Steven Rostedt
    Link: http://lkml.kernel.org/r/CAOAMb1BHA=5fm7KTewYyke6u-8DP0iUuJMpgQw54vNeXFsGpoQ@mail.gmail.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Chanho Min
     
  • commit 304a48400d9718f74ec35ae46f30868a5f4c4516 upstream.

    Different versions of the DP to LVDS bridge chip
    need different panel mode settings depending on
    the chip version used.

    Fixes:
    https://bugs.freedesktop.org/show_bug.cgi?id=41569

    Signed-off-by: Alex Deucher
    Signed-off-by: Dave Airlie
    Signed-off-by: Greg Kroah-Hartman

    Alex Deucher
     
  • commit 86698c20f71d488b32c49ed4687fb3cf8a88a5ca upstream.

    Polling the outputs when the device is suspended can result in erroneous
    status updates. Disable output polling during suspend to prevent this
    from happening.

    Signed-off-by: Seth Forshee
    Reviewed-by: Alex Deucher
    Signed-off-by: Dave Airlie
    Signed-off-by: Greg Kroah-Hartman

    Seth Forshee
     
  • commit 525895ba388c949aa906f26e3ec5cb1ab041f56b upstream.

    Due to a race it was possible for a fence to be destroyed while another
    thread was trying to synchronise with it. If this happened in the fallback
    non-semaphore path, it lead to the following oops due to fence->channel
    being NULL.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] nouveau_fence_update+0xe/0xe0 [nouveau]
    *pde = a649c067
    SMP
    Modules linked in: fuse nouveau(O) ttm(O) drm_kms_helper(O) drm(O) mxm_wmi video wmi netconsole configfs lockd bnep bluetooth rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ip6table_filter ip6_tables snd_hda_codec_realtek snd_hda_intel snd_hda_cobinfmt_misc uinput ata_generic pata_acpi pata_aet2c_algo_bit i2c_core [last unloaded: wmi]

    Pid: 2255, comm: gnome-shell Tainted: G O 3.2.0-0.rc5.git0.1.fc17.i686 #1 System manufacturer System Product Name/M2A-VM
    EIP: 0060:[] EFLAGS: 00010296 CPU: 1
    EIP is at nouveau_fence_update+0xe/0xe0 [nouveau]
    EAX: 00000000 EBX: ddfc6dd0 ECX: dd111580 EDX: 00000000
    ESI: 00003e80 EDI: dd111580 EBP: dd121d00 ESP: dd121ce8
    DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
    Process gnome-shell (pid: 2255, ti=dd120000 task=dd111580 task.ti=dd120000)
    Stack:
    7dc86c76 00000000 00003e80 ddfc6dd0 00003e80 dd111580 dd121d0c fa96371f
    00000000 dd121d3c fa963773 dd111580 01000246 000ec53d 00000000 ddfc6dd0
    00001f40 00000000 ddfc6dd0 00000010 dc7df840 dd121d6c fa9639a0 00000000
    Call Trace:
    [] __nouveau_fence_signalled+0x1f/0x30 [nouveau]
    [] __nouveau_fence_wait+0x43/0xd0 [nouveau]
    [] nouveau_fence_sync+0x1a0/0x1c0 [nouveau]
    [] validate_list+0x176/0x300 [nouveau]
    [] ? ttm_bo_mem_put+0x30/0x30 [ttm]
    [] nouveau_gem_ioctl_pushbuf+0x48a/0xfd0 [nouveau]
    [] ? die+0x31/0x80
    [] drm_ioctl+0x388/0x490 [drm]
    [] ? die+0x31/0x80
    [] ? nouveau_gem_ioctl_new+0x150/0x150 [nouveau]
    [] ? file_has_perm+0xcb/0xe0
    [] ? drm_copy_field+0x80/0x80 [drm]
    [] do_vfs_ioctl+0x86/0x5b0
    [] ? die+0x31/0x80
    [] ? selinux_file_ioctl+0x62/0x130
    [] ? fget_light+0x30/0x340
    [] sys_ioctl+0x6f/0x80
    [] syscall_call+0x7/0xb
    [] ? die+0x31/0x80
    [] ? die+0x31/0x80

    Signed-off-by: Ben Skeggs
    Signed-off-by: Greg Kroah-Hartman

    Ben Skeggs
     
  • commit 1b61925061660009f5b8047f93c5297e04541273 upstream.

    The value of this register is transferred to the V_COUNTER register at the
    beginning of vertical blank. V_COUNTER is the reference for VLINE waits and
    goes from VIEWPORT_Y_START to VIEWPORT_Y_START+VIEWPORT_HEIGHT during scanout,
    so if VIEWPORT_Y_START is not 0, V_COUNTER actually went backwards at the
    beginning of vertical blank, and VLINE waits excluding the whole scanout area
    could never finish (possibly only if VIEWPORT_Y_START is larger than the length
    of vertical blank in scanlines). Setting DESKTOP_HEIGHT to the framebuffer
    height should prevent this for any kind of VLINE wait.

    Fixes https://bugs.freedesktop.org/show_bug.cgi?id=45329 .

    Signed-off-by: Michel Dänzer
    Reviewed-by: Alex Deucher
    Signed-off-by: Dave Airlie
    Signed-off-by: Greg Kroah-Hartman

    Michel Dänzer
     
  • commit d020283dc694c9ec31b410f522252f7a8397e67d upstream.

    Looks like change "PM QoS: Move and rename the implementation files"
    merged during the 3.2 development cycle made PM QoS depend on
    CONFIG_PM which depends on (PM_SLEEP || PM_RUNTIME).

    That breaks CPU C-states with kernels not having these CONFIGs, causing CPUs
    to spend time in Polling loop idle instead of going into deep C-states,
    consuming way way more power. This is with either acpi idle or intel idle
    enabled.

    Either CONFIG_PM should be enabled with any pm_qos users or
    the !CONFIG_PM pm_qos_request() should return sane defaults not to break
    the existing users. Here's is the patch for the latter option.

    [rjw: Modified the changelog slightly.]

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Venkatesh Pallipadi
     
  • commit 181e9bdef37bfcaa41f3ab6c948a2a0d60a268b5 upstream.

    Commit 2aede851ddf08666f68ffc17be446420e9d2a056

    PM / Hibernate: Freeze kernel threads after preallocating memory

    introduced a mechanism by which kernel threads were frozen after
    the preallocation of hibernate image memory to avoid problems with
    frozen kernel threads not responding to memory freeing requests.
    However, it overlooked the s2disk code path in which the
    SNAPSHOT_CREATE_IMAGE ioctl was run directly after SNAPSHOT_FREE,
    which caused freeze_workqueues_begin() to BUG(), because it saw
    that worqueues had been already frozen.

    Although in principle this issue might be addressed by removing
    the relevant BUG_ON() from freeze_workqueues_begin(), that would
    reintroduce the very problem that commit 2aede851ddf08666f68ffc17be4
    attempted to avoid into that particular code path. For this reason,
    to fix the issue at hand, introduce thaw_kernel_threads() and make
    the SNAPSHOT_FREE ioctl execute it.

    Special thanks to Srivatsa S. Bhat for detailed analysis of the
    problem.

    Reported-and-tested-by: Jiri Slaby
    Signed-off-by: Rafael J. Wysocki
    Acked-by: Srivatsa S. Bhat
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • …ing isolation for migration

    commit 0bf380bc70ecba68cb4d74dc656cc2fa8c4d801a upstream.

    When isolating for migration, migration starts at the start of a zone
    which is not necessarily pageblock aligned. Further, it stops isolating
    when COMPACT_CLUSTER_MAX pages are isolated so migrate_pfn is generally
    not aligned. This allows isolate_migratepages() to call pfn_to_page() on
    an invalid PFN which can result in a crash. This was originally reported
    against a 3.0-based kernel with the following trace in a crash dump.

    PID: 9902 TASK: d47aecd0 CPU: 0 COMMAND: "memcg_process_s"
    #0 [d72d3ad0] crash_kexec at c028cfdb
    #1 [d72d3b24] oops_end at c05c5322
    #2 [d72d3b38] __bad_area_nosemaphore at c0227e60
    #3 [d72d3bec] bad_area at c0227fb6
    #4 [d72d3c00] do_page_fault at c05c72ec
    #5 [d72d3c80] error_code (via page_fault) at c05c47a4
    EAX: 00000000 EBX: 000c0000 ECX: 00000001 EDX: 00000807 EBP: 000c0000
    DS: 007b ESI: 00000001 ES: 007b EDI: f3000a80 GS: 6f50
    CS: 0060 EIP: c030b15a ERR: ffffffff EFLAGS: 00010002
    #6 [d72d3cb4] isolate_migratepages at c030b15a
    #7 [d72d3d14] zone_watermark_ok at c02d26cb
    #8 [d72d3d2c] compact_zone at c030b8de
    #9 [d72d3d68] compact_zone_order at c030bba1
    #10 [d72d3db4] try_to_compact_pages at c030bc84
    #11 [d72d3ddc] __alloc_pages_direct_compact at c02d61e7
    #12 [d72d3e08] __alloc_pages_slowpath at c02d66c7
    #13 [d72d3e78] __alloc_pages_nodemask at c02d6a97
    #14 [d72d3eb8] alloc_pages_vma at c030a845
    #15 [d72d3ed4] do_huge_pmd_anonymous_page at c03178eb
    #16 [d72d3f00] handle_mm_fault at c02f36c6
    #17 [d72d3f30] do_page_fault at c05c70ed
    #18 [d72d3fb0] error_code (via page_fault) at c05c47a4
    EAX: b71ff000 EBX: 00000001 ECX: 00001600 EDX: 00000431
    DS: 007b ESI: 08048950 ES: 007b EDI: bfaa3788
    SS: 007b ESP: bfaa36e0 EBP: bfaa3828 GS: 6f50
    CS: 0073 EIP: 080487c8 ERR: ffffffff EFLAGS: 00010202

    It was also reported by Herbert van den Bergh against 3.1-based kernel
    with the following snippet from the console log.

    BUG: unable to handle kernel paging request at 01c00008
    IP: [<c0522399>] isolate_migratepages+0x119/0x390
    *pdpt = 000000002f7ce001 *pde = 0000000000000000

    It is expected that it also affects 3.2.x and current mainline.

    The problem is that pfn_valid is only called on the first PFN being
    checked and that PFN is not necessarily aligned. Lets say we have a case
    like this

    H = MAX_ORDER_NR_PAGES boundary
    | = pageblock boundary
    m = cc->migrate_pfn
    f = cc->free_pfn
    o = memory hole

    H------|------H------|----m-Hoooooo|ooooooH-f----|------H

    The migrate_pfn is just below a memory hole and the free scanner is beyond
    the hole. When isolate_migratepages started, it scans from migrate_pfn to
    migrate_pfn+pageblock_nr_pages which is now in a memory hole. It checks
    pfn_valid() on the first PFN but then scans into the hole where there are
    not necessarily valid struct pages.

    This patch ensures that isolate_migratepages calls pfn_valid when
    necessary.

    Reported-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
    Tested-by: Herbert van den Bergh <herbert.van.den.bergh@oracle.com>
    Signed-off-by: Mel Gorman <mgorman@suse.de>
    Acked-by: Michal Nazarewicz <mina86@mina86.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

    Mel Gorman
     
  • commit 99f02ef1f18631eb0a4e0ea0a3d56878dbcb4b90 upstream.

    Fix a race condition that shows in conjunction with xip_file_fault() when
    two threads of the same user process fault on the same memory page.

    In this case, the race winner will install the page table entry and the
    unlucky loser will cause an oops: xip_file_fault calls vm_insert_pfn (via
    vm_insert_mixed) which drops out at this check:

    retval = -EBUSY;
    if (!pte_none(*pte))
    goto out_unlock;

    The resulting -EBUSY return value will trigger a BUG_ON() in
    xip_file_fault.

    This fix simply considers the fault as fixed in this case, because the
    race winner has successfully installed the pte.

    [akpm@linux-foundation.org: use conventional (and consistent) comment layout]
    Reported-by: David Sadler
    Signed-off-by: Carsten Otte
    Reported-by: Louis Alex Eisner
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Carsten Otte
     
  • commit bda3a47c886664e86ee14eb79e9072b9e341f575 upstream.

    commit 463894705e4089d0ff69e7d877312d496ac70e5b deleted redundant
    chan_id and chancnt initialization in dma drivers as this is done
    in dma_async_device_register().

    However, atc_enable_irq() relied on chan_id set before registering
    the device, what left only channel 0 functional for this driver.

    This patch introduces atc_enable/disable_chan_irq() as a variant
    of atc_enable/disable_irq() with the channel as explicit argument.

    Signed-off-by: Nikolaus Voss
    Signed-off-by: Nicolas Ferre
    Signed-off-by: Vinod Koul
    Signed-off-by: Greg Kroah-Hartman

    Nikolaus Voss