25 Sep, 2014

3 commits

  • Pull one last block fix from Jens Axboe:
    "We've had an issue with scsi-mq where probing takes forever. This was
    bisected down to the percpu changes for blk_mq_queue_enter(), and the
    fact we now suffer an RCU grace period when killing a queue. SCSI
    creates and destroys tons of queues, so this let to 10s of seconds of
    stalls at boot for some.

    Tejun has a real fix for this, but it's too involved for 3.17. So
    this is a temporary workaround to expedite the queue killing until we
    can fold in the real fix for 3.18 when that merge window opens"

    * 'for-linus' of git://git.kernel.dk/linux-block:
    blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe

    Linus Torvalds
     
  • Pull crypto fixes from Herbert Xu:
    "This fixes three issues:

    - if ccp is loaded on a machine without ccp, it will incorrectly
    activate causing all requests to fail. Fixed by preventing ccp
    from loading if hardware isn't available.

    - not all IRQs were enabled for the qat driver, leading to potential
    stalls when it is used

    - disabled buggy AVX CTR implementation in aesni"

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: aesni - disable "by8" AVX CTR optimization
    crypto: ccp - Check for CCP before registering crypto algs
    crypto: qat - Enable all 32 IRQs

    Linus Torvalds
     
  • Pull media fixes from Mauro Carvalho Chehab:
    "For some last time fixes:
    - a regression detected on Kernel 3.16 related to VBI Teletext
    application breakage on drivers using videobuf2 (see
    https://bugzilla.kernel.org/show_bug.cgi?id=84401). The bug was
    noticed on saa7134 (migrated to VB2 on 3.16), but also affects
    em28xx (migrated on 3.9 to VB2);
    - two additional sanity checks at videobuf2;
    - two fixups to restore proper VBI support at the em28xx driver;
    - two Kernel oops fixups (at cx24123 and cx2341x drivers);
    - a bug at adv7604 where an if was doing just the opposite as it
    would be expected;
    - some documentation fixups to match the behavior defined at the
    Kernel"

    * tag 'media/v3.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
    [media] em28xx-v4l: get rid of field "users" in struct em28xx_v4l2"
    [media] em28xx: fix VBI handling logic
    [media] DocBook media: improve the poll() documentation
    [media] DocBook media: fix the poll() 'no QBUF' documentation
    [media] vb2: fix VBI/poll regression
    [media] cx2341x: fix kernel oops
    [media] cx24123: fix kernel oops due to missing parent pointer
    [media] adv7604: fix inverted condition
    [media] media/radio: fix radio-miropcm20.c build with io.h header file
    [media] vb2: fix plane index sanity check in vb2_plane_cookie()
    [media] DocBook media: update version number and V4L2 changes
    [media] DocBook media: fix fieldname in struct v4l2_subdev_selection
    [media] vb2: fix vb2 state check when start_streaming fails
    [media] videobuf2-core.h: fix comment
    [media] videobuf2-core: add comments before the WARN_ON
    [media] videobuf2-dma-sg: fix for wrong GFP mask to sg_alloc_table_from_pages

    Linus Torvalds
     

24 Sep, 2014

3 commits

  • blk-mq uses percpu_ref for its usage counter which tracks the number
    of in-flight commands and used to synchronously drain the queue on
    freeze. percpu_ref shutdown takes measureable wallclock time as it
    involves a sched RCU grace period. This means that draining a blk-mq
    takes measureable wallclock time. One would think that this shouldn't
    matter as queue shutdown should be a rare event which takes place
    asynchronously w.r.t. userland.

    Unfortunately, SCSI probing involves synchronously setting up and then
    tearing down a lot of request_queues back-to-back for non-existent
    LUNs. This means that SCSI probing may take more than ten seconds
    when scsi-mq is used.

    This will be properly fixed by implementing a mechanism to keep
    q->mq_usage_counter in atomic mode till genhd registration; however,
    that involves rather big updates to percpu_ref which is difficult to
    apply late in the devel cycle (v3.17-rc6 at the moment). As a
    stop-gap measure till the proper fix can be implemented in the next
    cycle, this patch introduces __percpu_ref_kill_expedited() and makes
    blk_mq_freeze_queue() use it. This is heavy-handed but should work
    for testing the experimental SCSI blk-mq implementation.

    Signed-off-by: Tejun Heo
    Reported-by: Christoph Hellwig
    Link: http://lkml.kernel.org/g/20140919113815.GA10791@lst.de
    Fixes: add703fda981 ("blk-mq: use percpu_ref for mq usage count")
    Cc: Kent Overstreet
    Cc: Jens Axboe
    Tested-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • If the ccp is built as a built-in module, then ccp-crypto (whether
    built as a module or a built-in module) will be able to load and
    it will register its crypto algorithms. If the system does not have
    a CCP this will result in -ENODEV being returned whenever a command
    is attempted to be queued by the registered crypto algorithms.

    Add an API, ccp_present(), that checks for the presence of a CCP
    on the system. The ccp-crypto module can use this to determine if it
    should register it's crypto alogorithms.

    Cc: stable@vger.kernel.org
    Reported-by: Scot Doyle
    Signed-off-by: Tom Lendacky
    Tested-by: Scot Doyle
    Signed-off-by: Herbert Xu

    Tom Lendacky
     
  • Pull infiniband/rdma fixes from Roland Dreier:
    "Last late set of InfiniBand/RDMA fixes for 3.17:

    - fixes for the new memory region re-registration support
    - iSER initiator error path fixes
    - grab bag of small fixes for the qib and ocrdma hardware drivers
    - larger set of fixes for mlx4, especially in RoCE mode"

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits)
    IB/mlx4: Fix VF mac handling in RoCE
    IB/mlx4: Do not allow APM under RoCE
    IB/mlx4: Don't update QP1 in native mode
    IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
    mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
    IB/core: When marshaling uverbs path, clear unused fields
    IB/mlx4: Avoid executing gid task when device is being removed
    IB/mlx4: Fix lockdep splat for the iboe lock
    IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
    IB/mlx4: Reorder steps in RoCE GID table initialization
    IB/mlx4: Don't duplicate the default RoCE GID
    IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
    IB/iser: Bump version to 1.4.1
    IB/iser: Allow bind only when connection state is UP
    IB/iser: Fix RX/TX CQ resource leak on error flow
    RDMA/ocrdma: Use right macro in query AH
    RDMA/ocrdma: Resolve L2 address when creating user AH
    mlx4: Correct error flows in rereg_mr
    IB/qib: Correct reference counting in debugfs qp_stats
    IPoIB: Remove unnecessary port query
    ...

    Linus Torvalds
     

23 Sep, 2014

3 commits

  • Pull networking fixes from David Miller:

    1) If the user gives us a msg_namelen of 0, don't try to interpret
    anything pointed to by msg_name. From Ani Sinha.

    2) Fix some bnx2i/bnx2fc randconfig compilation errors.

    The gist of the issue is that we firstly have drivers that span both
    SCSI and networking. And at the top of that chain of dependencies
    we have things like SCSI_FC_ATTRS and SCSI_NETLINK which are
    selected.

    But since select is a sledgehammer and ignores dependencies,
    everything to select's SCSI_FC_ATTRS and/or SCSI_NETLINK has to also
    explicitly select their dependencies and so on and so forth.

    Generally speaking 'select' is supposed to only be used for child
    nodes, those which have no dependencies of their own. And this
    whole chain of dependencies in the scsi layer violates that rather
    strongly.

    So just make SCSI_NETLINK depend upon it's dependencies, and so on
    and so forth for the things selecting it (either directly or
    indirectly).

    From Anish Bhatt and Randy Dunlap.

    3) Fix generation of blackhole routes in IPSEC, from Steffen Klassert.

    4) Actually notice netdev feature changes in rtl_open() code, from
    Hayes Wang.

    5) Fix divide by zero in bond enslaving, from Nikolay Aleksandrov.

    6) Missing memory barrier in sunvnet driver, from David Stevens.

    7) Don't leave anycast addresses around when ipv6 interface is
    destroyed, from Sabrina Dubroca.

    8) Don't call efx_{arch}_filter_sync_rx_mode before addr_list_lock is
    initialized in SFC driver, from Edward Cree.

    9) Fix missing DMA error checking in 3c59x, from Neal Horman.

    10) Openvswitch doesn't emit OVS_FLOW_CMD_NEW notifications accidently,
    fix from Samuel Gauthier.

    11) pch_gbe needs to select NET_PTP_CLASSIFY otherwise we can get a
    build error.

    12) Fix macvlan regression wherein we stopped emitting
    broadcast/multicast frames over software devices. From Nicolas
    Dichtel.

    13) Fix infiniband bug due to unintended overflow of skb->cb[], from
    Eric Dumazet. And add an assertion so this doesn't happen again.

    14) dm9000_parse_dt() should return error pointers, not NULL. From
    Tobias Klauser.

    15) IP tunneling code uses this_cpu_ptr() in preemptible contexts, fix
    from Eric Dumazet.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
    net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma
    net: bcmgenet: fix TX reclaim accounting for fragments
    ipv4: do not use this_cpu_ptr() in preemptible context
    dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()
    r8169: fix an if condition
    r8152: disable ALDPS
    ipoib: validate struct ipoib_cb size
    net: sched: shrink struct qdisc_skb_cb to 28 bytes
    tg3: Work around HW/FW limitations with vlan encapsulated frames
    macvlan: allow to enqueue broadcast pkt on virtual device
    pch_gbe: 'select' NET_PTP_CLASSIFY.
    scsi: Use 'depends' with LIBFC instead of 'select'.
    openvswitch: restore OVS_FLOW_CMD_NEW notifications
    genetlink: add function genl_has_listeners()
    lib: rhashtable: remove second linux/log2.h inclusion
    net: allow macvlans to move to net namespace
    3c59x: Fix bad offset spec in skb_frag_dma_map
    3c59x: Add dma error checking and recovery
    sparc: bpf_jit: fix support for ldx/stx mem and SKF_AD_VLAN_TAG
    can: at91_can: add missing prepare and unprepare of the clock
    ...

    Linus Torvalds
     
  • Steffen Klassert says:

    ====================
    pull request (net): ipsec 2014-09-22

    We generate a blackhole or queueing route if a packet
    matches an IPsec policy but a state can't be resolved.
    Here we assume that dst_output() is called to kill
    these packets. Unfortunately this assumption is not
    true in all cases, so it is possible that these packets
    leave the system without the necessary transformations.

    This pull request contains two patches to fix this issue:

    1) Fix for blackhole routed packets.

    2) Fix for queue routed packets.

    Both patches are serious stable candidates.

    Please pull or let me know if there are problems.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
    or increasing skb->cb[] size.

    Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
    skb_flow_dissect()") broke IPoIB.

    Only current offender is sch_choke, and this one do not need an
    absolutely precise flow key.

    If we store 17 bytes of flow key, its more than enough. (Its the actual
    size of flow_keys if it was a packed structure, but we might add new
    fields at the end of it later)

    Signed-off-by: Eric Dumazet
    Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
    Signed-off-by: David S. Miller

    Eric Dumazet
     

22 Sep, 2014

3 commits

  • Pull workqueue fix from Tejun Heo:
    "create_singlethread_workqueue() is the old interface which is kept
    around for backward compatibility - each should be reviewed to
    determine whether singlethread usage was to save worker threads or for
    ordering guarantee and whether it's depended upon by memory reclaim
    path.

    While adding NUMA support for unbound workqueues during v3.10, I
    forgot to update it breaking the singlethread and ordering properties
    on NUMA setups. The breakage was unfortunately rather subtle and went
    without being reported until now.

    The only missing piece is __WQ_ORDERED flag which makes the unbounded
    workqueue use a single backend queue across different NUMA nodes.
    It's fixed by making create_singlethread_workqueue() wrap
    alloc_ordered_workqueue() so that possible future updates are
    inherited automatically"

    * 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: apply __WQ_ORDERED to create_singlethread_workqueue()

    Linus Torvalds
     
  • The recent conversion of saa7134 to vb2 unconvered a poll() bug that
    broke the teletext applications alevt and mtt. These applications
    expect that calling poll() without having called VIDIOC_STREAMON will
    cause poll() to return POLLERR. That did not happen in vb2.

    This patch fixes that behavior. It also fixes what should happen when
    poll() is called when STREAMON is called but no buffers have been
    queued. In that case poll() will also return POLLERR, but only for
    capture queues since output queues will always return POLLOUT
    anyway in that situation.

    This brings the vb2 behavior in line with the old videobuf behavior.

    Signed-off-by: Hans Verkuil
    Acked-by: Laurent Pinchart
    Signed-off-by: Mauro Carvalho Chehab

    Hans Verkuil
     
  • The comment for start_streaming that tells the developer with which vb2 state
    buffers should be returned to vb2 gave the wrong state. Very confusing.

    Signed-off-by: Hans Verkuil
    Acked-by: Laurent Pinchart
    Signed-off-by: Mauro Carvalho Chehab

    Hans Verkuil
     

21 Sep, 2014

2 commits

  • Pull staging / IIO fixes from Greg KH:
    "Here are some IIO and Staging driver fixes for 3.17-rc6. They are all
    pretty simple, and resolve reported issues"

    * tag 'staging-3.17-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
    staging: vt6655: buffer overflow in ioctl
    iio:magnetometer: bugfix magnetometers gain values
    iio: adc: at91: don't use the last converted data register
    iio: adc: xilinx-xadc: assign auxiliary channels address correctly
    iio: meter: ade7758: Fix indio_dev->trig assignment
    iio: inv_mpu6050: Fix indio_dev->trig assignment
    iio: gyro: itg3200: Fix indio_dev->trig assignment
    iio: st_sensors: Fix indio_dev->trig assignment
    iio: hid_sensor_hub: Fix indio_dev->trig assignment
    iio: adc: ad_sigma_delta: Fix indio_dev->trig assignment
    iio: accel: bma180: Fix indio_dev->trig assignment
    iio:trigger: modify return value for iio_trigger_get
    iio:inkern: fix overwritten -EPROBE_DEFER in of_iio_channel_get_by_name

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "A bunch of radeon fixes for oops on module unload, and problems with
    resetting the dma engine, one nouveau fix for black boxes in rendering
    on my mbp retina, one sti fix, and a couple of intel fixes"

    * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
    drm/nouveau: ltc/gf100-: fix cbc issues on certain boards
    drm/bochs: add missing drm_connector_register call
    drm/cirrus: add missing drm_connector_register call
    drm/radeon: Fix typo 'addr' -> 'entry' in rs400_gart_set_page
    drm/nouveau/runpm: fix module unload
    drm/radeon/px: fix module unload
    vgaswitcheroo: add vga_switcheroo_fini_domain_pm_ops
    drm/radeon: don't reset dma on r6xx-evergreen init
    drm/radeon: don't reset sdma on CIK init
    drm/radeon: don't reset dma on NI/SI init
    drm/radeon/dpm: fix resume on mullins
    drm/radeon: Disable HDP flush before every CS again for < r600
    drm/radeon: delete unused PTE_* defines
    drm/i915: Add limited color range readout for HDMI/DP ports on g4x/vlv/chv
    drm: sti: do not iterate over the info frame array
    drm/i915: Fix SRC_COPY width on 830/845g

    Linus Torvalds
     

20 Sep, 2014

4 commits

  • …23/iio into staging-linus

    Jonathan writes:

    First round of IIO fixes for the 3.17 cycle.

    * Fix an overwritten error return that can prevent deferred probing when
    using of_iio_channel_get_by_name
    * A series that deals with an incorrect reference count when the default
    trigger is set within the main probe routine for a driver. Can result
    in a double free if the trigger is changed.
    * Fix a buglet with xilinx-xadc concerning setup of the address for an
    aux channel.
    * At91 adc driver could sometimes get a touchscreen reading rather than
    the intended adc channel. This is fixed by using the channel data register
    instead.
    * Fix some ST magnetometer gain values that differ in production parts from
    the prerelease ones used for driver development.

    Greg Kroah-Hartman
     
  • This function is the counterpart of the function netlink_has_listeners().

    Signed-off-by: Nicolas Dichtel
    Acked-by: Pravin B Shelar
    Signed-off-by: David S. Miller

    Nicolas Dichtel
     
  • Pull PCI fixes from Bjorn Helgaas:
    "These fix:

    - Boot video device detection on dual-GPU Apple systems
    - Hotplug fiascos on VGA switcheroo with radeon & nouveau drivers
    - Boot hang on Freescale i.MX6 systems
    - Excessive "no hotplug settings from platform" warnings

    In particular:

    Enumeration
    - Don't default exclusively to first video device (Bruno Prémont)

    PCI device hotplug
    - Remove "no hotplug settings from platform" warning (Bjorn Helgaas)
    - Add pci_ignore_hotplug() for VGA switcheroo (Bjorn Helgaas)

    Freescale i.MX6
    - Put LTSSM in "Detect" state before disabling (Lucas Stach)"

    * tag 'pci-v3.17-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    vgaarb: Drop obsolete #ifndef
    vgaarb: Don't default exclusively to first video device with mem+io
    ACPIPHP / radeon / nouveau: Remove acpi_bus_no_hotplug()
    PCI: Remove "no hotplug settings from platform" warning
    PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device
    PCI: imx6: Put LTSSM in "Detect" state before disabling it
    MAINTAINERS: Add Lucas Stach as co-maintainer for i.MX6 PCI driver

    Linus Torvalds
     
  • In debugging an application that receives -ENOMEM from ib_reg_mr(), I
    found that ib_umem_get() can fail because the pinned_vm count has
    wrapped causing it to always be larger than the lock limit even with
    RLIMIT_MEMLOCK set to RLIM_INFINITY.

    The wrapping of pinned_vm occurs because the process that calls
    ib_reg_mr() will have its mm->pinned_vm count incremented. Later a
    different process with a different mm_struct than the one that
    allocated the ib_umem struct ends up releasing it which results in
    decrementing the new processes mm->pinned_vm count past zero and
    wrapping.

    I'm not entirely sure what circumstances cause a different process to
    release the ib_umem than the one that allocated it but the kernel
    stack trace of the freeing process from my situation looks like the
    following:

    Call Trace:
    [] dump_stack+0x19/0x1b
    [] ib_umem_release+0x1f5/0x200 [ib_core]
    [] mlx4_ib_destroy_qp+0x241/0x440 [mlx4_ib]
    [] ib_destroy_qp+0x12c/0x170 [ib_core]
    [] ib_uverbs_close+0x259/0x4e0 [ib_uverbs]
    [] __fput+0xba/0x240
    [] ____fput+0xe/0x10
    [] task_work_run+0xc4/0xe0
    [] do_notify_resume+0x95/0xa0
    [] int_signal+0x12/0x17

    The following patch fixes the issue by storing the pid struct of the
    process that calls ib_umem_get() so that ib_umem_release and/or
    ib_umem_account() can properly decrement the pinned_vm count of the
    correct mm_struct.

    Signed-off-by: Shawn Bohrer
    Reviewed-by: Shachar Raindel
    Signed-off-by: Roland Dreier

    Shawn Bohrer
     

19 Sep, 2014

4 commits


17 Sep, 2014

1 commit

  • Commit 20cde694027e ("x86, ia64: Move EFI_FB vga_default_device()
    initialization to pci_vga_fixup()") moved boot video device detection from
    efifb to x86 and ia64 pci/fixup.c.

    Remove the left-over #ifndef check that will always match since the
    corresponding arch-specific define is gone with above patch.

    Signed-off-by: Bruno Prémont
    Signed-off-by: Bjorn Helgaas
    CC: Matthew Garrett

    Bruno Prémont
     

16 Sep, 2014

3 commits

  • Currently we genarate a queueing route if we have matching policies
    but can not resolve the states and the sysctl xfrm_larval_drop is
    disabled. Here we assume that dst_output() is called to kill the
    queued packets. Unfortunately this assumption is not true in all
    cases, so it is possible that these packets leave the system unwanted.

    We fix this by generating queueing routes only from the
    route lookup functions, here we can guarantee a call to
    dst_output() afterwards.

    Fixes: a0073fe18e71 ("xfrm: Add a state resolution packet queue")
    Reported-by: Konstantinos Kolelis
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     
  • Currently we genarate a blackhole route route whenever we have
    matching policies but can not resolve the states. Here we assume
    that dst_output() is called to kill the balckholed packets.
    Unfortunately this assumption is not true in all cases, so
    it is possible that these packets leave the system unwanted.

    We fix this by generating blackhole routes only from the
    route lookup functions, here we can guarantee a call to
    dst_output() afterwards.

    Fixes: 2774c131b1d ("xfrm: Handle blackhole route creation via afinfo.")
    Reported-by: Konstantinos Kolelis
    Signed-off-by: Steffen Klassert

    Steffen Klassert
     
  • Revert parts of f244d8b623da ("ACPIPHP / radeon / nouveau: Fix VGA
    switcheroo problem related to hotplug").

    A previous commit 5493b31f0b55 ("PCI: Add pci_ignore_hotplug() to ignore
    hotplug events for a device") added equivalent functionality implemented in
    a different way for both acpiphp and pciehp.

    Signed-off-by: Bjorn Helgaas
    Acked-by: Alex Deucher
    Acked-by: Rafael J. Wysocki
    Acked-by: Dave Airlie
    Acked-by: Rajat Jain

    Bjorn Helgaas
     

15 Sep, 2014

2 commits

  • Pull crypto fixes from Herbert Xu:
    "This fixes the newly added drbg generator so that it actually works on
    32-bit machines. Previously the code was only tested on 64-bit and on
    32-bit it overflowed and simply doesn't work"

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: drbg - remove check for uninitialized DRBG handle
    crypto: drbg - backport "fix maximum value checks on 32 bit systems"

    Linus Torvalds
     
  • The performance regression that Josef Bacik reported in the pathname
    lookup (see commit 99d263d4c5b2 "vfs: fix bad hashing of dentries") made
    me look at performance stability of the dcache code, just to verify that
    the problem was actually fixed. That turned up a few other problems in
    this area.

    There are a few cases where we exit RCU lookup mode and go to the slow
    serializing case when we shouldn't, Al has fixed those and they'll come
    in with the next VFS pull.

    But my performance verification also shows that link_path_walk() turns
    out to have a very unfortunate 32-bit store of the length and hash of
    the name we look up, followed by a 64-bit read of the combined hash_len
    field. That screws up the processor store to load forwarding, causing
    an unnecessary hickup in this critical routine.

    It's caused by the ugly calling convention for the "hash_name()"
    function, and easily fixed by just making hash_name() fill in the whole
    'struct qstr' rather than passing it a pointer to just the hash value.

    With that, the profile for this function looks much smoother.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

14 Sep, 2014

2 commits

  • …/git.kernel.org/pub/scm/linux/kernel/git/tip/tip

    Pull futex and timer fixes from Thomas Gleixner:
    "A oneliner bugfix for the jinxed futex code:

    - Drop hash bucket lock in the error exit path. I really could slap
    myself for intruducing that bug while fixing all the other horror
    in that code three month ago ...

    and the timer department is not too proud about the following fixes:

    - Deal with a long standing rounding bug in the timeval to jiffies
    conversion. It's a real issue and this fix fell through the cracks
    for quite some time.

    - Another round of alarmtimer fixes. Finally this code gets used
    more widely and the subtle issues hidden for quite some time are
    noticed and fixed. Nothing really exciting, just the itty bitty
    details which bite the serious users here and there"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    futex: Unlock hb->lock in futex_wait_requeue_pi() error path

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    alarmtimer: Lock k_itimer during timer callback
    alarmtimer: Do not signal SIGEV_NONE timers
    alarmtimer: Return relative times in timer_gettime
    jiffies: Fix timeval conversion to jiffies

    Linus Torvalds
     
  • The hash_64() function historically does the multiply by the
    GOLDEN_RATIO_PRIME_64 number with explicit shifts and adds, because
    unlike the 32-bit case, gcc seems unable to turn the constant multiply
    into the more appropriate shift and adds when required.

    However, that means that we generate those shifts and adds even when the
    architecture has a fast multiplier, and could just do it better in
    hardware.

    Use the now-cleaned-up CONFIG_ARCH_HAS_FAST_MULTIPLIER (together with
    "is it a 64-bit architecture") to decide whether to use an integer
    multiply or the explicit sequence of shift/add instructions.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Sep, 2014

5 commits

  • …linux/kernel/git/xen/tip

    Pull Xen ARM bugfix from Stefano Stabellini:
    "The patches fix the "xen_add_mach_to_phys_entry: cannot add" bug that
    has been affecting xen on arm and arm64 guests since 3.16. They
    require a few hypervisor side changes that just went in xen-unstable.

    A couple of days ago David sent out a pull request with a few other
    Xen fixes (it is already in master). Sorry we didn't synchronized
    better among us"

    * tag 'stable/for-linus-3.17-b-rc4-arm-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/arm: remove mach_to_phys rbtree
    xen/arm: reimplement xen_dma_unmap_page & friends
    xen/arm: introduce XENFEAT_grant_map_identity

    Linus Torvalds
     
  • If we try to rmmod the driver for an interface while sockets with
    setsockopt(JOIN_ANYCAST) are alive, some refcounts aren't cleaned up
    and we get stuck on:

    unregister_netdevice: waiting for ens3 to become free. Usage count = 1

    If we LEAVE_ANYCAST/close everything before rmmod'ing, there is no
    problem.

    We need to perform a cleanup similar to the one for multicast in
    addrconf_ifdown(how == 1).

    Signed-off-by: Sabrina Dubroca
    Acked-by: Hannes Frederic Sowa
    Signed-off-by: David S. Miller

    Sabrina Dubroca
     
  • timeval_to_jiffies tried to round a timeval up to an integral number
    of jiffies, but the logic for doing so was incorrect: intervals
    corresponding to exactly N jiffies would become N+1. This manifested
    itself particularly repeatedly stopping/starting an itimer:

    setitimer(ITIMER_PROF, &val, NULL);
    setitimer(ITIMER_PROF, NULL, &val);

    would add a full tick to val, _even if it was exactly representable in
    terms of jiffies_ (say, the result of a previous rounding.) Doing
    this repeatedly would cause unbounded growth in val. So fix the math.

    Here's what was wrong with the conversion: we essentially computed
    (eliding seconds)

    jiffies = usec * (NSEC_PER_USEC/TICK_NSEC)

    by using scaling arithmetic, which took the best approximation of
    NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
    x/(2^USEC_JIFFIE_SC), and computed:

    jiffies = (usec * x) >> USEC_JIFFIE_SC

    and rounded this calculation up in the intermediate form (since we
    can't necessarily exactly represent TICK_NSEC in usec.) But the
    scaling arithmetic is a (very slight) *over*approximation of the true
    value; that is, instead of dividing by (1 usec/ 1 jiffie), we
    effectively divided by (1 usec/1 jiffie)-epsilon (rounding
    down). This would normally be fine, but we want to round timeouts up,
    and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
    would be fine if our division was exact, but dividing this by the
    slightly smaller factor was equivalent to adding just _over_ 1 to the
    final result (instead of just _under_ 1, as desired.)

    In particular, with HZ=1000, we consistently computed that 10000 usec
    was 11 jiffies; the same was true for any exact multiple of
    TICK_NSEC.

    We could possibly still round in the intermediate form, adding
    something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
    convert usec->nsec, round in nanoseconds, and then convert using
    time*spec*_to_jiffies. This adds one constant multiplication, and is
    not observably slower in microbenchmarks on recent x86 hardware.

    Tested: the following program:

    int main() {
    struct itimerval zero = {{0, 0}, {0, 0}};
    /* Initially set to 10 ms. */
    struct itimerval initial = zero;
    initial.it_interval.tv_usec = 10000;
    setitimer(ITIMER_PROF, &initial, NULL);
    /* Save and restore several times. */
    for (size_t i = 0; i < 10; ++i) {
    struct itimerval prev;
    setitimer(ITIMER_PROF, &zero, &prev);
    /* on old kernels, this goes up by TICK_USEC every iteration */
    printf("previous value: %ld %ld %ld %ld\n",
    prev.it_interval.tv_sec, prev.it_interval.tv_usec,
    prev.it_value.tv_sec, prev.it_value.tv_usec);
    setitimer(ITIMER_PROF, &prev, NULL);
    }
    return 0;
    }

    Cc: stable@vger.kernel.org
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: Paul Turner
    Cc: Richard Cochran
    Cc: Prarit Bhargava
    Reviewed-by: Paul Turner
    Reported-by: Aaron Jacobs
    Signed-off-by: Andrew Hunter
    [jstultz: Tweaked to apply to 3.17-rc]
    Signed-off-by: John Stultz

    Andrew Hunter
     
  • create_singlethread_workqueue() is a compat interface for single
    threaded workqueue which maps to ordered workqueue w/ rescuer in the
    current implementation. create_singlethread_workqueue() currently
    implemented by invoking alloc_workqueue() w/ appropriate parameters.

    8719dceae2f9 ("workqueue: reject adjusting max_active or applying
    attrs to ordered workqueues") introduced __WQ_ORDERED to protect
    ordered workqueues against dynamic attribute changes which can break
    ordering guarantees but forgot to apply it to
    create_singlethread_workqueue(). This in itself is okay as nobody
    currently uses dynamic attribute change on workqueues created with
    create_singlethread_workqueue().

    However, 4c16bd327c ("workqueue: implement NUMA affinity for unbound
    workqueues") broke singlethreaded guarantee for ordered workqueues
    through allocating a separate pool_workqueue on each NUMA node by
    default. A later change 8a2b75384444 ("workqueue: fix ordered
    workqueues in NUMA setups") fixed it by allocating only one global
    pool_workqueue if __WQ_ORDERED is set.

    Combined, the __WQ_ORDERED omission in create_singlethread_workqueue()
    became critical breaking its single threadedness and ordering
    guarantee.

    Let's make create_singlethread_workqueue() wrap
    alloc_ordered_workqueue() instead so that it inherits __WQ_ORDERED and
    can implicitly track future ordered_workqueue changes.

    v2: I missed that __WQ_ORDERED now protects against pwq splitting
    across NUMA nodes and incorrectly described the patch as a
    nice-to-have fix to protect against future dynamic attribute
    usages. Oleg pointed out that this is actually a critical
    breakage due to 8a2b75384444 ("workqueue: fix ordered workqueues
    in NUMA setups").

    Signed-off-by: Tejun Heo
    Reported-by: Mike Anderson
    Cc: Oleg Nesterov
    Cc: Gustavo Luiz Duarte
    Cc: Tomas Henzl
    Cc: stable@vger.kernel.org
    Fixes: 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues")

    Tejun Heo
     
  • Pull USB fixes from Greg KH:
    "Here are some USB and PHY fixes for 3.17-rc5.

    Nothing major here, just a number of tiny fixes for reported issues,
    and some new device ids as well.

    All have been tested in linux-next"

    * tag 'usb-3.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (46 commits)
    xhci: fix oops when xhci resumes from hibernate with hw lpm capable devices
    usb: xhci: Fix OOPS in xhci error handling code
    xhci: Fix null pointer dereference if xhci initialization fails
    storage: Add single-LUN quirk for Jaz USB Adapter
    uas: Add missing le16_to_cpu calls to asm1051 / asm1053 usb-id check
    usb: chipidea: msm: Initialize PHY on reset event
    usb: chipidea: msm: Use USB PHY API to control PHY state
    usb: hub: take hub->hdev reference when processing from eventlist
    uas: Disable uas on ASM1051 devices
    usb: dwc2/gadget: avoid disabling ep0
    usb: dwc2/gadget: delay enabling irq once hardware is configured properly
    usb: dwc2/gadget: do not call disconnect method in pullup
    usb: dwc2/gadget: break infinite loop in endpoint disable code
    usb: dwc2/gadget: fix phy initialization sequence
    usb: dwc2/gadget: fix phy disable sequence
    uwb: init beacon cache entry before registering uwb device
    USB: ftdi_sio: Add support for GE Healthcare Nemo Tracker device
    USB: document the 'u' flag for usb-storage quirks parameter
    usb: host: xhci: fix compliance mode workaround
    usb: dwc3: fix TRB completion when multiple TRBs are started
    ...

    Linus Torvalds
     

12 Sep, 2014

2 commits

  • The flag tells us that the hypervisor maps a grant page to guest
    physical address == machine address of the page in addition to the
    normal grant mapping address. It is needed to properly issue cache
    maintenance operation at the completion of a DMA operation involving a
    foreign grant.

    Signed-off-by: Stefano Stabellini
    Tested-by: Denis Schneider

    Stefano Stabellini
     
  • Pull input updates from Dmitry Torokhov:
    "An update to Synaptics PS/2 driver to handle "ForcePads" (currently
    found in HP EliteBook 1040 laptops), a change for Elan PS/2 driver to
    detect newer touchpads, bunch of devices get annotated as Trackpoint
    and/or Pointer to help userspace classify and handle them, plus
    assorted driver fixes"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
    Input: serport - add compat handling for SPIOCSTYPE ioctl
    Input: atmel_mxt_ts - fix double free of input device
    Input: synaptics - add support for ForcePads
    Input: matrix_keypad - use request_any_context_irq()
    Input: atmel_mxt_ts - downgrade warning about empty interrupts
    Input: wm971x - fix typo in module parameter description
    Input: cap1106 - fix register definition
    Input: add missing POINTER / DIRECT properties to a bunch of drivers
    Input: add INPUT_PROP_POINTING_STICK property
    Input: elantech - fix detection of touchpad on ASUS s301l

    Linus Torvalds
     

11 Sep, 2014

3 commits

  • The new header file memfd.h from commit 9183df25fe7b ("shm: add
    memfd_create() syscall") should be exported.

    Signed-off-by: David Drysdale
    Reviewed-by: David Herrmann
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Drysdale
     
  • Changing the vlan stripping policy of the QP isn't supported by older
    firmware versions for the INIT2RTR command. Nevertheless, we've used it.

    Fix that by doing this policy change using INIT2RTR only if the firmware
    supports it, otherwise, we call UPDATE_QP command to do the task.

    Fixes: 7677fc9 ('net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode')
    Signed-off-by: Matan Barak
    Signed-off-by: Or Gerlitz
    Signed-off-by: David S. Miller

    Matan Barak
     
  • Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
    normally generates a hot-remove event that unbinds the driver.

    Some drivers expect to remain bound to a device even while they power it
    off and back on again. This can be dangerous, because if the device is
    removed or replaced while it is powered off, the driver doesn't know that
    anything changed. But some drivers accept that risk.

    Add pci_ignore_hotplug() for use by drivers that know their device cannot
    be removed. Using pci_ignore_hotplug() tells the PCI core that hot-plug
    events for the device should be ignored.

    The radeon and nouveau drivers use this to switch between a low-power,
    integrated GPU and a higher-power, higher-performance discrete GPU. They
    power off the unused GPU, but they want to remain bound to it.

    This is a reimplementation of f244d8b623da ("ACPIPHP / radeon / nouveau:
    Fix VGA switcheroo problem related to hotplug") but extends it to work with
    both acpiphp and pciehp.

    This fixes a problem where systems with dual GPUs using the radeon drivers
    become unusable, freezing every few seconds (see bugzillas below). The
    resume of the radeon device may also fail, e.g.,

    This fixes problems on dual GPU systems where the radeon driver becomes
    unusable because of problems while suspending the device, as in bug 79701:

    [drm] radeon: finishing device.
    radeon 0000:01:00.0: Userspace still has active objects !
    radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
    ...
    WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
    trying to unbind memory from uninitialized GART !

    or while resuming it, as in bug 77261:

    radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
    radeon 0000:01:00.0: GPU lockup ...
    radeon 0000:01:00.0: GPU pci config reset
    pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
    radeon 0000:01:00.0: GPU reset succeeded, trying to resume
    *ERROR* radeon: dpm resume failed
    radeon 0000:01:00.0: Wait for MC idle timedout !

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
    Reported-by: Shawn Starr
    Reported-by: Jose P.
    Signed-off-by: Bjorn Helgaas
    Acked-by: Alex Deucher
    Acked-by: Rajat Jain
    Acked-by: Rafael J. Wysocki
    Acked-by: Dave Airlie
    CC: stable@vger.kernel.org # v3.15+

    Bjorn Helgaas