24 Sep, 2014
8 commits
-
Pull final block fixes from Jens Axboe:
"This week and last we've been fixing some corner cases related to
blk-mq, mostly. I ended up pulling most of that out of for-linus
yesterday, which is why the branch looks fresh. The rest were
postponed for 3.18.This pull request contains:
- Fix from Christoph, avoiding a stack overflow when FUA insertion
would recursive infinitely.- Fix from David Hildenbrand on races between the timeout handler and
uninitialized requests. Fixes a real issue that virtio_blk has run
into.- A few fixes from me:
- Ensure that request deadline/timeout is ordered before the
request is marked as started.- A potential oops on out-of-memory, when we scale the queue
depth of the device and retry.- A hang fix on requeue from SCSI, where the hardware queue
would be stopped when we attempt to re-run it (and hence
nothing would happen, stalling progress).- A fix for commit 2da78092, where the cleanup path was moved
to RCU, but a debug might_sleep() was inadvertently left in
the code. This causes warnings for people"* 'for-linus' of git://git.kernel.dk/linux-block:
genhd: fix leftover might_sleep() in blk_free_devt()
blk-mq: use blk_mq_start_hw_queues() when running requeue work
blk-mq: fix potential oops on out-of-memory in __blk_mq_alloc_rq_maps()
blk-mq: avoid infinite recursion with the FUA flag
blk-mq: Avoid race condition with uninitialized requests
blk-mq: request deadline must be visible before marking rq as started -
Pull parisc fixes from Helge Deller:
"We avoid using -mfast-indirect-calls for 64bit kernel builds to
prevent building an unbootable kernel due to latest gcc changes.In the pdc_stable/firmware-access driver we fix a few possible stack
overflows and we now call secure_computing_strict() instead of
secure_computing() which fixes upcoming SECCOMP patches in the
for-next trees"* 'parisc-3.17-7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
parisc: pdc_stable.c: Avoid potential stack overflows
parisc: pdc_stable.c: Cleaning up unnecessary use of memset in conjunction with strncpy
parisc: ptrace: use secure_computing_strict() -
In spite of what the GCC manual says, the -mfast-indirect-calls has
never been supported in the 64-bit parisc compiler. Indirect calls have
always been done using function descriptors irrespective of the
-mfast-indirect-calls option.Recently, it was noticed that a function descriptor was always requested
when the -mfast-indirect-calls option was specified. This caused
problems when the option was used in application code and doesn't make
any sense because the whole point of the option is to avoid using a
function descriptor for indirect calls.Fixing this broke 64-bit kernel builds.
I will fix GCC but for now we need the attached change. This results in
the same kernel code as before.Signed-off-by: John David Anglin
Cc: stable@vger.kernel.org # v3.0+
Signed-off-by: Helge Deller -
Pull ia64 defconfig update from Tony Luck:
"Need to rebuild defconfig files to cope with removal of "select NET"
in drivers/scsi/Kconfig"* tag 'please-pull-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] refresh arch/ia64/configs/* using "make savedefconfig" -
Pull hwmon fixes from Guenter Roeck:
- Fix a resource leak in tmp103 driver
- Add support for two more processors to fam15h_power driver
- Also fix a bug in the same driver to only report the power level on
chips which actually support reporting it* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (tmp103) Fix resource leak bug in tmp103 temperature sensor driver
hwmon: (fam15h_power) Add support for two more processors
hwmon: (fam15h_power) Make actual power reporting conditional -
Prompted by a change to drivers/scsi/Kconfig which used to do a
"select NET" but now does a "depends on NET". This meant that some
configurations ended up without CONFIG_NET=ySigned-off-by Tony Luck
-
Pull another kvm fix from Paolo Bonzini:
"Another fix for 3.17 arrived at just the wrong time, after I had sent
yesterday's pull request. Normally I would have waited for some other
patches to pile up, but since 3.17 might be short here it is"* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm/arm64: KVM: Fix unaligned access bug on gicv2 access -
Pull cgroup fix from Tejun Heo:
"One late fix for cgroup.I was waiting for another set of fixes for a long-standing obscure
cpuset bug but am not sure whether they'll be ready before v3.17
release. This one is a simple fix for a mutex unlock balance bug in
an allocation failure path in pidlist_array_load().The bug was introduced in v3.14 and the fix is tagged for -stable"
* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: fix unbalanced locking
23 Sep, 2014
28 commits
-
…/kernel/git/kvmarm/kvmarm into kvm-master
Fixes unaligned access to the gicv2 virtual cpu status.
-
This reverts commit 9cb0e394234d244fe5a97e743ec9dd7ddff7e64b.
It causes my Sony Vaio Pro 11 to immediately reboot at startup.
Acked-by: Ingo Molnar
Cc: Peter Anvin
Cc: Maarten Lankhorst
Cc: Ard Biesheuvel
Cc: Matt Fleming
Signed-off-by: Linus Torvalds -
Pull networking fixes from David Miller:
1) If the user gives us a msg_namelen of 0, don't try to interpret
anything pointed to by msg_name. From Ani Sinha.2) Fix some bnx2i/bnx2fc randconfig compilation errors.
The gist of the issue is that we firstly have drivers that span both
SCSI and networking. And at the top of that chain of dependencies
we have things like SCSI_FC_ATTRS and SCSI_NETLINK which are
selected.But since select is a sledgehammer and ignores dependencies,
everything to select's SCSI_FC_ATTRS and/or SCSI_NETLINK has to also
explicitly select their dependencies and so on and so forth.Generally speaking 'select' is supposed to only be used for child
nodes, those which have no dependencies of their own. And this
whole chain of dependencies in the scsi layer violates that rather
strongly.So just make SCSI_NETLINK depend upon it's dependencies, and so on
and so forth for the things selecting it (either directly or
indirectly).From Anish Bhatt and Randy Dunlap.
3) Fix generation of blackhole routes in IPSEC, from Steffen Klassert.
4) Actually notice netdev feature changes in rtl_open() code, from
Hayes Wang.5) Fix divide by zero in bond enslaving, from Nikolay Aleksandrov.
6) Missing memory barrier in sunvnet driver, from David Stevens.
7) Don't leave anycast addresses around when ipv6 interface is
destroyed, from Sabrina Dubroca.8) Don't call efx_{arch}_filter_sync_rx_mode before addr_list_lock is
initialized in SFC driver, from Edward Cree.9) Fix missing DMA error checking in 3c59x, from Neal Horman.
10) Openvswitch doesn't emit OVS_FLOW_CMD_NEW notifications accidently,
fix from Samuel Gauthier.11) pch_gbe needs to select NET_PTP_CLASSIFY otherwise we can get a
build error.12) Fix macvlan regression wherein we stopped emitting
broadcast/multicast frames over software devices. From Nicolas
Dichtel.13) Fix infiniband bug due to unintended overflow of skb->cb[], from
Eric Dumazet. And add an assertion so this doesn't happen again.14) dm9000_parse_dt() should return error pointers, not NULL. From
Tobias Klauser.15) IP tunneling code uses this_cpu_ptr() in preemptible contexts, fix
from Eric Dumazet.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma
net: bcmgenet: fix TX reclaim accounting for fragments
ipv4: do not use this_cpu_ptr() in preemptible context
dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()
r8169: fix an if condition
r8152: disable ALDPS
ipoib: validate struct ipoib_cb size
net: sched: shrink struct qdisc_skb_cb to 28 bytes
tg3: Work around HW/FW limitations with vlan encapsulated frames
macvlan: allow to enqueue broadcast pkt on virtual device
pch_gbe: 'select' NET_PTP_CLASSIFY.
scsi: Use 'depends' with LIBFC instead of 'select'.
openvswitch: restore OVS_FLOW_CMD_NEW notifications
genetlink: add function genl_has_listeners()
lib: rhashtable: remove second linux/log2.h inclusion
net: allow macvlans to move to net namespace
3c59x: Fix bad offset spec in skb_frag_dma_map
3c59x: Add dma error checking and recovery
sparc: bpf_jit: fix support for ldx/stx mem and SKF_AD_VLAN_TAG
can: at91_can: add missing prepare and unprepare of the clock
... -
Pull clock layer fixes from Mike Turquette:
"The fixes for the clock tree are mostly run-time bugs in clock
drivers.The fixes for TI DRA7 remove divide-by-zero errors. The recently
merged AT91 clock driver fixes some bad error checking and the QCOM
driver fix restores audio for that platform, a clear regression. A
list iteration bug in the framework core was hit recently and is fixed
up here. Finally a compilation warning is fixed for efm32gg, which is
also a regression fix"* tag 'clk-fixes-for-linus' of git://git.linaro.org/people/mike.turquette/linux:
clk/efm32gg: fix dt init prototype
clk: prevent erronous parsing of children during rate change
clk: rockchip: Fix the clocks for i2c1 and i2c2
clk: qcom: Fix sdc 144kHz frequency entry
clk: at91: fix num_parents test in at91sam9260 slow clk implementation
clk: ti: dra7-atl: Provide error check for incoming parameters in set_rate
clk: ti: divider: Provide error check for incoming parameters in set_rate -
…git/dhowells/linux-fs
Pull fs-cache fixes from David Howells:
- Put a timeout in releasepage() to deal with a recursive hang between
the memory allocator, writeback, ext4 and fscache under memory
pressure.- Fix a pair of refcount bugs in the fscache error handling.
- Remove a couple of unused pagevecs.
- The cachefiles requirement that the base directory support rename
should permit rename2 as an alternative - otherwise certain
filesystems cannot now be used as backing stores (such as ext4).* tag 'fscache-fixes-20140917' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
CacheFiles: Handle rename2
cachefiles: remove two unused pagevecs.
FS-Cache: refcount becomes corrupt under vma pressure.
FS-Cache: Reduce cookie ref count if submit fails.
FS-Cache: Timeout for releasepage() -
Florian Fainelli says:
====================
net: bcmgenet: TX reclaim and DMA fixesThis patch set contains one fix for an accounting problem while reclaiming
transmitted buffers having fragments, and the second fix is to make sure
that the DMA shutdown is properly controlled.
====================Signed-off-by: David S. Miller
-
We should not be manipulaging the DMA_CTRL registers directly by writing
0 to them to disable DMA. This is an operation that needs to be timed to
make sure the DMA engines have been properly stopped since their state
machine stops on a packet boundary, not immediately.Make sure that tha bcmgenet_fini_dma() calls bcmgenet_dma_teardown() to
ensure a proper DMA engine state. As a result, we need to reorder the
function bodies to resolve the use dependency.Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli
Signed-off-by: David S. Miller -
The GENET driver supports SKB fragments, and succeeds in transmitting
them properly, but when reclaiming these transmitted fragments, we will
only update the count of free buffer descriptors by 1, even for SKBs
with fragments. This leads to the networking stack thinking it has more
room than the hardware has when pushing new SKBs, and backing off
consequently because we return NETDEV_TX_BUSY.Fix this by accounting for the SKB nr_frags plus one (itself) and update
ring->free_bds accordingly with that value for each iteration loop in
__bcmgenet_tx_reclaim().Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli
Signed-off-by: David S. Miller -
this_cpu_ptr() in preemptible context is generally bad
Sep 22 05:05:55 br kernel: [ 94.608310] BUG: using smp_processor_id()
in
preemptible [00000000] code: ip/2261
Sep 22 05:05:55 br kernel: [ 94.608316] caller is
tunnel_dst_set.isra.28+0x20/0x60 [ip_tunnel]
Sep 22 05:05:55 br kernel: [ 94.608319] CPU: 3 PID: 2261 Comm: ip Not
tainted
3.17.0-rc5 #82We can simply use raw_cpu_ptr(), as preemption is safe in these
contexts.Should fix https://bugzilla.kernel.org/show_bug.cgi?id=84991
Signed-off-by: Eric Dumazet
Reported-by: Joe
Fixes: 9a4aa9af447f ("ipv4: Use percpu Cache route in IP tunnels")
Acked-by: Tom Herbert
Signed-off-by: David S. Miller -
We were using an atomic bitop on the vgic_v2.vgic_elrsr field which was
not aligned to the natural size on 64-bit platforms. This bug showed up
after QEMU correctly identifies the pl011 line as being level-triggered,
and not edge-triggered.These data structures are protected by a spinlock so simply use a
non-atomic version of the accessor instead.Tested-by: Joel Schopp
Reported-by: Riku Voipio
Signed-off-by: Christoffer Dall -
Commit 2da78092 changed the locking from a mutex to a spinlock,
so we now longer sleep in this context. But there was a leftover
might_sleep() in there, which now triggers since we do the final
free from an RCU callback. Get rid of it.Reported-by: Pontus Fuchs
Signed-off-by: Jens Axboe -
Steffen Klassert says:
====================
pull request (net): ipsec 2014-09-22We generate a blackhole or queueing route if a packet
matches an IPsec policy but a state can't be resolved.
Here we assume that dst_output() is called to kill
these packets. Unfortunately this assumption is not
true in all cases, so it is possible that these packets
leave the system without the necessary transformations.This pull request contains two patches to fix this issue:
1) Fix for blackhole routed packets.
2) Fix for queue routed packets.
Both patches are serious stable candidates.
Please pull or let me know if there are problems.
====================Signed-off-by: David S. Miller
-
In one error condition dm9000_parse_dt() returns NULL, however the
return value is checked using IS_ERR() in dm9000_probe(), leading to the
error not being properly propagated if CONFIG_OF is not enabled or the
device tree data is not available. Fix this by also returning an
ERR_PTR() in this case.Fixes: 0b8bf1baabe5 (net: dm9000: Allow instantiation using device tree)
Signed-off-by: Tobias Klauser
Signed-off-by: David S. Miller -
There is an extra semi-colon so __rtl8169_set_features() is called every
time.Fixes: 929a031dfd62 ('r8169: adjust __rtl8169_set_features')
Signed-off-by: Dan Carpenter
Acked-by: Hayes Wang --
Signed-off-by: David S. Miller -
Pull KVM fixes from Paolo Bonzini:
"Two very simple bugfixes, affecting all supported architectures"[ Two? There's three commits in here. Oh well, I guess Paolo didn't
count the preparatory symbol export ]* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: correct null pid check in kvm_vcpu_yield_to()
KVM: check for !is_zero_pfn() in kvm_is_mmio_pfn()
mm: export symbol dependencies of is_zero_pfn() -
If the hw is in ALDPS mode, the hw may have no response for accessing
the most registers. Therefore, the ALDPS should be disabled before
accessing the hw in rtl_ops.init(), rtl_ops.disable(), rtl_ops.up(),
and rtl_ops.down(). Regardless of rtl_ops.enable(), because the hw
wouldn't enter ALDPS mode when linking on. The hw would enter the
ALDPS mode after several seconds when link down occurs and the ALDPS
is enabled.Signed-off-by: Hayes Wang
Signed-off-by: David S. Miller -
To catch future errors sooner.
Signed-off-by: Eric Dumazet
Signed-off-by: David S. Miller -
We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
or increasing skb->cb[] size.Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
skb_flow_dissect()") broke IPoIB.Only current offender is sch_choke, and this one do not need an
absolutely precise flow key.If we store 17 bytes of flow key, its more than enough. (Its the actual
size of flow_keys if it was a packed structure, but we might add new
fields at the end of it later)Signed-off-by: Eric Dumazet
Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
Signed-off-by: David S. Miller -
TG3 appears to have an issue performing TSO and checksum offloading
correclty when the frame has been vlan encapsulated (non-accelrated).
In these cases, tcp checksum is not correctly updated.This patch attempts to work around this issue. After the patch,
802.1ad vlans start working correctly over tg3 devices.CC: Prashant Sreedharan
CC: Michael Chan
Signed-off-by: Vladislav Yasevich
Signed-off-by: David S. Miller -
tmp103 temperature sensor driver registers with the hwmon framework by calling
hwmon_device_register_with_groups but does not have a .remove method to call
hwmon_device_unregister to unregister from the framework when the device is no
longer needed. Fix this by calling devm_hwmon_device_register_with_groups.Signed-off-by: Sundar J Dev
Signed-off-by: Guenter Roeck -
Since commit 412ca1550cbe ("macvlan: Move broadcasts into a work queue"), the
driver uses tx_queue_len of the master device as the limit of packets enqueuing.
Problem is that virtual drivers have this value set to 0, thus all broadcast
packets were rejected.
Because tx_queue_len was arbitrarily chosen, I replace it with a static limit
of 1000 (also arbitrarily chosen).CC: Herbert Xu
Reported-by: Thibaut Collet
Suggested-by: Thibaut Collet
Tested-by: Thibaut Collet
Signed-off-by: Nicolas Dichtel
Acked-by: Herbert Xu
Signed-off-by: David S. Miller -
When requests are retried due to hw or sw resource shortages,
we often stop the associated hardware queue. So ensure that we
restart the queues when running the requeue work, otherwise the
queue run will be a no-op.Signed-off-by: Jens Axboe
-
__blk_mq_alloc_rq_maps() can be invoked multiple times, if we scale
back the queue depth if we are low on memory. So don't clear
set->tags when we fail, this is handled directly in
the parent function, blk_mq_alloc_tag_set().Reported-by: Robert Elliott
Signed-off-by: Jens Axboe -
We should not insert requests into the flush state machine from
blk_mq_insert_request. All incoming flush requests come through
blk_{m,s}q_make_request and are handled there, while blk_execute_rq_nowait
should only be called for BLOCK_PC requests. All other callers
deal with requests that already went through the flush statemchine
and shouldn't be reinserted into it.Reported-by: Robert Elliott
Debugged-by: Ming Lei
Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe -
This patch should fix the bug reported in
https://lkml.org/lkml/2014/9/11/249.We have to initialize at least the atomic_flags and the cmd_flags when
allocating storage for the requests.Otherwise blk_mq_timeout_check() might dereference uninitialized
pointers when racing with the creation of a request.Also move the reset of cmd_flags for the initializing code to the point
where a request is freed. So we will never end up with pending flush
request indicators that might trigger dereferences of invalid pointers
in blk_mq_timeout_check().Cc: stable@vger.kernel.org
Signed-off-by: David Hildenbrand
Reported-by: Paulo De Rezende Pinatti
Tested-by: Paulo De Rezende Pinatti
Acked-by: Christian Borntraeger
Signed-off-by: Jens Axboe -
When we start the request, we set the deadline and flip the bits
marking the request as started and non-complete. However, it's
important that the deadline store is ordered before flipping the
bits, otherwise we could have a small window where the request is
marked started but with an invalid deadline. This can confuse the
timeout handling.Suggested-by: Ming Lei
Signed-off-by: Jens Axboe -
Fixes the following randconfig build failure:
> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c: In function
> ‘pch_ptp_match’:
> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:130:2: error:
> implicit declaration of function ‘ptp_classify_raw’
> [-Werror=implicit-function-declaration]
> if (ptp_classify_raw(skb) == PTP_CLASS_NONE)
> ^
> cc1: some warnings being treated as errors
> make[5]: *** [drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o] Error 1Reported-by: Jim Davis
Signed-off-by: David S. Miller -
LIBFC depends upon SCSI_FC_ATTRS and select's CRC32C.
The only alternative would be to 'select' CRC32C and all of
SCSI_FC_ATTRS direct and indirect dependencies in the Kconfig section
for every LIBFCOE user which makes little sense.Subsequently, use 'depends' instead of 'select' for LIBFCOE too.
Signed-off-by: David S. Miller
22 Sep, 2014
4 commits
-
Pull workqueue fix from Tejun Heo:
"create_singlethread_workqueue() is the old interface which is kept
around for backward compatibility - each should be reviewed to
determine whether singlethread usage was to save worker threads or for
ordering guarantee and whether it's depended upon by memory reclaim
path.While adding NUMA support for unbound workqueues during v3.10, I
forgot to update it breaking the singlethread and ordering properties
on NUMA setups. The breakage was unfortunately rather subtle and went
without being reported until now.The only missing piece is __WQ_ORDERED flag which makes the unbounded
workqueue use a single backend queue across different NUMA nodes.
It's fixed by making create_singlethread_workqueue() wrap
alloc_ordered_workqueue() so that possible future updates are
inherited automatically"* 'for-3.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: apply __WQ_ORDERED to create_singlethread_workqueue() -
On 32-bit architectures, the legacy buffer_head functions are not always
handling the sector number with the proper 64-bit types, and will thus
fail on 4TB+ disks.Any code that uses __getblk() (and thus bread(), breadahead(),
sb_bread(), sb_breadahead(), sb_getblk()), and calls it using a 64-bit
block on a 32-bit arch (where "long" is 32-bit) causes an inifinite loop
in __getblk_slow() with an infinite stream of errors logged to dmesg
like this:__find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
b_state=0x00000020, b_size=512
device sda1 blocksize: 512Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the
top 32-bits are missing (in this case the 0x1 at the top).This is because grow_dev_page() is broken and has a 32-bit overflow due
to shifting the page index value (a pgoff_t - which is just 32 bits on
32-bit architectures) left-shifted as the block number. But the top
bits to get lost as the pgoff_t is not type cast to sector_t / 64-bit
before the shift.This patch fixes this issue by type casting "index" to sector_t before
doing the left shift.Note this is not a theoretical bug but has been seen in the field on a
4TiB hard drive with logical sector size 512 bytes.This patch has been verified to fix the infinite loop problem on 3.17-rc5
kernel using a 4TB disk image mounted using "-o loop". Without this patch
doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop
100% reproducibly whilst with the patch it works fine as expected.Signed-off-by: Anton Altaparmakov
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds -
Correct a simple mistake of checking the wrong variable
before a dereference, resulting in the dereference not being
properly protected by rcu_dereference().Signed-off-by: Sam Bobroff
Signed-off-by: Paolo Bonzini