29 Jul, 2014
1 commit
-
If the worker is already executing a work item when another is queued,
we can safely skip wakeup without worrying about stalling queue thus
avoiding waking up the busy worker spuriously. Spurious wakeups
should be fine but still isn't nice and avoiding it is trivial here.tj: Updated description.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
23 Jul, 2014
6 commits
-
They are the same and nr_node_ids is provided by the memory subsystem.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
After the locking was moved up to the caller of the get_unbound_pool(),
out_unlock label doesn't need to do any unlock operation and the name
became bad, so we just remove this label, and the only usage-site
"goto out_unlock" is subsituted to "return pool".Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
In 75ccf5950f82 ("workqueue: prepare flush_workqueue() for dynamic
creation and destrucion of unbound pool_workqueues"), a comment
about the synchronization for the pwq in pwq_unbound_release_workfn()
was added. The comment claimed the flush_mutex wasn't strictly
necessary, it was correct in that time, due to the pwq was protected
by workqueue_lock.But it is incorrect now since the wq->flush_mutex was renamed to
wq->mutex and workqueue_lock was removed, the wq->mutex is strictly
needed. But the comment was miss-updated when the synchronization
was changed.This patch removes the incorrect comments and doesn't add any new
comment to explain why wq->mutex is needed here, which is definitely
obvious and wq->pwqs_node has "WQ" notation in its definition which is
better comment.The old commit mentioned above also introduced a comment in link_pwq()
about the synchronization. This comment is also removed in this patch
since the whole link_pwq() is proteced by wq->mutex.Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
In 51697d393922 ("workqueue: use generic attach/detach routine for
rescuers"), The rescuer detaches itself from the pool before put_pwq()
so that the put_unbound_pool() will not destroy the rescuer-attached
pool.It is unnecessary. worker_detach_from_pool() can be used as the last
statement to access to the pool just like the regular workers,
put_unbound_pool() will wait for it to detach and then free the pool.So we move the worker_detach_from_pool() down, make it coincide with
the regular workers.tj: Minor description update.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
Simply unfold the code of start_worker() into create_worker() and
remove the original start_worker() and create_and_start_worker().The only trade-off is the introduced overhead that the pool->lock
is released and regrabbed after the newly worker is started.
The overhead is acceptible since the manager is slow path.And because this new locking behavior, the newly created worker
may grab the lock earlier than the manager and go to process
work items. In this case, the recheck need_to_create_worker() may be
true as expected and the manager goes to restart which is the
correct behavior.tj: Minor updates to description and comments.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
worker_set_flags() has only two callers, each specifying %true and
%false for @wakeup. Let's push the wake up to the caller and remove
@wakeup from worker_set_flags(). The caller can use the following
instead if wakeup is necessary:worker_set_flags();
if (need_more_worker(pool))
wake_up_worker(pool);This makes the code simpler. This patch doesn't introduce behavior
changes.tj: Updated description and comments.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
22 Jul, 2014
1 commit
-
In process_one_work():
if ((worker->flags & WORKER_UNBOUND) && need_more_worker(pool))
wake_up_worker(pool);the first test is unneeded. Even if the first test is removed, it
doesn't affect the wake-up logic for WORKER_UNBOUND, and it will not
introduce any useless wake-ups for normal per-cpu workers since
nr_running is always >= 1. It will introduce useless/redundant
wake-ups for CPU_INTENSIVE, but this case is rare and the next patch
will also remove this redundant wake-up.tj: Minor updates to the description and comment.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
19 Jul, 2014
1 commit
-
We don't need to wake up regular worker when nr_running==1,
so need_more_worker() is sufficient here.And need_more_worker() gives us better readability due to the name of
"keep_working()" implies the rescuer should keep working now but
the rescuer is actually leaving.Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
15 Jul, 2014
1 commit
-
When the create_worker() is called from non-manager, the struct worker
is allocated from the node of the caller which may be different from the
node of pool->node.So we add a node ID argument for the alloc_worker() to ensure the
struct worker is allocated from the preferable node.tj: @nid renamed to @node for consistency.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
11 Jul, 2014
1 commit
-
try_to_grab_pending() was re-calculating the associated pwq using
get_work_pwq() when it already has it cached in a local varible and
the association can't change. Reuse the local variable instead.This doesn't introduce any functional changes.
tj: Updated description.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
02 Jul, 2014
2 commits
-
When POOL_DISASSOCIATED is cleared, the running worker's local CPU should
be the same as pool->cpu without any exception even during cpu-hotplug.This patch changes "(proposition_A && proposition_B && proposition_C)"
to "(proposition_B && proposition_C)", so if the old compound
proposition is true, the new one must be true too. so this won't hide
any possible bug which can be hit by old test.tj: Minor description update and dropped the obvious comment.
CC: Jason J. Herne
CC: Sasha Levin
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
a9ab775bcadf ("workqueue: directly restore CPU affinity of workers
from CPU_ONLINE") moved pool locking into rebind_workers() but left
"pool->flags &= ~POOL_DISASSOCIATED" in workqueue_cpu_up_callback().There is nothing necessarily wrong with it, but there is no benefit
either. Let's move it into rebind_workers() and achieve the following
benefits:1) better readability, POOL_DISASSOCIATED is cleared in rebind_workers()
as expected.2) we can guarantee that, when POOL_DISASSOCIATED is clear, the
running workers of the pool are on the local CPU (pool->cpu).tj: Minor description update.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
20 Jun, 2014
6 commits
-
In theory, pool->cpu is equals to @cpu in wq_worker_sleeping() after
worker->flags is checked.And "pool->cpu != cpu" sanity check will help us if something wrong.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
When a worker is detached, the worker->flags may still have WORKER_UNBOUND
or WORKER_REBOUND, it is OK for all cases:
1) if it is a normal worker, the worker will be dead, it is OK.
2) if it is a rescuer, it may re-attach to a pool with this leftover flag[s],
it is still correct except it may cause unneeded wakeup.It is correct but not good, so we just remove the leftover flags.
Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
The @cpu is fetched via smp_processor_id() in this function,
so the check is useless.Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
schedule_timeout_interruptible(CREATE_COOLDOWN) is exactly the same as
the original code.Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
The commit ea1abd6197d5 ("workqueue: reimplement idle worker rebinding")
used a trick which simply removes all to-be-bound idle workers from the
idle list and lets them add themselves back after completing rebinding.And this trick caused the @worker_pool->nr_idle may deviate than the actual
number of idle workers on @worker_pool->idle_list. More specifically,
nr_idle may be non-zero while ->idle_list is empty. All users of
->nr_idle and ->idle_list are audited. The only affected one is
too_many_workers() which is updated to check %false if ->idle_list is
empty regardless of ->nr_idle.The commit/trick was complicated due to it just tried to simplify an even
more complicated problem (workers had to rebind itself). But the commit
a9ab775bcadf ("workqueue: directly restore CPU affinity of workers
from CPU_ONLINE") fixed all these problems and the mentioned trick was
useless and is gone.So, now the @worker_pool->nr_idle is exactly the actual number of workers
on @worker_pool->idle_list. too_many_workers() should recover as it was
before the trick. So we remove the empty check.Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo -
There is a piece of sanity checks code in the put_unbound_pool().
The meaning of this code is "if it is not an unbound pool, it will complain
and return" IIUC. But the code uses "pool->flags & POOL_DISASSOCIATED"
imprecisely due to a non-unbound pool may also have this flags.We should use "pool->cpu < 0" to stand for an unbound pool, so we covert the
code to it.There is no strictly wrong if we still keep "pool->flags & POOL_DISASSOCIATED"
here, but it is just a noise if we keep it:
1) we focus on "unbound" here, not "[dis]association".
2) "pool->cpu < 0" already implies "pool->flags & POOL_DISASSOCIATED".Signed-off-by: Lai Jiangshan
Signed-off-by: Tejun Heo
17 Jun, 2014
1 commit
-
This fixes use-after-free of epi->fllink.next inside list loop macro.
This loop actually releases elements in the body. The list is
rcu-protected but here we cannot hold rcu_read_lock because we need to
lock mutex inside.The obvious solution is to use list_for_each_entry_safe(). RCU-ness
isn't essential because nobody can change this list under us, it's final
fput for this file.The bug was introduced by ae10b2b4eb01 ("epoll: optimize EPOLL_CTL_DEL
using rcu")Signed-off-by: Konstantin Khlebnikov
Reported-by: Cyrill Gorcunov
Cc: Stable # 3.13+
Cc: Sasha Levin
Cc: Jason Baron
Signed-off-by: Linus Torvalds
16 Jun, 2014
5 commits
-
This reverts commit e1edf18b20076da83dd231dbd2146cbbc31c0b14.
This patch was a misguided attempt at fixing offb for LE ppc64
kernels on BE qemu but is just wrong ... it breaks real LE/LE
setups, LE with real HW, and existing mixed endian systems
that did the fight thing with the appropriate device-tree
property. Bad reviewing on my part, sorry.The right fix is to either make qemu change its endian when
the guest changes endian (working on that) or to use the
existing foreign endian support.Signed-off-by: Benjamin Herrenschmidt
CC: [v3.13+]
--- -
Pull networking fixes from David Miller:
1) Fix checksumming regressions, from Tom Herbert.
2) Undo unintentional permissions changes for SCTP rto_alpha and
rto_beta sysfs knobs, from Denial Borkmann.3) VXLAN, like other IP tunnels, should advertize it's encapsulation
size using dev->needed_headroom instead of dev->hard_header_len.
From Cong Wang.* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
net: sctp: fix permissions for rto_alpha and rto_beta knobs
vxlan: Checksum fixes
net: add skb_pop_rcv_encapsulation
udp: call __skb_checksum_complete when doing full checksum
net: Fix save software checksum complete
net: Fix GSO constants to match NETIF flags
udp: ipv4: do not waste time in __udp4_lib_mcast_demux_lookup
vxlan: use dev->needed_headroom instead of dev->hard_header_len
MAINTAINERS: update cxgb4 maintainer -
Pull more clock framework updates from Mike Turquette:
"This contains the second half the of the clk changes for 3.16.They are simply fixes and code refactoring for the OMAP clock drivers.
The sunxi clock driver changes include splitting out the one
mega-driver into several smaller pieces and adding support for the A31
SoC clocks"* tag 'clk-for-linus-3.16-part2' of git://git.linaro.org/people/mike.turquette/linux: (25 commits)
clk: sunxi: document PRCM clock compatible strings
clk: sunxi: add PRCM (Power/Reset/Clock Management) clks support
clk: sun6i: Protect SDRAM gating bit
clk: sun6i: Protect CPU clock
clk: sunxi: Rework clock protection code
clk: sunxi: Move the GMAC clock to a file of its own
clk: sunxi: Move the 24M oscillator to a file of its own
clk: sunxi: Remove calls to clk_put
clk: sunxi: document new A31 USB clock compatible
clk: sunxi: Implement A31 USB clock
ARM: dts: OMAP5/DRA7: use omap5-mpu-dpll-clock capable of dealing with higher frequencies
CLK: TI: dpll: support OMAP5 MPU DPLL that need special handling for higher frequencies
ARM: OMAP5+: dpll: support Duty Cycle Correction(DCC)
CLK: TI: clk-54xx: Set the rate for dpll_abe_m2x2_ck
CLK: TI: Driver for DRA7 ATL (Audio Tracking Logic)
dt:/bindings: DRA7 ATL (Audio Tracking Logic) clock bindings
ARM: dts: dra7xx-clocks: Correct name for atl clkin3 clock
CLK: TI: gate: add composite interface clock to OMAP2 only build
ARM: OMAP2: clock: add DT boot support for cpufreq_ck
CLK: TI: OMAP2: add clock init support
... -
Pull NVMe update from Matthew Wilcox:
"Mostly bugfixes again for the NVMe driver. I'd like to call out the
exported tracepoint in the block layer; I believe Keith has cleared
this with Jens.We've had a few reports from people who're really pounding on NVMe
devices at scale, hence the timeout changes (and new module
parameters), hotplug cpu deadlock, tracepoints, and minor performance
tweaks"[ Jens hadn't seen that tracepoint thing, but is ok with it - it will
end up going away when mq conversion happens ]* git://git.infradead.org/users/willy/linux-nvme: (22 commits)
NVMe: Fix START_STOP_UNIT Scsi->NVMe translation.
NVMe: Use Log Page constants in SCSI emulation
NVMe: Define Log Page constants
NVMe: Fix hot cpu notification dead lock
NVMe: Rename io_timeout to nvme_io_timeout
NVMe: Use last bytes of f/w rev SCSI Inquiry
NVMe: Adhere to request queue block accounting enable/disable
NVMe: Fix nvme get/put queue semantics
NVMe: Delete NVME_GET_FEAT_TEMP_THRESH
NVMe: Make admin timeout a module parameter
NVMe: Make iod bio timeout a parameter
NVMe: Prevent possible NULL pointer dereference
NVMe: Fix the buffer size passed in GetLogPage(CDW10.NUMD)
NVMe: Update data structures for NVMe 1.2
NVMe: Enable BUILD_BUG_ON checks
NVMe: Update namespace and controller identify structures to the 1.1a spec
NVMe: Flush with data support
NVMe: Configure support for block flush
NVMe: Add tracepoints
NVMe: Protect against badly formatted CQEs
...
15 Jun, 2014
15 commits
-
Commit 3fd091e73b81 ("[SCTP]: Remove multiple levels of msecs
to jiffies conversions.") has silently changed permissions for
rto_alpha and rto_beta knobs from 0644 to 0444. The purpose of
this was to discourage users from tweaking rto_alpha and
rto_beta knobs in production environments since they are key
to correctly compute rtt/srtt.RFC4960 under section 6.3.1. RTO Calculation says regarding
rto_alpha and rto_beta under rule C3 and C4:[...]
C3) When a new RTT measurement R' is made, setRTTVAR
Cc: Vlad Yasevich
Signed-off-by: David S. Miller -
Tom Herbert says:
====================
Fixes related to some recent checksum modifications.- Fix GSO constants to match NETIF flags
- Fix logic in saving checksum complete in __skb_checksum_complete
- Call __skb_checksum_complete from UDP if we are checksumming over
whole packet in order to save checksum.
- Fixes to VXLAN to work correctly with checksum complete
====================Signed-off-by: David S. Miller
-
Call skb_pop_rcv_encapsulation and postpull_rcsum for the Ethernet
header to work properly with checksum complete.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
This function is used by UDP encapsulation protocols in RX when
crossing encapsulation boundary. If ip_summed is set to
CHECKSUM_UNNECESSARY and encapsulation is not set, change to
CHECKSUM_NONE since the checksum has not been validated within the
encapsulation. Clears csum_valid by the same rationale.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
In __udp_lib_checksum_complete check if checksum is being done over all
the data (len is equal to skb->len) and if it is call
__skb_checksum_complete instead of __skb_checksum_complete_head. This
allows checksum to be saved in checksum complete.Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
Geert reported issues regarding checksum complete and UDP.
The logic introduced in commit 7e3cead5172927732f51fde
("net: Save software checksum complete") is not correct.This patch:
1) Restores code in __skb_checksum_complete_header except for setting
CHECKSUM_UNNECESSARY. This function may be calculating checksum on
something less than skb->len.
2) Adds saving checksum to __skb_checksum_complete. The full packet
checksum 0..skb->len is calculated without adding in pseudo header.
This value is saved in skb->csum and then the pseudo header is added
to that to derive the checksum for validation.
3) In both __skb_checksum_complete_header and __skb_checksum_complete,
set skb->csum_valid to whether checksum of zero was computed. This
allows skb_csum_unnecessary to return true without changing to
CHECKSUM_UNNECESSARY which was done previously.
4) Copy new csum related bits in __copy_skb_header.Reported-by: Geert Uytterhoeven
Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
Joseph Gasparakis reported that VXLAN GSO offload stopped working with
i40e device after recent UDP changes. The problem is that the
SKB_GSO_* bits are out of sync with the corresponding NETIF flags. This
patch fixes that. Also, we add BUILD_BUG_ONs in net_gso_ok for several
GSO constants that were missing to avoid the problem in the future.Reported-by: Joseph Gasparakis
Signed-off-by: Tom Herbert
Signed-off-by: David S. Miller -
Pull more SCSI updates from James Bottomley:
"This is just a couple of drivers (hpsa and lpfc) that got left out for
further testing in linux-next. We also have one fix to a prior
submission (qla2xxx sparse)"* tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (36 commits)
qla2xxx: fix sparse warnings introduced by previous target mode t10-dif patch
lpfc: Update lpfc version to driver version 10.2.8001.0
lpfc: Fix ExpressLane priority setup
lpfc: mark old devices as obsolete
lpfc: Fix for initializing RRQ bitmap
lpfc: Fix for cleaning up stale ring flag and sp_queue_event entries
lpfc: Update lpfc version to driver version 10.2.8000.0
lpfc: Update Copyright on changed files from 8.3.45 patches
lpfc: Update Copyright on changed files
lpfc: Fixed locking for scsi task management commands
lpfc: Convert runtime references to old xlane cfg param to fof cfg param
lpfc: Fix FW dump using sysfs
lpfc: Fix SLI4 s abort loop to process all FCP rings and under ring_lock
lpfc: Fixed kernel panic in lpfc_abort_handler
lpfc: Fix locking for postbufq when freeing
lpfc: Fix locking for lpfc_hba_down_post
lpfc: Fix dynamic transitions of FirstBurst from on to off
hpsa: fix handling of hpsa_volume_offline return value
hpsa: return -ENOMEM not -1 on kzalloc failure in hpsa_get_device_id
hpsa: remove messages about volume status VPD inquiry page not supported
... -
Pull more btrfs updates from Chris Mason:
"This has a few fixes since our last pull and a new ioctl for doing
btree searches from userland. It's very similar to the existing
ioctl, but lets us return larger items back down to the app"* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
btrfs: fix error handling in create_pending_snapshot
btrfs: fix use of uninit "ret" in end_extent_writepage()
btrfs: free ulist in qgroup_shared_accounting() error path
Btrfs: fix qgroups sanity test crash or hang
btrfs: prevent RCU warning when dereferencing radix tree slot
Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting
btrfs: new ioctl TREE_SEARCH_V2
btrfs: tree_search, search_ioctl: direct copy to userspace
btrfs: new function read_extent_buffer_to_user
btrfs: tree_search, copy_to_sk: return needed size on EOVERFLOW
btrfs: tree_search, copy_to_sk: return EOVERFLOW for too small buffer
btrfs: tree_search, search_ioctl: accept varying buffer
btrfs: tree_search: eliminate redundant nr_items check -
Pull aio fix and cleanups from Ben LaHaise:
"This consists of a couple of code cleanups plus a minor bug fix"* git://git.kvack.org/~bcrl/aio-next:
aio: cleanup: flatten kill_ioctx()
aio: report error from io_destroy() when threads race in io_destroy()
fs/aio.c: Remove ctx parameter in kiocb_cancel -
Tetsuo Handa wrote:
"Commit 62a8067a7f35 ("bio_vec-backed iov_iter") introduced an unnamed
union inside a struct which gcc-4.4.7 cannot handle. Name the unnamed
union as u in order to fix build failure"Let's do this instead: there is only one place in the entire tree that
steps into this breakage. Anon structs and unions work in older gcc
versions; as the matter of fact, we have those in the tree - see e.g.
struct ieee80211_tx_info in include/net/mac80211.hWhat doesn't work is handling their initializers:
struct {
int a;
union {
int b;
char c;
};
} x[2] = {{.a = 1, .c = 'a'}, {.a = 0, .b = 1}};is the obvious syntax for initializer, perfectly fine for C11 and
handled correctly by gcc-4.7 or later.Earlier versions, though, break on it - declaration is fine and so's
access to fields (i.e. x[0].c = 'a'; would produce the right code), but
members of the anon structs and unions are not inserted into the right
namespace. Tellingly, those older versions will not barf on struct {int
a; struct {int a;};}; - looks like they just have it hacked up somewhere
around the handling of . and -> instead of doing the right thing.The easiest way to deal with that crap is to turn initialization of
those fields (in the only place where we have such initializer of
iov_iter) into plain assignment.Reported-by: Tetsuo Handa
Reported-by: Russell King
Signed-off-by: Al Viro
Signed-off-by: Linus Torvalds -
Pull HSI build fixes from Sebastian Reichel:
- tighten dependency between ssi-protocol and omap-ssi to fix build
failures with randconfig.
- use normal module refcounting in omap driver to fix build with
disabled module support* tag 'hsi-for-3.16-fixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-hsi:
hsi: omap_ssi_port: use normal module refcounting
HSI: fix omap ssi driver dependency -
Pull GPIO fix from Linus Walleij:
"A first GPIO fix for the v3.16 series, this was serious since it
blocks the OMAP boot.Sending you this vital fix before leaving for a short vacation so it
does not sit collecting dust in my tree for no good reason.Apart from this, our v3.16 cycle looks like a good start"
* tag 'gpio-v3.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: of: Fix handling for deferred probe for -gpio suffix -
Pull x86 vdso fixes from Peter Anvin:
"Fixes for x86/vdso.One is a simple build fix for bigendian hosts, one is to make "make
vdso_install" work again, and the rest is about working around a bug
in Google's Go language -- two are documentation patches that improves
the sample code that the Go coders took, modified, and broke; the
other two implements a workaround that keeps existing Go binaries from
segfaulting at least"* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vdso: Fix vdso_install
x86/vdso: Hack to keep 64-bit Go programs working
x86/vdso: Add PUT_LE to store little-endian values
x86/vdso/doc: Make vDSO examples more portable
x86/vdso/doc: Rename vdso_test.c to vdso_standalone_test_x86.c
x86, vdso: Remove one final use of htole16() -
Pull hwmon updates from Guenter Roeck:
- new driver for Sensirion SHTC1 humidity / temperature sensor
- convert ltc4151 and vexpress drivers to use devm functions
- drop generic chip detection from lm85 driver
- avoid forward declarations in atxp1 driver
- fix sign extensions in ina2xx driver* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: vexpress: Use devm helper for hwmon device registration
hwmon: (atxp1) Avoid forward declaration
hwmon: add support for Sensirion SHTC1 sensor
hwmon: (ltc4151) Convert to devm_hwmon_device_register_with_groups
hwmon: (lm85) Drop generic detection
hwmon: (ina2xx) Cast to s16 on shunt and current regs