23 May, 2017
38 commits
-
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in async_run_entry_fn() and
async_synchronize_cookie_domain() to handle the extra states.Tested-by: Mark Rutland
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Arjan van de Ven
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170516184735.865155020@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in of_iommu_driver_present() to handle the
extra states.Tested-by: Mark Rutland
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Joerg Roedel
Acked-by: Robin Murphy
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170516184735.788023442@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state checks in dmar_parse_one_atsr() and
dmar_iommu_notify_scope_dev() to handle the extra states.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Joerg Roedel
Cc: David Woodhouse
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: iommu@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170516184735.712365947@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in pas_cpufreq_cpu_exit() to handle the extra
states.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Viresh Kumar
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Rafael J. Wysocki
Cc: Steven Rostedt
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170516184735.620023128@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.get_nid_for_pfn() checks for system_state == BOOTING to decide whether to
use early_pfn_to_nid() when CONFIG_DEFERRED_STRUCT_PAGE_INIT=y.That check is dubious, because the switch to state RUNNING happes way after
page_alloc_init_late() has been invoked.Change the check to less than RUNNING state so it covers the new
intermediate states as well.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Mel Gorman
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170516184735.528279534@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Make the decision whether a pci root is hotplugged depend on SYSTEM_RUNNING
instead of !SYSTEM_BOOTING. It makes no sense to cover states greater than
SYSTEM_RUNNING as there are not hotplug events on reboot and poweroff.Tested-by: Mark Rutland
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Cc: Greg Kroah-Hartman
Cc: Len Brown
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Rafael J. Wysocki
Link: http://lkml.kernel.org/r/20170516184735.446455652@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in smp_generic_cpu_bootable() to handle the
extra states.Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Michael Ellerman
Cc: Benjamin Herrenschmidt
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Paul Mackerras
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170516184735.359536998@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in stop_this_cpu() to handle the extra states.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Greg Kroah-Hartman
Cc: James Hogan
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170516184735.283420315@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in announce_cpu() to handle the extra states.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Peter Zijlstra
Link: http://lkml.kernel.org/r/20170516184735.191715856@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in smp_send_stop() to handle the extra states.
Tested-by: Mark Rutland
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Mark Rutland
Acked-by: Catalin Marinas
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Steven Rostedt
Cc: Will Deacon
Link: http://lkml.kernel.org/r/20170516184735.112589728@linutronix.de
Signed-off-by: Ingo Molnar -
To enable smp_processor_id() and might_sleep() debug checks earlier, it's
required to add system states between SYSTEM_BOOTING and SYSTEM_RUNNING.Adjust the system_state check in ipi_cpu_stop() to handle the extra states.
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Mark Rutland
Cc: Peter Zijlstra
Cc: Russell King
Cc: Steven Rostedt
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170516184735.020718977@linutronix.de
Signed-off-by: Ingo Molnar -
Some of the boot code in init_kernel_freeable() which runs before SMP
bringup assumes (rightfully) that it runs on the boot CPU and therefore can
use smp_processor_id() in preemptible context.That works so far because the smp_processor_id() check starts to be
effective after smp bringup. That's just wrong. Starting with SMP bringup
and the ability to move threads around, smp_processor_id() in preemptible
context is broken.Aside of that it does not make sense to allow init to run on all CPUs
before sched_smp_init() has been run.Pin the init to the boot CPU so the existing code can continue to use
smp_processor_id() without triggering the checks when the enabling of those
checks starts earlier.Tested-by: Mark Rutland
Signed-off-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Greg Kroah-Hartman
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Steven Rostedt
Link: http://lkml.kernel.org/r/20170516184734.943149935@linutronix.de
Signed-off-by: Ingo Molnar -
A customer has reported a soft-lockup when running an intensive
memory stress test, where the trace on multiple CPU's looks like this:RIP: 0010:[]
[] native_queued_spin_lock_slowpath+0x10e/0x190
...
Call Trace:
[] queued_spin_lock_slowpath+0x7/0xa
[] change_protection_range+0x3b1/0x930
[] change_prot_numa+0x18/0x30
[] task_numa_work+0x1fe/0x310
[] task_work_run+0x72/0x90Further investigation showed that the lock contention here is pmd_lock().
The task_numa_work() function makes sure that only one thread is let to perform
the work in a single scan period (via cmpxchg), but if there's a thread with
mmap_sem locked for writing for several periods, multiple threads in
task_numa_work() can build up a convoy waiting for mmap_sem for read and then
all get unblocked at once.This patch changes the down_read() to the trylock version, which prevents the
build up. For a workload experiencing mmap_sem contention, it's probably better
to postpone the NUMA balancing work anyway. This seems to have fixed the soft
lockups involving pmd_lock(), which is in line with the convoy theory.Signed-off-by: Vlastimil Babka
Signed-off-by: Peter Zijlstra (Intel)
Acked-by: Rik van Riel
Acked-by: Mel Gorman
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/20170515131316.21909-1-vbabka@suse.cz
Signed-off-by: Ingo Molnar -
With CONFIG_RT_GROUP_SCHED=y, do_sched_rt_period_timer() sequentially
takes each CPU's rq->lock. On a large, busy system, the cumulative time it
takes to acquire each lock can be excessive, even triggering a watchdog
timeout.If rt_rq->rt_time and rt_rq->rt_nr_running are both zero, this function does
nothing while holding the lock, so don't bother taking it at all.Signed-off-by: Dave Kleikamp
Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/a767637b-df85-912f-ba69-c90ee00a3fb6@oracle.com
Signed-off-by: Ingo Molnar -
When priority inheritance was added back in 2.6.18 to sched_setscheduler(), it
added a path to taking an rt-mutex wait_lock, which is not IRQ safe. As PI
is not a common occurrence, lockdep will likely never trigger if
sched_setscheduler was called from interrupt context. A BUG_ON() was added
to trigger if __sched_setscheduler() was ever called from interrupt context
because there was a possibility to take the wait_lock.Today the wait_lock is irq safe, but the path to taking it in
sched_setscheduler() is the same as the path to taking it from normal
context. The wait_lock is taken with raw_spin_lock_irq() and released with
raw_spin_unlock_irq() which will indiscriminately enable interrupts,
which would be bad in interrupt context.The problem is that normalize_rt_tasks, which is called by triggering the
sysrq nice-all-RT-tasks was changed to call __sched_setscheduler(), and this
is done from interrupt context!Now __sched_setscheduler() takes a "pi" parameter that is used to know if
the priority inheritance should be called or not. As the BUG_ON() only cares
about calling the PI code, it should only bug if called from interrupt
context with the "pi" parameter set to true.Reported-by: Laurent Dufour
Tested-by: Laurent Dufour
Signed-off-by: Steven Rostedt (VMware)
Signed-off-by: Peter Zijlstra (Intel)
Cc: Andrew Morton
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Fixes: dbc7f069b93a ("sched: Use replace normalize_task() with __sched_setscheduler()")
Link: http://lkml.kernel.org/r/20170308124654.10e598f2@gandalf.local.home
Signed-off-by: Ingo Molnar -
pick_next_pushable_dl_task(rq) has BUG_ON(rq->cpu != task_cpu(task))
when it returns a task other than NULL, which means that task_cpu(task)
must be rq->cpu. So if task == next_task, then task_cpu(next_task) must
be rq->cpu as well. Remove the redundant condition and make the code simpler.This way one unnecessary branch and two LOAD operations can be avoided.
Signed-off-by: Byungchul Park
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Reviewed-by: Juri Lelli
Reviewed-by: Daniel Bristot de Oliveira
Cc:
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1494551159-22367-1-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar -
pick_next_pushable_task(rq) has BUG_ON(rq_cpu != task_cpu(task)) when
it returns a task other than NULL, which means that task_cpu(task) must
be rq->cpu. So if task == next_task, then task_cpu(next_task) must be
rq->cpu as well. Remove the redundant condition and make the code simpler.This way one unnecessary branch and two LOAD operations can be avoided.
Signed-off-by: Byungchul Park
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Steven Rostedt (VMware)
Reviewed-by: Juri Lelli
Reviewed-by: Daniel Bristot de Oliveira
Cc:
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1494551143-22219-1-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar -
Now that we've added llist_for_each_entry_safe(), use it to simplify
an open coded version of it in sched_ttwu_pending().Signed-off-by: Byungchul Park
Signed-off-by: Peter Zijlstra (Intel)
Cc:
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1494549584-11730-1-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar -
Sometimes we have to dereference next field of llist node before entering
loop becasue the node might be deleted or the next field might be
modified within the loop. So this adds the safe version of llist_for_each(),
that is, llist_for_each_safe().Signed-off-by: Byungchul Park
Signed-off-by: Peter Zijlstra (Intel)
Reviewed-by: Huang, Ying
Cc:
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1494549416-10539-1-git-send-email-byungchul.park@lge.com
Signed-off-by: Ingo Molnar -
The cpumasks in smp_call_function_many() are private and not subject
to concurrency, atomic bitops are pointless and expensive.Signed-off-by: Peter Zijlstra (Intel)
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar -
Inter-Processor-Interrupt(IPI) is needed when a page is unmapped and the
process' mm_cpumask() shows the process has ever run on other CPUs. page
migration, page reclaim all need IPIs. The number of IPI needed to send
to different CPUs is especially large for multi-threaded workload since
mm_cpumask() is per process.For smp_call_function_many(), whenever a CPU queues a CSD to a target
CPU, it will send an IPI to let the target CPU to handle the work.
This isn't necessary - we need only send IPI when queueing a CSD
to an empty call_single_queue.The reason:
flush_smp_call_function_queue() that is called upon a CPU receiving an
IPI will empty the queue and then handle all of the CSDs there. So if
the target CPU's call_single_queue is not empty, we know that:
i. An IPI for the target CPU has already been sent by 'previous queuers';
ii. flush_smp_call_function_queue() hasn't emptied that CPU's queue yet.
Thus, it's safe for us to just queue our CSD there without sending an
addtional IPI. And for the 'previous queuers', we can limit it to the
first queuer.To demonstrate the effect of this patch, a multi-thread workload that
spawns 80 threads to equally consume 100G memory is used. This is tested
on a 2 node broadwell-EP which has 44cores/88threads and 32G memory. So
after 32G memory is used up, page reclaiming starts to happen a lot.With this patch, IPI number dropped 88% and throughput increased about
15% for the above workload.Signed-off-by: Aaron Lu
Signed-off-by: Peter Zijlstra (Intel)
Cc: Dave Hansen
Cc: Huang Ying
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Tim Chen
Link: http://lkml.kernel.org/r/20170519075331.GE2084@aaronlu.sh.intel.com
Signed-off-by: Ingo Molnar -
Signed-off-by: Ingo Molnar
-
Pull crypto fix from Herbert Xu:
"This fixes a regression in the skcipher interface that allows bogus
key parameters to hit underlying implementations which can cause
crashes"* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: skcipher - Add missing API setkey checks -
Pull pstore fix from Kees Cook:
"Marta noticed another misbehavior in EFI pstore, which this fixes.Hopefully this is the last of the v4.12 fixes for pstore!"
* tag 'pstore-v4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
efi-pstore: Fix write/erase id tracking -
Pull ACPI fixes from Rafael Wysocki:
"These revert a 4.11 change that turned out to be problematic and add a
.gitignore file.Specifics:
- Revert a 4.11 commit related to the ACPI-based handling of laptop
lids that made changes incompatible with existing user space stacks
and broke things there (Lv Zheng).- Add .gitignore to the ACPI tools directory (Prarit Bhargava)"
* tag 'acpi-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI / button: Remove lid_init_state=method mode"
tools/power/acpi: Add .gitignore file -
Pull power management fixes from Rafael Wysocki:
"These fix RTC wakeup from suspend-to-idle broken recently, fix CPU
idleness detection condition in the schedutil cpufreq governor, fix a
cpufreq driver build failure, fix an error code path in the power
capping framework, clean up the hibernate core and update the
intel_pstate documentation.Specifics:
- Fix RTC wakeup from suspend-to-idle broken by the recent rework of
ACPI wakeup handling (Rafael Wysocki).- Update intel_pstate driver documentation to reflect the current
code and explain how it works in more detail (Rafael Wysocki).- Fix an issue related to CPU idleness detection on systems with
shared cpufreq policies in the schedutil governor (Juri Lelli).- Fix a possible build issue in the dbx500 cpufreq driver (Arnd
Bergmann).- Fix a function in the power capping framework core to return an
error code instead of 0 when there's an error (Dan Carpenter).- Clean up variable definition in the hibernation core (Pushkar
Jambhlekar)"* tag 'pm-4.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: dbx500: add a Kconfig symbol
PM / hibernate: Declare variables as static
PowerCap: Fix an error code in powercap_register_zone()
RTC: rtc-cmos: Fix wakeup from suspend-to-idle
PM / wakeup: Fix up wakeup_source_report_event()
cpufreq: intel_pstate: Document the current behavior and user interface
cpufreq: schedutil: use now as reference when aggregating shared policy requests -
We need to initializes those variables to 0 for platforms that do not
provide ACPI parameters. Otherwise, we set sda_hold_time to random
values, breaking e.g. Galileo and IOT2000 boards.Reported-and-tested-by: Linus Torvalds
Reported-by: Tobias Klausmann
Fixes: 9d6408433019 ("i2c: designware: don't infer timings described by ACPI from clock rate")
Signed-off-by: Jan Kiszka
Reviewed-by: Ard Biesheuvel
Acked-by: Jarkko Nikula
Signed-off-by: Wolfram Sang
Signed-off-by: Linus Torvalds -
Prior to the pstore interface refactoring, the "id" generated during
a backend pstore_write() was only retained by the internal pstore
inode tracking list. Additionally the "part" was ignored, so EFI
would encode this in the id. This corrects the misunderstandings
and correctly sets "id" during pstore_write(), and uses "part"
directly during pstore_erase().Reported-by: Marta Lofstedt
Fixes: 76cc9580e3fb ("pstore: Replace arguments for write() API")
Fixes: a61072aae693 ("pstore: Replace arguments for erase() API")
Signed-off-by: Kees Cook
Tested-by: Marta Lofstedt -
Pull networking fixes from David Miller:
"Mostly netfilter bug fixes in here, but we have some bits elsewhere as
well.1) Don't do SNAT replies for non-NATed connections in IPVS, from
Julian Anastasov.2) Don't delete conntrack helpers while they are still in use, from
Liping Zhang.3) Fix zero padding in xtables's xt_data_to_user(), from Willem de
Bruijn.4) Add proper RCU protection to nf_tables_dump_set() because we
cannot guarantee that we hold the NFNL_SUBSYS_NFTABLES lock. From
Liping Zhang.5) Initialize rcv_mss in tcp_disconnect(), from Wei Wang.
6) smsc95xx devices can't handle IPV6 checksums fully, so don't
advertise support for offloading them. From Nisar Sayed.7) Fix out-of-bounds access in __ip6_append_data(), from Eric
Dumazet.8) Make atl2_probe() propagate the error code properly on failures,
from Alexey Khoroshilov.9) arp_target[] in bond_check_params() is used uninitialized. This
got changes from a global static to a local variable, which is how
this mistake happened. Fix from Jarod Wilson.10) Fix fallout from unnecessary NULL check removal in cls_matchall,
from Jiri Pirko. This is definitely brown paper bag territory..."* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
net: sched: cls_matchall: fix null pointer dereference
vsock: use new wait API for vsock_stream_sendmsg()
bonding: fix randomly populated arp target array
net: Make IP alignment calulations clearer.
bonding: fix accounting of active ports in 3ad
net: atheros: atl2: don't return zero on failure path in atl2_probe()
ipv6: fix out of bound writes in __ip6_append_data()
bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
smsc95xx: Support only IPv4 TCP/UDP csum offload
arp: always override existing neigh entries with gratuitous ARP
arp: postpone addr_type calculation to as late as possible
arp: decompose is_garp logic into a separate function
arp: fixed error in a comment
tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0
netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT
ebtables: arpreply: Add the standard target sanity check
netfilter: nf_tables: revisit chain/object refcounting from elements
netfilter: nf_tables: missing sanitization in data from userspace
netfilter: nf_tables: can't assume lock is acquired when dumping set elems
netfilter: synproxy: fix conntrackd interaction
... -
Since the head is guaranteed by the check above to be null, the call_rcu
would explode. Remove the previously logically dead code that was made
logically very much alive and kicking.Fixes: 985538eee06f ("net/sched: remove redundant null check on head")
Signed-off-by: Jiri Pirko
Signed-off-by: David S. Miller -
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lockJust switch to the new wait API like we did for commit
d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()").Reported-by: Michal Kubecek
Cc: Stefan Hajnoczi
Cc: Jorgen Hansen
Cc: "Michael S. Tsirkin"
Cc: Claudio Imbrenda
Signed-off-by: Cong Wang
Reviewed-by: Stefan Hajnoczi
Signed-off-by: David S. Miller -
In commit dc9c4d0fe023, the arp_target array moved from a static global
to a local variable. By the nature of static globals, the array used to
be initialized to all 0. At present, it's full of random data, which
that gets interpreted as arp_target values, when none have actually been
specified. Systems end up booting with spew along these lines:[ 32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
[ 32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.204892] lacp0: Setting MII monitoring interval to 100
[ 32.211071] lacp0: Removing ARP target 216.124.228.17
[ 32.216824] lacp0: Removing ARP target 218.160.255.255
[ 32.222646] lacp0: Removing ARP target 185.170.136.184
[ 32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.243987] lacp0: Removing ARP target 56.125.228.17
[ 32.249625] lacp0: Removing ARP target 218.160.255.255
[ 32.255432] lacp0: Removing ARP target 15.157.233.184
[ 32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.276632] lacp0: Removing ARP target 16.0.0.0
[ 32.281755] lacp0: Removing ARP target 218.160.255.255
[ 32.287567] lacp0: Removing ARP target 72.125.228.17
[ 32.293165] lacp0: Removing ARP target 218.160.255.255
[ 32.298970] lacp0: Removing ARP target 8.125.228.17
[ 32.304458] lacp0: Removing ARP target 218.160.255.255None of these were actually specified as ARP targets, and the driver does
seem to clean up the mess okay, but it's rather noisy and confusing, leaks
values to userspace, and the 255.255.255.255 spew shows up even when debug
prints are disabled.The fix: just zero out arp_target at init time.
While we're in here, init arp_all_targets_value in the right place.
Fixes: dc9c4d0fe023 ("bonding: reduce scope of some global variables")
CC: Mahesh Bandewar
CC: Jay Vosburgh
CC: Veaceslav Falico
CC: Andy Gospodarek
CC: netdev@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jarod Wilson
Acked-by: Andy Gospodarek
Signed-off-by: David S. Miller -
* pm-sleep:
PM / hibernate: Declare variables as static
RTC: rtc-cmos: Fix wakeup from suspend-to-idle
PM / wakeup: Fix up wakeup_source_report_event()* powercap:
PowerCap: Fix an error code in powercap_register_zone() -
* acpi-button:
Revert "ACPI / button: Remove lid_init_state=method mode"* acpi-tools:
tools/power/acpi: Add .gitignore file -
* intel_pstate:
cpufreq: intel_pstate: Document the current behavior and user interface* pm-cpufreq:
cpufreq: dbx500: add a Kconfig symbol* pm-cpufreq-sched:
cpufreq: schedutil: use now as reference when aggregating shared policy requests -
The assignmnet:
ip_align = strict ? 2 : NET_IP_ALIGN;
in compare_pkt_ptr_alignment() trips up Coverity because we can only
get to this code when strict is true, therefore ip_align will always
be 2 regardless of NET_IP_ALIGN's value.So just assign directly to '2' and explain the situation in the
comment above.Reported-by: "Gustavo A. R. Silva"
Signed-off-by: David S. Miller -
As of 7bb11dc9f59d and 0622cab0341c, bond slaves in a 3ad bond are not
removed from the aggregator when they are down, and the active slave count
is NOT equal to number of ports in the aggregator, but rather the number
of ports in the aggregator that are still enabled. The sysfs spew for
bonding_show_ad_num_ports() has a comment that says "Show number of active
802.3ad ports.", but it's currently showing total number of ports, both
active and inactive. Remedy it by using the same logic introduced in
0622cab0341c in __bond_3ad_get_active_agg_info(), so sysfs, procfs and
netlink all report the number of active ports. Note that this means that
IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of
NUM_PORTS, and thus perhaps should be renamed for clarity.Lightly tested on a dual i40e lacp bond, simulating link downs with an ip
link set dev down, was able to produce the state where I could
see both in the same aggregator, but a number of ports count of 1.MII Status: up
Active Aggregator Info:
Aggregator ID: 1
Number of ports: 2
CC: Veaceslav Falico
CC: Andy Gospodarek
CC: netdev@vger.kernel.org
Signed-off-by: Jarod Wilson
Signed-off-by: David S. Miller -
If dma mask checks fail in atl2_probe(), it breaks off initialization,
deallocates all resources, but returns zero.The patch adds proper error code return value and
make error code setup unified.Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov
Signed-off-by: David S. Miller
22 May, 2017
2 commits
-
Andrey Konovalov and idaifish@gmail.com reported crashes caused by
one skb shared_info being overwritten from __ip6_append_data()Andrey program lead to following state :
copy -4200 datalen 2000 fraglen 2040
maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen,
fraggap, 0); is overwriting skb->head and skb_shared_infoSince we apparently detect this rare condition too late, move the
code earlier to even avoid allocating skb and risking crashes.Once again, many thanks to Andrey and syzkaller team.
Signed-off-by: Eric Dumazet
Reported-by: Andrey Konovalov
Tested-by: Andrey Konovalov
Reported-by:
Signed-off-by: David S. Miller