05 Dec, 2009
1 commit
-
Change totalram_pages when a single page is added/removed to the
ballooned list. This avoid totalram_pages to be set erroneously to
max_pfn at boot.Signed-off-by: Gianluca Guida
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel
04 Dec, 2009
16 commits
-
I have observed cases where the implicit stop_machine_destroy() done by
stop_machine() hangs while destroying the workqueues, specifically in
kthread_stop(). This seems to be because timer ticks are not restarted
until after stop_machine() returns.Fortunately stop_machine provides a facility to pre-create/post-destroy
the workqueues so use this to ensure that workqueues are only destroyed
after everything is really up and running again.I only actually observed this failure with 2.6.30. It seems that newer
kernels are somehow more robust against doing kthread_stop() without timer
interrupts (I tried some backports of some likely looking candidates but
did not track down the commit which added this robustness). However this
change seems like a reasonable belt&braces thing to do.Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
The existing error handling has a few issues:
- If freeze_processes() fails it exits with shutting_down = SHUTDOWN_SUSPEND.
- If dpm_suspend_noirq() fails it exits without resuming xenbus.
- If stop_machine() fails it exits without resuming xenbus or calling
dpm_resume_end().
- xs_suspend()/xs_resume() and dpm_suspend_noirq()/dpm_resume_noirq() were not
nested in the obvious way.Fix by ensuring each failure case goto's the correct label. Treat a failure of
stop_machine() as a cancelled suspend in order to follow the correct resume
path.Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
On resume irq_info[*].evtchn is reset to 0 since event channel mappings
are not preserved over suspend/resume. The other contents of irq_info
is preserved to allow rebind_evtchn_irq() to function.However when a device resumes it will try to unbind from the
previous IRQ (e.g. blkfront goes blkfront_resume() -> blkif_free() ->
unbind_from_irqhandler() -> unbind_from_irq()). This will fail due to the
check for VALID_EVTCHN in unbind_from_irq() and the IRQ is leaked. The
device will then continue to resume and allocate a new IRQ, eventually
leading to find_unbound_irq() panic()ing.Fix this by changing unbind_from_irq() to handle teardown of interrupts
which have type!=IRQT_UNBOUND but are not currently bound to a specific
event channel.Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
tick_resume() is never called on secondary processors. Presumably this
is because they are offlined for suspend on native and so this is
normally taken care of in the CPU onlining path. Under Xen we keep all
CPUs online over a suspend.This patch papers over the issue for me but I will investigate a more
generic, less hacky, way of doing to the same.tick_suspend is also only called on the boot CPU which I presume should
be fixed too.Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel
Cc: Thomas Gleixner -
If Xen wants to return to a 32b usermode with sysret it must use the
right form. When using VCGF_in_syscall to trigger this, it looks at
the code segment and does a 32b sysret if it is FLAT_USER_CS32.
However, this is different from __USER32_CS, so it fails to return
properly if we use the normal Linux segment.So avoid the whole mess by dropping VCGF_in_syscall and simply use
plain iret to return to usermode.Signed-off-by: Jeremy Fitzhardinge
Acked-by: Jan Beulich
Cc: Stable Kernel -
dpm_resume_noirq() takes a mutex, so it can't be called from a no-interrupt
context. Don't call it from within the stop-machine function, but just
afterwards, since we're resuming anyway, regardless of what happened.Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
printk timestamping uses sched_clock, which in turn relies on runstate
info under Xen. So make sure we set it up before any printks can
be called.Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
The commit "xen: re-register runstate area earlier on resume" caused us
to never try and setup the runstate area for secondary CPUs. Ensure that
we do this...Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
Otherwise the timer is disabled by dpm_suspend_noirq() which in turn prevents
correct operation of stop_machine on multi-processor systems and breaks
suspend.Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
pvops kernels >= 2.6.30 can currently only be saved and restored once. The
second attempt to save results in:ERROR Internal error: Frame# in pfn-to-mfn frame list is not in pseudophys
ERROR Internal error: entry 0: p2m_frame_list[0] is 0xf2c2c2c2, max 0x120000
ERROR Internal error: Failed to map/save the p2m frame listI finally narrowed it down to:
commit cdaead6b4e657f960d6d6f9f380e7dfeedc6a09b
Author: Jeremy Fitzhardinge
Date: Fri Feb 27 15:34:59 2009 -0800xen: split construction of p2m mfn tables from registration
Build the p2m_mfn_list_list early with the rest of the p2m table, but
register it later when the real shared_info structure is in place.Signed-off-by: Jeremy Fitzhardinge
The unforeseen side-effect of this change was to cause the mfn list list to not
be rebuilt on resume. Prior to this change it would have been rebuilt via
xen_post_suspend() -> xen_setup_shared_info() -> xen_setup_mfn_list_list().Fix by explicitly calling xen_build_mfn_list_list() from xen_post_suspend().
Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
Even if have_vcpu_info_placement is not set, we still need to set up
the runstate area on each resumed vcpu.Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
This is necessary to ensure the runstate area is available to
xen_sched_clock before any calls to printk which will require it in
order to provide a timestamp.I chose to pull the xen_setup_runstate_info out of xen_time_init into
the caller in order to maintain parity with calling
xen_setup_runstate_info separately from calling xen_time_resume.Signed-off-by: Ian Campbell
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
Increases the device timeout from 10s to 5 minutes, giving the user a
visual indication during that time in case there are problems. The patch
is a backport of changesets 144 and 150 in the Xenbits tree.Cc: Jeremy Fitzhardinge
Signed-off-by: Paolo Bonzini
Signed-off-by: Jeremy Fitzhardinge -
When printing a warning about a timed-out device, print the
current state of both ends of the device connection (i.e., backend as
well as frontend). This backports half of changeset 146 from the
Xenbits tree.Cc: Jeremy Fitzhardinge
Signed-off-by: Paolo Bonzini
Signed-off-by: Jeremy Fitzhardinge -
The logic of is_disconnected_device/exists_disconnected_device is wrong
in that they are used to test whether a device is trying to connect (i.e.
connecting). For this reason the patch fixes them to not consider a
Closing or Closed device to be connecting. At the same time the patch
also renames the functions according to what they really do; you could
say a closed device is "disconnected" (the old name), but not "connecting"
(the new name).This patch is a backport of changeset 909 from the Xenbits tree.
Cc: Jeremy Fitzhardinge
Signed-off-by: Paolo Bonzini
Signed-off-by: Jeremy Fitzhardinge -
They don't need to be global, and may cause linker clashes.
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel
04 Nov, 2009
2 commits
-
A Xen guest never needs to know about extended topology, and knowing
would just confuse it.This patch just zeros ebx in leaf 0xb which indicates no topology info,
preventing a crash under Xen on cpus which support this leaf.Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel -
We never want to rely on the hvc workqueue to emit output, because the
most interesting output is when the kernel is broken. This will
improve oops/crash/console message for better debugging.Instead, we force-poll until all output is emitted.
Signed-off-by: Jeremy Fitzhardinge
Cc: Stable Kernel
28 Oct, 2009
1 commit
-
xen_setup_stackprotector() ends up trying to set page protections,
so we need to have vm_mmu_ops set up before trying to do so.
Failing to do so causes an early boot crash.[ Impact: Fix early crash under Xen. ]
Signed-off-by: Jeremy Fitzhardinge
23 Oct, 2009
2 commits
-
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus:
move virtrng_remove to .devexit.text
move virtballoon_remove to .devexit.text
virtio_blk: Revert serial number support
virtio: let header files include virtio_ids.h
virtio_blk: revert QUEUE_FLAG_VIRT addition -
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
niu: VLAN_ETH_HLEN should be used to make sure that the whole MAC header was copied to the head buffer in the Vlan packets case
KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
KS8851: Fix MAC address write order
KS8851: Add soft reset at probe time
net: fix section mismatch in fec.c
net: Fix struct inet_timewait_sock bitfield annotation
tcp: Try to catch MSG_PEEK bug
net: Fix IP_MULTICAST_IF
bluetooth: static lock key fix
bluetooth: scheduling while atomic bug fix
tcp: fix TCP_DEFER_ACCEPT retrans calculation
tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
tcp: accept socket after TCP_DEFER_ACCEPT period
Revert "tcp: fix tcp_defer_accept to consider the timeout"
AF_UNIX: Fix deadlock on connecting to shutdown socket
ethoc: clear only pending irqs
ethoc: inline regs access
vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
be2net: fix support for PCI hot plug
...
22 Oct, 2009
16 commits
-
The function virtrng_remove is used only wrapped by __devexit_p so define
it using __devexit.Signed-off-by: Uwe Kleine-König
Acked-by: Sam Ravnborg
Cc: Rusty Russell
Cc: Michael S. Tsirkin
Acked-by: Christian Borntraeger
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Rusty Russell -
The function virtballoon_remove is used only wrapped by __devexit_p so
define it using __devexit.Signed-off-by: Uwe Kleine-König
Acked-by: Sam Ravnborg
Acked-by: Michael S. Tsirkin
Signed-off-by: Rusty Russell -
This reverts "Add serial number support for virtio_blk, V4a".
Turns out that virtio_pci, lguest and s/390 all have an 8 bit limit
on virtio config space, so noone could ever use this.This is coming back later in a cleaner form.
Signed-off-by: Rusty Russell
Cc: john cooper
Cc: Jens Axboe -
Rusty,
commit 3ca4f5ca73057a617f9444a91022d7127041970a
virtio: add virtio IDs file
moved all device IDs into a single file. While the change itself is
a very good one, it can break userspace applications. For example
if a userspace tool wanted to get the ID of virtio_net it used to
include virtio_net.h. This does no longer work, since virtio_net.h
does not include virtio_ids.h.
This patch moves all "#include " from the C
files into the header files, making the header files compatible with
the old ones.In addition, this patch exports virtio_ids.h to userspace.
CC: Fernando Luis Vazquez Cao
Signed-off-by: Christian Borntraeger
Signed-off-by: Rusty Russell -
It seems like the addition of QUEUE_FLAG_VIRT caueses major performance
regressions for Fedora users:https://bugzilla.redhat.com/show_bug.cgi?id=509383
https://bugzilla.redhat.com/show_bug.cgi?id=505695while I can't reproduce those extreme regressions myself I think the flag
is wrong.Rationale:
QUEUE_FLAG_VIRT expands to QUEUE_FLAG_NONROT which casus the queue
unplugged immediately. This is not a good behaviour for at least
qemu and kvm where we do have significant overhead for every
I/O operations. Even with all the latested speeups (native AIO,
MSI support, zero copy) we can only get native speed for up to 128kb
I/O requests we already are down to 66% of native performance for 4kb
requests even on my laptop running the Intel X25-M SSD for which the
QUEUE_FLAG_NONROT was designed.
If we ever get virtio-blk overhead low enough that this flag makes
sense it should only be set based on a feature flag set by the host.Signed-off-by: Christoph Hellwig
Signed-off-by: Rusty Russell -
…ied to the head buffer in the Vlan packets case
Signed-off-by: Joyce Yu <joyce.yu@sun.com>
Signed-off-by: David S. Miller <davem@davemloft.net> -
* 'for-linus' of git://git.infradead.org/users/eparis/notify:
dnotify: ignore FS_EVENT_ON_CHILD
inotify: fix coalesce duplicate events into a single event in special case
inotify: deprecate the inotify kernel interface
fsnotify: do not set group for a mark before it is on the i_list -
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: hp_sdc_rtc - fix test in hp_sdc_rtc_read_rt()
Input: atkbd - consolidate force release quirks for volume keys
Input: logips2pp - model 73 is actually TrackMan FX
Input: i8042 - add Sony Vaio VGN-FZ240E to the nomux list
Input: fix locking issue in /proc/bus/input/ handlers
Input: atkbd - postpone restoring LED/repeat rate at resume
Input: atkbd - restore resetting LED state at startup
Input: i8042 - make pnp_data_busted variable boolean instead of int
Input: synaptics - add another Protege M300 to rate blacklist -
* 'kvm-updates/2.6.32' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Prevent kvm_init from corrupting debugfs structures
KVM: MMU: fix pointer cast
KVM: use proper hrtimer function to retrieve expiration time -
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm snapshot: allow chunk size to be less than page size
dm snapshot: use unsigned integer chunk size
dm snapshot: lock snapshot while supplying status
dm exception store: fix failed set_chunk_size error path
dm snapshot: require non zero chunk size by end of ctr
dm: dec_pending needs locking to save error value
dm: add missing del_gendisk to alloc_dev error path
dm log: userspace fix incorrect luid cast in userspace_ctr
dm snapshot: free exception store on init failure
dm snapshot: sort by chunk size to fix race -
Increase TEST_SUSPEND_SECONDS to 10 so the warning in
suspend_test_finish() doesn't annoy the users of slower systems so much.Also, make the warning print the suspend-resume cycle time, so that we
know why the warning actually triggered.Patch prepared during the hacking session at the Kernel Summit in Tokyo.
Signed-off-by: Rafael J. Wysocki
Signed-off-by: Linus Torvalds -
This fixes a compile bug introduced in
6ef297f (ARM: 5720/1: Move MMCI header to amba include dir)
That commit moved arch/arm/include/asm/mach/mmc.h to
include/linux/amba/mmci.h. Just removing the include was enough.Signed-off-by: Uwe Kleine-König
Acked-by: Linus Walleij
Acked-by: Nicolas Ferre
Acked-by: Bill Gatliff
Cc: Catalin Marinas
Cc: Russell King
Cc: Pierre Ossman
Cc: linux-arm-kernel@lists.infradead.org
Cc: Andrew Morton
Signed-off-by: Linus Torvalds -
* 'sh/for-2.6.32' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
sh: Kill off stray HAVE_FTRACE_SYSCALLS reference.
sh: Remove BKL from landisk gio.
sh: disabled cache handling fix.
sh: Fix up single page flushing to use PAGE_SIZE. -
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: aesni-intel - Fix irq_fpu_usable usage
crypto: padlock-sha - Fix stack alignment -
Fix a (small) memory leak in one of the error paths of the NFS mount
options parsing code.Regression introduced in 2.6.30 by commit a67d18f (NFS: load the
rpc/rdma transport module automatically).Reported-by: Yinghai Lu
Reported-by: Pekka Enberg
Signed-off-by: Ingo Molnar
Signed-off-by: Trond Myklebust
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds -
This patch fixes a null pointer exception in pipe_rdwr_open() which
generates the stack trace:> Unable to handle kernel NULL pointer dereference at 0000000000000028 RIP:
> [] pipe_rdwr_open+0x35/0x70
> [] __dentry_open+0x13c/0x230
> [] do_filp_open+0x2d/0x40
> [] do_sys_open+0x5a/0x100
> [] sysenter_do_call+0x1b/0x67The failure mode is triggered by an attempt to open an anonymous
pipe via /proc/pid/fd/* as exemplified by this script:=============================================================
while : ; do
{ echo y ; sleep 1 ; } | { while read ; do echo z$REPLY; done ; } &
PID=$!
OUT=$(ps -efl | grep 'sleep 1' | grep -v grep |
{ read PID REST ; echo $PID; } )
OUT="${OUT%% *}"
DELAY=$((RANDOM * 1000 / 32768))
usleep $((DELAY * 1000 + RANDOM % 1000 ))
echo n > /proc/$OUT/fd/1 # Trigger defect
done
=============================================================Note that the failure window is quite small and I could only
reliably reproduce the defect by inserting a small delay
in pipe_rdwr_open(). For example:static int
pipe_rdwr_open(struct inode *inode, struct file *filp)
{
msleep(100);
mutex_lock(&inode->i_mutex);Although the defect was observed in pipe_rdwr_open(), I think it
makes sense to replicate the change through all the pipe_*_open()
functions.The core of the change is to verify that inode->i_pipe has not
been released before attempting to manipulate it. If inode->i_pipe
is no longer present, return ENOENT to indicate so.The comment about potentially using atomic_t for i_pipe->readers
and i_pipe->writers has also been removed because it is no longer
relevant in this context. The inode->i_mutex lock must be used so
that inode->i_pipe can be dealt with correctly.Signed-off-by: Earl Chew
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds
21 Oct, 2009
2 commits
-
In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting
the RXCR1_AE bit by accident. This meant that all unicast frames where
being accepted by the device. Remove RXCR1_AE from this case.Note, RXCR1_AE was also masking a problem with setting the MAC address
properly, so needs to be applied after fixing the MAC write order.Fixes a bug reported by Doong, Ping of Micrel. This version of the
patch avoids setting RXCR1_ME for all cases.Signed-off-by: Ben Dooks
Signed-off-by: David S. Miller -
The MAC address register was being written in the wrong order, so add
a new address macro to convert mac-address byte to register address and
a ks8851_wrreg8() function to write each byte without having to worry
about any difficult byte swapping.Fixes a bug reported by Doong, Ping of Micrel.
Signed-off-by: Ben Dooks
Signed-off-by: David S. Miller