11 Oct, 2008
13 commits
-
Conflicts:
arch/x86/kernel/pci-gart_64.c
include/asm-x86/dma-mapping.h -
Conflicts:
arch/x86/mm/init_64.c -
Jeremy Fitzhardinge wrote:
> I'd noticed that current tip/master hasn't been booting under Xen, and I
> just got around to bisecting it down to this change.
>
> commit 065ae73c5462d42e9761afb76f2b52965ff45bd6
> Author: Suresh Siddha
>
> x86, cpa: make the kernel physical mapping initialization a two pass sequence
>
> This patch is causing Xen to fail various pagetable updates because it
> ends up remapping pagetables to RW, which Xen explicitly prohibits (as
> that would allow guests to make arbitrary changes to pagetables, rather
> than have them mediated by the hypervisor).Instead of making init a two pass sequence, to satisfy the Intel's TLB
Application note (developer.intel.com/design/processor/applnots/317080.pdf
Section 6 page 26), we preserve the original page permissions
when fragmenting the large mappings and don't touch the existing memory
mapping (which satisfies Xen's requirements).Only open issue is: on a native linux kernel, we will go back to mapping
the first 0-1GB kernel identity mapping as executable (because of the
static mapping setup in head_64.S). We can fix this in a different
patch if needed.Signed-off-by: Suresh Siddha
Acked-by: Jeremy Fitzhardinge
Signed-off-by: Ingo Molnar -
clean up recently added code to be more consistent with other x86 code.
Signed-off-by: Ingo Molnar
-
Fix _end alignment check - can trigger a crash if _end happens to be
on a page boundary.Signed-off-by: Ingo Molnar
-
Track the memtype for RAM pages in page struct instead of using the
memtype list. This avoids the explosion in the number of entries in
memtype list (of the order of 20,000 with AGP) and makes the PAT
tracking simpler.We are using PG_arch_1 bit in page->flags.
We still use the memtype list for non RAM pages.
Signed-off-by: Suresh Siddha
Signed-off-by: Venkatesh Pallipadi
Signed-off-by: Ingo Molnar -
Do a global flush tlb after splitting the large page and before we do the
actual change page attribute in the PTE.With out this, we violate the TLB application note, which says
"The TLBs may contain both ordinary and large-page translations for
a 4-KByte range of linear addresses. This may occur if software
modifies the paging structures so that the page size used for the
address range changes. If the two translations differ with respect
to page frame or attributes (e.g., permissions), processor behavior
is undefined and may be implementation-specific."And also serialize cpa() (for !DEBUG_PAGEALLOC which uses large identity
mappings) using cpa_lock. So that we don't allow any other cpu, with stale
large tlb entries change the page attribute in parallel to some other cpu
splitting a large page entry along with changing the attribute.Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar -
Interrupt context no longer splits large page in cpa(). So we can do away
with cpa memory pool code.Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar -
No alias checking needed for setting present/not-present mapping. Otherwise,
we may need to break large pages for 64-bit kernel text mappings (this adds to
complexity if we want to do this from atomic context especially, for ex:
with CONFIG_DEBUG_PAGEALLOC). Let's keep it simple!Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar -
Don't use large pages for kernel identity mapping with DEBUG_PAGEALLOC.
This will remove the need to split the large page for the
allocated kernel page in the interrupt context.This will simplify cpa code(as we don't do the split any more from the
interrupt context). cpa code simplication in the subsequent patches.Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar -
In the first pass, kernel physical mapping will be setup using large or
small pages but uses the same PTE attributes as that of the early
PTE attributes setup by early boot code in head_[32|64].SAfter flushing TLB's, we go through the second pass, which setups the
direct mapped PTE's with the appropriate attributes (like NX, GLOBAL etc)
which are runtime detectable.This two pass mechanism conforms to the TLB app note which says:
"Software should not write to a paging-structure entry in a way that would
change, for any linear address, both the page size and either the page frame
or attributes."Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar -
remove USER from the PTE/PDE attributes for the very early identity
mapping. We overwrite these mappings with KERNEL attribute later
in the boot. Just being paranoid here as there is no need for USER bit
to be set.If this breaks something(don't know the history), then we can simply drop
this change.Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar -
Signed-off-by: Suresh Siddha
Cc: Suresh Siddha
Cc: arjan@linux.intel.com
Cc: venkatesh.pallipadi@intel.com
Cc: jeremy@goop.org
Signed-off-by: Ingo Molnar
10 Oct, 2008
9 commits
-
This merges phase 1 of the x86 tree, which is a collection of branches:
x86/alternatives, x86/cleanups, x86/commandline, x86/crashdump,
x86/debug, x86/defconfig, x86/doc, x86/exports, x86/fpu, x86/gart,
x86/idle, x86/mm, x86/mtrr, x86/nmi-watchdog, x86/oprofile,
x86/paravirt, x86/reboot, x86/sparse-fixes, x86/tsc, x86/urgent and
x86/vmallocand as Ingo says: "these are the easiest, purely independent x86 topics
with no conflicts, in one nice Octopus merge".* 'x86-v28-for-linus-phase1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (147 commits)
x86: mtrr_cleanup: treat WRPROT as UNCACHEABLE
x86: mtrr_cleanup: first 1M may be covered in var mtrrs
x86: mtrr_cleanup: print out correct type v2
x86: trivial printk fix in efi.c
x86, debug: mtrr_cleanup print out var mtrr before change it
x86: mtrr_cleanup try gran_size to less than 1M, v3
x86: mtrr_cleanup try gran_size to less than 1M, cleanup
x86: change MTRR_SANITIZER to def_bool y
x86, debug printouts: IOMMU setup failures should not be KERN_ERR
x86: export set_memory_ro and set_memory_rw
x86: mtrr_cleanup try gran_size to less than 1M
x86: mtrr_cleanup prepare to make gran_size to less 1M
x86: mtrr_cleanup safe to get more spare regs now
x86_64: be less annoying on boot, v2
x86: mtrr_cleanup hole size should be less than half of chunk_size, v2
x86: add mtrr_cleanup_debug command line
x86: mtrr_cleanup optimization, v2
x86: don't need to go to chunksize to 4G
x86_64: be less annoying on boot
x86, olpc: fix endian bug in openfirmware workaround
... -
We already did that a long time ago for pnp_system_init, but
pnpacpi_init and pnpbios_init remained as subsys_initcalls, and get
linked into the kernel before the arch-specific routines that finalize
the PCI resources (pci_subsys_init).This means that the PnP routines would either register their resources
before the PCI layer could, or would be unable to check whether a PCI
resource had already been registered. Both are problematic.I wanted to do this before 2.6.27, but every time we change something
like this, something breaks. That said, _every_ single time we trust
some firmware (like PnP tables) more than we trust the hardware itself
(like PCI probing), the problems have been worse.Signed-off-by: Linus Torvalds
-
* 'upstream-2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
ata_piix: IDE Mode SATA patch for Intel Ibex Peak DeviceIDs
libata-eh: clear UNIT ATTENTION after reset
ata_piix: add Hercules EC-900 mini-notebook to ich_laptop short cable list
libata: reorder ata_device to remove 8 bytes of padding on 64 bits
[libata] pata_bf54x: Add proper PM operation
pata_sil680: convert CONFIG_PPC_MERGE to CONFIG_PPC
libata: Implement disk shock protection support
[libata] Introduce ata_id_has_unload()
PATA: RPC now selects HAVE_PATA_PLATFORM for pata platform driver
ata_piix: drop merged SCR access and use slave_link instead
libata: implement slave_link
libata: misc updates to prepare for slave link
libata: reimplement link iterator
libata: make SCR access ops per-link -
This is debatable, but while we're debating it, let's disallow the
combination of splice and an O_APPEND destination.It's not entirely clear what the semantics of O_APPEND should be, and
POSIX apparently expects pwrite() to ignore O_APPEND, for example. So
we could make up any semantics we want, including the old ones.But Miklos convinced me that we should at least give it some thought,
and that accepting writes at arbitrary offsets is wrong at least for
IS_APPEND() files (which always have O_APPEND set, even if the reverse
isn't true: you can obviously have O_APPEND set on a regular file).So disallow O_APPEND entirely for now. I doubt anybody cares, and this
way we have one less gray area to worry about.Reported-and-argued-for-by: Miklos Szeredi
Acked-by: Jens Axboe
Signed-off-by: Linus Torvalds -
* 'hwmon-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6:
hwmon: (abituguru3) Enable DMI probing feature on Abit AT8 32X
hwmon: (abituguru3) Enable reading from AUX3 fan on Abit AT8 32X
hwmon: (adt7473) Fix some bogosity in documentation file
hwmon: Define sysfs interface for energy consumption register
hwmon: (it87) Prevent power-off on Shuttle SN68PT
eeepc-laptop: Fix hwmon interface -
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[CPUFREQ] correct broken links and email addresses -
This fixes the previous fix, which was completely wrong on closer
inspection. This version has been manually tested with a user-space
test harness and generates sane values. A nearly identical patch has
been boot-tested.The problem arose from changing how kmalloc/kfree handled alignment
padding without updating ksize to match. This brings it in sync.Signed-off-by: Matt Mackall
Signed-off-by: Linus Torvalds -
Replace the no longer working links and email address in the
documentation and in source code.Signed-off-by: Márton Németh
Signed-off-by: Dave Jones
09 Oct, 2008
9 commits
-
Enable driver checking of the DMI product name (when enabled) on
an Abit AT8 32X, instead of falling back to a manual probe. This
eliminates false negatives and eventually will help avoid
unnecessary bus probes on unsupported mainboards.Signed-off-by: Alistair John Strachan
Tested-by: Daniel Exner
Acked-by: Hans de Goede
Signed-off-by: Jean Delvare -
The table for the Abit AT8 32X was incorrectly missing an entry
for the sixth ("AUX3") fan. Add this entry, exporting the fan
reading to userspace.Closes lm-sensors.org ticket #2339.
Signed-off-by: Alistair John Strachan
Tested-by: Daniel Exner
Acked-by: Hans de Goede
Signed-off-by: Jean Delvare -
Signed-off-by: Darrick J. Wong
Signed-off-by: Jean Delvare -
Describe the sysfs files that were introduced in the ibmaem driver.
Signed-off-by: Darrick J. Wong
Signed-off-by: Jean Delvare -
On the Shuttle SN68PT, FAN_CTL2 is apparently not connected to a fan,
but to something else. One user has reported instant system power-off
when changing the PWM2 duty cycle, so we disable it.I use the board name string as the trigger in case the same board is
ever used in other systems.This closes lm-sensors ticket #2349:
pwmconfig causes a hard poweroff
http://www.lm-sensors.org/ticket/2349Signed-off-by: Jean Delvare
-
Creates a name file in the sysfs directory, that
is needed for the libsensors library to work.
Also rename fan1_pwm to pwm1 and scale its value as needed.This fixes bug #11520:
http://bugzilla.kernel.org/show_bug.cgi?id=11520Signed-off-by: Corentin Chary
Signed-off-by: Jean Delvare -
* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
[MIPS] Sibyte: Register PIO PATA device only for Swarm and Litte Sur -
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
tcp: Fix tcp_hybla zero congestion window growth with small rho and large cwnd.
net: Fix netdev_run_todo dead-lock
tcp: Fix possible double-ack w/ user dma
net: only invoke dev->change_rx_flags when device is UP
netrom: Fix sock_orphan() use in nr_release
ax25: Quick fix for making sure unaccepted sockets get destroyed.
Revert "ax25: Fix std timer socket destroy handling."
[Bluetooth] Add reset quirk for A-Link BlueUSB21 dongle
[Bluetooth] Add reset quirk for new Targus and Belkin dongles
[Bluetooth] Fix double frees on error paths of btusb and bpa10x drivers -
Symbol name spaghetti which is too complicated to cleanup on this stage
of the release cycle breaks the build on BCM1480 platforms.Signed-off-by: Ralf Baechle
08 Oct, 2008
6 commits
-
Because of rounding, in certain conditions, i.e. when in congestion
avoidance state rho is smaller than 1/128 of the current cwnd, TCP
Hybla congestion control starves and the cwnd is kept constant
forever.This patch forces an increment by one segment after #send_cwnd calls
without increments(newreno behavior).Signed-off-by: Daniele Lacamera
Signed-off-by: David S. Miller -
Benjamin Thery tracked down a bug that explains many instances
of the errorunregister_netdevice: waiting for %s to become free. Usage count = %d
It turns out that netdev_run_todo can dead-lock with itself if
a second instance of it is run in a thread that will then free
a reference to the device waited on by the first instance.The problem is really quite silly. We were trying to create
parallelism where none was required. As netdev_run_todo always
follows a RTNL section, and that todo tasks can only be added
with the RTNL held, by definition you should only need to wait
for the very ones that you've added and be done with it.There is no need for a second mutex or spinlock.
This is exactly what the following patch does.
Signed-off-by: Herbert Xu
Signed-off-by: David S. Miller -
From: Ali Saidi
When TCP receive copy offload is enabled it's possible that
tcp_rcv_established() will cause two acks to be sent for a single
packet. In the case that a tcp_dma_early_copy() is successful,
copied_early is set to true which causes tcp_cleanup_rbuf() to be
called early which can send an ack. Further along in
tcp_rcv_established(), __tcp_ack_snd_check() is called and will
schedule a delayed ACK. If no packets are processed before the delayed
ack timer expires the packet will be acked twice.Signed-off-by: David S. Miller
-
Jesper Dangaard Brouer reported a bug when setting a VLAN
device down that is in promiscous mode:When the VLAN device is set down, the promiscous count on the real
device is decremented by one by vlan_dev_stop(). When removing the
promiscous flag from the VLAN device afterwards, the promiscous
count on the real device is decremented a second time by the
vlan_change_rx_flags() callback.The root cause for this is that the ->change_rx_flags() callback is
invoked while the device is down. The synchronization is meant to mirror
the behaviour of the ->set_rx_mode callbacks, meaning the ->open function
is responsible for doing a full sync on open, the ->close() function is
responsible for doing full cleanup on ->stop() and ->change_rx_flags()
is meant to do incremental changes while the device is UP.Only invoke ->change_rx_flags() while the device is UP to provide the
intended behaviour.Tested-by: Jesper Dangaard Brouer
Signed-off-by: Patrick McHardy
Signed-off-by: David S. Miller -
SLOB's ksize calculation was braindamaged and generally harmlessly
underreported the allocation size. But for very small buffers, it could
in fact overreport them, leading code depending on krealloc to overrun
the allocation and trample other data.Signed-off-by: Matt Mackall
Tested-by: Peter Zijlstra
Signed-off-by: Linus Torvalds
07 Oct, 2008
3 commits
-
This reverts commit 135aedc38e812b922aa56096f36a3d72ffbcf2fb, as
requested by Hans Verkuil.It was a patch for 2.6.28 where the BKL was pushed down from v4l core to
the drivers, not for 2.6.27!Requested-by: Hans Verkuil
Cc: Mauro Carvalho Chehab
Signed-of-by: Linus Torvalds -
* Theodore Ts'o (tytso@mit.edu) wrote:
>
> I've been playing with adding some markers into ext4 to see if they
> could be useful in solving some problems along with Systemtap. It
> appears, though, that as of 2.6.27-rc8, markers defined in code which is
> compiled directly into the kernel (i.e., not as modules) don't show up
> in Module.markers:
>
> kvm_trace_entryexit arch/x86/kvm/kvm-intel %u %p %u %u %u %u %u %u
> kvm_trace_handler arch/x86/kvm/kvm-intel %u %p %u %u %u %u %u %u
> kvm_trace_entryexit arch/x86/kvm/kvm-amd %u %p %u %u %u %u %u %u
> kvm_trace_handler arch/x86/kvm/kvm-amd %u %p %u %u %u %u %u %u
>
> (Note the lack of any of the kernel_sched_* markers, and the markers I
> added for ext4_* and jbd2_* are missing as wel.)
>
> Systemtap apparently depends on in-kernel trace_mark being recorded in
> Module.markers, and apparently it's been claimed that it used to be
> there. Is this a bug in systemtap, or in how Module.markers is getting
> built? And is there a file that contains the equivalent information
> for markers located in non-modules code?I think the problem comes from "markers: fix duplicate modpost entry"
(commit d35cb360c29956510b2fe1a953bd4968536f7216)Especially :
- add_marker(mod, marker, fmt);
+ if (!mod->skip)
+ add_marker(mod, marker, fmt);
}
return;
fail:Here is a fix that should take care if this problem.
Thanks for the bug report!
Signed-off-by: Mathieu Desnoyers
Tested-by: "Theodore Ts'o"
CC: Greg KH
CC: David Smith
CC: Roland McGrath
CC: Sam Ravnborg
CC: Wenji Huang
CC: Takashi Nishiie
Signed-off-by: Linus Torvalds