12 Feb, 2007
14 commits
-
Values are readily available via ZVC per node and global sums.
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Function is unnecessary now. We can use the summing features of the ZVCs to
get the values we need.Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
nr_free_pages is now a simple access to a global variable. Make it a macro
instead of a function.The nr_free_pages now requires vmstat.h to be included. There is one
occurrence in power management where we need to add the include. Directly
refrer to global_page_state() there to clarify why the #include was added.[akpm@osdl.org: arm build fix]
[akpm@osdl.org: sparc64 build fix]
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The global and per zone counter sums are in arrays of longs. Reorder the ZVCs
so that the most frequently used ZVCs are put into the same cacheline. That
way calculations of the global, node and per zone vm state touches only a
single cacheline. This is mostly important for 64 bit systems were one 128
byte cacheline takes only 8 longs.Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This is again simplifies some of the VM counter calculations through the use
of the ZVC consolidated counters.[michal.k.k.piotrowski@gmail.com: build fix]
Signed-off-by: Christoph Lameter
Signed-off-by: Michal Piotrowski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The determination of the dirty ratio to determine writeback behavior is
currently based on the number of total pages on the system.However, not all pages in the system may be dirtied. Thus the ratio is always
too low and can never reach 100%. The ratio may be particularly skewed if
large hugepage allocations, slab allocations or device driver buffers make
large sections of memory not available anymore. In that case we may get into
a situation in which f.e. the background writeback ratio of 40% cannot be
reached anymore which leads to undesired writeback behavior.This patchset fixes that issue by determining the ratio based on the actual
pages that may potentially be dirty. These are the pages on the active and
the inactive list plus free pages.The problem with those counts has so far been that it is expensive to
calculate these because counts from multiple nodes and multiple zones will
have to be summed up. This patchset makes these counters ZVC counters. This
means that a current sum per zone, per node and for the whole system is always
available via global variables and not expensive anymore to calculate.The patchset results in some other good side effects:
- Removal of the various functions that sum up free, active and inactive
page counts- Cleanup of the functions that display information via the proc filesystem.
This patch:
The use of a ZVC for nr_inactive and nr_active allows a simplification of some
counter operations. More ZVC functionality is used for sums etc in the
following patches.[akpm@osdl.org: UP build fix]
Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
After do_wp_page has tested page_mkwrite, it must release old_page after
acquiring page table lock, not before: at some stage that ordering got
reversed, leaving a (very unlikely) window in which old_page might be
truncated, freed, and reused in the same position.Signed-off-by: Hugh Dickins
Acked-by: Nick Piggin
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
With CONFIG_SPARSEMEM=y:
mm/rmap.c:579: warning: format '%lx' expects type 'long unsigned int', but argument 2 has type 'int'
Make __page_to_pfn() return unsigned long.
Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This early break prevents us from displaying info for the vm stats thresholds
if the zone doesn't have any pages in its per-cpu pagesets.So my 800MB i386 box says:
Node 0, zone DMA
pages free 2365
min 16
low 20
high 24
active 0
inactive 0
scanned 0 (a: 0 i: 0)
spanned 4096
present 4044
nr_anon_pages 0
nr_mapped 1
nr_file_pages 0
nr_slab_reclaimable 0
nr_slab_unreclaimable 0
nr_page_table_pages 0
nr_dirty 0
nr_writeback 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
protection: (0, 868, 868)
pagesets
all_unreclaimable: 0
prev_priority: 12
start_pfn: 0
Node 0, zone Normal
pages free 199713
min 934
low 1167
high 1401
active 10215
inactive 4507
scanned 0 (a: 0 i: 0)
spanned 225280
present 222420
nr_anon_pages 2685
nr_mapped 1110
nr_file_pages 12055
nr_slab_reclaimable 2216
nr_slab_unreclaimable 1527
nr_page_table_pages 213
nr_dirty 0
nr_writeback 0
nr_unstable 0
nr_bounce 0
nr_vmscan_write 0
protection: (0, 0, 0)
pagesets
cpu: 0 pcp: 0
count: 152
high: 186
batch: 31
cpu: 0 pcp: 1
count: 13
high: 62
batch: 15
vm stats threshold: 16
cpu: 1 pcp: 0
count: 34
high: 186
batch: 31
cpu: 1 pcp: 1
count: 10
high: 62
batch: 15
vm stats threshold: 16
all_unreclaimable: 0
prev_priority: 12
start_pfn: 4096Just nuke all that search-for-the-first-non-empty-pageset code. Dunno why it
was there in the first place..Cc: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
find_min_pfn_for_node() and find_min_pfn_with_active_regions() sort
early_node_map[] on every call. This is an excessive amount of sorting and
that can be avoided. This patch always searches the whole early_node_map[]
in find_min_pfn_for_node() instead of returning the first value found. The
map is then only sorted once when required. Successfully boot tested on a
number of machines.[akpm@osdl.org: cleanup]
Signed-off-by: Mel Gorman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the last vestiges of the long-deprecated "MAP_ANON" page protection
flag: use "MAP_ANONYMOUS" instead.Signed-off-by: Robert P. J. Day
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Use the pointer passed to cache_reap to determine the work pointer and
consolidate exit paths.Signed-off-by: Christoph Lameter
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Clean up __cache_alloc and __cache_alloc_node functions a bit. We no
longer need to do NUMA_BUILD tricks and the UMA allocation path is much
simpler. No functional changes in this patch.Note: saves few kernel text bytes on x86 NUMA build due to using gotos in
__cache_alloc_node() and moving __GFP_THISNODE check in to
fallback_alloc().Cc: Andy Whitcroft
Cc: Christoph Hellwig
Cc: Manfred Spraul
Acked-by: Christoph Lameter
Cc: Paul Jackson
Signed-off-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The PageSlab debug check in kfree_debugcheck() is broken for compound
pages. It is also redundant as we already do BUG_ON for non-slab pages in
page_get_cache() and page_get_slab() which are always called before we free
any actual objects.Signed-off-by: Pekka Enberg
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
10 Feb, 2007
26 commits
-
The ATA_ENABLE_PATA define was never meant to be permanent, and in
recent kernels, it's already been unconditionally enabled. Remove.Signed-off-by: Jeff Garzik
-
Signed-off-by: Jeff Garzik
-
If we are doing a PIO setup for a CFA card and it blows up with a device
error then assume it is an older CFA card which doesn't support this
rather than failing the device out of existance.Stands seperate to the quieting patch but that is obviously useful with
this change.Signed-off-by: Alan Cox
Signed-off-by: Jeff Garzik -
ata_pci_device_do_resume can fail if the PCI device couldn't be re-enabled.
Update sata_nv to propagate the return value from this call and to not try
to do any other resume activities if it fails. Fixes a compile warning.Signed-off-by: Robert Hancock
Signed-off-by: Andrew Morton
Signed-off-by: Jeff Garzik -
Update sata_nv to wait for the controller to indicate via the status
register that it has entered the requested state when switching between
ADMA mode and register mode. This issue came up recently when debugging
some problems with cache flush command timeouts and while it didn't appear
to fix that problem, this is something we should likely be doing in any
case.Signed-off-by: Robert Hancock
Cc: Tejun Heo
Cc: Jeff Garzik
Signed-off-by: Andrew Morton
Signed-off-by: Jeff Garzik -
Some problems showed up recently with cache flush commands timing out on
sata_nv. Previously these commands were always handled by transitioning to
legacy mode from ADMA mode first. The timeout problem was worked around
already by a change to the interrupt handling code for legacy mode, but for
non-data commands like these it appears we can handle them in ADMA mode, so
the switch to legacy mode is not needed.This patch changes the behavior so that we use ADMA mode to submit
interrupt-driven commands with ATA_PROT_NODATA protocol. In addition to
avoiding the problem mentioned above entirely, this avoids the overhead of
switching to legacy mode and back to ADMA mode for handling cache flushes.
When handling non-DMA-mapped commands, we leave the APRD blank and clear
the NV_CPB_CTL_APRD_VALID field in the CPB so the controller does not
attempt to read it.Signed-off-by: Robert Hancock
Cc: Jeff Garzik
Cc: Tejun Heo
Signed-off-by: Andrew Morton
Signed-off-by: Jeff Garzik -
This cleans up a few issues with the error handling in sata_nv in ADMA mode
to make it more consistent with other NCQ-capable drivers like ahci and
sata_sil24:- When a command failed, we would effectively set AC_ERR_DEV on the
queued command always. In the case of NCQ commands this prevents libata
from doing a log page query to determine the details of the failed
command, since it thinks we've already analyzed. Just set flags in the
port ehi->err_mask, then freeze or abort and let libata figure out what
went wrong.- The code handled NV_ADMA_STAT_CPBERR as a "really bad error" which
caused it to set error flags on every queued command. I don't know
exactly what this flag means (no docs, grr!) but from what I can guess
from the standard ADMA spec, it just means that one or more of the CPBs
had an error, so we just need to go through and do our normal checks in
this case.- In the error_handler function the code would always dump the state of
all the CPBs. This output seems redundant at this point since libata
already dumps the state of all active commands on errors (and it also
triggers at times when it shouldn't, like when suspending). Take this
out.[akpm@osdl.org: many coding-style fixes]
Signed-off-by: Robert Hancock
Cc: Jeff Garzik
Cc: Tejun Heo
Cc: Allen Martin
Signed-off-by: Andrew Morton
Signed-off-by: Jeff Garzik -
MPIIX has only single channel IDE which can be configured for either primary or
secondary legacy I/O ports and IRQ. So, get rid of the unneeded second probe
entry in mpiix_init_one() and of the invalid (but unused anyway) enable bits in
mpiix_pre_reset().Warning: this cleanup has only been compile-tested...
Signed-off-by: Sergei Shtylyov
Signed-off-by: Jeff Garzik -
Fix clearing/setting the wrong TIME/IE/PPE bits for a slave drive caused by a
wrong shift count.
Fix the PIO mode 1 being overclocked by wrongly selecting the fast timing bank.
Also, fix/rephrase some comments while at it.Signed-off-by: Sergei Shtylyov
Signed-off-by: Jeff Garzik -
Fix the PIO mode 2 using mode 0 timings -- this driver should enable the
fast timing bank starting with PIO2, just like the ata_piix driver does.
Also, fix/rephrase some comments while at it.Signed-off-by: Sergei Shtylyov
Signed-off-by: Jeff Garzik -
IORDY and IORDY enable/disable flags.
Signed-off-by: Alan Cox
Signed-off-by: Jeff Garzik -
People are getting confused about which drivers to enable for PATA PIIX
type devices. Change the ATA_PIIX line and help to make it clearer.Signed-off-by: Alan Cox
Signed-off-by: Jeff Garzik -
* Hardreset must not exit without actually performing reset regardless
of link status. We're resetting the link after all.* Minor message update.
* 150ms delay is meaningful iff link is online after reset is
complete.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
Follow the old SRST rule and delay 150ms between completion of
hardreset and status checking. Debouncing delay should usually cover
this but debounce duration could be shorter than 150ms under certain
circumstances.Usefulness depends on host controller implementation but it can't hurt
and serves as a reminder that 2s delay for GoVault should also be
added here.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
Per Jeff's suggestion, this patch rearranges the info printed for ATA
drives into dmesg to add the full ATA firmware revision and model
information, while keeping the output to 2 lines.Signed-off-by: Eric D. Mudama
Signed-off-by: Jeff Garzik -
Fix the wrong "compatible" PIO mode choices: MWDMA0 has 480 ns cycle while PIO1
only has 383 ns cycle, and MWDMA2 timings matchs those of PIO4 exactly.Signed-off-by: Jeff Garzik
-
This patch is against each libata driver.
Two IRQ calls are added in ata_port_operations.
- irq_on() is used to enable interrupts.
- irq_ack() is used to acknowledge a device interrupt.In most drivers, ata_irq_on() and ata_irq_ack() are used for
irq_on and irq_ack respectively.In some drivers (ex: ahci, sata_sil24) which cannot use them
as is, ata_dummy_irq_on() and ata_dummy_irq_ack() are used.Signed-off-by: Kou Ishizaki
Signed-off-by: Akira Iguchi
Signed-off-by: Jeff Garzik -
This patch is against the libata core and headers.
Two IRQ calls are added in ata_port_operations.
- irq_on() is used to enable interrupts.
- irq_ack() is used to acknowledge a device interrupt.In most drivers, ata_irq_on() and ata_irq_ack() are used for
irq_on and irq_ack respectively.In some drivers (ex: ahci, sata_sil24) which cannot use them
as is, ata_dummy_irq_on() and ata_dummy_irq_ack() are used.Signed-off-by: Kou Ishizaki
Signed-off-by: Akira Iguchi
Signed-off-by: Jeff Garzik -
In file included from drivers/infiniband/hw/ipath/ipath_diag.c:44:
include/linux/io.h:35: warning: 'struct device' declared inside parameter list
include/linux/io.h:35: warning: its scope is only this definition or declarationCc: Jeff Garzik
Signed-off-by: Andrew Morton
Signed-off-by: Jeff Garzik -
Convert libata core layer and LLDs to use iomap.
* managed iomap is used. Pointer to pcim_iomap_table() is cached at
host->iomap and used through out LLDs. This basically replaces
host->mmio_base.* if possible, pcim_iomap_regions() is used
Most iomap operation conversions are taken from Jeff Garzik
's iomap branch.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
devres updates for pata_platform were dropped while merging devres
patches due to merge conflict. This is the updated version.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
devres change moved iomap.o from obj-$(CONFIG_GENERIC_IOMAP) to lib-y
making it not linked if no in-kernel driver uses it. Fix it.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
Signed-off-by: Jeff Garzik
-
Implement pcim_iomap_regions(). This function takes mask of BARs to
request and iomap. No BAR should have length of zero. BARs are
iomapped using pcim_iomap_table().Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
Now that all LLDs are converted to use devres, default stop callbacks
are unused. Remove them.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik -
Update libata LLDs to use devres. Core layer is already converted to
support managed LLDs. This patch simplifies initialization and fixes
many resource related bugs in init failure and detach path. For
example, all converted drivers now handle ata_device_add() failure
gracefully without excessive resource rollback code.As most resources are released automatically on driver detach, many
drivers don't need or can do with much simpler ->{port|host}_stop().
In general, stop callbacks are need iff port or host needs to be given
commands to shut it down. Note that freezing is enough in many cases
and ports are automatically frozen before being detached.Signed-off-by: Tejun Heo
Signed-off-by: Jeff Garzik