Commit 3297e760776af18a26bf30046cbaaae2e730c5c2

Authored by Nicolas Pitre
1 parent 3835f6cb64

highmem: atomic highmem kmap page pinning

Most ARM machines have a non IO coherent cache, meaning that the
dma_map_*() set of functions must clean and/or invalidate the affected
memory manually before DMA occurs.  And because the majority of those
machines have a VIVT cache, the cache maintenance operations must be
performed using virtual
addresses.

When a highmem page is kunmap'd, its mapping (and cache) remains in place
in case it is kmap'd again. However if dma_map_page() is then called with
such a page, some cache maintenance on the remaining mapping must be
performed. In that case, page_address(page) is non null and we can use
that to synchronize the cache.

It is unlikely but still possible for kmap() to race and recycle the
virtual address obtained above, and use it for another page before some
on-going cache invalidation loop in dma_map_page() is done. In that case,
the new mapping could end up with dirty cache lines for another page,
and the unsuspecting cache invalidation loop in dma_map_page() might
simply discard those dirty cache lines resulting in data loss.

For example, let's consider this sequence of events:

	- dma_map_page(..., DMA_FROM_DEVICE) is called on a highmem page.

	-->	- vaddr = page_address(page) is non null. In this case
		it is likely that the page has valid cache lines
		associated with vaddr. Remember that the cache is VIVT.

		-->	for (i = vaddr; i < vaddr + PAGE_SIZE; i += 32)
				invalidate_cache_line(i);

	*** preemption occurs in the middle of the loop above ***

	- kmap_high() is called for a different page.

	-->	- last_pkmap_nr wraps to zero and flush_all_zero_pkmaps()
		  is called.  The pkmap_count value for the page passed
		  to dma_map_page() above happens to be 1, so the page
		  is unmapped.  But prior to that, flush_cache_kmaps()
		  cleared the cache for it.  So far so good.

		- A fresh pkmap entry is assigned for this kmap request.
		  The Murphy law says this pkmap entry will eventually
		  happen to use the same vaddr as the one which used to
		  belong to the other page being processed by
		  dma_map_page() in the preempted thread above.

	- The kmap_high() caller start dirtying the cache using the
	  just assigned virtual mapping for its page.

	*** the first thread is rescheduled ***

			- The for(...) loop is resumed, but now cached
			  data belonging to a different physical page is
			  being discarded !

And this is not only a preemption issue as ARM can be SMP as well,
making the above scenario just as likely. Hence the need for some kind
of pkmap page pinning which can be used in any context, primarily for
the benefit of dma_map_page() on ARM.

This provides the necessary interface to cope with the above issue if
ARCH_NEEDS_KMAP_HIGH_GET is defined, otherwise the resulting code is
unchanged.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Reviewed-by: MinChan Kim <minchan.kim@gmail.com>
Acked-by: Andrew Morton <akpm@linux-foundation.org>

Showing 1 changed file with 57 additions and 8 deletions Side-by-side Diff

... ... @@ -67,6 +67,25 @@
67 67  
68 68 static DECLARE_WAIT_QUEUE_HEAD(pkmap_map_wait);
69 69  
  70 +/*
  71 + * Most architectures have no use for kmap_high_get(), so let's abstract
  72 + * the disabling of IRQ out of the locking in that case to save on a
  73 + * potential useless overhead.
  74 + */
  75 +#ifdef ARCH_NEEDS_KMAP_HIGH_GET
  76 +#define lock_kmap() spin_lock_irq(&kmap_lock)
  77 +#define unlock_kmap() spin_unlock_irq(&kmap_lock)
  78 +#define lock_kmap_any(flags) spin_lock_irqsave(&kmap_lock, flags)
  79 +#define unlock_kmap_any(flags) spin_unlock_irqrestore(&kmap_lock, flags)
  80 +#else
  81 +#define lock_kmap() spin_lock(&kmap_lock)
  82 +#define unlock_kmap() spin_unlock(&kmap_lock)
  83 +#define lock_kmap_any(flags) \
  84 + do { spin_lock(&kmap_lock); (void)(flags); } while (0)
  85 +#define unlock_kmap_any(flags) \
  86 + do { spin_unlock(&kmap_lock); (void)(flags); } while (0)
  87 +#endif
  88 +
70 89 static void flush_all_zero_pkmaps(void)
71 90 {
72 91 int i;
73 92  
... ... @@ -113,9 +132,9 @@
113 132 */
114 133 void kmap_flush_unused(void)
115 134 {
116   - spin_lock(&kmap_lock);
  135 + lock_kmap();
117 136 flush_all_zero_pkmaps();
118   - spin_unlock(&kmap_lock);
  137 + unlock_kmap();
119 138 }
120 139  
121 140 static inline unsigned long map_new_virtual(struct page *page)
122 141  
... ... @@ -145,10 +164,10 @@
145 164  
146 165 __set_current_state(TASK_UNINTERRUPTIBLE);
147 166 add_wait_queue(&pkmap_map_wait, &wait);
148   - spin_unlock(&kmap_lock);
  167 + unlock_kmap();
149 168 schedule();
150 169 remove_wait_queue(&pkmap_map_wait, &wait);
151   - spin_lock(&kmap_lock);
  170 + lock_kmap();
152 171  
153 172 /* Somebody else might have mapped it while we slept */
154 173 if (page_address(page))
155 174  
156 175  
157 176  
158 177  
159 178  
160 179  
... ... @@ -184,29 +203,59 @@
184 203 * For highmem pages, we can't trust "virtual" until
185 204 * after we have the lock.
186 205 */
187   - spin_lock(&kmap_lock);
  206 + lock_kmap();
188 207 vaddr = (unsigned long)page_address(page);
189 208 if (!vaddr)
190 209 vaddr = map_new_virtual(page);
191 210 pkmap_count[PKMAP_NR(vaddr)]++;
192 211 BUG_ON(pkmap_count[PKMAP_NR(vaddr)] < 2);
193   - spin_unlock(&kmap_lock);
  212 + unlock_kmap();
194 213 return (void*) vaddr;
195 214 }
196 215  
197 216 EXPORT_SYMBOL(kmap_high);
198 217  
  218 +#ifdef ARCH_NEEDS_KMAP_HIGH_GET
199 219 /**
  220 + * kmap_high_get - pin a highmem page into memory
  221 + * @page: &struct page to pin
  222 + *
  223 + * Returns the page's current virtual memory address, or NULL if no mapping
  224 + * exists. When and only when a non null address is returned then a
  225 + * matching call to kunmap_high() is necessary.
  226 + *
  227 + * This can be called from any context.
  228 + */
  229 +void *kmap_high_get(struct page *page)
  230 +{
  231 + unsigned long vaddr, flags;
  232 +
  233 + lock_kmap_any(flags);
  234 + vaddr = (unsigned long)page_address(page);
  235 + if (vaddr) {
  236 + BUG_ON(pkmap_count[PKMAP_NR(vaddr)] < 1);
  237 + pkmap_count[PKMAP_NR(vaddr)]++;
  238 + }
  239 + unlock_kmap_any(flags);
  240 + return (void*) vaddr;
  241 +}
  242 +#endif
  243 +
  244 +/**
200 245 * kunmap_high - map a highmem page into memory
201 246 * @page: &struct page to unmap
  247 + *
  248 + * If ARCH_NEEDS_KMAP_HIGH_GET is not defined then this may be called
  249 + * only from user context.
202 250 */
203 251 void kunmap_high(struct page *page)
204 252 {
205 253 unsigned long vaddr;
206 254 unsigned long nr;
  255 + unsigned long flags;
207 256 int need_wakeup;
208 257  
209   - spin_lock(&kmap_lock);
  258 + lock_kmap_any(flags);
210 259 vaddr = (unsigned long)page_address(page);
211 260 BUG_ON(!vaddr);
212 261 nr = PKMAP_NR(vaddr);
... ... @@ -232,7 +281,7 @@
232 281 */
233 282 need_wakeup = waitqueue_active(&pkmap_map_wait);
234 283 }
235   - spin_unlock(&kmap_lock);
  284 + unlock_kmap_any(flags);
236 285  
237 286 /* do wake-up, if needed, race-free outside of the spin lock */
238 287 if (need_wakeup)