Blame view

Documentation/ia64/aliasing.rst 9.16 KB
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
1
2
3
  ==================================
  Memory Attribute Aliasing on IA-64
  ==================================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
4

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
5
  Bjorn Helgaas <bjorn.helgaas@hp.com>
32e62c636   Bjorn Helgaas   [IA64] rework mem...
6

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
7
  May 4, 2006
32e62c636   Bjorn Helgaas   [IA64] rework mem...
8

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
9
10
11
  
  Memory Attributes
  =================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
12
13
14
15
16
  
      Itanium supports several attributes for virtual memory references.
      The attribute is part of the virtual translation, i.e., it is
      contained in the TLB entry.  The ones of most interest to the Linux
      kernel are:
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
17
18
  	==		======================
          WB		Write-back (cacheable)
32e62c636   Bjorn Helgaas   [IA64] rework mem...
19
20
  	UC		Uncacheable
  	WC		Write-coalescing
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
21
  	==		======================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
22
23
24
25
26
27
28
29
30
31
32
33
34
  
      System memory typically uses the WB attribute.  The UC attribute is
      used for memory-mapped I/O devices.  The WC attribute is uncacheable
      like UC is, but writes may be delayed and combined to increase
      performance for things like frame buffers.
  
      The Itanium architecture requires that we avoid accessing the same
      page with both a cacheable mapping and an uncacheable mapping[1].
  
      The design of the chipset determines which attributes are supported
      on which regions of the address space.  For example, some chipsets
      support either WB or UC access to main memory, while others support
      only WB access.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
35
36
  Memory Map
  ==========
32e62c636   Bjorn Helgaas   [IA64] rework mem...
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
  
      Platform firmware describes the physical memory map and the
      supported attributes for each region.  At boot-time, the kernel uses
      the EFI GetMemoryMap() interface.  ACPI can also describe memory
      devices and the attributes they support, but Linux/ia64 currently
      doesn't use this information.
  
      The kernel uses the efi_memmap table returned from GetMemoryMap() to
      learn the attributes supported by each region of physical address
      space.  Unfortunately, this table does not completely describe the
      address space because some machines omit some or all of the MMIO
      regions from the map.
  
      The kernel maintains another table, kern_memmap, which describes the
      memory Linux is actually using and the attribute for each region.
      This contains only system memory; it does not contain MMIO space.
  
      The kern_memmap table typically contains only a subset of the system
      memory described by the efi_memmap.  Linux/ia64 can't use all memory
      in the system because of constraints imposed by the identity mapping
      scheme.
  
      The efi_memmap table is preserved unmodified because the original
      boot-time information is required for kexec.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
61
62
  Kernel Identify Mappings
  ========================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
  
      Linux/ia64 identity mappings are done with large pages, currently
      either 16MB or 64MB, referred to as "granules."  Cacheable mappings
      are speculative[2], so the processor can read any location in the
      page at any time, independent of the programmer's intentions.  This
      means that to avoid attribute aliasing, Linux can create a cacheable
      identity mapping only when the entire granule supports cacheable
      access.
  
      Therefore, kern_memmap contains only full granule-sized regions that
      can referenced safely by an identity mapping.
  
      Uncacheable mappings are not speculative, so the processor will
      generate UC accesses only to locations explicitly referenced by
      software.  This allows UC identity mappings to cover granules that
      are only partially populated, or populated with a combination of UC
      and WB regions.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
80
81
  User Mappings
  =============
32e62c636   Bjorn Helgaas   [IA64] rework mem...
82
83
84
85
  
      User mappings are typically done with 16K or 64K pages.  The smaller
      page size allows more flexibility because only 16K or 64K has to be
      homogeneous with respect to memory attributes.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
86
87
  Potential Attribute Aliasing Cases
  ==================================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
88
89
  
      There are several ways the kernel creates new mappings:
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
90
91
  mmap of /dev/mem
  ----------------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
92
93
94
95
96
97
98
99
100
101
102
103
  
  	This uses remap_pfn_range(), which creates user mappings.  These
  	mappings may be either WB or UC.  If the region being mapped
  	happens to be in kern_memmap, meaning that it may also be mapped
  	by a kernel identity mapping, the user mapping must use the same
  	attribute as the kernel mapping.
  
  	If the region is not in kern_memmap, the user mapping should use
  	an attribute reported as being supported in the EFI memory map.
  
  	Since the EFI memory map does not describe MMIO on some
  	machines, this should use an uncacheable mapping as a fallback.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
104
105
  mmap of /sys/class/pci_bus/.../legacy_mem
  -----------------------------------------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
106
107
108
109
110
111
112
113
114
115
116
117
  
  	This is very similar to mmap of /dev/mem, except that legacy_mem
  	only allows mmap of the one megabyte "legacy MMIO" area for a
  	specific PCI bus.  Typically this is the first megabyte of
  	physical address space, but it may be different on machines with
  	several VGA devices.
  
  	"X" uses this to access VGA frame buffers.  Using legacy_mem
  	rather than /dev/mem allows multiple instances of X to talk to
  	different VGA cards.
  
  	The /dev/mem mmap constraints apply.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
118
119
  mmap of /proc/bus/pci/.../??.?
  ------------------------------
012b7105c   Alex Chiang   [IA64] prevent MC...
120

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
121
  	This is an MMIO mmap of PCI functions, which additionally may or
012b7105c   Alex Chiang   [IA64] prevent MC...
122
123
124
125
126
127
128
129
  	may not be requested as using the WC attribute.
  
  	If WC is requested, and the region in kern_memmap is either WC
  	or UC, and the EFI memory map designates the region as WC, then
  	the WC mapping is allowed.
  
  	Otherwise, the user mapping must use the same attribute as the
  	kernel mapping.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
130
131
  read/write of /dev/mem
  ----------------------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
132
133
134
135
136
137
138
139
140
141
142
143
  
  	This uses copy_from_user(), which implicitly uses a kernel
  	identity mapping.  This is obviously safe for things in
  	kern_memmap.
  
  	There may be corner cases of things that are not in kern_memmap,
  	but could be accessed this way.  For example, registers in MMIO
  	space are not in kern_memmap, but could be accessed with a UC
  	mapping.  This would not cause attribute aliasing.  But
  	registers typically can be accessed only with four-byte or
  	eight-byte accesses, and the copy_from_user() path doesn't allow
  	any control over the access size, so this would be dangerous.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
144
145
  ioremap()
  ---------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
146

ddd83eff5   Bjorn Helgaas   [IA64] update mem...
147
  	This returns a mapping for use inside the kernel.
32e62c636   Bjorn Helgaas   [IA64] rework mem...
148
149
  
  	If the region is in kern_memmap, we should use the attribute
ddd83eff5   Bjorn Helgaas   [IA64] update mem...
150
151
152
153
154
155
156
157
158
159
160
  	specified there.
  
  	If the EFI memory map reports that the entire granule supports
  	WB, we should use that (granules that are partially reserved
  	or occupied by firmware do not appear in kern_memmap).
  
  	If the granule contains non-WB memory, but we can cover the
  	region safely with kernel page table mappings, we can use
  	ioremap_page_range() as most other architectures do.
  
  	Failing all of the above, we have to fall back to a UC mapping.
32e62c636   Bjorn Helgaas   [IA64] rework mem...
161

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
162
163
  Past Problem Cases
  ==================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
164

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
165
166
  mmap of various MMIO regions from /dev/mem by "X" on Intel platforms
  --------------------------------------------------------------------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
167
168
169
170
171
172
173
  
        The EFI memory map may not report these MMIO regions.
  
        These must be allowed so that X will work.  This means that
        when the EFI memory map is incomplete, every /dev/mem mmap must
        succeed.  It may create either WB or UC user mappings, depending
        on whether the region is in kern_memmap or the EFI memory map.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
174
175
  mmap of 0x0-0x9FFFF /dev/mem by "hwinfo" on HP sx1000 with VGA enabled
  ----------------------------------------------------------------------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
176

32e62c636   Bjorn Helgaas   [IA64] rework mem...
177
        The EFI memory map reports the following attributes:
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
178
179
  
          =============== ======= ==================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
180
181
182
          0x00000-0x9FFFF WB only
          0xA0000-0xBFFFF UC only (VGA frame buffer)
          0xC0000-0xFFFFF WB only
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
183
          =============== ======= ==================
32e62c636   Bjorn Helgaas   [IA64] rework mem...
184
185
186
187
188
  
        This mmap is done with user pages, not kernel identity mappings,
        so it is safe to use WB mappings.
  
        The kernel VGA driver may ioremap the VGA frame buffer at 0xA0000,
ddd83eff5   Bjorn Helgaas   [IA64] update mem...
189
190
191
192
        which uses a granule-sized UC mapping.  This granule will cover some
        WB-only memory, but since UC is non-speculative, the processor will
        never generate an uncacheable reference to the WB-only areas unless
        the driver explicitly touches them.
32e62c636   Bjorn Helgaas   [IA64] rework mem...
193

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
194
195
  mmap of 0x0-0xFFFFF legacy_mem by "X"
  -------------------------------------
32e62c636   Bjorn Helgaas   [IA64] rework mem...
196

ddd83eff5   Bjorn Helgaas   [IA64] update mem...
197
198
199
200
        If the EFI memory map reports that the entire range supports the
        same attributes, we can allow the mmap (and we will prefer WB if
        supported, as is the case with HP sx[12]000 machines with VGA
        disabled).
32e62c636   Bjorn Helgaas   [IA64] rework mem...
201

ddd83eff5   Bjorn Helgaas   [IA64] update mem...
202
203
204
        If EFI reports the range as partly WB and partly UC (as on sx[12]000
        machines with VGA enabled), we must fail the mmap because there's no
        safe attribute to use.
32e62c636   Bjorn Helgaas   [IA64] rework mem...
205

ddd83eff5   Bjorn Helgaas   [IA64] update mem...
206
207
208
        If EFI reports some of the range but not all (as on Intel firmware
        that doesn't report the VGA frame buffer at all), we should fail the
        mmap and force the user to map just the specific region of interest.
32e62c636   Bjorn Helgaas   [IA64] rework mem...
209

db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
210
211
212
213
  mmap of 0xA0000-0xBFFFF legacy_mem by "X" on HP sx1000 with VGA disabled
  ------------------------------------------------------------------------
  
        The EFI memory map reports the following attributes::
32e62c636   Bjorn Helgaas   [IA64] rework mem...
214

32e62c636   Bjorn Helgaas   [IA64] rework mem...
215
216
217
218
          0x00000-0xFFFFF WB only (no VGA MMIO hole)
  
        This is a special case of the previous case, and the mmap should
        fail for the same reason as above.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
219
220
  read of /sys/devices/.../rom
  ----------------------------
ddd83eff5   Bjorn Helgaas   [IA64] update mem...
221
222
223
224
225
226
227
228
  
        For VGA devices, this may cause an ioremap() of 0xC0000.  This
        used to be done with a UC mapping, because the VGA frame buffer
        at 0xA0000 prevents use of a WB granule.  The UC mapping causes
        an MCA on HP sx[12]000 chipsets.
  
        We should use WB page table mappings to avoid covering the VGA
        frame buffer.
db9a0975a   Mauro Carvalho Chehab   docs: ia64: conve...
229
230
  Notes
  =====
32e62c636   Bjorn Helgaas   [IA64] rework mem...
231
232
233
  
      [1] SDM rev 2.2, vol 2, sec 4.4.1.
      [2] SDM rev 2.2, vol 2, sec 4.4.6.