Commit 0d9ea75443dc7e37843e656b8ebc947a6d16d618

Authored by Jon Tollefson
Committed by Linus Torvalds
1 parent f4a67cceee

powerpc: support multiple hugepage sizes

Instead of using the variable mmu_huge_psize to keep track of the huge
page size we use an array of MMU_PAGE_* values.  For each supported huge
page size we need to know the hugepte_shift value and have a
pgtable_cache.  The hstate or an mmu_huge_psizes index is passed to
functions so that they know which huge page size they should use.

The hugepage sizes 16M and 64K are setup(if available on the hardware) so
that they don't have to be set on the boot cmd line in order to use them.
The number of 16G pages have to be specified at boot-time though (e.g.
hugepagesz=16G hugepages=5).

Signed-off-by: Jon Tollefson <kniht@linux.vnet.ibm.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Showing 9 changed files with 199 additions and 118 deletions Inline Diff

Documentation/kernel-parameters.txt
1 Kernel Parameters 1 Kernel Parameters
2 ~~~~~~~~~~~~~~~~~ 2 ~~~~~~~~~~~~~~~~~
3 3
4 The following is a consolidated list of the kernel parameters as implemented 4 The following is a consolidated list of the kernel parameters as implemented
5 (mostly) by the __setup() macro and sorted into English Dictionary order 5 (mostly) by the __setup() macro and sorted into English Dictionary order
6 (defined as ignoring all punctuation and sorting digits before letters in a 6 (defined as ignoring all punctuation and sorting digits before letters in a
7 case insensitive manner), and with descriptions where known. 7 case insensitive manner), and with descriptions where known.
8 8
9 Module parameters for loadable modules are specified only as the 9 Module parameters for loadable modules are specified only as the
10 parameter name with optional '=' and value as appropriate, such as: 10 parameter name with optional '=' and value as appropriate, such as:
11 11
12 modprobe usbcore blinkenlights=1 12 modprobe usbcore blinkenlights=1
13 13
14 Module parameters for modules that are built into the kernel image 14 Module parameters for modules that are built into the kernel image
15 are specified on the kernel command line with the module name plus 15 are specified on the kernel command line with the module name plus
16 '.' plus parameter name, with '=' and value if appropriate, such as: 16 '.' plus parameter name, with '=' and value if appropriate, such as:
17 17
18 usbcore.blinkenlights=1 18 usbcore.blinkenlights=1
19 19
20 This document may not be entirely up to date and comprehensive. The command 20 This document may not be entirely up to date and comprehensive. The command
21 "modinfo -p ${modulename}" shows a current list of all parameters of a loadable 21 "modinfo -p ${modulename}" shows a current list of all parameters of a loadable
22 module. Loadable modules, after being loaded into the running kernel, also 22 module. Loadable modules, after being loaded into the running kernel, also
23 reveal their parameters in /sys/module/${modulename}/parameters/. Some of these 23 reveal their parameters in /sys/module/${modulename}/parameters/. Some of these
24 parameters may be changed at runtime by the command 24 parameters may be changed at runtime by the command
25 "echo -n ${value} > /sys/module/${modulename}/parameters/${parm}". 25 "echo -n ${value} > /sys/module/${modulename}/parameters/${parm}".
26 26
27 The parameters listed below are only valid if certain kernel build options were 27 The parameters listed below are only valid if certain kernel build options were
28 enabled and if respective hardware is present. The text in square brackets at 28 enabled and if respective hardware is present. The text in square brackets at
29 the beginning of each description states the restrictions within which a 29 the beginning of each description states the restrictions within which a
30 parameter is applicable: 30 parameter is applicable:
31 31
32 ACPI ACPI support is enabled. 32 ACPI ACPI support is enabled.
33 AGP AGP (Accelerated Graphics Port) is enabled. 33 AGP AGP (Accelerated Graphics Port) is enabled.
34 ALSA ALSA sound support is enabled. 34 ALSA ALSA sound support is enabled.
35 APIC APIC support is enabled. 35 APIC APIC support is enabled.
36 APM Advanced Power Management support is enabled. 36 APM Advanced Power Management support is enabled.
37 AVR32 AVR32 architecture is enabled. 37 AVR32 AVR32 architecture is enabled.
38 AX25 Appropriate AX.25 support is enabled. 38 AX25 Appropriate AX.25 support is enabled.
39 BLACKFIN Blackfin architecture is enabled. 39 BLACKFIN Blackfin architecture is enabled.
40 DRM Direct Rendering Management support is enabled. 40 DRM Direct Rendering Management support is enabled.
41 EDD BIOS Enhanced Disk Drive Services (EDD) is enabled 41 EDD BIOS Enhanced Disk Drive Services (EDD) is enabled
42 EFI EFI Partitioning (GPT) is enabled 42 EFI EFI Partitioning (GPT) is enabled
43 EIDE EIDE/ATAPI support is enabled. 43 EIDE EIDE/ATAPI support is enabled.
44 FB The frame buffer device is enabled. 44 FB The frame buffer device is enabled.
45 HW Appropriate hardware is enabled. 45 HW Appropriate hardware is enabled.
46 IA-64 IA-64 architecture is enabled. 46 IA-64 IA-64 architecture is enabled.
47 IOSCHED More than one I/O scheduler is enabled. 47 IOSCHED More than one I/O scheduler is enabled.
48 IP_PNP IP DHCP, BOOTP, or RARP is enabled. 48 IP_PNP IP DHCP, BOOTP, or RARP is enabled.
49 ISAPNP ISA PnP code is enabled. 49 ISAPNP ISA PnP code is enabled.
50 ISDN Appropriate ISDN support is enabled. 50 ISDN Appropriate ISDN support is enabled.
51 JOY Appropriate joystick support is enabled. 51 JOY Appropriate joystick support is enabled.
52 LIBATA Libata driver is enabled 52 LIBATA Libata driver is enabled
53 LP Printer support is enabled. 53 LP Printer support is enabled.
54 LOOP Loopback device support is enabled. 54 LOOP Loopback device support is enabled.
55 M68k M68k architecture is enabled. 55 M68k M68k architecture is enabled.
56 These options have more detailed description inside of 56 These options have more detailed description inside of
57 Documentation/m68k/kernel-options.txt. 57 Documentation/m68k/kernel-options.txt.
58 MCA MCA bus support is enabled. 58 MCA MCA bus support is enabled.
59 MDA MDA console support is enabled. 59 MDA MDA console support is enabled.
60 MOUSE Appropriate mouse support is enabled. 60 MOUSE Appropriate mouse support is enabled.
61 MSI Message Signaled Interrupts (PCI). 61 MSI Message Signaled Interrupts (PCI).
62 MTD MTD (Memory Technology Device) support is enabled. 62 MTD MTD (Memory Technology Device) support is enabled.
63 NET Appropriate network support is enabled. 63 NET Appropriate network support is enabled.
64 NUMA NUMA support is enabled. 64 NUMA NUMA support is enabled.
65 GENERIC_TIME The generic timeofday code is enabled. 65 GENERIC_TIME The generic timeofday code is enabled.
66 NFS Appropriate NFS support is enabled. 66 NFS Appropriate NFS support is enabled.
67 OSS OSS sound support is enabled. 67 OSS OSS sound support is enabled.
68 PV_OPS A paravirtualized kernel is enabled. 68 PV_OPS A paravirtualized kernel is enabled.
69 PARIDE The ParIDE (parallel port IDE) subsystem is enabled. 69 PARIDE The ParIDE (parallel port IDE) subsystem is enabled.
70 PARISC The PA-RISC architecture is enabled. 70 PARISC The PA-RISC architecture is enabled.
71 PCI PCI bus support is enabled. 71 PCI PCI bus support is enabled.
72 PCIE PCI Express support is enabled. 72 PCIE PCI Express support is enabled.
73 PCMCIA The PCMCIA subsystem is enabled. 73 PCMCIA The PCMCIA subsystem is enabled.
74 PNP Plug & Play support is enabled. 74 PNP Plug & Play support is enabled.
75 PPC PowerPC architecture is enabled. 75 PPC PowerPC architecture is enabled.
76 PPT Parallel port support is enabled. 76 PPT Parallel port support is enabled.
77 PS2 Appropriate PS/2 support is enabled. 77 PS2 Appropriate PS/2 support is enabled.
78 RAM RAM disk support is enabled. 78 RAM RAM disk support is enabled.
79 ROOTPLUG The example Root Plug LSM is enabled. 79 ROOTPLUG The example Root Plug LSM is enabled.
80 S390 S390 architecture is enabled. 80 S390 S390 architecture is enabled.
81 SCSI Appropriate SCSI support is enabled. 81 SCSI Appropriate SCSI support is enabled.
82 A lot of drivers has their options described inside of 82 A lot of drivers has their options described inside of
83 Documentation/scsi/. 83 Documentation/scsi/.
84 SECURITY Different security models are enabled. 84 SECURITY Different security models are enabled.
85 SELINUX SELinux support is enabled. 85 SELINUX SELinux support is enabled.
86 SERIAL Serial support is enabled. 86 SERIAL Serial support is enabled.
87 SH SuperH architecture is enabled. 87 SH SuperH architecture is enabled.
88 SMP The kernel is an SMP kernel. 88 SMP The kernel is an SMP kernel.
89 SPARC Sparc architecture is enabled. 89 SPARC Sparc architecture is enabled.
90 SWSUSP Software suspend is enabled. 90 SWSUSP Software suspend is enabled.
91 TS Appropriate touchscreen support is enabled. 91 TS Appropriate touchscreen support is enabled.
92 USB USB support is enabled. 92 USB USB support is enabled.
93 USBHID USB Human Interface Device support is enabled. 93 USBHID USB Human Interface Device support is enabled.
94 V4L Video For Linux support is enabled. 94 V4L Video For Linux support is enabled.
95 VGA The VGA console has been enabled. 95 VGA The VGA console has been enabled.
96 VT Virtual terminal support is enabled. 96 VT Virtual terminal support is enabled.
97 WDT Watchdog support is enabled. 97 WDT Watchdog support is enabled.
98 XT IBM PC/XT MFM hard disk support is enabled. 98 XT IBM PC/XT MFM hard disk support is enabled.
99 X86-32 X86-32, aka i386 architecture is enabled. 99 X86-32 X86-32, aka i386 architecture is enabled.
100 X86-64 X86-64 architecture is enabled. 100 X86-64 X86-64 architecture is enabled.
101 More X86-64 boot options can be found in 101 More X86-64 boot options can be found in
102 Documentation/x86_64/boot-options.txt . 102 Documentation/x86_64/boot-options.txt .
103 103
104 In addition, the following text indicates that the option: 104 In addition, the following text indicates that the option:
105 105
106 BUGS= Relates to possible processor bugs on the said processor. 106 BUGS= Relates to possible processor bugs on the said processor.
107 KNL Is a kernel start-up parameter. 107 KNL Is a kernel start-up parameter.
108 BOOT Is a boot loader parameter. 108 BOOT Is a boot loader parameter.
109 109
110 Parameters denoted with BOOT are actually interpreted by the boot 110 Parameters denoted with BOOT are actually interpreted by the boot
111 loader, and have no meaning to the kernel directly. 111 loader, and have no meaning to the kernel directly.
112 Do not modify the syntax of boot loader parameters without extreme 112 Do not modify the syntax of boot loader parameters without extreme
113 need or coordination with <Documentation/i386/boot.txt>. 113 need or coordination with <Documentation/i386/boot.txt>.
114 114
115 There are also arch-specific kernel-parameters not documented here. 115 There are also arch-specific kernel-parameters not documented here.
116 See for example <Documentation/x86_64/boot-options.txt>. 116 See for example <Documentation/x86_64/boot-options.txt>.
117 117
118 Note that ALL kernel parameters listed below are CASE SENSITIVE, and that 118 Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
119 a trailing = on the name of any parameter states that that parameter will 119 a trailing = on the name of any parameter states that that parameter will
120 be entered as an environment variable, whereas its absence indicates that 120 be entered as an environment variable, whereas its absence indicates that
121 it will appear as a kernel argument readable via /proc/cmdline by programs 121 it will appear as a kernel argument readable via /proc/cmdline by programs
122 running once the system is up. 122 running once the system is up.
123 123
124 The number of kernel parameters is not limited, but the length of the 124 The number of kernel parameters is not limited, but the length of the
125 complete command line (parameters including spaces etc.) is limited to 125 complete command line (parameters including spaces etc.) is limited to
126 a fixed number of characters. This limit depends on the architecture 126 a fixed number of characters. This limit depends on the architecture
127 and is between 256 and 4096 characters. It is defined in the file 127 and is between 256 and 4096 characters. It is defined in the file
128 ./include/asm/setup.h as COMMAND_LINE_SIZE. 128 ./include/asm/setup.h as COMMAND_LINE_SIZE.
129 129
130 130
131 acpi= [HW,ACPI,X86-64,i386] 131 acpi= [HW,ACPI,X86-64,i386]
132 Advanced Configuration and Power Interface 132 Advanced Configuration and Power Interface
133 Format: { force | off | ht | strict | noirq } 133 Format: { force | off | ht | strict | noirq }
134 force -- enable ACPI if default was off 134 force -- enable ACPI if default was off
135 off -- disable ACPI if default was on 135 off -- disable ACPI if default was on
136 noirq -- do not use ACPI for IRQ routing 136 noirq -- do not use ACPI for IRQ routing
137 ht -- run only enough ACPI to enable Hyper Threading 137 ht -- run only enough ACPI to enable Hyper Threading
138 strict -- Be less tolerant of platforms that are not 138 strict -- Be less tolerant of platforms that are not
139 strictly ACPI specification compliant. 139 strictly ACPI specification compliant.
140 140
141 See also Documentation/power/pm.txt, pci=noacpi 141 See also Documentation/power/pm.txt, pci=noacpi
142 142
143 acpi_apic_instance= [ACPI, IOAPIC] 143 acpi_apic_instance= [ACPI, IOAPIC]
144 Format: <int> 144 Format: <int>
145 2: use 2nd APIC table, if available 145 2: use 2nd APIC table, if available
146 1,0: use 1st APIC table 146 1,0: use 1st APIC table
147 default: 0 147 default: 0
148 148
149 acpi_sleep= [HW,ACPI] Sleep options 149 acpi_sleep= [HW,ACPI] Sleep options
150 Format: { s3_bios, s3_mode, s3_beep, old_ordering } 150 Format: { s3_bios, s3_mode, s3_beep, old_ordering }
151 See Documentation/power/video.txt for s3_bios and s3_mode. 151 See Documentation/power/video.txt for s3_bios and s3_mode.
152 s3_beep is for debugging; it makes the PC's speaker beep 152 s3_beep is for debugging; it makes the PC's speaker beep
153 as soon as the kernel's real-mode entry point is called. 153 as soon as the kernel's real-mode entry point is called.
154 old_ordering causes the ACPI 1.0 ordering of the _PTS 154 old_ordering causes the ACPI 1.0 ordering of the _PTS
155 control method, wrt putting devices into low power 155 control method, wrt putting devices into low power
156 states, to be enforced (the ACPI 2.0 ordering of _PTS is 156 states, to be enforced (the ACPI 2.0 ordering of _PTS is
157 used by default). 157 used by default).
158 158
159 acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode 159 acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode
160 Format: { level | edge | high | low } 160 Format: { level | edge | high | low }
161 161
162 acpi_irq_balance [HW,ACPI] 162 acpi_irq_balance [HW,ACPI]
163 ACPI will balance active IRQs 163 ACPI will balance active IRQs
164 default in APIC mode 164 default in APIC mode
165 165
166 acpi_irq_nobalance [HW,ACPI] 166 acpi_irq_nobalance [HW,ACPI]
167 ACPI will not move active IRQs (default) 167 ACPI will not move active IRQs (default)
168 default in PIC mode 168 default in PIC mode
169 169
170 acpi_irq_pci= [HW,ACPI] If irq_balance, clear listed IRQs for 170 acpi_irq_pci= [HW,ACPI] If irq_balance, clear listed IRQs for
171 use by PCI 171 use by PCI
172 Format: <irq>,<irq>... 172 Format: <irq>,<irq>...
173 173
174 acpi_irq_isa= [HW,ACPI] If irq_balance, mark listed IRQs used by ISA 174 acpi_irq_isa= [HW,ACPI] If irq_balance, mark listed IRQs used by ISA
175 Format: <irq>,<irq>... 175 Format: <irq>,<irq>...
176 176
177 acpi_no_auto_ssdt [HW,ACPI] Disable automatic loading of SSDT 177 acpi_no_auto_ssdt [HW,ACPI] Disable automatic loading of SSDT
178 178
179 acpi_os_name= [HW,ACPI] Tell ACPI BIOS the name of the OS 179 acpi_os_name= [HW,ACPI] Tell ACPI BIOS the name of the OS
180 Format: To spoof as Windows 98: ="Microsoft Windows" 180 Format: To spoof as Windows 98: ="Microsoft Windows"
181 181
182 acpi_osi= [HW,ACPI] Modify list of supported OS interface strings 182 acpi_osi= [HW,ACPI] Modify list of supported OS interface strings
183 acpi_osi="string1" # add string1 -- only one string 183 acpi_osi="string1" # add string1 -- only one string
184 acpi_osi="!string2" # remove built-in string2 184 acpi_osi="!string2" # remove built-in string2
185 acpi_osi= # disable all strings 185 acpi_osi= # disable all strings
186 186
187 acpi_serialize [HW,ACPI] force serialization of AML methods 187 acpi_serialize [HW,ACPI] force serialization of AML methods
188 188
189 acpi_skip_timer_override [HW,ACPI] 189 acpi_skip_timer_override [HW,ACPI]
190 Recognize and ignore IRQ0/pin2 Interrupt Override. 190 Recognize and ignore IRQ0/pin2 Interrupt Override.
191 For broken nForce2 BIOS resulting in XT-PIC timer. 191 For broken nForce2 BIOS resulting in XT-PIC timer.
192 acpi_use_timer_override [HW,ACPI} 192 acpi_use_timer_override [HW,ACPI}
193 Use timer override. For some broken Nvidia NF5 boards 193 Use timer override. For some broken Nvidia NF5 boards
194 that require a timer override, but don't have 194 that require a timer override, but don't have
195 HPET 195 HPET
196 196
197 acpi.debug_layer= [HW,ACPI] 197 acpi.debug_layer= [HW,ACPI]
198 Format: <int> 198 Format: <int>
199 Each bit of the <int> indicates an ACPI debug layer, 199 Each bit of the <int> indicates an ACPI debug layer,
200 1: enable, 0: disable. It is useful for boot time 200 1: enable, 0: disable. It is useful for boot time
201 debugging. After system has booted up, it can be set 201 debugging. After system has booted up, it can be set
202 via /sys/module/acpi/parameters/debug_layer. 202 via /sys/module/acpi/parameters/debug_layer.
203 CONFIG_ACPI_DEBUG must be enabled for this to produce any output. 203 CONFIG_ACPI_DEBUG must be enabled for this to produce any output.
204 Available bits (add the numbers together) to enable debug output 204 Available bits (add the numbers together) to enable debug output
205 for specific parts of the ACPI subsystem: 205 for specific parts of the ACPI subsystem:
206 0x01 utilities 0x02 hardware 0x04 events 0x08 tables 206 0x01 utilities 0x02 hardware 0x04 events 0x08 tables
207 0x10 namespace 0x20 parser 0x40 dispatcher 207 0x10 namespace 0x20 parser 0x40 dispatcher
208 0x80 executer 0x100 resources 0x200 acpica debugger 208 0x80 executer 0x100 resources 0x200 acpica debugger
209 0x400 os services 0x800 acpica disassembler. 209 0x400 os services 0x800 acpica disassembler.
210 The number can be in decimal or prefixed with 0x in hex. 210 The number can be in decimal or prefixed with 0x in hex.
211 Warning: Many of these options can produce a lot of 211 Warning: Many of these options can produce a lot of
212 output and make your system unusable. Be very careful. 212 output and make your system unusable. Be very careful.
213 213
214 acpi.debug_level= [HW,ACPI] 214 acpi.debug_level= [HW,ACPI]
215 Format: <int> 215 Format: <int>
216 Each bit of the <int> indicates an ACPI debug level, 216 Each bit of the <int> indicates an ACPI debug level,
217 1: enable, 0: disable. It is useful for boot time 217 1: enable, 0: disable. It is useful for boot time
218 debugging. After system has booted up, it can be set 218 debugging. After system has booted up, it can be set
219 via /sys/module/acpi/parameters/debug_level. 219 via /sys/module/acpi/parameters/debug_level.
220 CONFIG_ACPI_DEBUG must be enabled for this to produce any output. 220 CONFIG_ACPI_DEBUG must be enabled for this to produce any output.
221 Available bits (add the numbers together) to enable different 221 Available bits (add the numbers together) to enable different
222 debug output levels of the ACPI subsystem: 222 debug output levels of the ACPI subsystem:
223 0x01 error 0x02 warn 0x04 init 0x08 debug object 223 0x01 error 0x02 warn 0x04 init 0x08 debug object
224 0x10 info 0x20 init names 0x40 parse 0x80 load 224 0x10 info 0x20 init names 0x40 parse 0x80 load
225 0x100 dispatch 0x200 execute 0x400 names 0x800 operation region 225 0x100 dispatch 0x200 execute 0x400 names 0x800 operation region
226 0x1000 bfield 0x2000 tables 0x4000 values 0x8000 objects 226 0x1000 bfield 0x2000 tables 0x4000 values 0x8000 objects
227 0x10000 resources 0x20000 user requests 0x40000 package. 227 0x10000 resources 0x20000 user requests 0x40000 package.
228 The number can be in decimal or prefixed with 0x in hex. 228 The number can be in decimal or prefixed with 0x in hex.
229 Warning: Many of these options can produce a lot of 229 Warning: Many of these options can produce a lot of
230 output and make your system unusable. Be very careful. 230 output and make your system unusable. Be very careful.
231 231
232 acpi_pm_good [X86-32,X86-64] 232 acpi_pm_good [X86-32,X86-64]
233 Override the pmtimer bug detection: force the kernel 233 Override the pmtimer bug detection: force the kernel
234 to assume that this machine's pmtimer latches its value 234 to assume that this machine's pmtimer latches its value
235 and always returns good values. 235 and always returns good values.
236 236
237 agp= [AGP] 237 agp= [AGP]
238 { off | try_unsupported } 238 { off | try_unsupported }
239 off: disable AGP support 239 off: disable AGP support
240 try_unsupported: try to drive unsupported chipsets 240 try_unsupported: try to drive unsupported chipsets
241 (may crash computer or cause data corruption) 241 (may crash computer or cause data corruption)
242 242
243 enable_timer_pin_1 [i386,x86-64] 243 enable_timer_pin_1 [i386,x86-64]
244 Enable PIN 1 of APIC timer 244 Enable PIN 1 of APIC timer
245 Can be useful to work around chipset bugs 245 Can be useful to work around chipset bugs
246 (in particular on some ATI chipsets). 246 (in particular on some ATI chipsets).
247 The kernel tries to set a reasonable default. 247 The kernel tries to set a reasonable default.
248 248
249 disable_timer_pin_1 [i386,x86-64] 249 disable_timer_pin_1 [i386,x86-64]
250 Disable PIN 1 of APIC timer 250 Disable PIN 1 of APIC timer
251 Can be useful to work around chipset bugs. 251 Can be useful to work around chipset bugs.
252 252
253 ad1848= [HW,OSS] 253 ad1848= [HW,OSS]
254 Format: <io>,<irq>,<dma>,<dma2>,<type> 254 Format: <io>,<irq>,<dma>,<dma2>,<type>
255 255
256 advansys= [HW,SCSI] 256 advansys= [HW,SCSI]
257 See header of drivers/scsi/advansys.c. 257 See header of drivers/scsi/advansys.c.
258 258
259 advwdt= [HW,WDT] Advantech WDT 259 advwdt= [HW,WDT] Advantech WDT
260 Format: <iostart>,<iostop> 260 Format: <iostart>,<iostop>
261 261
262 aedsp16= [HW,OSS] Audio Excel DSP 16 262 aedsp16= [HW,OSS] Audio Excel DSP 16
263 Format: <io>,<irq>,<dma>,<mss_io>,<mpu_io>,<mpu_irq> 263 Format: <io>,<irq>,<dma>,<mss_io>,<mpu_io>,<mpu_irq>
264 See also header of sound/oss/aedsp16.c. 264 See also header of sound/oss/aedsp16.c.
265 265
266 aha152x= [HW,SCSI] 266 aha152x= [HW,SCSI]
267 See Documentation/scsi/aha152x.txt. 267 See Documentation/scsi/aha152x.txt.
268 268
269 aha1542= [HW,SCSI] 269 aha1542= [HW,SCSI]
270 Format: <portbase>[,<buson>,<busoff>[,<dmaspeed>]] 270 Format: <portbase>[,<buson>,<busoff>[,<dmaspeed>]]
271 271
272 aic7xxx= [HW,SCSI] 272 aic7xxx= [HW,SCSI]
273 See Documentation/scsi/aic7xxx.txt. 273 See Documentation/scsi/aic7xxx.txt.
274 274
275 aic79xx= [HW,SCSI] 275 aic79xx= [HW,SCSI]
276 See Documentation/scsi/aic79xx.txt. 276 See Documentation/scsi/aic79xx.txt.
277 277
278 amd_iommu= [HW,X86-84] 278 amd_iommu= [HW,X86-84]
279 Pass parameters to the AMD IOMMU driver in the system. 279 Pass parameters to the AMD IOMMU driver in the system.
280 Possible values are: 280 Possible values are:
281 isolate - enable device isolation (each device, as far 281 isolate - enable device isolation (each device, as far
282 as possible, will get its own protection 282 as possible, will get its own protection
283 domain) 283 domain)
284 amd_iommu_size= [HW,X86-64] 284 amd_iommu_size= [HW,X86-64]
285 Define the size of the aperture for the AMD IOMMU 285 Define the size of the aperture for the AMD IOMMU
286 driver. Possible values are: 286 driver. Possible values are:
287 '32M', '64M' (default), '128M', '256M', '512M', '1G' 287 '32M', '64M' (default), '128M', '256M', '512M', '1G'
288 288
289 amijoy.map= [HW,JOY] Amiga joystick support 289 amijoy.map= [HW,JOY] Amiga joystick support
290 Map of devices attached to JOY0DAT and JOY1DAT 290 Map of devices attached to JOY0DAT and JOY1DAT
291 Format: <a>,<b> 291 Format: <a>,<b>
292 See also Documentation/kernel/input/joystick.txt 292 See also Documentation/kernel/input/joystick.txt
293 293
294 analog.map= [HW,JOY] Analog joystick and gamepad support 294 analog.map= [HW,JOY] Analog joystick and gamepad support
295 Specifies type or capabilities of an analog joystick 295 Specifies type or capabilities of an analog joystick
296 connected to one of 16 gameports 296 connected to one of 16 gameports
297 Format: <type1>,<type2>,..<type16> 297 Format: <type1>,<type2>,..<type16>
298 298
299 apc= [HW,SPARC] 299 apc= [HW,SPARC]
300 Power management functions (SPARCstation-4/5 + deriv.) 300 Power management functions (SPARCstation-4/5 + deriv.)
301 Format: noidle 301 Format: noidle
302 Disable APC CPU standby support. SPARCstation-Fox does 302 Disable APC CPU standby support. SPARCstation-Fox does
303 not play well with APC CPU idle - disable it if you have 303 not play well with APC CPU idle - disable it if you have
304 APC and your system crashes randomly. 304 APC and your system crashes randomly.
305 305
306 apic= [APIC,i386] Advanced Programmable Interrupt Controller 306 apic= [APIC,i386] Advanced Programmable Interrupt Controller
307 Change the output verbosity whilst booting 307 Change the output verbosity whilst booting
308 Format: { quiet (default) | verbose | debug } 308 Format: { quiet (default) | verbose | debug }
309 Change the amount of debugging information output 309 Change the amount of debugging information output
310 when initialising the APIC and IO-APIC components. 310 when initialising the APIC and IO-APIC components.
311 311
312 apm= [APM] Advanced Power Management 312 apm= [APM] Advanced Power Management
313 See header of arch/x86/kernel/apm_32.c. 313 See header of arch/x86/kernel/apm_32.c.
314 314
315 arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards 315 arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
316 Format: <io>,<irq>,<nodeID> 316 Format: <io>,<irq>,<nodeID>
317 317
318 ataflop= [HW,M68k] 318 ataflop= [HW,M68k]
319 319
320 atarimouse= [HW,MOUSE] Atari Mouse 320 atarimouse= [HW,MOUSE] Atari Mouse
321 321
322 atascsi= [HW,SCSI] Atari SCSI 322 atascsi= [HW,SCSI] Atari SCSI
323 323
324 atkbd.extra= [HW] Enable extra LEDs and keys on IBM RapidAccess, 324 atkbd.extra= [HW] Enable extra LEDs and keys on IBM RapidAccess,
325 EzKey and similar keyboards 325 EzKey and similar keyboards
326 326
327 atkbd.reset= [HW] Reset keyboard during initialization 327 atkbd.reset= [HW] Reset keyboard during initialization
328 328
329 atkbd.set= [HW] Select keyboard code set 329 atkbd.set= [HW] Select keyboard code set
330 Format: <int> (2 = AT (default), 3 = PS/2) 330 Format: <int> (2 = AT (default), 3 = PS/2)
331 331
332 atkbd.scroll= [HW] Enable scroll wheel on MS Office and similar 332 atkbd.scroll= [HW] Enable scroll wheel on MS Office and similar
333 keyboards 333 keyboards
334 334
335 atkbd.softraw= [HW] Choose between synthetic and real raw mode 335 atkbd.softraw= [HW] Choose between synthetic and real raw mode
336 Format: <bool> (0 = real, 1 = synthetic (default)) 336 Format: <bool> (0 = real, 1 = synthetic (default))
337 337
338 atkbd.softrepeat= [HW] 338 atkbd.softrepeat= [HW]
339 Use software keyboard repeat 339 Use software keyboard repeat
340 340
341 autotest [IA64] 341 autotest [IA64]
342 342
343 baycom_epp= [HW,AX25] 343 baycom_epp= [HW,AX25]
344 Format: <io>,<mode> 344 Format: <io>,<mode>
345 345
346 baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem 346 baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem
347 Format: <io>,<mode> 347 Format: <io>,<mode>
348 See header of drivers/net/hamradio/baycom_par.c. 348 See header of drivers/net/hamradio/baycom_par.c.
349 349
350 baycom_ser_fdx= [HW,AX25] 350 baycom_ser_fdx= [HW,AX25]
351 BayCom Serial Port AX.25 Modem (Full Duplex Mode) 351 BayCom Serial Port AX.25 Modem (Full Duplex Mode)
352 Format: <io>,<irq>,<mode>[,<baud>] 352 Format: <io>,<irq>,<mode>[,<baud>]
353 See header of drivers/net/hamradio/baycom_ser_fdx.c. 353 See header of drivers/net/hamradio/baycom_ser_fdx.c.
354 354
355 baycom_ser_hdx= [HW,AX25] 355 baycom_ser_hdx= [HW,AX25]
356 BayCom Serial Port AX.25 Modem (Half Duplex Mode) 356 BayCom Serial Port AX.25 Modem (Half Duplex Mode)
357 Format: <io>,<irq>,<mode> 357 Format: <io>,<irq>,<mode>
358 See header of drivers/net/hamradio/baycom_ser_hdx.c. 358 See header of drivers/net/hamradio/baycom_ser_hdx.c.
359 359
360 boot_delay= Milliseconds to delay each printk during boot. 360 boot_delay= Milliseconds to delay each printk during boot.
361 Values larger than 10 seconds (10000) are changed to 361 Values larger than 10 seconds (10000) are changed to
362 no delay (0). 362 no delay (0).
363 Format: integer 363 Format: integer
364 364
365 bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards) 365 bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards)
366 bttv.radio= Most important insmod options are available as 366 bttv.radio= Most important insmod options are available as
367 kernel args too. 367 kernel args too.
368 bttv.pll= See Documentation/video4linux/bttv/Insmod-options 368 bttv.pll= See Documentation/video4linux/bttv/Insmod-options
369 bttv.tuner= and Documentation/video4linux/bttv/CARDLIST 369 bttv.tuner= and Documentation/video4linux/bttv/CARDLIST
370 370
371 BusLogic= [HW,SCSI] 371 BusLogic= [HW,SCSI]
372 See drivers/scsi/BusLogic.c, comment before function 372 See drivers/scsi/BusLogic.c, comment before function
373 BusLogic_ParseDriverOptions(). 373 BusLogic_ParseDriverOptions().
374 374
375 c101= [NET] Moxa C101 synchronous serial card 375 c101= [NET] Moxa C101 synchronous serial card
376 376
377 cachesize= [BUGS=X86-32] Override level 2 CPU cache size detection. 377 cachesize= [BUGS=X86-32] Override level 2 CPU cache size detection.
378 Sometimes CPU hardware bugs make them report the cache 378 Sometimes CPU hardware bugs make them report the cache
379 size incorrectly. The kernel will attempt work arounds 379 size incorrectly. The kernel will attempt work arounds
380 to fix known problems, but for some CPUs it is not 380 to fix known problems, but for some CPUs it is not
381 possible to determine what the correct size should be. 381 possible to determine what the correct size should be.
382 This option provides an override for these situations. 382 This option provides an override for these situations.
383 383
384 security= [SECURITY] Choose a security module to enable at boot. 384 security= [SECURITY] Choose a security module to enable at boot.
385 If this boot parameter is not specified, only the first 385 If this boot parameter is not specified, only the first
386 security module asking for security registration will be 386 security module asking for security registration will be
387 loaded. An invalid security module name will be treated 387 loaded. An invalid security module name will be treated
388 as if no module has been chosen. 388 as if no module has been chosen.
389 389
390 capability.disable= 390 capability.disable=
391 [SECURITY] Disable capabilities. This would normally 391 [SECURITY] Disable capabilities. This would normally
392 be used only if an alternative security model is to be 392 be used only if an alternative security model is to be
393 configured. Potentially dangerous and should only be 393 configured. Potentially dangerous and should only be
394 used if you are entirely sure of the consequences. 394 used if you are entirely sure of the consequences.
395 395
396 ccw_timeout_log [S390] 396 ccw_timeout_log [S390]
397 See Documentation/s390/CommonIO for details. 397 See Documentation/s390/CommonIO for details.
398 398
399 cgroup_disable= [KNL] Disable a particular controller 399 cgroup_disable= [KNL] Disable a particular controller
400 Format: {name of the controller(s) to disable} 400 Format: {name of the controller(s) to disable}
401 {Currently supported controllers - "memory"} 401 {Currently supported controllers - "memory"}
402 402
403 checkreqprot [SELINUX] Set initial checkreqprot flag value. 403 checkreqprot [SELINUX] Set initial checkreqprot flag value.
404 Format: { "0" | "1" } 404 Format: { "0" | "1" }
405 See security/selinux/Kconfig help text. 405 See security/selinux/Kconfig help text.
406 0 -- check protection applied by kernel (includes 406 0 -- check protection applied by kernel (includes
407 any implied execute protection). 407 any implied execute protection).
408 1 -- check protection requested by application. 408 1 -- check protection requested by application.
409 Default value is set via a kernel config option. 409 Default value is set via a kernel config option.
410 Value can be changed at runtime via 410 Value can be changed at runtime via
411 /selinux/checkreqprot. 411 /selinux/checkreqprot.
412 412
413 cio_ignore= [S390] 413 cio_ignore= [S390]
414 See Documentation/s390/CommonIO for details. 414 See Documentation/s390/CommonIO for details.
415 415
416 clock= [BUGS=X86-32, HW] gettimeofday clocksource override. 416 clock= [BUGS=X86-32, HW] gettimeofday clocksource override.
417 [Deprecated] 417 [Deprecated]
418 Forces specified clocksource (if available) to be used 418 Forces specified clocksource (if available) to be used
419 when calculating gettimeofday(). If specified 419 when calculating gettimeofday(). If specified
420 clocksource is not available, it defaults to PIT. 420 clocksource is not available, it defaults to PIT.
421 Format: { pit | tsc | cyclone | pmtmr } 421 Format: { pit | tsc | cyclone | pmtmr }
422 422
423 clocksource= [GENERIC_TIME] Override the default clocksource 423 clocksource= [GENERIC_TIME] Override the default clocksource
424 Format: <string> 424 Format: <string>
425 Override the default clocksource and use the clocksource 425 Override the default clocksource and use the clocksource
426 with the name specified. 426 with the name specified.
427 Some clocksource names to choose from, depending on 427 Some clocksource names to choose from, depending on
428 the platform: 428 the platform:
429 [all] jiffies (this is the base, fallback clocksource) 429 [all] jiffies (this is the base, fallback clocksource)
430 [ACPI] acpi_pm 430 [ACPI] acpi_pm
431 [ARM] imx_timer1,OSTS,netx_timer,mpu_timer2, 431 [ARM] imx_timer1,OSTS,netx_timer,mpu_timer2,
432 pxa_timer,timer3,32k_counter,timer0_1 432 pxa_timer,timer3,32k_counter,timer0_1
433 [AVR32] avr32 433 [AVR32] avr32
434 [X86-32] pit,hpet,tsc,vmi-timer; 434 [X86-32] pit,hpet,tsc,vmi-timer;
435 scx200_hrt on Geode; cyclone on IBM x440 435 scx200_hrt on Geode; cyclone on IBM x440
436 [MIPS] MIPS 436 [MIPS] MIPS
437 [PARISC] cr16 437 [PARISC] cr16
438 [S390] tod 438 [S390] tod
439 [SH] SuperH 439 [SH] SuperH
440 [SPARC64] tick 440 [SPARC64] tick
441 [X86-64] hpet,tsc 441 [X86-64] hpet,tsc
442 442
443 clearcpuid=BITNUM [X86] 443 clearcpuid=BITNUM [X86]
444 Disable CPUID feature X for the kernel. See 444 Disable CPUID feature X for the kernel. See
445 include/asm-x86/cpufeature.h for the valid bit numbers. 445 include/asm-x86/cpufeature.h for the valid bit numbers.
446 Note the Linux specific bits are not necessarily 446 Note the Linux specific bits are not necessarily
447 stable over kernel options, but the vendor specific 447 stable over kernel options, but the vendor specific
448 ones should be. 448 ones should be.
449 Also note that user programs calling CPUID directly 449 Also note that user programs calling CPUID directly
450 or using the feature without checking anything 450 or using the feature without checking anything
451 will still see it. This just prevents it from 451 will still see it. This just prevents it from
452 being used by the kernel or shown in /proc/cpuinfo. 452 being used by the kernel or shown in /proc/cpuinfo.
453 Also note the kernel might malfunction if you disable 453 Also note the kernel might malfunction if you disable
454 some critical bits. 454 some critical bits.
455 455
456 code_bytes [IA32/X86_64] How many bytes of object code to print 456 code_bytes [IA32/X86_64] How many bytes of object code to print
457 in an oops report. 457 in an oops report.
458 Range: 0 - 8192 458 Range: 0 - 8192
459 Default: 64 459 Default: 64
460 460
461 disable_8254_timer 461 disable_8254_timer
462 enable_8254_timer 462 enable_8254_timer
463 [IA32/X86_64] Disable/Enable interrupt 0 timer routing 463 [IA32/X86_64] Disable/Enable interrupt 0 timer routing
464 over the 8254 in addition to over the IO-APIC. The 464 over the 8254 in addition to over the IO-APIC. The
465 kernel tries to set a sensible default. 465 kernel tries to set a sensible default.
466 466
467 hpet= [X86-32,HPET] option to control HPET usage 467 hpet= [X86-32,HPET] option to control HPET usage
468 Format: { enable (default) | disable | force } 468 Format: { enable (default) | disable | force }
469 disable: disable HPET and use PIT instead 469 disable: disable HPET and use PIT instead
470 force: allow force enabled of undocumented chips (ICH4, 470 force: allow force enabled of undocumented chips (ICH4,
471 VIA, nVidia) 471 VIA, nVidia)
472 472
473 com20020= [HW,NET] ARCnet - COM20020 chipset 473 com20020= [HW,NET] ARCnet - COM20020 chipset
474 Format: 474 Format:
475 <io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]] 475 <io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
476 476
477 com90io= [HW,NET] ARCnet - COM90xx chipset (IO-mapped buffers) 477 com90io= [HW,NET] ARCnet - COM90xx chipset (IO-mapped buffers)
478 Format: <io>[,<irq>] 478 Format: <io>[,<irq>]
479 479
480 com90xx= [HW,NET] 480 com90xx= [HW,NET]
481 ARCnet - COM90xx chipset (memory-mapped buffers) 481 ARCnet - COM90xx chipset (memory-mapped buffers)
482 Format: <io>[,<irq>[,<memstart>]] 482 Format: <io>[,<irq>[,<memstart>]]
483 483
484 condev= [HW,S390] console device 484 condev= [HW,S390] console device
485 conmode= 485 conmode=
486 486
487 console= [KNL] Output console device and options. 487 console= [KNL] Output console device and options.
488 488
489 tty<n> Use the virtual console device <n>. 489 tty<n> Use the virtual console device <n>.
490 490
491 ttyS<n>[,options] 491 ttyS<n>[,options]
492 ttyUSB0[,options] 492 ttyUSB0[,options]
493 Use the specified serial port. The options are of 493 Use the specified serial port. The options are of
494 the form "bbbbpnf", where "bbbb" is the baud rate, 494 the form "bbbbpnf", where "bbbb" is the baud rate,
495 "p" is parity ("n", "o", or "e"), "n" is number of 495 "p" is parity ("n", "o", or "e"), "n" is number of
496 bits, and "f" is flow control ("r" for RTS or 496 bits, and "f" is flow control ("r" for RTS or
497 omit it). Default is "9600n8". 497 omit it). Default is "9600n8".
498 498
499 See Documentation/serial-console.txt for more 499 See Documentation/serial-console.txt for more
500 information. See 500 information. See
501 Documentation/networking/netconsole.txt for an 501 Documentation/networking/netconsole.txt for an
502 alternative. 502 alternative.
503 503
504 uart[8250],io,<addr>[,options] 504 uart[8250],io,<addr>[,options]
505 uart[8250],mmio,<addr>[,options] 505 uart[8250],mmio,<addr>[,options]
506 Start an early, polled-mode console on the 8250/16550 506 Start an early, polled-mode console on the 8250/16550
507 UART at the specified I/O port or MMIO address, 507 UART at the specified I/O port or MMIO address,
508 switching to the matching ttyS device later. The 508 switching to the matching ttyS device later. The
509 options are the same as for ttyS, above. 509 options are the same as for ttyS, above.
510 510
511 If the device connected to the port is not a TTY but a braille 511 If the device connected to the port is not a TTY but a braille
512 device, prepend "brl," before the device type, for instance 512 device, prepend "brl," before the device type, for instance
513 console=brl,ttyS0 513 console=brl,ttyS0
514 For now, only VisioBraille is supported. 514 For now, only VisioBraille is supported.
515 515
516 earlycon= [KNL] Output early console device and options. 516 earlycon= [KNL] Output early console device and options.
517 uart[8250],io,<addr>[,options] 517 uart[8250],io,<addr>[,options]
518 uart[8250],mmio,<addr>[,options] 518 uart[8250],mmio,<addr>[,options]
519 Start an early, polled-mode console on the 8250/16550 519 Start an early, polled-mode console on the 8250/16550
520 UART at the specified I/O port or MMIO address. 520 UART at the specified I/O port or MMIO address.
521 The options are the same as for ttyS, above. 521 The options are the same as for ttyS, above.
522 522
523 no_console_suspend 523 no_console_suspend
524 [HW] Never suspend the console 524 [HW] Never suspend the console
525 Disable suspending of consoles during suspend and 525 Disable suspending of consoles during suspend and
526 hibernate operations. Once disabled, debugging 526 hibernate operations. Once disabled, debugging
527 messages can reach various consoles while the rest 527 messages can reach various consoles while the rest
528 of the system is being put to sleep (ie, while 528 of the system is being put to sleep (ie, while
529 debugging driver suspend/resume hooks). This may 529 debugging driver suspend/resume hooks). This may
530 not work reliably with all consoles, but is known 530 not work reliably with all consoles, but is known
531 to work with serial and VGA consoles. 531 to work with serial and VGA consoles.
532 532
533 cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver 533 cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
534 Format: 534 Format:
535 <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>] 535 <first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
536 536
537 crashkernel=nn[KMG]@ss[KMG] 537 crashkernel=nn[KMG]@ss[KMG]
538 [KNL] Reserve a chunk of physical memory to 538 [KNL] Reserve a chunk of physical memory to
539 hold a kernel to switch to with kexec on panic. 539 hold a kernel to switch to with kexec on panic.
540 540
541 crashkernel=range1:size1[,range2:size2,...][@offset] 541 crashkernel=range1:size1[,range2:size2,...][@offset]
542 [KNL] Same as above, but depends on the memory 542 [KNL] Same as above, but depends on the memory
543 in the running system. The syntax of range is 543 in the running system. The syntax of range is
544 start-[end] where start and end are both 544 start-[end] where start and end are both
545 a memory unit (amount[KMG]). See also 545 a memory unit (amount[KMG]). See also
546 Documentation/kdump/kdump.txt for a example. 546 Documentation/kdump/kdump.txt for a example.
547 547
548 cs4232= [HW,OSS] 548 cs4232= [HW,OSS]
549 Format: <io>,<irq>,<dma>,<dma2>,<mpuio>,<mpuirq> 549 Format: <io>,<irq>,<dma>,<dma2>,<mpuio>,<mpuirq>
550 550
551 cs89x0_dma= [HW,NET] 551 cs89x0_dma= [HW,NET]
552 Format: <dma> 552 Format: <dma>
553 553
554 cs89x0_media= [HW,NET] 554 cs89x0_media= [HW,NET]
555 Format: { rj45 | aui | bnc } 555 Format: { rj45 | aui | bnc }
556 556
557 dasd= [HW,NET] 557 dasd= [HW,NET]
558 See header of drivers/s390/block/dasd_devmap.c. 558 See header of drivers/s390/block/dasd_devmap.c.
559 559
560 db9.dev[2|3]= [HW,JOY] Multisystem joystick support via parallel port 560 db9.dev[2|3]= [HW,JOY] Multisystem joystick support via parallel port
561 (one device per port) 561 (one device per port)
562 Format: <port#>,<type> 562 Format: <port#>,<type>
563 See also Documentation/input/joystick-parport.txt 563 See also Documentation/input/joystick-parport.txt
564 564
565 debug [KNL] Enable kernel debugging (events log level). 565 debug [KNL] Enable kernel debugging (events log level).
566 566
567 debug_locks_verbose= 567 debug_locks_verbose=
568 [KNL] verbose self-tests 568 [KNL] verbose self-tests
569 Format=<0|1> 569 Format=<0|1>
570 Print debugging info while doing the locking API 570 Print debugging info while doing the locking API
571 self-tests. 571 self-tests.
572 We default to 0 (no extra messages), setting it to 572 We default to 0 (no extra messages), setting it to
573 1 will print _a lot_ more information - normally 573 1 will print _a lot_ more information - normally
574 only useful to kernel developers. 574 only useful to kernel developers.
575 575
576 debug_objects [KNL] Enable object debugging 576 debug_objects [KNL] Enable object debugging
577 577
578 debugpat [X86] Enable PAT debugging 578 debugpat [X86] Enable PAT debugging
579 579
580 decnet.addr= [HW,NET] 580 decnet.addr= [HW,NET]
581 Format: <area>[,<node>] 581 Format: <area>[,<node>]
582 See also Documentation/networking/decnet.txt. 582 See also Documentation/networking/decnet.txt.
583 583
584 vt.default_blu= [VT] 584 vt.default_blu= [VT]
585 Format: <blue0>,<blue1>,<blue2>,...,<blue15> 585 Format: <blue0>,<blue1>,<blue2>,...,<blue15>
586 Change the default blue palette of the console. 586 Change the default blue palette of the console.
587 This is a 16-member array composed of values 587 This is a 16-member array composed of values
588 ranging from 0-255. 588 ranging from 0-255.
589 589
590 vt.default_grn= [VT] 590 vt.default_grn= [VT]
591 Format: <green0>,<green1>,<green2>,...,<green15> 591 Format: <green0>,<green1>,<green2>,...,<green15>
592 Change the default green palette of the console. 592 Change the default green palette of the console.
593 This is a 16-member array composed of values 593 This is a 16-member array composed of values
594 ranging from 0-255. 594 ranging from 0-255.
595 595
596 vt.default_red= [VT] 596 vt.default_red= [VT]
597 Format: <red0>,<red1>,<red2>,...,<red15> 597 Format: <red0>,<red1>,<red2>,...,<red15>
598 Change the default red palette of the console. 598 Change the default red palette of the console.
599 This is a 16-member array composed of values 599 This is a 16-member array composed of values
600 ranging from 0-255. 600 ranging from 0-255.
601 601
602 vt.default_utf8= 602 vt.default_utf8=
603 [VT] 603 [VT]
604 Format=<0|1> 604 Format=<0|1>
605 Set system-wide default UTF-8 mode for all tty's. 605 Set system-wide default UTF-8 mode for all tty's.
606 Default is 1, i.e. UTF-8 mode is enabled for all 606 Default is 1, i.e. UTF-8 mode is enabled for all
607 newly opened terminals. 607 newly opened terminals.
608 608
609 dhash_entries= [KNL] 609 dhash_entries= [KNL]
610 Set number of hash buckets for dentry cache. 610 Set number of hash buckets for dentry cache.
611 611
612 digi= [HW,SERIAL] 612 digi= [HW,SERIAL]
613 IO parameters + enable/disable command. 613 IO parameters + enable/disable command.
614 614
615 digiepca= [HW,SERIAL] 615 digiepca= [HW,SERIAL]
616 See drivers/char/README.epca and 616 See drivers/char/README.epca and
617 Documentation/digiepca.txt. 617 Documentation/digiepca.txt.
618 618
619 disable_mtrr_cleanup [X86] 619 disable_mtrr_cleanup [X86]
620 enable_mtrr_cleanup [X86] 620 enable_mtrr_cleanup [X86]
621 The kernel tries to adjust MTRR layout from continuous 621 The kernel tries to adjust MTRR layout from continuous
622 to discrete, to make X server driver able to add WB 622 to discrete, to make X server driver able to add WB
623 entry later. This parameter enables/disables that. 623 entry later. This parameter enables/disables that.
624 624
625 mtrr_chunk_size=nn[KMG] [X86] 625 mtrr_chunk_size=nn[KMG] [X86]
626 used for mtrr cleanup. It is largest continous chunk 626 used for mtrr cleanup. It is largest continous chunk
627 that could hold holes aka. UC entries. 627 that could hold holes aka. UC entries.
628 628
629 mtrr_gran_size=nn[KMG] [X86] 629 mtrr_gran_size=nn[KMG] [X86]
630 Used for mtrr cleanup. It is granularity of mtrr block. 630 Used for mtrr cleanup. It is granularity of mtrr block.
631 Default is 1. 631 Default is 1.
632 Large value could prevent small alignment from 632 Large value could prevent small alignment from
633 using up MTRRs. 633 using up MTRRs.
634 634
635 mtrr_spare_reg_nr=n [X86] 635 mtrr_spare_reg_nr=n [X86]
636 Format: <integer> 636 Format: <integer>
637 Range: 0,7 : spare reg number 637 Range: 0,7 : spare reg number
638 Default : 1 638 Default : 1
639 Used for mtrr cleanup. It is spare mtrr entries number. 639 Used for mtrr cleanup. It is spare mtrr entries number.
640 Set to 2 or more if your graphical card needs more. 640 Set to 2 or more if your graphical card needs more.
641 641
642 disable_mtrr_trim [X86, Intel and AMD only] 642 disable_mtrr_trim [X86, Intel and AMD only]
643 By default the kernel will trim any uncacheable 643 By default the kernel will trim any uncacheable
644 memory out of your available memory pool based on 644 memory out of your available memory pool based on
645 MTRR settings. This parameter disables that behavior, 645 MTRR settings. This parameter disables that behavior,
646 possibly causing your machine to run very slowly. 646 possibly causing your machine to run very slowly.
647 647
648 dmasound= [HW,OSS] Sound subsystem buffers 648 dmasound= [HW,OSS] Sound subsystem buffers
649 649
650 dscc4.setup= [NET] 650 dscc4.setup= [NET]
651 651
652 dtc3181e= [HW,SCSI] 652 dtc3181e= [HW,SCSI]
653 653
654 earlyprintk= [X86-32,X86-64,SH,BLACKFIN] 654 earlyprintk= [X86-32,X86-64,SH,BLACKFIN]
655 earlyprintk=vga 655 earlyprintk=vga
656 earlyprintk=serial[,ttySn[,baudrate]] 656 earlyprintk=serial[,ttySn[,baudrate]]
657 657
658 Append ",keep" to not disable it when the real console 658 Append ",keep" to not disable it when the real console
659 takes over. 659 takes over.
660 660
661 Only vga or serial at a time, not both. 661 Only vga or serial at a time, not both.
662 662
663 Currently only ttyS0 and ttyS1 are supported. 663 Currently only ttyS0 and ttyS1 are supported.
664 664
665 Interaction with the standard serial driver is not 665 Interaction with the standard serial driver is not
666 very good. 666 very good.
667 667
668 The VGA output is eventually overwritten by the real 668 The VGA output is eventually overwritten by the real
669 console. 669 console.
670 670
671 eata= [HW,SCSI] 671 eata= [HW,SCSI]
672 672
673 edd= [EDD] 673 edd= [EDD]
674 Format: {"off" | "on" | "skip[mbr]"} 674 Format: {"off" | "on" | "skip[mbr]"}
675 675
676 eisa_irq_edge= [PARISC,HW] 676 eisa_irq_edge= [PARISC,HW]
677 See header of drivers/parisc/eisa.c. 677 See header of drivers/parisc/eisa.c.
678 678
679 elanfreq= [X86-32] 679 elanfreq= [X86-32]
680 See comment before function elanfreq_setup() in 680 See comment before function elanfreq_setup() in
681 arch/x86/kernel/cpu/cpufreq/elanfreq.c. 681 arch/x86/kernel/cpu/cpufreq/elanfreq.c.
682 682
683 elevator= [IOSCHED] 683 elevator= [IOSCHED]
684 Format: {"anticipatory" | "cfq" | "deadline" | "noop"} 684 Format: {"anticipatory" | "cfq" | "deadline" | "noop"}
685 See Documentation/block/as-iosched.txt and 685 See Documentation/block/as-iosched.txt and
686 Documentation/block/deadline-iosched.txt for details. 686 Documentation/block/deadline-iosched.txt for details.
687 687
688 elfcorehdr= [X86-32, X86_64] 688 elfcorehdr= [X86-32, X86_64]
689 Specifies physical address of start of kernel core 689 Specifies physical address of start of kernel core
690 image elf header. Generally kexec loader will 690 image elf header. Generally kexec loader will
691 pass this option to capture kernel. 691 pass this option to capture kernel.
692 See Documentation/kdump/kdump.txt for details. 692 See Documentation/kdump/kdump.txt for details.
693 693
694 enforcing [SELINUX] Set initial enforcing status. 694 enforcing [SELINUX] Set initial enforcing status.
695 Format: {"0" | "1"} 695 Format: {"0" | "1"}
696 See security/selinux/Kconfig help text. 696 See security/selinux/Kconfig help text.
697 0 -- permissive (log only, no denials). 697 0 -- permissive (log only, no denials).
698 1 -- enforcing (deny and log). 698 1 -- enforcing (deny and log).
699 Default value is 0. 699 Default value is 0.
700 Value can be changed at runtime via /selinux/enforce. 700 Value can be changed at runtime via /selinux/enforce.
701 701
702 es1371= [HW,OSS] 702 es1371= [HW,OSS]
703 Format: <spdif>,[<nomix>,[<amplifier>]] 703 Format: <spdif>,[<nomix>,[<amplifier>]]
704 See also header of sound/oss/es1371.c. 704 See also header of sound/oss/es1371.c.
705 705
706 ether= [HW,NET] Ethernet cards parameters 706 ether= [HW,NET] Ethernet cards parameters
707 This option is obsoleted by the "netdev=" option, which 707 This option is obsoleted by the "netdev=" option, which
708 has equivalent usage. See its documentation for details. 708 has equivalent usage. See its documentation for details.
709 709
710 eurwdt= [HW,WDT] Eurotech CPU-1220/1410 onboard watchdog. 710 eurwdt= [HW,WDT] Eurotech CPU-1220/1410 onboard watchdog.
711 Format: <io>[,<irq>] 711 Format: <io>[,<irq>]
712 712
713 failslab= 713 failslab=
714 fail_page_alloc= 714 fail_page_alloc=
715 fail_make_request=[KNL] 715 fail_make_request=[KNL]
716 General fault injection mechanism. 716 General fault injection mechanism.
717 Format: <interval>,<probability>,<space>,<times> 717 Format: <interval>,<probability>,<space>,<times>
718 See also /Documentation/fault-injection/. 718 See also /Documentation/fault-injection/.
719 719
720 fd_mcs= [HW,SCSI] 720 fd_mcs= [HW,SCSI]
721 See header of drivers/scsi/fd_mcs.c. 721 See header of drivers/scsi/fd_mcs.c.
722 722
723 fdomain= [HW,SCSI] 723 fdomain= [HW,SCSI]
724 See header of drivers/scsi/fdomain.c. 724 See header of drivers/scsi/fdomain.c.
725 725
726 floppy= [HW] 726 floppy= [HW]
727 See Documentation/floppy.txt. 727 See Documentation/floppy.txt.
728 728
729 force_pal_cache_flush 729 force_pal_cache_flush
730 [IA-64] Avoid check_sal_cache_flush which may hang on 730 [IA-64] Avoid check_sal_cache_flush which may hang on
731 buggy SAL_CACHE_FLUSH implementations. Using this 731 buggy SAL_CACHE_FLUSH implementations. Using this
732 parameter will force ia64_sal_cache_flush to call 732 parameter will force ia64_sal_cache_flush to call
733 ia64_pal_cache_flush instead of SAL_CACHE_FLUSH. 733 ia64_pal_cache_flush instead of SAL_CACHE_FLUSH.
734 734
735 gamecon.map[2|3]= 735 gamecon.map[2|3]=
736 [HW,JOY] Multisystem joystick and NES/SNES/PSX pad 736 [HW,JOY] Multisystem joystick and NES/SNES/PSX pad
737 support via parallel port (up to 5 devices per port) 737 support via parallel port (up to 5 devices per port)
738 Format: <port#>,<pad1>,<pad2>,<pad3>,<pad4>,<pad5> 738 Format: <port#>,<pad1>,<pad2>,<pad3>,<pad4>,<pad5>
739 See also Documentation/input/joystick-parport.txt 739 See also Documentation/input/joystick-parport.txt
740 740
741 gamma= [HW,DRM] 741 gamma= [HW,DRM]
742 742
743 gart_fix_e820= [X86_64] disable the fix e820 for K8 GART 743 gart_fix_e820= [X86_64] disable the fix e820 for K8 GART
744 Format: off | on 744 Format: off | on
745 default: on 745 default: on
746 746
747 gdth= [HW,SCSI] 747 gdth= [HW,SCSI]
748 See header of drivers/scsi/gdth.c. 748 See header of drivers/scsi/gdth.c.
749 749
750 gpt [EFI] Forces disk with valid GPT signature but 750 gpt [EFI] Forces disk with valid GPT signature but
751 invalid Protective MBR to be treated as GPT. 751 invalid Protective MBR to be treated as GPT.
752 752
753 gvp11= [HW,SCSI] 753 gvp11= [HW,SCSI]
754 754
755 hashdist= [KNL,NUMA] Large hashes allocated during boot 755 hashdist= [KNL,NUMA] Large hashes allocated during boot
756 are distributed across NUMA nodes. Defaults on 756 are distributed across NUMA nodes. Defaults on
757 for IA-64, off otherwise. 757 for IA-64, off otherwise.
758 Format: 0 | 1 (for off | on) 758 Format: 0 | 1 (for off | on)
759 759
760 hcl= [IA-64] SGI's Hardware Graph compatibility layer 760 hcl= [IA-64] SGI's Hardware Graph compatibility layer
761 761
762 hd= [EIDE] (E)IDE hard drive subsystem geometry 762 hd= [EIDE] (E)IDE hard drive subsystem geometry
763 Format: <cyl>,<head>,<sect> 763 Format: <cyl>,<head>,<sect>
764 764
765 highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact 765 highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact
766 size of <nn>. This works even on boxes that have no 766 size of <nn>. This works even on boxes that have no
767 highmem otherwise. This also works to reduce highmem 767 highmem otherwise. This also works to reduce highmem
768 size on bigger boxes. 768 size on bigger boxes.
769 769
770 highres= [KNL] Enable/disable high resolution timer mode. 770 highres= [KNL] Enable/disable high resolution timer mode.
771 Valid parameters: "on", "off" 771 Valid parameters: "on", "off"
772 Default: "on" 772 Default: "on"
773 773
774 hisax= [HW,ISDN] 774 hisax= [HW,ISDN]
775 See Documentation/isdn/README.HiSax. 775 See Documentation/isdn/README.HiSax.
776 776
777 hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot. 777 hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
778 hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages. 778 hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
779 On x86 this option can be specified multiple times 779 On x86-64 and powerpc, this option can be specified
780 interleaved with hugepages= to reserve huge pages 780 multiple times interleaved with hugepages= to reserve
781 of different sizes. Valid pages sizes on x86-64 781 huge pages of different sizes. Valid pages sizes on
782 are 2M (when the CPU supports "pse") and 1G (when the 782 x86-64 are 2M (when the CPU supports "pse") and 1G
783 CPU supports the "pdpe1gb" cpuinfo flag) 783 (when the CPU supports the "pdpe1gb" cpuinfo flag)
784 Note that 1GB pages can only be allocated at boot time 784 Note that 1GB pages can only be allocated at boot time
785 using hugepages= and not freed afterwards. 785 using hugepages= and not freed afterwards.
786 default_hugepagesz= 786 default_hugepagesz=
787 [same as hugepagesz=] The size of the default 787 [same as hugepagesz=] The size of the default
788 HugeTLB page size. This is the size represented by 788 HugeTLB page size. This is the size represented by
789 the legacy /proc/ hugepages APIs, used for SHM, and 789 the legacy /proc/ hugepages APIs, used for SHM, and
790 default size when mounting hugetlbfs filesystems. 790 default size when mounting hugetlbfs filesystems.
791 Defaults to the default architecture's huge page size 791 Defaults to the default architecture's huge page size
792 if not specified. 792 if not specified.
793 793
794 i8042.direct [HW] Put keyboard port into non-translated mode 794 i8042.direct [HW] Put keyboard port into non-translated mode
795 i8042.dumbkbd [HW] Pretend that controller can only read data from 795 i8042.dumbkbd [HW] Pretend that controller can only read data from
796 keyboard and cannot control its state 796 keyboard and cannot control its state
797 (Don't attempt to blink the leds) 797 (Don't attempt to blink the leds)
798 i8042.noaux [HW] Don't check for auxiliary (== mouse) port 798 i8042.noaux [HW] Don't check for auxiliary (== mouse) port
799 i8042.nokbd [HW] Don't check/create keyboard port 799 i8042.nokbd [HW] Don't check/create keyboard port
800 i8042.noloop [HW] Disable the AUX Loopback command while probing 800 i8042.noloop [HW] Disable the AUX Loopback command while probing
801 for the AUX port 801 for the AUX port
802 i8042.nomux [HW] Don't check presence of an active multiplexing 802 i8042.nomux [HW] Don't check presence of an active multiplexing
803 controller 803 controller
804 i8042.nopnp [HW] Don't use ACPIPnP / PnPBIOS to discover KBD/AUX 804 i8042.nopnp [HW] Don't use ACPIPnP / PnPBIOS to discover KBD/AUX
805 controllers 805 controllers
806 i8042.panicblink= 806 i8042.panicblink=
807 [HW] Frequency with which keyboard LEDs should blink 807 [HW] Frequency with which keyboard LEDs should blink
808 when kernel panics (default is 0.5 sec) 808 when kernel panics (default is 0.5 sec)
809 i8042.reset [HW] Reset the controller during init and cleanup 809 i8042.reset [HW] Reset the controller during init and cleanup
810 i8042.unlock [HW] Unlock (ignore) the keylock 810 i8042.unlock [HW] Unlock (ignore) the keylock
811 811
812 i810= [HW,DRM] 812 i810= [HW,DRM]
813 813
814 i8k.ignore_dmi [HW] Continue probing hardware even if DMI data 814 i8k.ignore_dmi [HW] Continue probing hardware even if DMI data
815 indicates that the driver is running on unsupported 815 indicates that the driver is running on unsupported
816 hardware. 816 hardware.
817 i8k.force [HW] Activate i8k driver even if SMM BIOS signature 817 i8k.force [HW] Activate i8k driver even if SMM BIOS signature
818 does not match list of supported models. 818 does not match list of supported models.
819 i8k.power_status 819 i8k.power_status
820 [HW] Report power status in /proc/i8k 820 [HW] Report power status in /proc/i8k
821 (disabled by default) 821 (disabled by default)
822 i8k.restricted [HW] Allow controlling fans only if SYS_ADMIN 822 i8k.restricted [HW] Allow controlling fans only if SYS_ADMIN
823 capability is set. 823 capability is set.
824 824
825 ibmmcascsi= [HW,MCA,SCSI] IBM MicroChannel SCSI adapter 825 ibmmcascsi= [HW,MCA,SCSI] IBM MicroChannel SCSI adapter
826 See Documentation/mca.txt. 826 See Documentation/mca.txt.
827 827
828 icn= [HW,ISDN] 828 icn= [HW,ISDN]
829 Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]] 829 Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]]
830 830
831 ide= [HW] (E)IDE subsystem 831 ide= [HW] (E)IDE subsystem
832 Format: ide=nodma or ide=doubler 832 Format: ide=nodma or ide=doubler
833 See Documentation/ide/ide.txt. 833 See Documentation/ide/ide.txt.
834 834
835 idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed 835 idebus= [HW] (E)IDE subsystem - VLB/PCI bus speed
836 See Documentation/ide/ide.txt. 836 See Documentation/ide/ide.txt.
837 837
838 idle= [X86] 838 idle= [X86]
839 Format: idle=poll or idle=mwait, idle=halt, idle=nomwait 839 Format: idle=poll or idle=mwait, idle=halt, idle=nomwait
840 Poll forces a polling idle loop that can slightly improves the performance 840 Poll forces a polling idle loop that can slightly improves the performance
841 of waking up a idle CPU, but will use a lot of power and make the system 841 of waking up a idle CPU, but will use a lot of power and make the system
842 run hot. Not recommended. 842 run hot. Not recommended.
843 idle=mwait. On systems which support MONITOR/MWAIT but the kernel chose 843 idle=mwait. On systems which support MONITOR/MWAIT but the kernel chose
844 to not use it because it doesn't save as much power as a normal idle 844 to not use it because it doesn't save as much power as a normal idle
845 loop use the MONITOR/MWAIT idle loop anyways. Performance should be the same 845 loop use the MONITOR/MWAIT idle loop anyways. Performance should be the same
846 as idle=poll. 846 as idle=poll.
847 idle=halt. Halt is forced to be used for CPU idle. 847 idle=halt. Halt is forced to be used for CPU idle.
848 In such case C2/C3 won't be used again. 848 In such case C2/C3 won't be used again.
849 idle=nomwait. Disable mwait for CPU C-states 849 idle=nomwait. Disable mwait for CPU C-states
850 850
851 ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem 851 ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem
852 Claim all unknown PCI IDE storage controllers. 852 Claim all unknown PCI IDE storage controllers.
853 853
854 ignore_loglevel [KNL] 854 ignore_loglevel [KNL]
855 Ignore loglevel setting - this will print /all/ 855 Ignore loglevel setting - this will print /all/
856 kernel messages to the console. Useful for debugging. 856 kernel messages to the console. Useful for debugging.
857 857
858 ihash_entries= [KNL] 858 ihash_entries= [KNL]
859 Set number of hash buckets for inode cache. 859 Set number of hash buckets for inode cache.
860 860
861 in2000= [HW,SCSI] 861 in2000= [HW,SCSI]
862 See header of drivers/scsi/in2000.c. 862 See header of drivers/scsi/in2000.c.
863 863
864 init= [KNL] 864 init= [KNL]
865 Format: <full_path> 865 Format: <full_path>
866 Run specified binary instead of /sbin/init as init 866 Run specified binary instead of /sbin/init as init
867 process. 867 process.
868 868
869 initcall_debug [KNL] Trace initcalls as they are executed. Useful 869 initcall_debug [KNL] Trace initcalls as they are executed. Useful
870 for working out where the kernel is dying during 870 for working out where the kernel is dying during
871 startup. 871 startup.
872 872
873 initrd= [BOOT] Specify the location of the initial ramdisk 873 initrd= [BOOT] Specify the location of the initial ramdisk
874 874
875 inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver 875 inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
876 Format: <irq> 876 Format: <irq>
877 877
878 inttest= [IA64] 878 inttest= [IA64]
879 879
880 iommu= [x86] 880 iommu= [x86]
881 off 881 off
882 force 882 force
883 noforce 883 noforce
884 biomerge 884 biomerge
885 panic 885 panic
886 nopanic 886 nopanic
887 merge 887 merge
888 nomerge 888 nomerge
889 forcesac 889 forcesac
890 soft 890 soft
891 891
892 892
893 intel_iommu= [DMAR] Intel IOMMU driver (DMAR) option 893 intel_iommu= [DMAR] Intel IOMMU driver (DMAR) option
894 off 894 off
895 Disable intel iommu driver. 895 Disable intel iommu driver.
896 igfx_off [Default Off] 896 igfx_off [Default Off]
897 By default, gfx is mapped as normal device. If a gfx 897 By default, gfx is mapped as normal device. If a gfx
898 device has a dedicated DMAR unit, the DMAR unit is 898 device has a dedicated DMAR unit, the DMAR unit is
899 bypassed by not enabling DMAR with this option. In 899 bypassed by not enabling DMAR with this option. In
900 this case, gfx device will use physical address for 900 this case, gfx device will use physical address for
901 DMA. 901 DMA.
902 forcedac [x86_64] 902 forcedac [x86_64]
903 With this option iommu will not optimize to look 903 With this option iommu will not optimize to look
904 for io virtual address below 32 bit forcing dual 904 for io virtual address below 32 bit forcing dual
905 address cycle on pci bus for cards supporting greater 905 address cycle on pci bus for cards supporting greater
906 than 32 bit addressing. The default is to look 906 than 32 bit addressing. The default is to look
907 for translation below 32 bit and if not available 907 for translation below 32 bit and if not available
908 then look in the higher range. 908 then look in the higher range.
909 strict [Default Off] 909 strict [Default Off]
910 With this option on every unmap_single operation will 910 With this option on every unmap_single operation will
911 result in a hardware IOTLB flush operation as opposed 911 result in a hardware IOTLB flush operation as opposed
912 to batching them for performance. 912 to batching them for performance.
913 913
914 io_delay= [X86-32,X86-64] I/O delay method 914 io_delay= [X86-32,X86-64] I/O delay method
915 0x80 915 0x80
916 Standard port 0x80 based delay 916 Standard port 0x80 based delay
917 0xed 917 0xed
918 Alternate port 0xed based delay (needed on some systems) 918 Alternate port 0xed based delay (needed on some systems)
919 udelay 919 udelay
920 Simple two microseconds delay 920 Simple two microseconds delay
921 none 921 none
922 No delay 922 No delay
923 923
924 io7= [HW] IO7 for Marvel based alpha systems 924 io7= [HW] IO7 for Marvel based alpha systems
925 See comment before marvel_specify_io7 in 925 See comment before marvel_specify_io7 in
926 arch/alpha/kernel/core_marvel.c. 926 arch/alpha/kernel/core_marvel.c.
927 927
928 ip= [IP_PNP] 928 ip= [IP_PNP]
929 See Documentation/filesystems/nfsroot.txt. 929 See Documentation/filesystems/nfsroot.txt.
930 930
931 ip2= [HW] Set IO/IRQ pairs for up to 4 IntelliPort boards 931 ip2= [HW] Set IO/IRQ pairs for up to 4 IntelliPort boards
932 See comment before ip2_setup() in 932 See comment before ip2_setup() in
933 drivers/char/ip2/ip2base.c. 933 drivers/char/ip2/ip2base.c.
934 934
935 ips= [HW,SCSI] Adaptec / IBM ServeRAID controller 935 ips= [HW,SCSI] Adaptec / IBM ServeRAID controller
936 See header of drivers/scsi/ips.c. 936 See header of drivers/scsi/ips.c.
937 937
938 ports= [IP_VS_FTP] IPVS ftp helper module 938 ports= [IP_VS_FTP] IPVS ftp helper module
939 Default is 21. 939 Default is 21.
940 Up to 8 (IP_VS_APP_MAX_PORTS) ports 940 Up to 8 (IP_VS_APP_MAX_PORTS) ports
941 may be specified. 941 may be specified.
942 Format: <port>,<port>.... 942 Format: <port>,<port>....
943 943
944 irqfixup [HW] 944 irqfixup [HW]
945 When an interrupt is not handled search all handlers 945 When an interrupt is not handled search all handlers
946 for it. Intended to get systems with badly broken 946 for it. Intended to get systems with badly broken
947 firmware running. 947 firmware running.
948 948
949 irqpoll [HW] 949 irqpoll [HW]
950 When an interrupt is not handled search all handlers 950 When an interrupt is not handled search all handlers
951 for it. Also check all handlers each timer 951 for it. Also check all handlers each timer
952 interrupt. Intended to get systems with badly broken 952 interrupt. Intended to get systems with badly broken
953 firmware running. 953 firmware running.
954 954
955 isapnp= [ISAPNP] 955 isapnp= [ISAPNP]
956 Format: <RDP>,<reset>,<pci_scan>,<verbosity> 956 Format: <RDP>,<reset>,<pci_scan>,<verbosity>
957 957
958 isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler. 958 isolcpus= [KNL,SMP] Isolate CPUs from the general scheduler.
959 Format: 959 Format:
960 <cpu number>,...,<cpu number> 960 <cpu number>,...,<cpu number>
961 or 961 or
962 <cpu number>-<cpu number> (must be a positive range in ascending order) 962 <cpu number>-<cpu number> (must be a positive range in ascending order)
963 or a mixture 963 or a mixture
964 <cpu number>,...,<cpu number>-<cpu number> 964 <cpu number>,...,<cpu number>-<cpu number>
965 This option can be used to specify one or more CPUs 965 This option can be used to specify one or more CPUs
966 to isolate from the general SMP balancing and scheduling 966 to isolate from the general SMP balancing and scheduling
967 algorithms. The only way to move a process onto or off 967 algorithms. The only way to move a process onto or off
968 an "isolated" CPU is via the CPU affinity syscalls. 968 an "isolated" CPU is via the CPU affinity syscalls.
969 <cpu number> begins at 0 and the maximum value is 969 <cpu number> begins at 0 and the maximum value is
970 "number of CPUs in system - 1". 970 "number of CPUs in system - 1".
971 971
972 This option is the preferred way to isolate CPUs. The 972 This option is the preferred way to isolate CPUs. The
973 alternative -- manually setting the CPU mask of all 973 alternative -- manually setting the CPU mask of all
974 tasks in the system -- can cause problems and 974 tasks in the system -- can cause problems and
975 suboptimal load balancer performance. 975 suboptimal load balancer performance.
976 976
977 iucv= [HW,NET] 977 iucv= [HW,NET]
978 978
979 js= [HW,JOY] Analog joystick 979 js= [HW,JOY] Analog joystick
980 See Documentation/input/joystick.txt. 980 See Documentation/input/joystick.txt.
981 981
982 kernelcore=nn[KMG] [KNL,X86-32,IA-64,PPC,X86-64] This parameter 982 kernelcore=nn[KMG] [KNL,X86-32,IA-64,PPC,X86-64] This parameter
983 specifies the amount of memory usable by the kernel 983 specifies the amount of memory usable by the kernel
984 for non-movable allocations. The requested amount is 984 for non-movable allocations. The requested amount is
985 spread evenly throughout all nodes in the system. The 985 spread evenly throughout all nodes in the system. The
986 remaining memory in each node is used for Movable 986 remaining memory in each node is used for Movable
987 pages. In the event, a node is too small to have both 987 pages. In the event, a node is too small to have both
988 kernelcore and Movable pages, kernelcore pages will 988 kernelcore and Movable pages, kernelcore pages will
989 take priority and other nodes will have a larger number 989 take priority and other nodes will have a larger number
990 of kernelcore pages. The Movable zone is used for the 990 of kernelcore pages. The Movable zone is used for the
991 allocation of pages that may be reclaimed or moved 991 allocation of pages that may be reclaimed or moved
992 by the page migration subsystem. This means that 992 by the page migration subsystem. This means that
993 HugeTLB pages may not be allocated from this zone. 993 HugeTLB pages may not be allocated from this zone.
994 Note that allocations like PTEs-from-HighMem still 994 Note that allocations like PTEs-from-HighMem still
995 use the HighMem zone if it exists, and the Normal 995 use the HighMem zone if it exists, and the Normal
996 zone if it does not. 996 zone if it does not.
997 997
998 movablecore=nn[KMG] [KNL,X86-32,IA-64,PPC,X86-64] This parameter 998 movablecore=nn[KMG] [KNL,X86-32,IA-64,PPC,X86-64] This parameter
999 is similar to kernelcore except it specifies the 999 is similar to kernelcore except it specifies the
1000 amount of memory used for migratable allocations. 1000 amount of memory used for migratable allocations.
1001 If both kernelcore and movablecore is specified, 1001 If both kernelcore and movablecore is specified,
1002 then kernelcore will be at *least* the specified 1002 then kernelcore will be at *least* the specified
1003 value but may be more. If movablecore on its own 1003 value but may be more. If movablecore on its own
1004 is specified, the administrator must be careful 1004 is specified, the administrator must be careful
1005 that the amount of memory usable for all allocations 1005 that the amount of memory usable for all allocations
1006 is not too small. 1006 is not too small.
1007 1007
1008 keepinitrd [HW,ARM] 1008 keepinitrd [HW,ARM]
1009 1009
1010 kstack=N [X86-32,X86-64] Print N words from the kernel stack 1010 kstack=N [X86-32,X86-64] Print N words from the kernel stack
1011 in oops dumps. 1011 in oops dumps.
1012 1012
1013 kgdboc= [HW] kgdb over consoles. 1013 kgdboc= [HW] kgdb over consoles.
1014 Requires a tty driver that supports console polling. 1014 Requires a tty driver that supports console polling.
1015 (only serial suported for now) 1015 (only serial suported for now)
1016 Format: <serial_device>[,baud] 1016 Format: <serial_device>[,baud]
1017 1017
1018 l2cr= [PPC] 1018 l2cr= [PPC]
1019 1019
1020 l3cr= [PPC] 1020 l3cr= [PPC]
1021 1021
1022 lapic [X86-32,APIC] Enable the local APIC even if BIOS 1022 lapic [X86-32,APIC] Enable the local APIC even if BIOS
1023 disabled it. 1023 disabled it.
1024 1024
1025 lapic_timer_c2_ok [X86-32,x86-64,APIC] trust the local apic timer in 1025 lapic_timer_c2_ok [X86-32,x86-64,APIC] trust the local apic timer in
1026 C2 power state. 1026 C2 power state.
1027 1027
1028 libata.dma= [LIBATA] DMA control 1028 libata.dma= [LIBATA] DMA control
1029 libata.dma=0 Disable all PATA and SATA DMA 1029 libata.dma=0 Disable all PATA and SATA DMA
1030 libata.dma=1 PATA and SATA Disk DMA only 1030 libata.dma=1 PATA and SATA Disk DMA only
1031 libata.dma=2 ATAPI (CDROM) DMA only 1031 libata.dma=2 ATAPI (CDROM) DMA only
1032 libata.dma=4 Compact Flash DMA only 1032 libata.dma=4 Compact Flash DMA only
1033 Combinations also work, so libata.dma=3 enables DMA 1033 Combinations also work, so libata.dma=3 enables DMA
1034 for disks and CDROMs, but not CFs. 1034 for disks and CDROMs, but not CFs.
1035 1035
1036 libata.noacpi [LIBATA] Disables use of ACPI in libata suspend/resume 1036 libata.noacpi [LIBATA] Disables use of ACPI in libata suspend/resume
1037 when set. 1037 when set.
1038 Format: <int> 1038 Format: <int>
1039 1039
1040 libata.force= [LIBATA] Force configurations. The format is comma 1040 libata.force= [LIBATA] Force configurations. The format is comma
1041 separated list of "[ID:]VAL" where ID is 1041 separated list of "[ID:]VAL" where ID is
1042 PORT[:DEVICE]. PORT and DEVICE are decimal numbers 1042 PORT[:DEVICE]. PORT and DEVICE are decimal numbers
1043 matching port, link or device. Basically, it matches 1043 matching port, link or device. Basically, it matches
1044 the ATA ID string printed on console by libata. If 1044 the ATA ID string printed on console by libata. If
1045 the whole ID part is omitted, the last PORT and DEVICE 1045 the whole ID part is omitted, the last PORT and DEVICE
1046 values are used. If ID hasn't been specified yet, the 1046 values are used. If ID hasn't been specified yet, the
1047 configuration applies to all ports, links and devices. 1047 configuration applies to all ports, links and devices.
1048 1048
1049 If only DEVICE is omitted, the parameter applies to 1049 If only DEVICE is omitted, the parameter applies to
1050 the port and all links and devices behind it. DEVICE 1050 the port and all links and devices behind it. DEVICE
1051 number of 0 either selects the first device or the 1051 number of 0 either selects the first device or the
1052 first fan-out link behind PMP device. It does not 1052 first fan-out link behind PMP device. It does not
1053 select the host link. DEVICE number of 15 selects the 1053 select the host link. DEVICE number of 15 selects the
1054 host link and device attached to it. 1054 host link and device attached to it.
1055 1055
1056 The VAL specifies the configuration to force. As long 1056 The VAL specifies the configuration to force. As long
1057 as there's no ambiguity shortcut notation is allowed. 1057 as there's no ambiguity shortcut notation is allowed.
1058 For example, both 1.5 and 1.5G would work for 1.5Gbps. 1058 For example, both 1.5 and 1.5G would work for 1.5Gbps.
1059 The following configurations can be forced. 1059 The following configurations can be forced.
1060 1060
1061 * Cable type: 40c, 80c, short40c, unk, ign or sata. 1061 * Cable type: 40c, 80c, short40c, unk, ign or sata.
1062 Any ID with matching PORT is used. 1062 Any ID with matching PORT is used.
1063 1063
1064 * SATA link speed limit: 1.5Gbps or 3.0Gbps. 1064 * SATA link speed limit: 1.5Gbps or 3.0Gbps.
1065 1065
1066 * Transfer mode: pio[0-7], mwdma[0-4] and udma[0-7]. 1066 * Transfer mode: pio[0-7], mwdma[0-4] and udma[0-7].
1067 udma[/][16,25,33,44,66,100,133] notation is also 1067 udma[/][16,25,33,44,66,100,133] notation is also
1068 allowed. 1068 allowed.
1069 1069
1070 * [no]ncq: Turn on or off NCQ. 1070 * [no]ncq: Turn on or off NCQ.
1071 1071
1072 If there are multiple matching configurations changing 1072 If there are multiple matching configurations changing
1073 the same attribute, the last one is used. 1073 the same attribute, the last one is used.
1074 1074
1075 load_ramdisk= [RAM] List of ramdisks to load from floppy 1075 load_ramdisk= [RAM] List of ramdisks to load from floppy
1076 See Documentation/ramdisk.txt. 1076 See Documentation/ramdisk.txt.
1077 1077
1078 lockd.nlm_grace_period=P [NFS] Assign grace period. 1078 lockd.nlm_grace_period=P [NFS] Assign grace period.
1079 Format: <integer> 1079 Format: <integer>
1080 1080
1081 lockd.nlm_tcpport=N [NFS] Assign TCP port. 1081 lockd.nlm_tcpport=N [NFS] Assign TCP port.
1082 Format: <integer> 1082 Format: <integer>
1083 1083
1084 lockd.nlm_timeout=T [NFS] Assign timeout value. 1084 lockd.nlm_timeout=T [NFS] Assign timeout value.
1085 Format: <integer> 1085 Format: <integer>
1086 1086
1087 lockd.nlm_udpport=M [NFS] Assign UDP port. 1087 lockd.nlm_udpport=M [NFS] Assign UDP port.
1088 Format: <integer> 1088 Format: <integer>
1089 1089
1090 logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver 1090 logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver
1091 Format: <irq> 1091 Format: <irq>
1092 1092
1093 loglevel= All Kernel Messages with a loglevel smaller than the 1093 loglevel= All Kernel Messages with a loglevel smaller than the
1094 console loglevel will be printed to the console. It can 1094 console loglevel will be printed to the console. It can
1095 also be changed with klogd or other programs. The 1095 also be changed with klogd or other programs. The
1096 loglevels are defined as follows: 1096 loglevels are defined as follows:
1097 1097
1098 0 (KERN_EMERG) system is unusable 1098 0 (KERN_EMERG) system is unusable
1099 1 (KERN_ALERT) action must be taken immediately 1099 1 (KERN_ALERT) action must be taken immediately
1100 2 (KERN_CRIT) critical conditions 1100 2 (KERN_CRIT) critical conditions
1101 3 (KERN_ERR) error conditions 1101 3 (KERN_ERR) error conditions
1102 4 (KERN_WARNING) warning conditions 1102 4 (KERN_WARNING) warning conditions
1103 5 (KERN_NOTICE) normal but significant condition 1103 5 (KERN_NOTICE) normal but significant condition
1104 6 (KERN_INFO) informational 1104 6 (KERN_INFO) informational
1105 7 (KERN_DEBUG) debug-level messages 1105 7 (KERN_DEBUG) debug-level messages
1106 1106
1107 log_buf_len=n Sets the size of the printk ring buffer, in bytes. 1107 log_buf_len=n Sets the size of the printk ring buffer, in bytes.
1108 Format: { n | nk | nM } 1108 Format: { n | nk | nM }
1109 n must be a power of two. The default size 1109 n must be a power of two. The default size
1110 is set in the kernel config file. 1110 is set in the kernel config file.
1111 1111
1112 logo.nologo [FB] Disables display of the built-in Linux logo. 1112 logo.nologo [FB] Disables display of the built-in Linux logo.
1113 This may be used to provide more screen space for 1113 This may be used to provide more screen space for
1114 kernel log messages and is useful when debugging 1114 kernel log messages and is useful when debugging
1115 kernel boot problems. 1115 kernel boot problems.
1116 1116
1117 lp=0 [LP] Specify parallel ports to use, e.g, 1117 lp=0 [LP] Specify parallel ports to use, e.g,
1118 lp=port[,port...] lp=none,parport0 (lp0 not configured, lp1 uses 1118 lp=port[,port...] lp=none,parport0 (lp0 not configured, lp1 uses
1119 lp=reset first parallel port). 'lp=0' disables the 1119 lp=reset first parallel port). 'lp=0' disables the
1120 lp=auto printer driver. 'lp=reset' (which can be 1120 lp=auto printer driver. 'lp=reset' (which can be
1121 specified in addition to the ports) causes 1121 specified in addition to the ports) causes
1122 attached printers to be reset. Using 1122 attached printers to be reset. Using
1123 lp=port1,port2,... specifies the parallel ports 1123 lp=port1,port2,... specifies the parallel ports
1124 to associate lp devices with, starting with 1124 to associate lp devices with, starting with
1125 lp0. A port specification may be 'none' to skip 1125 lp0. A port specification may be 'none' to skip
1126 that lp device, or a parport name such as 1126 that lp device, or a parport name such as
1127 'parport0'. Specifying 'lp=auto' instead of a 1127 'parport0'. Specifying 'lp=auto' instead of a
1128 port specification list means that device IDs 1128 port specification list means that device IDs
1129 from each port should be examined, to see if 1129 from each port should be examined, to see if
1130 an IEEE 1284-compliant printer is attached; if 1130 an IEEE 1284-compliant printer is attached; if
1131 so, the driver will manage that printer. 1131 so, the driver will manage that printer.
1132 See also header of drivers/char/lp.c. 1132 See also header of drivers/char/lp.c.
1133 1133
1134 lpj=n [KNL] 1134 lpj=n [KNL]
1135 Sets loops_per_jiffy to given constant, thus avoiding 1135 Sets loops_per_jiffy to given constant, thus avoiding
1136 time-consuming boot-time autodetection (up to 250 ms per 1136 time-consuming boot-time autodetection (up to 250 ms per
1137 CPU). 0 enables autodetection (default). To determine 1137 CPU). 0 enables autodetection (default). To determine
1138 the correct value for your kernel, boot with normal 1138 the correct value for your kernel, boot with normal
1139 autodetection and see what value is printed. Note that 1139 autodetection and see what value is printed. Note that
1140 on SMP systems the preset will be applied to all CPUs, 1140 on SMP systems the preset will be applied to all CPUs,
1141 which is likely to cause problems if your CPUs need 1141 which is likely to cause problems if your CPUs need
1142 significantly divergent settings. An incorrect value 1142 significantly divergent settings. An incorrect value
1143 will cause delays in the kernel to be wrong, leading to 1143 will cause delays in the kernel to be wrong, leading to
1144 unpredictable I/O errors and other breakage. Although 1144 unpredictable I/O errors and other breakage. Although
1145 unlikely, in the extreme case this might damage your 1145 unlikely, in the extreme case this might damage your
1146 hardware. 1146 hardware.
1147 1147
1148 ltpc= [NET] 1148 ltpc= [NET]
1149 Format: <io>,<irq>,<dma> 1149 Format: <io>,<irq>,<dma>
1150 1150
1151 mac5380= [HW,SCSI] Format: 1151 mac5380= [HW,SCSI] Format:
1152 <can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags> 1152 <can_queue>,<cmd_per_lun>,<sg_tablesize>,<hostid>,<use_tags>
1153 1153
1154 machvec= [IA64] Force the use of a particular machine-vector 1154 machvec= [IA64] Force the use of a particular machine-vector
1155 (machvec) in a generic kernel. 1155 (machvec) in a generic kernel.
1156 Example: machvec=hpzx1_swiotlb 1156 Example: machvec=hpzx1_swiotlb
1157 1157
1158 max_loop= [LOOP] Maximum number of loopback devices that can 1158 max_loop= [LOOP] Maximum number of loopback devices that can
1159 be mounted 1159 be mounted
1160 Format: <1-256> 1160 Format: <1-256>
1161 1161
1162 maxcpus= [SMP] Maximum number of processors that an SMP kernel 1162 maxcpus= [SMP] Maximum number of processors that an SMP kernel
1163 should make use of. maxcpus=n : n >= 0 limits the 1163 should make use of. maxcpus=n : n >= 0 limits the
1164 kernel to using 'n' processors. n=0 is a special case, 1164 kernel to using 'n' processors. n=0 is a special case,
1165 it is equivalent to "nosmp", which also disables 1165 it is equivalent to "nosmp", which also disables
1166 the IO APIC. 1166 the IO APIC.
1167 1167
1168 max_addr=[KMG] [KNL,BOOT,ia64] All physical memory greater than or 1168 max_addr=[KMG] [KNL,BOOT,ia64] All physical memory greater than or
1169 equal to this physical address is ignored. 1169 equal to this physical address is ignored.
1170 1170
1171 max_luns= [SCSI] Maximum number of LUNs to probe. 1171 max_luns= [SCSI] Maximum number of LUNs to probe.
1172 Should be between 1 and 2^32-1. 1172 Should be between 1 and 2^32-1.
1173 1173
1174 max_report_luns= 1174 max_report_luns=
1175 [SCSI] Maximum number of LUNs received. 1175 [SCSI] Maximum number of LUNs received.
1176 Should be between 1 and 16384. 1176 Should be between 1 and 16384.
1177 1177
1178 mcatest= [IA-64] 1178 mcatest= [IA-64]
1179 1179
1180 mce [X86-32] Machine Check Exception 1180 mce [X86-32] Machine Check Exception
1181 1181
1182 mce=option [X86-64] See Documentation/x86_64/boot-options.txt 1182 mce=option [X86-64] See Documentation/x86_64/boot-options.txt
1183 1183
1184 md= [HW] RAID subsystems devices and level 1184 md= [HW] RAID subsystems devices and level
1185 See Documentation/md.txt. 1185 See Documentation/md.txt.
1186 1186
1187 mdacon= [MDA] 1187 mdacon= [MDA]
1188 Format: <first>,<last> 1188 Format: <first>,<last>
1189 Specifies range of consoles to be captured by the MDA. 1189 Specifies range of consoles to be captured by the MDA.
1190 1190
1191 mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory 1191 mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
1192 Amount of memory to be used when the kernel is not able 1192 Amount of memory to be used when the kernel is not able
1193 to see the whole system memory or for test. 1193 to see the whole system memory or for test.
1194 [X86-32] Use together with memmap= to avoid physical 1194 [X86-32] Use together with memmap= to avoid physical
1195 address space collisions. Without memmap= PCI devices 1195 address space collisions. Without memmap= PCI devices
1196 could be placed at addresses belonging to unused RAM. 1196 could be placed at addresses belonging to unused RAM.
1197 1197
1198 mem=nopentium [BUGS=X86-32] Disable usage of 4MB pages for kernel 1198 mem=nopentium [BUGS=X86-32] Disable usage of 4MB pages for kernel
1199 memory. 1199 memory.
1200 1200
1201 memmap=exactmap [KNL,X86-32,X86_64] Enable setting of an exact 1201 memmap=exactmap [KNL,X86-32,X86_64] Enable setting of an exact
1202 E820 memory map, as specified by the user. 1202 E820 memory map, as specified by the user.
1203 Such memmap=exactmap lines can be constructed based on 1203 Such memmap=exactmap lines can be constructed based on
1204 BIOS output or other requirements. See the memmap=nn@ss 1204 BIOS output or other requirements. See the memmap=nn@ss
1205 option description. 1205 option description.
1206 1206
1207 memmap=nn[KMG]@ss[KMG] 1207 memmap=nn[KMG]@ss[KMG]
1208 [KNL] Force usage of a specific region of memory 1208 [KNL] Force usage of a specific region of memory
1209 Region of memory to be used, from ss to ss+nn. 1209 Region of memory to be used, from ss to ss+nn.
1210 1210
1211 memmap=nn[KMG]#ss[KMG] 1211 memmap=nn[KMG]#ss[KMG]
1212 [KNL,ACPI] Mark specific memory as ACPI data. 1212 [KNL,ACPI] Mark specific memory as ACPI data.
1213 Region of memory to be used, from ss to ss+nn. 1213 Region of memory to be used, from ss to ss+nn.
1214 1214
1215 memmap=nn[KMG]$ss[KMG] 1215 memmap=nn[KMG]$ss[KMG]
1216 [KNL,ACPI] Mark specific memory as reserved. 1216 [KNL,ACPI] Mark specific memory as reserved.
1217 Region of memory to be used, from ss to ss+nn. 1217 Region of memory to be used, from ss to ss+nn.
1218 Example: Exclude memory from 0x18690000-0x1869ffff 1218 Example: Exclude memory from 0x18690000-0x1869ffff
1219 memmap=64K$0x18690000 1219 memmap=64K$0x18690000
1220 or 1220 or
1221 memmap=0x10000$0x18690000 1221 memmap=0x10000$0x18690000
1222 1222
1223 memtest= [KNL,X86] Enable memtest 1223 memtest= [KNL,X86] Enable memtest
1224 Format: <integer> 1224 Format: <integer>
1225 range: 0,4 : pattern number 1225 range: 0,4 : pattern number
1226 default : 0 <disable> 1226 default : 0 <disable>
1227 1227
1228 meye.*= [HW] Set MotionEye Camera parameters 1228 meye.*= [HW] Set MotionEye Camera parameters
1229 See Documentation/video4linux/meye.txt. 1229 See Documentation/video4linux/meye.txt.
1230 1230
1231 mfgpt_irq= [IA-32] Specify the IRQ to use for the 1231 mfgpt_irq= [IA-32] Specify the IRQ to use for the
1232 Multi-Function General Purpose Timers on AMD Geode 1232 Multi-Function General Purpose Timers on AMD Geode
1233 platforms. 1233 platforms.
1234 1234
1235 mfgptfix [X86-32] Fix MFGPT timers on AMD Geode platforms when 1235 mfgptfix [X86-32] Fix MFGPT timers on AMD Geode platforms when
1236 the BIOS has incorrectly applied a workaround. TinyBIOS 1236 the BIOS has incorrectly applied a workaround. TinyBIOS
1237 version 0.98 is known to be affected, 0.99 fixes the 1237 version 0.98 is known to be affected, 0.99 fixes the
1238 problem by letting the user disable the workaround. 1238 problem by letting the user disable the workaround.
1239 1239
1240 mga= [HW,DRM] 1240 mga= [HW,DRM]
1241 1241
1242 mminit_loglevel= 1242 mminit_loglevel=
1243 [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this 1243 [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
1244 parameter allows control of the logging verbosity for 1244 parameter allows control of the logging verbosity for
1245 the additional memory initialisation checks. A value 1245 the additional memory initialisation checks. A value
1246 of 0 disables mminit logging and a level of 4 will 1246 of 0 disables mminit logging and a level of 4 will
1247 log everything. Information is printed at KERN_DEBUG 1247 log everything. Information is printed at KERN_DEBUG
1248 so loglevel=8 may also need to be specified. 1248 so loglevel=8 may also need to be specified.
1249 1249
1250 mousedev.tap_time= 1250 mousedev.tap_time=
1251 [MOUSE] Maximum time between finger touching and 1251 [MOUSE] Maximum time between finger touching and
1252 leaving touchpad surface for touch to be considered 1252 leaving touchpad surface for touch to be considered
1253 a tap and be reported as a left button click (for 1253 a tap and be reported as a left button click (for
1254 touchpads working in absolute mode only). 1254 touchpads working in absolute mode only).
1255 Format: <msecs> 1255 Format: <msecs>
1256 mousedev.xres= [MOUSE] Horizontal screen resolution, used for devices 1256 mousedev.xres= [MOUSE] Horizontal screen resolution, used for devices
1257 reporting absolute coordinates, such as tablets 1257 reporting absolute coordinates, such as tablets
1258 mousedev.yres= [MOUSE] Vertical screen resolution, used for devices 1258 mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
1259 reporting absolute coordinates, such as tablets 1259 reporting absolute coordinates, such as tablets
1260 1260
1261 mpu401= [HW,OSS] 1261 mpu401= [HW,OSS]
1262 Format: <io>,<irq> 1262 Format: <io>,<irq>
1263 1263
1264 MTD_Partition= [MTD] 1264 MTD_Partition= [MTD]
1265 Format: <name>,<region-number>,<size>,<offset> 1265 Format: <name>,<region-number>,<size>,<offset>
1266 1266
1267 MTD_Region= [MTD] Format: 1267 MTD_Region= [MTD] Format:
1268 <name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>] 1268 <name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>]
1269 1269
1270 mtdparts= [MTD] 1270 mtdparts= [MTD]
1271 See drivers/mtd/cmdlinepart.c. 1271 See drivers/mtd/cmdlinepart.c.
1272 1272
1273 mtdset= [ARM] 1273 mtdset= [ARM]
1274 ARM/S3C2412 JIVE boot control 1274 ARM/S3C2412 JIVE boot control
1275 1275
1276 See arch/arm/mach-s3c2412/mach-jive.c 1276 See arch/arm/mach-s3c2412/mach-jive.c
1277 1277
1278 mtouchusb.raw_coordinates= 1278 mtouchusb.raw_coordinates=
1279 [HW] Make the MicroTouch USB driver use raw coordinates 1279 [HW] Make the MicroTouch USB driver use raw coordinates
1280 ('y', default) or cooked coordinates ('n') 1280 ('y', default) or cooked coordinates ('n')
1281 1281
1282 n2= [NET] SDL Inc. RISCom/N2 synchronous serial card 1282 n2= [NET] SDL Inc. RISCom/N2 synchronous serial card
1283 1283
1284 NCR_D700= [HW,SCSI] 1284 NCR_D700= [HW,SCSI]
1285 See header of drivers/scsi/NCR_D700.c. 1285 See header of drivers/scsi/NCR_D700.c.
1286 1286
1287 ncr5380= [HW,SCSI] 1287 ncr5380= [HW,SCSI]
1288 1288
1289 ncr53c400= [HW,SCSI] 1289 ncr53c400= [HW,SCSI]
1290 1290
1291 ncr53c400a= [HW,SCSI] 1291 ncr53c400a= [HW,SCSI]
1292 1292
1293 ncr53c406a= [HW,SCSI] 1293 ncr53c406a= [HW,SCSI]
1294 1294
1295 ncr53c8xx= [HW,SCSI] 1295 ncr53c8xx= [HW,SCSI]
1296 1296
1297 netdev= [NET] Network devices parameters 1297 netdev= [NET] Network devices parameters
1298 Format: <irq>,<io>,<mem_start>,<mem_end>,<name> 1298 Format: <irq>,<io>,<mem_start>,<mem_end>,<name>
1299 Note that mem_start is often overloaded to mean 1299 Note that mem_start is often overloaded to mean
1300 something different and driver-specific. 1300 something different and driver-specific.
1301 This usage is only documented in each driver source 1301 This usage is only documented in each driver source
1302 file if at all. 1302 file if at all.
1303 1303
1304 nf_conntrack.acct= 1304 nf_conntrack.acct=
1305 [NETFILTER] Enable connection tracking flow accounting 1305 [NETFILTER] Enable connection tracking flow accounting
1306 0 to disable accounting 1306 0 to disable accounting
1307 1 to enable accounting 1307 1 to enable accounting
1308 Default value depends on CONFIG_NF_CT_ACCT that is 1308 Default value depends on CONFIG_NF_CT_ACCT that is
1309 going to be removed in 2.6.29. 1309 going to be removed in 2.6.29.
1310 1310
1311 nfsaddrs= [NFS] 1311 nfsaddrs= [NFS]
1312 See Documentation/filesystems/nfsroot.txt. 1312 See Documentation/filesystems/nfsroot.txt.
1313 1313
1314 nfsroot= [NFS] nfs root filesystem for disk-less boxes. 1314 nfsroot= [NFS] nfs root filesystem for disk-less boxes.
1315 See Documentation/filesystems/nfsroot.txt. 1315 See Documentation/filesystems/nfsroot.txt.
1316 1316
1317 nfs.callback_tcpport= 1317 nfs.callback_tcpport=
1318 [NFS] set the TCP port on which the NFSv4 callback 1318 [NFS] set the TCP port on which the NFSv4 callback
1319 channel should listen. 1319 channel should listen.
1320 1320
1321 nfs.idmap_cache_timeout= 1321 nfs.idmap_cache_timeout=
1322 [NFS] set the maximum lifetime for idmapper cache 1322 [NFS] set the maximum lifetime for idmapper cache
1323 entries. 1323 entries.
1324 1324
1325 nfs.enable_ino64= 1325 nfs.enable_ino64=
1326 [NFS] enable 64-bit inode numbers. 1326 [NFS] enable 64-bit inode numbers.
1327 If zero, the NFS client will fake up a 32-bit inode 1327 If zero, the NFS client will fake up a 32-bit inode
1328 number for the readdir() and stat() syscalls instead 1328 number for the readdir() and stat() syscalls instead
1329 of returning the full 64-bit number. 1329 of returning the full 64-bit number.
1330 The default is to return 64-bit inode numbers. 1330 The default is to return 64-bit inode numbers.
1331 1331
1332 nmi_debug= [KNL,AVR32] Specify one or more actions to take 1332 nmi_debug= [KNL,AVR32] Specify one or more actions to take
1333 when a NMI is triggered. 1333 when a NMI is triggered.
1334 Format: [state][,regs][,debounce][,die] 1334 Format: [state][,regs][,debounce][,die]
1335 1335
1336 nmi_watchdog= [KNL,BUGS=X86-32] Debugging features for SMP kernels 1336 nmi_watchdog= [KNL,BUGS=X86-32] Debugging features for SMP kernels
1337 1337
1338 no387 [BUGS=X86-32] Tells the kernel to use the 387 maths 1338 no387 [BUGS=X86-32] Tells the kernel to use the 387 maths
1339 emulation library even if a 387 maths coprocessor 1339 emulation library even if a 387 maths coprocessor
1340 is present. 1340 is present.
1341 1341
1342 noaliencache [MM, NUMA, SLAB] Disables the allocation of alien 1342 noaliencache [MM, NUMA, SLAB] Disables the allocation of alien
1343 caches in the slab allocator. Saves per-node memory, 1343 caches in the slab allocator. Saves per-node memory,
1344 but will impact performance. 1344 but will impact performance.
1345 1345
1346 noalign [KNL,ARM] 1346 noalign [KNL,ARM]
1347 1347
1348 noapic [SMP,APIC] Tells the kernel to not make use of any 1348 noapic [SMP,APIC] Tells the kernel to not make use of any
1349 IOAPICs that may be present in the system. 1349 IOAPICs that may be present in the system.
1350 1350
1351 nobats [PPC] Do not use BATs for mapping kernel lowmem 1351 nobats [PPC] Do not use BATs for mapping kernel lowmem
1352 on "Classic" PPC cores. 1352 on "Classic" PPC cores.
1353 1353
1354 nocache [ARM] 1354 nocache [ARM]
1355 1355
1356 nodelayacct [KNL] Disable per-task delay accounting 1356 nodelayacct [KNL] Disable per-task delay accounting
1357 1357
1358 nodisconnect [HW,SCSI,M68K] Disables SCSI disconnects. 1358 nodisconnect [HW,SCSI,M68K] Disables SCSI disconnects.
1359 1359
1360 noefi [X86-32,X86-64] Disable EFI runtime services support. 1360 noefi [X86-32,X86-64] Disable EFI runtime services support.
1361 1361
1362 noexec [IA-64] 1362 noexec [IA-64]
1363 1363
1364 noexec [X86-32,X86-64] 1364 noexec [X86-32,X86-64]
1365 On X86-32 available only on PAE configured kernels. 1365 On X86-32 available only on PAE configured kernels.
1366 noexec=on: enable non-executable mappings (default) 1366 noexec=on: enable non-executable mappings (default)
1367 noexec=off: disable non-executable mappings 1367 noexec=off: disable non-executable mappings
1368 1368
1369 noexec32 [X86-64] 1369 noexec32 [X86-64]
1370 This affects only 32-bit executables. 1370 This affects only 32-bit executables.
1371 noexec32=on: enable non-executable mappings (default) 1371 noexec32=on: enable non-executable mappings (default)
1372 read doesn't imply executable mappings 1372 read doesn't imply executable mappings
1373 noexec32=off: disable non-executable mappings 1373 noexec32=off: disable non-executable mappings
1374 read implies executable mappings 1374 read implies executable mappings
1375 1375
1376 nofxsr [BUGS=X86-32] Disables x86 floating point extended 1376 nofxsr [BUGS=X86-32] Disables x86 floating point extended
1377 register save and restore. The kernel will only save 1377 register save and restore. The kernel will only save
1378 legacy floating-point registers on task switch. 1378 legacy floating-point registers on task switch.
1379 1379
1380 noclflush [BUGS=X86] Don't use the CLFLUSH instruction 1380 noclflush [BUGS=X86] Don't use the CLFLUSH instruction
1381 1381
1382 nohlt [BUGS=ARM] 1382 nohlt [BUGS=ARM]
1383 1383
1384 no-hlt [BUGS=X86-32] Tells the kernel that the hlt 1384 no-hlt [BUGS=X86-32] Tells the kernel that the hlt
1385 instruction doesn't work correctly and not to 1385 instruction doesn't work correctly and not to
1386 use it. 1386 use it.
1387 1387
1388 nohalt [IA-64] Tells the kernel not to use the power saving 1388 nohalt [IA-64] Tells the kernel not to use the power saving
1389 function PAL_HALT_LIGHT when idle. This increases 1389 function PAL_HALT_LIGHT when idle. This increases
1390 power-consumption. On the positive side, it reduces 1390 power-consumption. On the positive side, it reduces
1391 interrupt wake-up latency, which may improve performance 1391 interrupt wake-up latency, which may improve performance
1392 in certain environments such as networked servers or 1392 in certain environments such as networked servers or
1393 real-time systems. 1393 real-time systems.
1394 1394
1395 nohz= [KNL] Boottime enable/disable dynamic ticks 1395 nohz= [KNL] Boottime enable/disable dynamic ticks
1396 Valid arguments: on, off 1396 Valid arguments: on, off
1397 Default: on 1397 Default: on
1398 1398
1399 noirqbalance [X86-32,SMP,KNL] Disable kernel irq balancing 1399 noirqbalance [X86-32,SMP,KNL] Disable kernel irq balancing
1400 1400
1401 noirqdebug [X86-32] Disables the code which attempts to detect and 1401 noirqdebug [X86-32] Disables the code which attempts to detect and
1402 disable unhandled interrupt sources. 1402 disable unhandled interrupt sources.
1403 1403
1404 no_timer_check [X86-32,X86_64,APIC] Disables the code which tests for 1404 no_timer_check [X86-32,X86_64,APIC] Disables the code which tests for
1405 broken timer IRQ sources. 1405 broken timer IRQ sources.
1406 1406
1407 noisapnp [ISAPNP] Disables ISA PnP code. 1407 noisapnp [ISAPNP] Disables ISA PnP code.
1408 1408
1409 noinitrd [RAM] Tells the kernel not to load any configured 1409 noinitrd [RAM] Tells the kernel not to load any configured
1410 initial RAM disk. 1410 initial RAM disk.
1411 1411
1412 nointroute [IA-64] 1412 nointroute [IA-64]
1413 1413
1414 nojitter [IA64] Disables jitter checking for ITC timers. 1414 nojitter [IA64] Disables jitter checking for ITC timers.
1415 1415
1416 nolapic [X86-32,APIC] Do not enable or use the local APIC. 1416 nolapic [X86-32,APIC] Do not enable or use the local APIC.
1417 1417
1418 nolapic_timer [X86-32,APIC] Do not use the local APIC timer. 1418 nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
1419 1419
1420 noltlbs [PPC] Do not use large page/tlb entries for kernel 1420 noltlbs [PPC] Do not use large page/tlb entries for kernel
1421 lowmem mapping on PPC40x. 1421 lowmem mapping on PPC40x.
1422 1422
1423 nomca [IA-64] Disable machine check abort handling 1423 nomca [IA-64] Disable machine check abort handling
1424 1424
1425 nomce [X86-32] Machine Check Exception 1425 nomce [X86-32] Machine Check Exception
1426 1426
1427 nomfgpt [X86-32] Disable Multi-Function General Purpose 1427 nomfgpt [X86-32] Disable Multi-Function General Purpose
1428 Timer usage (for AMD Geode machines). 1428 Timer usage (for AMD Geode machines).
1429 1429
1430 noreplace-paravirt [X86-32,PV_OPS] Don't patch paravirt_ops 1430 noreplace-paravirt [X86-32,PV_OPS] Don't patch paravirt_ops
1431 1431
1432 noreplace-smp [X86-32,SMP] Don't replace SMP instructions 1432 noreplace-smp [X86-32,SMP] Don't replace SMP instructions
1433 with UP alternatives 1433 with UP alternatives
1434 1434
1435 noresidual [PPC] Don't use residual data on PReP machines. 1435 noresidual [PPC] Don't use residual data on PReP machines.
1436 1436
1437 noresume [SWSUSP] Disables resume and restores original swap 1437 noresume [SWSUSP] Disables resume and restores original swap
1438 space. 1438 space.
1439 1439
1440 no-scroll [VGA] Disables scrollback. 1440 no-scroll [VGA] Disables scrollback.
1441 This is required for the Braillex ib80-piezo Braille 1441 This is required for the Braillex ib80-piezo Braille
1442 reader made by F.H. Papenmeier (Germany). 1442 reader made by F.H. Papenmeier (Germany).
1443 1443
1444 nosbagart [IA-64] 1444 nosbagart [IA-64]
1445 1445
1446 nosep [BUGS=X86-32] Disables x86 SYSENTER/SYSEXIT support. 1446 nosep [BUGS=X86-32] Disables x86 SYSENTER/SYSEXIT support.
1447 1447
1448 nosmp [SMP] Tells an SMP kernel to act as a UP kernel, 1448 nosmp [SMP] Tells an SMP kernel to act as a UP kernel,
1449 and disable the IO APIC. legacy for "maxcpus=0". 1449 and disable the IO APIC. legacy for "maxcpus=0".
1450 1450
1451 nosoftlockup [KNL] Disable the soft-lockup detector. 1451 nosoftlockup [KNL] Disable the soft-lockup detector.
1452 1452
1453 nosync [HW,M68K] Disables sync negotiation for all devices. 1453 nosync [HW,M68K] Disables sync negotiation for all devices.
1454 1454
1455 notsc [BUGS=X86-32] Disable Time Stamp Counter 1455 notsc [BUGS=X86-32] Disable Time Stamp Counter
1456 1456
1457 nousb [USB] Disable the USB subsystem 1457 nousb [USB] Disable the USB subsystem
1458 1458
1459 nowb [ARM] 1459 nowb [ARM]
1460 1460
1461 nptcg= [IA64] Override max number of concurrent global TLB 1461 nptcg= [IA64] Override max number of concurrent global TLB
1462 purges which is reported from either PAL_VM_SUMMARY or 1462 purges which is reported from either PAL_VM_SUMMARY or
1463 SAL PALO. 1463 SAL PALO.
1464 1464
1465 numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA. 1465 numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
1466 one of ['zone', 'node', 'default'] can be specified 1466 one of ['zone', 'node', 'default'] can be specified
1467 This can be set from sysctl after boot. 1467 This can be set from sysctl after boot.
1468 See Documentation/sysctl/vm.txt for details. 1468 See Documentation/sysctl/vm.txt for details.
1469 1469
1470 nr_uarts= [SERIAL] maximum number of UARTs to be registered. 1470 nr_uarts= [SERIAL] maximum number of UARTs to be registered.
1471 1471
1472 olpc_ec_timeout= [OLPC] ms delay when issuing EC commands 1472 olpc_ec_timeout= [OLPC] ms delay when issuing EC commands
1473 Rather than timing out after 20 ms if an EC 1473 Rather than timing out after 20 ms if an EC
1474 command is not properly ACKed, override the length 1474 command is not properly ACKed, override the length
1475 of the timeout. We have interrupts disabled while 1475 of the timeout. We have interrupts disabled while
1476 waiting for the ACK, so if this is set too high 1476 waiting for the ACK, so if this is set too high
1477 interrupts *may* be lost! 1477 interrupts *may* be lost!
1478 1478
1479 opl3= [HW,OSS] 1479 opl3= [HW,OSS]
1480 Format: <io> 1480 Format: <io>
1481 1481
1482 oprofile.timer= [HW] 1482 oprofile.timer= [HW]
1483 Use timer interrupt instead of performance counters 1483 Use timer interrupt instead of performance counters
1484 1484
1485 osst= [HW,SCSI] SCSI Tape Driver 1485 osst= [HW,SCSI] SCSI Tape Driver
1486 Format: <buffer_size>,<write_threshold> 1486 Format: <buffer_size>,<write_threshold>
1487 See also Documentation/scsi/st.txt. 1487 See also Documentation/scsi/st.txt.
1488 1488
1489 panic= [KNL] Kernel behaviour on panic 1489 panic= [KNL] Kernel behaviour on panic
1490 Format: <timeout> 1490 Format: <timeout>
1491 1491
1492 parkbd.port= [HW] Parallel port number the keyboard adapter is 1492 parkbd.port= [HW] Parallel port number the keyboard adapter is
1493 connected to, default is 0. 1493 connected to, default is 0.
1494 Format: <parport#> 1494 Format: <parport#>
1495 parkbd.mode= [HW] Parallel port keyboard adapter mode of operation, 1495 parkbd.mode= [HW] Parallel port keyboard adapter mode of operation,
1496 0 for XT, 1 for AT (default is AT). 1496 0 for XT, 1 for AT (default is AT).
1497 Format: <mode> 1497 Format: <mode>
1498 1498
1499 parport= [HW,PPT] Specify parallel ports. 0 disables. 1499 parport= [HW,PPT] Specify parallel ports. 0 disables.
1500 Format: { 0 | auto | 0xBBB[,IRQ[,DMA]] } 1500 Format: { 0 | auto | 0xBBB[,IRQ[,DMA]] }
1501 Use 'auto' to force the driver to use any 1501 Use 'auto' to force the driver to use any
1502 IRQ/DMA settings detected (the default is to 1502 IRQ/DMA settings detected (the default is to
1503 ignore detected IRQ/DMA settings because of 1503 ignore detected IRQ/DMA settings because of
1504 possible conflicts). You can specify the base 1504 possible conflicts). You can specify the base
1505 address, IRQ, and DMA settings; IRQ and DMA 1505 address, IRQ, and DMA settings; IRQ and DMA
1506 should be numbers, or 'auto' (for using detected 1506 should be numbers, or 'auto' (for using detected
1507 settings on that particular port), or 'nofifo' 1507 settings on that particular port), or 'nofifo'
1508 (to avoid using a FIFO even if it is detected). 1508 (to avoid using a FIFO even if it is detected).
1509 Parallel ports are assigned in the order they 1509 Parallel ports are assigned in the order they
1510 are specified on the command line, starting 1510 are specified on the command line, starting
1511 with parport0. 1511 with parport0.
1512 1512
1513 parport_init_mode= [HW,PPT] 1513 parport_init_mode= [HW,PPT]
1514 Configure VIA parallel port to operate in 1514 Configure VIA parallel port to operate in
1515 a specific mode. This is necessary on Pegasos 1515 a specific mode. This is necessary on Pegasos
1516 computer where firmware has no options for setting 1516 computer where firmware has no options for setting
1517 up parallel port mode and sets it to spp. 1517 up parallel port mode and sets it to spp.
1518 Currently this function knows 686a and 8231 chips. 1518 Currently this function knows 686a and 8231 chips.
1519 Format: [spp|ps2|epp|ecp|ecpepp] 1519 Format: [spp|ps2|epp|ecp|ecpepp]
1520 1520
1521 pas2= [HW,OSS] Format: 1521 pas2= [HW,OSS] Format:
1522 <io>,<irq>,<dma>,<dma16>,<sb_io>,<sb_irq>,<sb_dma>,<sb_dma16> 1522 <io>,<irq>,<dma>,<dma16>,<sb_io>,<sb_irq>,<sb_dma>,<sb_dma16>
1523 1523
1524 pas16= [HW,SCSI] 1524 pas16= [HW,SCSI]
1525 See header of drivers/scsi/pas16.c. 1525 See header of drivers/scsi/pas16.c.
1526 1526
1527 pause_on_oops= 1527 pause_on_oops=
1528 Halt all CPUs after the first oops has been printed for 1528 Halt all CPUs after the first oops has been printed for
1529 the specified number of seconds. This is to be used if 1529 the specified number of seconds. This is to be used if
1530 your oopses keep scrolling off the screen. 1530 your oopses keep scrolling off the screen.
1531 1531
1532 pcbit= [HW,ISDN] 1532 pcbit= [HW,ISDN]
1533 1533
1534 pcd. [PARIDE] 1534 pcd. [PARIDE]
1535 See header of drivers/block/paride/pcd.c. 1535 See header of drivers/block/paride/pcd.c.
1536 See also Documentation/paride.txt. 1536 See also Documentation/paride.txt.
1537 1537
1538 pci=option[,option...] [PCI] various PCI subsystem options: 1538 pci=option[,option...] [PCI] various PCI subsystem options:
1539 off [X86-32] don't probe for the PCI bus 1539 off [X86-32] don't probe for the PCI bus
1540 bios [X86-32] force use of PCI BIOS, don't access 1540 bios [X86-32] force use of PCI BIOS, don't access
1541 the hardware directly. Use this if your machine 1541 the hardware directly. Use this if your machine
1542 has a non-standard PCI host bridge. 1542 has a non-standard PCI host bridge.
1543 nobios [X86-32] disallow use of PCI BIOS, only direct 1543 nobios [X86-32] disallow use of PCI BIOS, only direct
1544 hardware access methods are allowed. Use this 1544 hardware access methods are allowed. Use this
1545 if you experience crashes upon bootup and you 1545 if you experience crashes upon bootup and you
1546 suspect they are caused by the BIOS. 1546 suspect they are caused by the BIOS.
1547 conf1 [X86-32] Force use of PCI Configuration 1547 conf1 [X86-32] Force use of PCI Configuration
1548 Mechanism 1. 1548 Mechanism 1.
1549 conf2 [X86-32] Force use of PCI Configuration 1549 conf2 [X86-32] Force use of PCI Configuration
1550 Mechanism 2. 1550 Mechanism 2.
1551 noaer [PCIE] If the PCIEAER kernel config parameter is 1551 noaer [PCIE] If the PCIEAER kernel config parameter is
1552 enabled, this kernel boot option can be used to 1552 enabled, this kernel boot option can be used to
1553 disable the use of PCIE advanced error reporting. 1553 disable the use of PCIE advanced error reporting.
1554 nodomains [PCI] Disable support for multiple PCI 1554 nodomains [PCI] Disable support for multiple PCI
1555 root domains (aka PCI segments, in ACPI-speak). 1555 root domains (aka PCI segments, in ACPI-speak).
1556 nommconf [X86-32,X86_64] Disable use of MMCONFIG for PCI 1556 nommconf [X86-32,X86_64] Disable use of MMCONFIG for PCI
1557 Configuration 1557 Configuration
1558 nomsi [MSI] If the PCI_MSI kernel config parameter is 1558 nomsi [MSI] If the PCI_MSI kernel config parameter is
1559 enabled, this kernel boot option can be used to 1559 enabled, this kernel boot option can be used to
1560 disable the use of MSI interrupts system-wide. 1560 disable the use of MSI interrupts system-wide.
1561 biosirq [X86-32] Use PCI BIOS calls to get the interrupt 1561 biosirq [X86-32] Use PCI BIOS calls to get the interrupt
1562 routing table. These calls are known to be buggy 1562 routing table. These calls are known to be buggy
1563 on several machines and they hang the machine 1563 on several machines and they hang the machine
1564 when used, but on other computers it's the only 1564 when used, but on other computers it's the only
1565 way to get the interrupt routing table. Try 1565 way to get the interrupt routing table. Try
1566 this option if the kernel is unable to allocate 1566 this option if the kernel is unable to allocate
1567 IRQs or discover secondary PCI buses on your 1567 IRQs or discover secondary PCI buses on your
1568 motherboard. 1568 motherboard.
1569 rom [X86-32] Assign address space to expansion ROMs. 1569 rom [X86-32] Assign address space to expansion ROMs.
1570 Use with caution as certain devices share 1570 Use with caution as certain devices share
1571 address decoders between ROMs and other 1571 address decoders between ROMs and other
1572 resources. 1572 resources.
1573 norom [X86-32,X86_64] Do not assign address space to 1573 norom [X86-32,X86_64] Do not assign address space to
1574 expansion ROMs that do not already have 1574 expansion ROMs that do not already have
1575 BIOS assigned address ranges. 1575 BIOS assigned address ranges.
1576 irqmask=0xMMMM [X86-32] Set a bit mask of IRQs allowed to be 1576 irqmask=0xMMMM [X86-32] Set a bit mask of IRQs allowed to be
1577 assigned automatically to PCI devices. You can 1577 assigned automatically to PCI devices. You can
1578 make the kernel exclude IRQs of your ISA cards 1578 make the kernel exclude IRQs of your ISA cards
1579 this way. 1579 this way.
1580 pirqaddr=0xAAAAA [X86-32] Specify the physical address 1580 pirqaddr=0xAAAAA [X86-32] Specify the physical address
1581 of the PIRQ table (normally generated 1581 of the PIRQ table (normally generated
1582 by the BIOS) if it is outside the 1582 by the BIOS) if it is outside the
1583 F0000h-100000h range. 1583 F0000h-100000h range.
1584 lastbus=N [X86-32] Scan all buses thru bus #N. Can be 1584 lastbus=N [X86-32] Scan all buses thru bus #N. Can be
1585 useful if the kernel is unable to find your 1585 useful if the kernel is unable to find your
1586 secondary buses and you want to tell it 1586 secondary buses and you want to tell it
1587 explicitly which ones they are. 1587 explicitly which ones they are.
1588 assign-busses [X86-32] Always assign all PCI bus 1588 assign-busses [X86-32] Always assign all PCI bus
1589 numbers ourselves, overriding 1589 numbers ourselves, overriding
1590 whatever the firmware may have done. 1590 whatever the firmware may have done.
1591 usepirqmask [X86-32] Honor the possible IRQ mask stored 1591 usepirqmask [X86-32] Honor the possible IRQ mask stored
1592 in the BIOS $PIR table. This is needed on 1592 in the BIOS $PIR table. This is needed on
1593 some systems with broken BIOSes, notably 1593 some systems with broken BIOSes, notably
1594 some HP Pavilion N5400 and Omnibook XE3 1594 some HP Pavilion N5400 and Omnibook XE3
1595 notebooks. This will have no effect if ACPI 1595 notebooks. This will have no effect if ACPI
1596 IRQ routing is enabled. 1596 IRQ routing is enabled.
1597 noacpi [X86-32] Do not use ACPI for IRQ routing 1597 noacpi [X86-32] Do not use ACPI for IRQ routing
1598 or for PCI scanning. 1598 or for PCI scanning.
1599 use_crs [X86-32] Use _CRS for PCI resource 1599 use_crs [X86-32] Use _CRS for PCI resource
1600 allocation. 1600 allocation.
1601 routeirq Do IRQ routing for all PCI devices. 1601 routeirq Do IRQ routing for all PCI devices.
1602 This is normally done in pci_enable_device(), 1602 This is normally done in pci_enable_device(),
1603 so this option is a temporary workaround 1603 so this option is a temporary workaround
1604 for broken drivers that don't call it. 1604 for broken drivers that don't call it.
1605 skip_isa_align [X86] do not align io start addr, so can 1605 skip_isa_align [X86] do not align io start addr, so can
1606 handle more pci cards 1606 handle more pci cards
1607 firmware [ARM] Do not re-enumerate the bus but instead 1607 firmware [ARM] Do not re-enumerate the bus but instead
1608 just use the configuration from the 1608 just use the configuration from the
1609 bootloader. This is currently used on 1609 bootloader. This is currently used on
1610 IXP2000 systems where the bus has to be 1610 IXP2000 systems where the bus has to be
1611 configured a certain way for adjunct CPUs. 1611 configured a certain way for adjunct CPUs.
1612 noearly [X86] Don't do any early type 1 scanning. 1612 noearly [X86] Don't do any early type 1 scanning.
1613 This might help on some broken boards which 1613 This might help on some broken boards which
1614 machine check when some devices' config space 1614 machine check when some devices' config space
1615 is read. But various workarounds are disabled 1615 is read. But various workarounds are disabled
1616 and some IOMMU drivers will not work. 1616 and some IOMMU drivers will not work.
1617 bfsort Sort PCI devices into breadth-first order. 1617 bfsort Sort PCI devices into breadth-first order.
1618 This sorting is done to get a device 1618 This sorting is done to get a device
1619 order compatible with older (<= 2.4) kernels. 1619 order compatible with older (<= 2.4) kernels.
1620 nobfsort Don't sort PCI devices into breadth-first order. 1620 nobfsort Don't sort PCI devices into breadth-first order.
1621 cbiosize=nn[KMG] The fixed amount of bus space which is 1621 cbiosize=nn[KMG] The fixed amount of bus space which is
1622 reserved for the CardBus bridge's IO window. 1622 reserved for the CardBus bridge's IO window.
1623 The default value is 256 bytes. 1623 The default value is 256 bytes.
1624 cbmemsize=nn[KMG] The fixed amount of bus space which is 1624 cbmemsize=nn[KMG] The fixed amount of bus space which is
1625 reserved for the CardBus bridge's memory 1625 reserved for the CardBus bridge's memory
1626 window. The default value is 64 megabytes. 1626 window. The default value is 64 megabytes.
1627 1627
1628 pcmv= [HW,PCMCIA] BadgePAD 4 1628 pcmv= [HW,PCMCIA] BadgePAD 4
1629 1629
1630 pd. [PARIDE] 1630 pd. [PARIDE]
1631 See Documentation/paride.txt. 1631 See Documentation/paride.txt.
1632 1632
1633 pdcchassis= [PARISC,HW] Disable/Enable PDC Chassis Status codes at 1633 pdcchassis= [PARISC,HW] Disable/Enable PDC Chassis Status codes at
1634 boot time. 1634 boot time.
1635 Format: { 0 | 1 } 1635 Format: { 0 | 1 }
1636 See arch/parisc/kernel/pdc_chassis.c 1636 See arch/parisc/kernel/pdc_chassis.c
1637 1637
1638 pf. [PARIDE] 1638 pf. [PARIDE]
1639 See Documentation/paride.txt. 1639 See Documentation/paride.txt.
1640 1640
1641 pg. [PARIDE] 1641 pg. [PARIDE]
1642 See Documentation/paride.txt. 1642 See Documentation/paride.txt.
1643 1643
1644 pirq= [SMP,APIC] Manual mp-table setup 1644 pirq= [SMP,APIC] Manual mp-table setup
1645 See Documentation/i386/IO-APIC.txt. 1645 See Documentation/i386/IO-APIC.txt.
1646 1646
1647 plip= [PPT,NET] Parallel port network link 1647 plip= [PPT,NET] Parallel port network link
1648 Format: { parport<nr> | timid | 0 } 1648 Format: { parport<nr> | timid | 0 }
1649 See also Documentation/parport.txt. 1649 See also Documentation/parport.txt.
1650 1650
1651 pmtmr= [X86] Manual setup of pmtmr I/O Port. 1651 pmtmr= [X86] Manual setup of pmtmr I/O Port.
1652 Override pmtimer IOPort with a hex value. 1652 Override pmtimer IOPort with a hex value.
1653 e.g. pmtmr=0x508 1653 e.g. pmtmr=0x508
1654 1654
1655 pnpacpi= [ACPI] 1655 pnpacpi= [ACPI]
1656 { off } 1656 { off }
1657 1657
1658 pnpbios= [ISAPNP] 1658 pnpbios= [ISAPNP]
1659 { on | off | curr | res | no-curr | no-res } 1659 { on | off | curr | res | no-curr | no-res }
1660 1660
1661 pnp_reserve_irq= 1661 pnp_reserve_irq=
1662 [ISAPNP] Exclude IRQs for the autoconfiguration 1662 [ISAPNP] Exclude IRQs for the autoconfiguration
1663 1663
1664 pnp_reserve_dma= 1664 pnp_reserve_dma=
1665 [ISAPNP] Exclude DMAs for the autoconfiguration 1665 [ISAPNP] Exclude DMAs for the autoconfiguration
1666 1666
1667 pnp_reserve_io= [ISAPNP] Exclude I/O ports for the autoconfiguration 1667 pnp_reserve_io= [ISAPNP] Exclude I/O ports for the autoconfiguration
1668 Ranges are in pairs (I/O port base and size). 1668 Ranges are in pairs (I/O port base and size).
1669 1669
1670 pnp_reserve_mem= 1670 pnp_reserve_mem=
1671 [ISAPNP] Exclude memory regions for the 1671 [ISAPNP] Exclude memory regions for the
1672 autoconfiguration. 1672 autoconfiguration.
1673 Ranges are in pairs (memory base and size). 1673 Ranges are in pairs (memory base and size).
1674 1674
1675 print-fatal-signals= 1675 print-fatal-signals=
1676 [KNL] debug: print fatal signals 1676 [KNL] debug: print fatal signals
1677 print-fatal-signals=1: print segfault info to 1677 print-fatal-signals=1: print segfault info to
1678 the kernel console. 1678 the kernel console.
1679 default: off. 1679 default: off.
1680 1680
1681 printk.time= Show timing data prefixed to each printk message line 1681 printk.time= Show timing data prefixed to each printk message line
1682 Format: <bool> (1/Y/y=enable, 0/N/n=disable) 1682 Format: <bool> (1/Y/y=enable, 0/N/n=disable)
1683 1683
1684 profile= [KNL] Enable kernel profiling via /proc/profile 1684 profile= [KNL] Enable kernel profiling via /proc/profile
1685 Format: [schedule,]<number> 1685 Format: [schedule,]<number>
1686 Param: "schedule" - profile schedule points. 1686 Param: "schedule" - profile schedule points.
1687 Param: <number> - step/bucket size as a power of 2 for 1687 Param: <number> - step/bucket size as a power of 2 for
1688 statistical time based profiling. 1688 statistical time based profiling.
1689 Param: "sleep" - profile D-state sleeping (millisecs). 1689 Param: "sleep" - profile D-state sleeping (millisecs).
1690 Requires CONFIG_SCHEDSTATS 1690 Requires CONFIG_SCHEDSTATS
1691 Param: "kvm" - profile VM exits. 1691 Param: "kvm" - profile VM exits.
1692 1692
1693 processor.max_cstate= [HW,ACPI] 1693 processor.max_cstate= [HW,ACPI]
1694 Limit processor to maximum C-state 1694 Limit processor to maximum C-state
1695 max_cstate=9 overrides any DMI blacklist limit. 1695 max_cstate=9 overrides any DMI blacklist limit.
1696 1696
1697 processor.nocst [HW,ACPI] 1697 processor.nocst [HW,ACPI]
1698 Ignore the _CST method to determine C-states, 1698 Ignore the _CST method to determine C-states,
1699 instead using the legacy FADT method 1699 instead using the legacy FADT method
1700 1700
1701 prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk 1701 prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk
1702 before loading. 1702 before loading.
1703 See Documentation/ramdisk.txt. 1703 See Documentation/ramdisk.txt.
1704 1704
1705 psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to 1705 psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
1706 probe for; one of (bare|imps|exps|lifebook|any). 1706 probe for; one of (bare|imps|exps|lifebook|any).
1707 psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports 1707 psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
1708 per second. 1708 per second.
1709 psmouse.resetafter= [HW,MOUSE] 1709 psmouse.resetafter= [HW,MOUSE]
1710 Try to reset the device after so many bad packets 1710 Try to reset the device after so many bad packets
1711 (0 = never). 1711 (0 = never).
1712 psmouse.resolution= 1712 psmouse.resolution=
1713 [HW,MOUSE] Set desired mouse resolution, in dpi. 1713 [HW,MOUSE] Set desired mouse resolution, in dpi.
1714 psmouse.smartscroll= 1714 psmouse.smartscroll=
1715 [HW,MOUSE] Controls Logitech smartscroll autorepeat. 1715 [HW,MOUSE] Controls Logitech smartscroll autorepeat.
1716 0 = disabled, 1 = enabled (default). 1716 0 = disabled, 1 = enabled (default).
1717 1717
1718 pss= [HW,OSS] Personal Sound System (ECHO ESC614) 1718 pss= [HW,OSS] Personal Sound System (ECHO ESC614)
1719 Format: 1719 Format:
1720 <io>,<mss_io>,<mss_irq>,<mss_dma>,<mpu_io>,<mpu_irq> 1720 <io>,<mss_io>,<mss_irq>,<mss_dma>,<mpu_io>,<mpu_irq>
1721 1721
1722 pt. [PARIDE] 1722 pt. [PARIDE]
1723 See Documentation/paride.txt. 1723 See Documentation/paride.txt.
1724 1724
1725 pty.legacy_count= 1725 pty.legacy_count=
1726 [KNL] Number of legacy pty's. Overwrites compiled-in 1726 [KNL] Number of legacy pty's. Overwrites compiled-in
1727 default number. 1727 default number.
1728 1728
1729 quiet [KNL] Disable most log messages 1729 quiet [KNL] Disable most log messages
1730 1730
1731 r128= [HW,DRM] 1731 r128= [HW,DRM]
1732 1732
1733 raid= [HW,RAID] 1733 raid= [HW,RAID]
1734 See Documentation/md.txt. 1734 See Documentation/md.txt.
1735 1735
1736 ramdisk_blocksize= [RAM] 1736 ramdisk_blocksize= [RAM]
1737 See Documentation/ramdisk.txt. 1737 See Documentation/ramdisk.txt.
1738 1738
1739 ramdisk_size= [RAM] Sizes of RAM disks in kilobytes 1739 ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
1740 See Documentation/ramdisk.txt. 1740 See Documentation/ramdisk.txt.
1741 1741
1742 rcupdate.blimit= [KNL,BOOT] 1742 rcupdate.blimit= [KNL,BOOT]
1743 Set maximum number of finished RCU callbacks to process 1743 Set maximum number of finished RCU callbacks to process
1744 in one batch. 1744 in one batch.
1745 1745
1746 rcupdate.qhimark= [KNL,BOOT] 1746 rcupdate.qhimark= [KNL,BOOT]
1747 Set threshold of queued 1747 Set threshold of queued
1748 RCU callbacks over which batch limiting is disabled. 1748 RCU callbacks over which batch limiting is disabled.
1749 1749
1750 rcupdate.qlowmark= [KNL,BOOT] 1750 rcupdate.qlowmark= [KNL,BOOT]
1751 Set threshold of queued RCU callbacks below which 1751 Set threshold of queued RCU callbacks below which
1752 batch limiting is re-enabled. 1752 batch limiting is re-enabled.
1753 1753
1754 rdinit= [KNL] 1754 rdinit= [KNL]
1755 Format: <full_path> 1755 Format: <full_path>
1756 Run specified binary instead of /init from the ramdisk, 1756 Run specified binary instead of /init from the ramdisk,
1757 used for early userspace startup. See initrd. 1757 used for early userspace startup. See initrd.
1758 1758
1759 reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode 1759 reboot= [BUGS=X86-32,BUGS=ARM,BUGS=IA-64] Rebooting mode
1760 Format: <reboot_mode>[,<reboot_mode2>[,...]] 1760 Format: <reboot_mode>[,<reboot_mode2>[,...]]
1761 See arch/*/kernel/reboot.c or arch/*/kernel/process.c 1761 See arch/*/kernel/reboot.c or arch/*/kernel/process.c
1762 1762
1763 relax_domain_level= 1763 relax_domain_level=
1764 [KNL, SMP] Set scheduler's default relax_domain_level. 1764 [KNL, SMP] Set scheduler's default relax_domain_level.
1765 See Documentation/cpusets.txt. 1765 See Documentation/cpusets.txt.
1766 1766
1767 reserve= [KNL,BUGS] Force the kernel to ignore some iomem area 1767 reserve= [KNL,BUGS] Force the kernel to ignore some iomem area
1768 1768
1769 reservetop= [X86-32] 1769 reservetop= [X86-32]
1770 Format: nn[KMG] 1770 Format: nn[KMG]
1771 Reserves a hole at the top of the kernel virtual 1771 Reserves a hole at the top of the kernel virtual
1772 address space. 1772 address space.
1773 1773
1774 reset_devices [KNL] Force drivers to reset the underlying device 1774 reset_devices [KNL] Force drivers to reset the underlying device
1775 during initialization. 1775 during initialization.
1776 1776
1777 resume= [SWSUSP] 1777 resume= [SWSUSP]
1778 Specify the partition device for software suspend 1778 Specify the partition device for software suspend
1779 1779
1780 resume_offset= [SWSUSP] 1780 resume_offset= [SWSUSP]
1781 Specify the offset from the beginning of the partition 1781 Specify the offset from the beginning of the partition
1782 given by "resume=" at which the swap header is located, 1782 given by "resume=" at which the swap header is located,
1783 in <PAGE_SIZE> units (needed only for swap files). 1783 in <PAGE_SIZE> units (needed only for swap files).
1784 See Documentation/power/swsusp-and-swap-files.txt 1784 See Documentation/power/swsusp-and-swap-files.txt
1785 1785
1786 retain_initrd [RAM] Keep initrd memory after extraction 1786 retain_initrd [RAM] Keep initrd memory after extraction
1787 1787
1788 rhash_entries= [KNL,NET] 1788 rhash_entries= [KNL,NET]
1789 Set number of hash buckets for route cache 1789 Set number of hash buckets for route cache
1790 1790
1791 riscom8= [HW,SERIAL] 1791 riscom8= [HW,SERIAL]
1792 Format: <io_board1>[,<io_board2>[,...<io_boardN>]] 1792 Format: <io_board1>[,<io_board2>[,...<io_boardN>]]
1793 1793
1794 ro [KNL] Mount root device read-only on boot 1794 ro [KNL] Mount root device read-only on boot
1795 1795
1796 root= [KNL] Root filesystem 1796 root= [KNL] Root filesystem
1797 1797
1798 rootdelay= [KNL] Delay (in seconds) to pause before attempting to 1798 rootdelay= [KNL] Delay (in seconds) to pause before attempting to
1799 mount the root filesystem 1799 mount the root filesystem
1800 1800
1801 rootflags= [KNL] Set root filesystem mount option string 1801 rootflags= [KNL] Set root filesystem mount option string
1802 1802
1803 rootfstype= [KNL] Set root filesystem type 1803 rootfstype= [KNL] Set root filesystem type
1804 1804
1805 rootwait [KNL] Wait (indefinitely) for root device to show up. 1805 rootwait [KNL] Wait (indefinitely) for root device to show up.
1806 Useful for devices that are detected asynchronously 1806 Useful for devices that are detected asynchronously
1807 (e.g. USB and MMC devices). 1807 (e.g. USB and MMC devices).
1808 1808
1809 root_plug.vendor_id= 1809 root_plug.vendor_id=
1810 [ROOTPLUG] Override the default vendor ID 1810 [ROOTPLUG] Override the default vendor ID
1811 1811
1812 root_plug.product_id= 1812 root_plug.product_id=
1813 [ROOTPLUG] Override the default product ID 1813 [ROOTPLUG] Override the default product ID
1814 1814
1815 root_plug.debug= 1815 root_plug.debug=
1816 [ROOTPLUG] Enable debugging output 1816 [ROOTPLUG] Enable debugging output
1817 1817
1818 rw [KNL] Mount root device read-write on boot 1818 rw [KNL] Mount root device read-write on boot
1819 1819
1820 S [KNL] Run init in single mode 1820 S [KNL] Run init in single mode
1821 1821
1822 sa1100ir [NET] 1822 sa1100ir [NET]
1823 See drivers/net/irda/sa1100_ir.c. 1823 See drivers/net/irda/sa1100_ir.c.
1824 1824
1825 sbni= [NET] Granch SBNI12 leased line adapter 1825 sbni= [NET] Granch SBNI12 leased line adapter
1826 1826
1827 sc1200wdt= [HW,WDT] SC1200 WDT (watchdog) driver 1827 sc1200wdt= [HW,WDT] SC1200 WDT (watchdog) driver
1828 Format: <io>[,<timeout>[,<isapnp>]] 1828 Format: <io>[,<timeout>[,<isapnp>]]
1829 1829
1830 scsi_debug_*= [SCSI] 1830 scsi_debug_*= [SCSI]
1831 See drivers/scsi/scsi_debug.c. 1831 See drivers/scsi/scsi_debug.c.
1832 1832
1833 scsi_default_dev_flags= 1833 scsi_default_dev_flags=
1834 [SCSI] SCSI default device flags 1834 [SCSI] SCSI default device flags
1835 Format: <integer> 1835 Format: <integer>
1836 1836
1837 scsi_dev_flags= [SCSI] Black/white list entry for vendor and model 1837 scsi_dev_flags= [SCSI] Black/white list entry for vendor and model
1838 Format: <vendor>:<model>:<flags> 1838 Format: <vendor>:<model>:<flags>
1839 (flags are integer value) 1839 (flags are integer value)
1840 1840
1841 scsi_logging_level= [SCSI] a bit mask of logging levels 1841 scsi_logging_level= [SCSI] a bit mask of logging levels
1842 See drivers/scsi/scsi_logging.h for bits. Also 1842 See drivers/scsi/scsi_logging.h for bits. Also
1843 settable via sysctl at dev.scsi.logging_level 1843 settable via sysctl at dev.scsi.logging_level
1844 (/proc/sys/dev/scsi/logging_level). 1844 (/proc/sys/dev/scsi/logging_level).
1845 There is also a nice 'scsi_logging_level' script in the 1845 There is also a nice 'scsi_logging_level' script in the
1846 S390-tools package, available for download at 1846 S390-tools package, available for download at
1847 http://www-128.ibm.com/developerworks/linux/linux390/s390-tools-1.5.4.html 1847 http://www-128.ibm.com/developerworks/linux/linux390/s390-tools-1.5.4.html
1848 1848
1849 scsi_mod.scan= [SCSI] sync (default) scans SCSI busses as they are 1849 scsi_mod.scan= [SCSI] sync (default) scans SCSI busses as they are
1850 discovered. async scans them in kernel threads, 1850 discovered. async scans them in kernel threads,
1851 allowing boot to proceed. none ignores them, expecting 1851 allowing boot to proceed. none ignores them, expecting
1852 user space to do the scan. 1852 user space to do the scan.
1853 1853
1854 selinux [SELINUX] Disable or enable SELinux at boot time. 1854 selinux [SELINUX] Disable or enable SELinux at boot time.
1855 Format: { "0" | "1" } 1855 Format: { "0" | "1" }
1856 See security/selinux/Kconfig help text. 1856 See security/selinux/Kconfig help text.
1857 0 -- disable. 1857 0 -- disable.
1858 1 -- enable. 1858 1 -- enable.
1859 Default value is set via kernel config option. 1859 Default value is set via kernel config option.
1860 If enabled at boot time, /selinux/disable can be used 1860 If enabled at boot time, /selinux/disable can be used
1861 later to disable prior to initial policy load. 1861 later to disable prior to initial policy load.
1862 1862
1863 selinux_compat_net = 1863 selinux_compat_net =
1864 [SELINUX] Set initial selinux_compat_net flag value. 1864 [SELINUX] Set initial selinux_compat_net flag value.
1865 Format: { "0" | "1" } 1865 Format: { "0" | "1" }
1866 0 -- use new secmark-based packet controls 1866 0 -- use new secmark-based packet controls
1867 1 -- use legacy packet controls 1867 1 -- use legacy packet controls
1868 Default value is 0 (preferred). 1868 Default value is 0 (preferred).
1869 Value can be changed at runtime via 1869 Value can be changed at runtime via
1870 /selinux/compat_net. 1870 /selinux/compat_net.
1871 1871
1872 serialnumber [BUGS=X86-32] 1872 serialnumber [BUGS=X86-32]
1873 1873
1874 shapers= [NET] 1874 shapers= [NET]
1875 Maximal number of shapers. 1875 Maximal number of shapers.
1876 1876
1877 sim710= [SCSI,HW] 1877 sim710= [SCSI,HW]
1878 See header of drivers/scsi/sim710.c. 1878 See header of drivers/scsi/sim710.c.
1879 1879
1880 simeth= [IA-64] 1880 simeth= [IA-64]
1881 simscsi= 1881 simscsi=
1882 1882
1883 slram= [HW,MTD] 1883 slram= [HW,MTD]
1884 1884
1885 slub_debug[=options[,slabs]] [MM, SLUB] 1885 slub_debug[=options[,slabs]] [MM, SLUB]
1886 Enabling slub_debug allows one to determine the 1886 Enabling slub_debug allows one to determine the
1887 culprit if slab objects become corrupted. Enabling 1887 culprit if slab objects become corrupted. Enabling
1888 slub_debug can create guard zones around objects and 1888 slub_debug can create guard zones around objects and
1889 may poison objects when not in use. Also tracks the 1889 may poison objects when not in use. Also tracks the
1890 last alloc / free. For more information see 1890 last alloc / free. For more information see
1891 Documentation/vm/slub.txt. 1891 Documentation/vm/slub.txt.
1892 1892
1893 slub_max_order= [MM, SLUB] 1893 slub_max_order= [MM, SLUB]
1894 Determines the maximum allowed order for slabs. 1894 Determines the maximum allowed order for slabs.
1895 A high setting may cause OOMs due to memory 1895 A high setting may cause OOMs due to memory
1896 fragmentation. For more information see 1896 fragmentation. For more information see
1897 Documentation/vm/slub.txt. 1897 Documentation/vm/slub.txt.
1898 1898
1899 slub_min_objects= [MM, SLUB] 1899 slub_min_objects= [MM, SLUB]
1900 The minimum number of objects per slab. SLUB will 1900 The minimum number of objects per slab. SLUB will
1901 increase the slab order up to slub_max_order to 1901 increase the slab order up to slub_max_order to
1902 generate a sufficiently large slab able to contain 1902 generate a sufficiently large slab able to contain
1903 the number of objects indicated. The higher the number 1903 the number of objects indicated. The higher the number
1904 of objects the smaller the overhead of tracking slabs 1904 of objects the smaller the overhead of tracking slabs
1905 and the less frequently locks need to be acquired. 1905 and the less frequently locks need to be acquired.
1906 For more information see Documentation/vm/slub.txt. 1906 For more information see Documentation/vm/slub.txt.
1907 1907
1908 slub_min_order= [MM, SLUB] 1908 slub_min_order= [MM, SLUB]
1909 Determines the mininum page order for slabs. Must be 1909 Determines the mininum page order for slabs. Must be
1910 lower than slub_max_order. 1910 lower than slub_max_order.
1911 For more information see Documentation/vm/slub.txt. 1911 For more information see Documentation/vm/slub.txt.
1912 1912
1913 slub_nomerge [MM, SLUB] 1913 slub_nomerge [MM, SLUB]
1914 Disable merging of slabs with similar size. May be 1914 Disable merging of slabs with similar size. May be
1915 necessary if there is some reason to distinguish 1915 necessary if there is some reason to distinguish
1916 allocs to different slabs. Debug options disable 1916 allocs to different slabs. Debug options disable
1917 merging on their own. 1917 merging on their own.
1918 For more information see Documentation/vm/slub.txt. 1918 For more information see Documentation/vm/slub.txt.
1919 1919
1920 smart2= [HW] 1920 smart2= [HW]
1921 Format: <io1>[,<io2>[,...,<io8>]] 1921 Format: <io1>[,<io2>[,...,<io8>]]
1922 1922
1923 smp-alt-once [X86-32,SMP] On a hotplug CPU system, only 1923 smp-alt-once [X86-32,SMP] On a hotplug CPU system, only
1924 attempt to substitute SMP alternatives once at boot. 1924 attempt to substitute SMP alternatives once at boot.
1925 1925
1926 smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices 1926 smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices
1927 smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port 1927 smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port
1928 smsc-ircc2.ircc_sir= [HW] SIR base I/O port 1928 smsc-ircc2.ircc_sir= [HW] SIR base I/O port
1929 smsc-ircc2.ircc_fir= [HW] FIR base I/O port 1929 smsc-ircc2.ircc_fir= [HW] FIR base I/O port
1930 smsc-ircc2.ircc_irq= [HW] IRQ line 1930 smsc-ircc2.ircc_irq= [HW] IRQ line
1931 smsc-ircc2.ircc_dma= [HW] DMA channel 1931 smsc-ircc2.ircc_dma= [HW] DMA channel
1932 smsc-ircc2.ircc_transceiver= [HW] Transceiver type: 1932 smsc-ircc2.ircc_transceiver= [HW] Transceiver type:
1933 0: Toshiba Satellite 1800 (GP data pin select) 1933 0: Toshiba Satellite 1800 (GP data pin select)
1934 1: Fast pin select (default) 1934 1: Fast pin select (default)
1935 2: ATC IRMode 1935 2: ATC IRMode
1936 1936
1937 snd-ad1816a= [HW,ALSA] 1937 snd-ad1816a= [HW,ALSA]
1938 1938
1939 snd-ad1848= [HW,ALSA] 1939 snd-ad1848= [HW,ALSA]
1940 1940
1941 snd-ali5451= [HW,ALSA] 1941 snd-ali5451= [HW,ALSA]
1942 1942
1943 snd-als100= [HW,ALSA] 1943 snd-als100= [HW,ALSA]
1944 1944
1945 snd-als4000= [HW,ALSA] 1945 snd-als4000= [HW,ALSA]
1946 1946
1947 snd-azt2320= [HW,ALSA] 1947 snd-azt2320= [HW,ALSA]
1948 1948
1949 snd-cmi8330= [HW,ALSA] 1949 snd-cmi8330= [HW,ALSA]
1950 1950
1951 snd-cmipci= [HW,ALSA] 1951 snd-cmipci= [HW,ALSA]
1952 1952
1953 snd-cs4231= [HW,ALSA] 1953 snd-cs4231= [HW,ALSA]
1954 1954
1955 snd-cs4232= [HW,ALSA] 1955 snd-cs4232= [HW,ALSA]
1956 1956
1957 snd-cs4236= [HW,ALSA] 1957 snd-cs4236= [HW,ALSA]
1958 1958
1959 snd-cs4281= [HW,ALSA] 1959 snd-cs4281= [HW,ALSA]
1960 1960
1961 snd-cs46xx= [HW,ALSA] 1961 snd-cs46xx= [HW,ALSA]
1962 1962
1963 snd-dt019x= [HW,ALSA] 1963 snd-dt019x= [HW,ALSA]
1964 1964
1965 snd-dummy= [HW,ALSA] 1965 snd-dummy= [HW,ALSA]
1966 1966
1967 snd-emu10k1= [HW,ALSA] 1967 snd-emu10k1= [HW,ALSA]
1968 1968
1969 snd-ens1370= [HW,ALSA] 1969 snd-ens1370= [HW,ALSA]
1970 1970
1971 snd-ens1371= [HW,ALSA] 1971 snd-ens1371= [HW,ALSA]
1972 1972
1973 snd-es968= [HW,ALSA] 1973 snd-es968= [HW,ALSA]
1974 1974
1975 snd-es1688= [HW,ALSA] 1975 snd-es1688= [HW,ALSA]
1976 1976
1977 snd-es18xx= [HW,ALSA] 1977 snd-es18xx= [HW,ALSA]
1978 1978
1979 snd-es1938= [HW,ALSA] 1979 snd-es1938= [HW,ALSA]
1980 1980
1981 snd-es1968= [HW,ALSA] 1981 snd-es1968= [HW,ALSA]
1982 1982
1983 snd-fm801= [HW,ALSA] 1983 snd-fm801= [HW,ALSA]
1984 1984
1985 snd-gusclassic= [HW,ALSA] 1985 snd-gusclassic= [HW,ALSA]
1986 1986
1987 snd-gusextreme= [HW,ALSA] 1987 snd-gusextreme= [HW,ALSA]
1988 1988
1989 snd-gusmax= [HW,ALSA] 1989 snd-gusmax= [HW,ALSA]
1990 1990
1991 snd-hdsp= [HW,ALSA] 1991 snd-hdsp= [HW,ALSA]
1992 1992
1993 snd-ice1712= [HW,ALSA] 1993 snd-ice1712= [HW,ALSA]
1994 1994
1995 snd-intel8x0= [HW,ALSA] 1995 snd-intel8x0= [HW,ALSA]
1996 1996
1997 snd-interwave= [HW,ALSA] 1997 snd-interwave= [HW,ALSA]
1998 1998
1999 snd-interwave-stb= 1999 snd-interwave-stb=
2000 [HW,ALSA] 2000 [HW,ALSA]
2001 2001
2002 snd-korg1212= [HW,ALSA] 2002 snd-korg1212= [HW,ALSA]
2003 2003
2004 snd-maestro3= [HW,ALSA] 2004 snd-maestro3= [HW,ALSA]
2005 2005
2006 snd-mpu401= [HW,ALSA] 2006 snd-mpu401= [HW,ALSA]
2007 2007
2008 snd-mtpav= [HW,ALSA] 2008 snd-mtpav= [HW,ALSA]
2009 2009
2010 snd-nm256= [HW,ALSA] 2010 snd-nm256= [HW,ALSA]
2011 2011
2012 snd-opl3sa2= [HW,ALSA] 2012 snd-opl3sa2= [HW,ALSA]
2013 2013
2014 snd-opti92x-ad1848= 2014 snd-opti92x-ad1848=
2015 [HW,ALSA] 2015 [HW,ALSA]
2016 2016
2017 snd-opti92x-cs4231= 2017 snd-opti92x-cs4231=
2018 [HW,ALSA] 2018 [HW,ALSA]
2019 2019
2020 snd-opti93x= [HW,ALSA] 2020 snd-opti93x= [HW,ALSA]
2021 2021
2022 snd-pmac= [HW,ALSA] 2022 snd-pmac= [HW,ALSA]
2023 2023
2024 snd-rme32= [HW,ALSA] 2024 snd-rme32= [HW,ALSA]
2025 2025
2026 snd-rme96= [HW,ALSA] 2026 snd-rme96= [HW,ALSA]
2027 2027
2028 snd-rme9652= [HW,ALSA] 2028 snd-rme9652= [HW,ALSA]
2029 2029
2030 snd-sb8= [HW,ALSA] 2030 snd-sb8= [HW,ALSA]
2031 2031
2032 snd-sb16= [HW,ALSA] 2032 snd-sb16= [HW,ALSA]
2033 2033
2034 snd-sbawe= [HW,ALSA] 2034 snd-sbawe= [HW,ALSA]
2035 2035
2036 snd-serial= [HW,ALSA] 2036 snd-serial= [HW,ALSA]
2037 2037
2038 snd-sgalaxy= [HW,ALSA] 2038 snd-sgalaxy= [HW,ALSA]
2039 2039
2040 snd-sonicvibes= [HW,ALSA] 2040 snd-sonicvibes= [HW,ALSA]
2041 2041
2042 snd-sun-amd7930= 2042 snd-sun-amd7930=
2043 [HW,ALSA] 2043 [HW,ALSA]
2044 2044
2045 snd-sun-cs4231= [HW,ALSA] 2045 snd-sun-cs4231= [HW,ALSA]
2046 2046
2047 snd-trident= [HW,ALSA] 2047 snd-trident= [HW,ALSA]
2048 2048
2049 snd-usb-audio= [HW,ALSA,USB] 2049 snd-usb-audio= [HW,ALSA,USB]
2050 2050
2051 snd-via82xx= [HW,ALSA] 2051 snd-via82xx= [HW,ALSA]
2052 2052
2053 snd-virmidi= [HW,ALSA] 2053 snd-virmidi= [HW,ALSA]
2054 2054
2055 snd-wavefront= [HW,ALSA] 2055 snd-wavefront= [HW,ALSA]
2056 2056
2057 snd-ymfpci= [HW,ALSA] 2057 snd-ymfpci= [HW,ALSA]
2058 2058
2059 softlockup_panic= 2059 softlockup_panic=
2060 [KNL] Should the soft-lockup detector generate panics. 2060 [KNL] Should the soft-lockup detector generate panics.
2061 2061
2062 sonypi.*= [HW] Sony Programmable I/O Control Device driver 2062 sonypi.*= [HW] Sony Programmable I/O Control Device driver
2063 See Documentation/sonypi.txt 2063 See Documentation/sonypi.txt
2064 2064
2065 specialix= [HW,SERIAL] Specialix multi-serial port adapter 2065 specialix= [HW,SERIAL] Specialix multi-serial port adapter
2066 See Documentation/specialix.txt. 2066 See Documentation/specialix.txt.
2067 2067
2068 spia_io_base= [HW,MTD] 2068 spia_io_base= [HW,MTD]
2069 spia_fio_base= 2069 spia_fio_base=
2070 spia_pedr= 2070 spia_pedr=
2071 spia_peddr= 2071 spia_peddr=
2072 2072
2073 sscape= [HW,OSS] 2073 sscape= [HW,OSS]
2074 Format: <io>,<irq>,<dma>,<mpu_io>,<mpu_irq> 2074 Format: <io>,<irq>,<dma>,<mpu_io>,<mpu_irq>
2075 2075
2076 st= [HW,SCSI] SCSI tape parameters (buffers, etc.) 2076 st= [HW,SCSI] SCSI tape parameters (buffers, etc.)
2077 See Documentation/scsi/st.txt. 2077 See Documentation/scsi/st.txt.
2078 2078
2079 sti= [PARISC,HW] 2079 sti= [PARISC,HW]
2080 Format: <num> 2080 Format: <num>
2081 Set the STI (builtin display/keyboard on the HP-PARISC 2081 Set the STI (builtin display/keyboard on the HP-PARISC
2082 machines) console (graphic card) which should be used 2082 machines) console (graphic card) which should be used
2083 as the initial boot-console. 2083 as the initial boot-console.
2084 See also comment in drivers/video/console/sticore.c. 2084 See also comment in drivers/video/console/sticore.c.
2085 2085
2086 sti_font= [HW] 2086 sti_font= [HW]
2087 See comment in drivers/video/console/sticore.c. 2087 See comment in drivers/video/console/sticore.c.
2088 2088
2089 stifb= [HW] 2089 stifb= [HW]
2090 Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]] 2090 Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]]
2091 2091
2092 sunrpc.pool_mode= 2092 sunrpc.pool_mode=
2093 [NFS] 2093 [NFS]
2094 Control how the NFS server code allocates CPUs to 2094 Control how the NFS server code allocates CPUs to
2095 service thread pools. Depending on how many NICs 2095 service thread pools. Depending on how many NICs
2096 you have and where their interrupts are bound, this 2096 you have and where their interrupts are bound, this
2097 option will affect which CPUs will do NFS serving. 2097 option will affect which CPUs will do NFS serving.
2098 Note: this parameter cannot be changed while the 2098 Note: this parameter cannot be changed while the
2099 NFS server is running. 2099 NFS server is running.
2100 2100
2101 auto the server chooses an appropriate mode 2101 auto the server chooses an appropriate mode
2102 automatically using heuristics 2102 automatically using heuristics
2103 global a single global pool contains all CPUs 2103 global a single global pool contains all CPUs
2104 percpu one pool for each CPU 2104 percpu one pool for each CPU
2105 pernode one pool for each NUMA node (equivalent 2105 pernode one pool for each NUMA node (equivalent
2106 to global on non-NUMA machines) 2106 to global on non-NUMA machines)
2107 2107
2108 swiotlb= [IA-64] Number of I/O TLB slabs 2108 swiotlb= [IA-64] Number of I/O TLB slabs
2109 2109
2110 switches= [HW,M68k] 2110 switches= [HW,M68k]
2111 2111
2112 sym53c416= [HW,SCSI] 2112 sym53c416= [HW,SCSI]
2113 See header of drivers/scsi/sym53c416.c. 2113 See header of drivers/scsi/sym53c416.c.
2114 2114
2115 sysrq_always_enabled 2115 sysrq_always_enabled
2116 [KNL] 2116 [KNL]
2117 Ignore sysrq setting - this boot parameter will 2117 Ignore sysrq setting - this boot parameter will
2118 neutralize any effect of /proc/sys/kernel/sysrq. 2118 neutralize any effect of /proc/sys/kernel/sysrq.
2119 Useful for debugging. 2119 Useful for debugging.
2120 2120
2121 t128= [HW,SCSI] 2121 t128= [HW,SCSI]
2122 See header of drivers/scsi/t128.c. 2122 See header of drivers/scsi/t128.c.
2123 2123
2124 tdfx= [HW,DRM] 2124 tdfx= [HW,DRM]
2125 2125
2126 thash_entries= [KNL,NET] 2126 thash_entries= [KNL,NET]
2127 Set number of hash buckets for TCP connection 2127 Set number of hash buckets for TCP connection
2128 2128
2129 thermal.act= [HW,ACPI] 2129 thermal.act= [HW,ACPI]
2130 -1: disable all active trip points in all thermal zones 2130 -1: disable all active trip points in all thermal zones
2131 <degrees C>: override all lowest active trip points 2131 <degrees C>: override all lowest active trip points
2132 2132
2133 thermal.crt= [HW,ACPI] 2133 thermal.crt= [HW,ACPI]
2134 -1: disable all critical trip points in all thermal zones 2134 -1: disable all critical trip points in all thermal zones
2135 <degrees C>: lower all critical trip points 2135 <degrees C>: lower all critical trip points
2136 2136
2137 thermal.nocrt= [HW,ACPI] 2137 thermal.nocrt= [HW,ACPI]
2138 Set to disable actions on ACPI thermal zone 2138 Set to disable actions on ACPI thermal zone
2139 critical and hot trip points. 2139 critical and hot trip points.
2140 2140
2141 thermal.off= [HW,ACPI] 2141 thermal.off= [HW,ACPI]
2142 1: disable ACPI thermal control 2142 1: disable ACPI thermal control
2143 2143
2144 thermal.psv= [HW,ACPI] 2144 thermal.psv= [HW,ACPI]
2145 -1: disable all passive trip points 2145 -1: disable all passive trip points
2146 <degrees C>: override all passive trip points to this value 2146 <degrees C>: override all passive trip points to this value
2147 2147
2148 thermal.tzp= [HW,ACPI] 2148 thermal.tzp= [HW,ACPI]
2149 Specify global default ACPI thermal zone polling rate 2149 Specify global default ACPI thermal zone polling rate
2150 <deci-seconds>: poll all this frequency 2150 <deci-seconds>: poll all this frequency
2151 0: no polling (default) 2151 0: no polling (default)
2152 2152
2153 tipar.timeout= [HW,PPT] 2153 tipar.timeout= [HW,PPT]
2154 Set communications timeout in tenths of a second 2154 Set communications timeout in tenths of a second
2155 (default 15). 2155 (default 15).
2156 2156
2157 tipar.delay= [HW,PPT] 2157 tipar.delay= [HW,PPT]
2158 Set inter-bit delay in microseconds (default 10). 2158 Set inter-bit delay in microseconds (default 10).
2159 2159
2160 tmscsim= [HW,SCSI] 2160 tmscsim= [HW,SCSI]
2161 See comment before function dc390_setup() in 2161 See comment before function dc390_setup() in
2162 drivers/scsi/tmscsim.c. 2162 drivers/scsi/tmscsim.c.
2163 2163
2164 tp720= [HW,PS2] 2164 tp720= [HW,PS2]
2165 2165
2166 trix= [HW,OSS] MediaTrix AudioTrix Pro 2166 trix= [HW,OSS] MediaTrix AudioTrix Pro
2167 Format: 2167 Format:
2168 <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq> 2168 <io>,<irq>,<dma>,<dma2>,<sb_io>,<sb_irq>,<sb_dma>,<mpu_io>,<mpu_irq>
2169 2169
2170 turbografx.map[2|3]= [HW,JOY] 2170 turbografx.map[2|3]= [HW,JOY]
2171 TurboGraFX parallel port interface 2171 TurboGraFX parallel port interface
2172 Format: 2172 Format:
2173 <port#>,<js1>,<js2>,<js3>,<js4>,<js5>,<js6>,<js7> 2173 <port#>,<js1>,<js2>,<js3>,<js4>,<js5>,<js6>,<js7>
2174 See also Documentation/input/joystick-parport.txt 2174 See also Documentation/input/joystick-parport.txt
2175 2175
2176 u14-34f= [HW,SCSI] UltraStor 14F/34F SCSI host adapter 2176 u14-34f= [HW,SCSI] UltraStor 14F/34F SCSI host adapter
2177 See header of drivers/scsi/u14-34f.c. 2177 See header of drivers/scsi/u14-34f.c.
2178 2178
2179 uart401= [HW,OSS] 2179 uart401= [HW,OSS]
2180 Format: <io>,<irq> 2180 Format: <io>,<irq>
2181 2181
2182 uart6850= [HW,OSS] 2182 uart6850= [HW,OSS]
2183 Format: <io>,<irq> 2183 Format: <io>,<irq>
2184 2184
2185 uhci-hcd.ignore_oc= 2185 uhci-hcd.ignore_oc=
2186 [USB] Ignore overcurrent events (default N). 2186 [USB] Ignore overcurrent events (default N).
2187 Some badly-designed motherboards generate lots of 2187 Some badly-designed motherboards generate lots of
2188 bogus events, for ports that aren't wired to 2188 bogus events, for ports that aren't wired to
2189 anything. Set this parameter to avoid log spamming. 2189 anything. Set this parameter to avoid log spamming.
2190 Note that genuine overcurrent events won't be 2190 Note that genuine overcurrent events won't be
2191 reported either. 2191 reported either.
2192 2192
2193 unknown_nmi_panic 2193 unknown_nmi_panic
2194 [X86-32,X86-64] 2194 [X86-32,X86-64]
2195 Set unknown_nmi_panic=1 early on boot. 2195 Set unknown_nmi_panic=1 early on boot.
2196 2196
2197 usbcore.autosuspend= 2197 usbcore.autosuspend=
2198 [USB] The autosuspend time delay (in seconds) used 2198 [USB] The autosuspend time delay (in seconds) used
2199 for newly-detected USB devices (default 2). This 2199 for newly-detected USB devices (default 2). This
2200 is the time required before an idle device will be 2200 is the time required before an idle device will be
2201 autosuspended. Devices for which the delay is set 2201 autosuspended. Devices for which the delay is set
2202 to a negative value won't be autosuspended at all. 2202 to a negative value won't be autosuspended at all.
2203 2203
2204 usbhid.mousepoll= 2204 usbhid.mousepoll=
2205 [USBHID] The interval which mice are to be polled at. 2205 [USBHID] The interval which mice are to be polled at.
2206 2206
2207 add_efi_memmap [EFI; x86-32,X86-64] Include EFI memory map in 2207 add_efi_memmap [EFI; x86-32,X86-64] Include EFI memory map in
2208 kernel's map of available physical RAM. 2208 kernel's map of available physical RAM.
2209 2209
2210 vdso= [X86-32,SH,x86-64] 2210 vdso= [X86-32,SH,x86-64]
2211 vdso=2: enable compat VDSO (default with COMPAT_VDSO) 2211 vdso=2: enable compat VDSO (default with COMPAT_VDSO)
2212 vdso=1: enable VDSO (default) 2212 vdso=1: enable VDSO (default)
2213 vdso=0: disable VDSO mapping 2213 vdso=0: disable VDSO mapping
2214 2214
2215 vdso32= [X86-32,X86-64] 2215 vdso32= [X86-32,X86-64]
2216 vdso32=2: enable compat VDSO (default with COMPAT_VDSO) 2216 vdso32=2: enable compat VDSO (default with COMPAT_VDSO)
2217 vdso32=1: enable 32-bit VDSO (default) 2217 vdso32=1: enable 32-bit VDSO (default)
2218 vdso32=0: disable 32-bit VDSO mapping 2218 vdso32=0: disable 32-bit VDSO mapping
2219 2219
2220 vector= [IA-64,SMP] 2220 vector= [IA-64,SMP]
2221 vector=percpu: enable percpu vector domain 2221 vector=percpu: enable percpu vector domain
2222 2222
2223 video= [FB] Frame buffer configuration 2223 video= [FB] Frame buffer configuration
2224 See Documentation/fb/modedb.txt. 2224 See Documentation/fb/modedb.txt.
2225 2225
2226 vga= [BOOT,X86-32] Select a particular video mode 2226 vga= [BOOT,X86-32] Select a particular video mode
2227 See Documentation/i386/boot.txt and 2227 See Documentation/i386/boot.txt and
2228 Documentation/svga.txt. 2228 Documentation/svga.txt.
2229 Use vga=ask for menu. 2229 Use vga=ask for menu.
2230 This is actually a boot loader parameter; the value is 2230 This is actually a boot loader parameter; the value is
2231 passed to the kernel using a special protocol. 2231 passed to the kernel using a special protocol.
2232 2232
2233 vmalloc=nn[KMG] [KNL,BOOT] Forces the vmalloc area to have an exact 2233 vmalloc=nn[KMG] [KNL,BOOT] Forces the vmalloc area to have an exact
2234 size of <nn>. This can be used to increase the 2234 size of <nn>. This can be used to increase the
2235 minimum size (128MB on x86). It can also be used to 2235 minimum size (128MB on x86). It can also be used to
2236 decrease the size and leave more room for directly 2236 decrease the size and leave more room for directly
2237 mapped kernel RAM. 2237 mapped kernel RAM.
2238 2238
2239 vmhalt= [KNL,S390] Perform z/VM CP command after system halt. 2239 vmhalt= [KNL,S390] Perform z/VM CP command after system halt.
2240 Format: <command> 2240 Format: <command>
2241 2241
2242 vmpanic= [KNL,S390] Perform z/VM CP command after kernel panic. 2242 vmpanic= [KNL,S390] Perform z/VM CP command after kernel panic.
2243 Format: <command> 2243 Format: <command>
2244 2244
2245 vmpoff= [KNL,S390] Perform z/VM CP command after power off. 2245 vmpoff= [KNL,S390] Perform z/VM CP command after power off.
2246 Format: <command> 2246 Format: <command>
2247 2247
2248 waveartist= [HW,OSS] 2248 waveartist= [HW,OSS]
2249 Format: <io>,<irq>,<dma>,<dma2> 2249 Format: <io>,<irq>,<dma>,<dma2>
2250 2250
2251 wd33c93= [HW,SCSI] 2251 wd33c93= [HW,SCSI]
2252 See header of drivers/scsi/wd33c93.c. 2252 See header of drivers/scsi/wd33c93.c.
2253 2253
2254 wd7000= [HW,SCSI] 2254 wd7000= [HW,SCSI]
2255 See header of drivers/scsi/wd7000.c. 2255 See header of drivers/scsi/wd7000.c.
2256 2256
2257 wdt= [WDT] Watchdog 2257 wdt= [WDT] Watchdog
2258 See Documentation/watchdog/wdt.txt. 2258 See Documentation/watchdog/wdt.txt.
2259 2259
2260 xd= [HW,XT] Original XT pre-IDE (RLL encoded) disks. 2260 xd= [HW,XT] Original XT pre-IDE (RLL encoded) disks.
2261 xd_geo= See header of drivers/block/xd.c. 2261 xd_geo= See header of drivers/block/xd.c.
2262 2262
2263 xirc2ps_cs= [NET,PCMCIA] 2263 xirc2ps_cs= [NET,PCMCIA]
2264 Format: 2264 Format:
2265 <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]] 2265 <irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
2266 2266
2267 norandmaps Don't use address space randomization 2267 norandmaps Don't use address space randomization
2268 Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space 2268 Equivalent to echo 0 > /proc/sys/kernel/randomize_va_space
2269 2269
2270 ______________________________________________________________________ 2270 ______________________________________________________________________
2271 2271
2272 TODO: 2272 TODO:
2273 2273
2274 Add documentation for ALSA options. 2274 Add documentation for ALSA options.
2275 Add more DRM drivers. 2275 Add more DRM drivers.
2276 2276
arch/powerpc/mm/hash_utils_64.c
1 /* 1 /*
2 * PowerPC64 port by Mike Corrigan and Dave Engebretsen 2 * PowerPC64 port by Mike Corrigan and Dave Engebretsen
3 * {mikejc|engebret}@us.ibm.com 3 * {mikejc|engebret}@us.ibm.com
4 * 4 *
5 * Copyright (c) 2000 Mike Corrigan <mikejc@us.ibm.com> 5 * Copyright (c) 2000 Mike Corrigan <mikejc@us.ibm.com>
6 * 6 *
7 * SMP scalability work: 7 * SMP scalability work:
8 * Copyright (C) 2001 Anton Blanchard <anton@au.ibm.com>, IBM 8 * Copyright (C) 2001 Anton Blanchard <anton@au.ibm.com>, IBM
9 * 9 *
10 * Module name: htab.c 10 * Module name: htab.c
11 * 11 *
12 * Description: 12 * Description:
13 * PowerPC Hashed Page Table functions 13 * PowerPC Hashed Page Table functions
14 * 14 *
15 * This program is free software; you can redistribute it and/or 15 * This program is free software; you can redistribute it and/or
16 * modify it under the terms of the GNU General Public License 16 * modify it under the terms of the GNU General Public License
17 * as published by the Free Software Foundation; either version 17 * as published by the Free Software Foundation; either version
18 * 2 of the License, or (at your option) any later version. 18 * 2 of the License, or (at your option) any later version.
19 */ 19 */
20 20
21 #undef DEBUG 21 #undef DEBUG
22 #undef DEBUG_LOW 22 #undef DEBUG_LOW
23 23
24 #include <linux/spinlock.h> 24 #include <linux/spinlock.h>
25 #include <linux/errno.h> 25 #include <linux/errno.h>
26 #include <linux/sched.h> 26 #include <linux/sched.h>
27 #include <linux/proc_fs.h> 27 #include <linux/proc_fs.h>
28 #include <linux/stat.h> 28 #include <linux/stat.h>
29 #include <linux/sysctl.h> 29 #include <linux/sysctl.h>
30 #include <linux/ctype.h> 30 #include <linux/ctype.h>
31 #include <linux/cache.h> 31 #include <linux/cache.h>
32 #include <linux/init.h> 32 #include <linux/init.h>
33 #include <linux/signal.h> 33 #include <linux/signal.h>
34 #include <linux/lmb.h> 34 #include <linux/lmb.h>
35 35
36 #include <asm/processor.h> 36 #include <asm/processor.h>
37 #include <asm/pgtable.h> 37 #include <asm/pgtable.h>
38 #include <asm/mmu.h> 38 #include <asm/mmu.h>
39 #include <asm/mmu_context.h> 39 #include <asm/mmu_context.h>
40 #include <asm/page.h> 40 #include <asm/page.h>
41 #include <asm/types.h> 41 #include <asm/types.h>
42 #include <asm/system.h> 42 #include <asm/system.h>
43 #include <asm/uaccess.h> 43 #include <asm/uaccess.h>
44 #include <asm/machdep.h> 44 #include <asm/machdep.h>
45 #include <asm/prom.h> 45 #include <asm/prom.h>
46 #include <asm/abs_addr.h> 46 #include <asm/abs_addr.h>
47 #include <asm/tlbflush.h> 47 #include <asm/tlbflush.h>
48 #include <asm/io.h> 48 #include <asm/io.h>
49 #include <asm/eeh.h> 49 #include <asm/eeh.h>
50 #include <asm/tlb.h> 50 #include <asm/tlb.h>
51 #include <asm/cacheflush.h> 51 #include <asm/cacheflush.h>
52 #include <asm/cputable.h> 52 #include <asm/cputable.h>
53 #include <asm/sections.h> 53 #include <asm/sections.h>
54 #include <asm/spu.h> 54 #include <asm/spu.h>
55 #include <asm/udbg.h> 55 #include <asm/udbg.h>
56 56
57 #ifdef DEBUG 57 #ifdef DEBUG
58 #define DBG(fmt...) udbg_printf(fmt) 58 #define DBG(fmt...) udbg_printf(fmt)
59 #else 59 #else
60 #define DBG(fmt...) 60 #define DBG(fmt...)
61 #endif 61 #endif
62 62
63 #ifdef DEBUG_LOW 63 #ifdef DEBUG_LOW
64 #define DBG_LOW(fmt...) udbg_printf(fmt) 64 #define DBG_LOW(fmt...) udbg_printf(fmt)
65 #else 65 #else
66 #define DBG_LOW(fmt...) 66 #define DBG_LOW(fmt...)
67 #endif 67 #endif
68 68
69 #define KB (1024) 69 #define KB (1024)
70 #define MB (1024*KB) 70 #define MB (1024*KB)
71 #define GB (1024L*MB) 71 #define GB (1024L*MB)
72 72
73 /* 73 /*
74 * Note: pte --> Linux PTE 74 * Note: pte --> Linux PTE
75 * HPTE --> PowerPC Hashed Page Table Entry 75 * HPTE --> PowerPC Hashed Page Table Entry
76 * 76 *
77 * Execution context: 77 * Execution context:
78 * htab_initialize is called with the MMU off (of course), but 78 * htab_initialize is called with the MMU off (of course), but
79 * the kernel has been copied down to zero so it can directly 79 * the kernel has been copied down to zero so it can directly
80 * reference global data. At this point it is very difficult 80 * reference global data. At this point it is very difficult
81 * to print debug info. 81 * to print debug info.
82 * 82 *
83 */ 83 */
84 84
85 #ifdef CONFIG_U3_DART 85 #ifdef CONFIG_U3_DART
86 extern unsigned long dart_tablebase; 86 extern unsigned long dart_tablebase;
87 #endif /* CONFIG_U3_DART */ 87 #endif /* CONFIG_U3_DART */
88 88
89 static unsigned long _SDR1; 89 static unsigned long _SDR1;
90 struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; 90 struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
91 91
92 struct hash_pte *htab_address; 92 struct hash_pte *htab_address;
93 unsigned long htab_size_bytes; 93 unsigned long htab_size_bytes;
94 unsigned long htab_hash_mask; 94 unsigned long htab_hash_mask;
95 int mmu_linear_psize = MMU_PAGE_4K; 95 int mmu_linear_psize = MMU_PAGE_4K;
96 int mmu_virtual_psize = MMU_PAGE_4K; 96 int mmu_virtual_psize = MMU_PAGE_4K;
97 int mmu_vmalloc_psize = MMU_PAGE_4K; 97 int mmu_vmalloc_psize = MMU_PAGE_4K;
98 #ifdef CONFIG_SPARSEMEM_VMEMMAP 98 #ifdef CONFIG_SPARSEMEM_VMEMMAP
99 int mmu_vmemmap_psize = MMU_PAGE_4K; 99 int mmu_vmemmap_psize = MMU_PAGE_4K;
100 #endif 100 #endif
101 int mmu_io_psize = MMU_PAGE_4K; 101 int mmu_io_psize = MMU_PAGE_4K;
102 int mmu_kernel_ssize = MMU_SEGSIZE_256M; 102 int mmu_kernel_ssize = MMU_SEGSIZE_256M;
103 int mmu_highuser_ssize = MMU_SEGSIZE_256M; 103 int mmu_highuser_ssize = MMU_SEGSIZE_256M;
104 u16 mmu_slb_size = 64; 104 u16 mmu_slb_size = 64;
105 #ifdef CONFIG_HUGETLB_PAGE 105 #ifdef CONFIG_HUGETLB_PAGE
106 int mmu_huge_psize = MMU_PAGE_16M;
107 unsigned int HPAGE_SHIFT; 106 unsigned int HPAGE_SHIFT;
108 #endif 107 #endif
109 #ifdef CONFIG_PPC_64K_PAGES 108 #ifdef CONFIG_PPC_64K_PAGES
110 int mmu_ci_restrictions; 109 int mmu_ci_restrictions;
111 #endif 110 #endif
112 #ifdef CONFIG_DEBUG_PAGEALLOC 111 #ifdef CONFIG_DEBUG_PAGEALLOC
113 static u8 *linear_map_hash_slots; 112 static u8 *linear_map_hash_slots;
114 static unsigned long linear_map_hash_count; 113 static unsigned long linear_map_hash_count;
115 static DEFINE_SPINLOCK(linear_map_hash_lock); 114 static DEFINE_SPINLOCK(linear_map_hash_lock);
116 #endif /* CONFIG_DEBUG_PAGEALLOC */ 115 #endif /* CONFIG_DEBUG_PAGEALLOC */
117 116
118 /* There are definitions of page sizes arrays to be used when none 117 /* There are definitions of page sizes arrays to be used when none
119 * is provided by the firmware. 118 * is provided by the firmware.
120 */ 119 */
121 120
122 /* Pre-POWER4 CPUs (4k pages only) 121 /* Pre-POWER4 CPUs (4k pages only)
123 */ 122 */
124 static struct mmu_psize_def mmu_psize_defaults_old[] = { 123 static struct mmu_psize_def mmu_psize_defaults_old[] = {
125 [MMU_PAGE_4K] = { 124 [MMU_PAGE_4K] = {
126 .shift = 12, 125 .shift = 12,
127 .sllp = 0, 126 .sllp = 0,
128 .penc = 0, 127 .penc = 0,
129 .avpnm = 0, 128 .avpnm = 0,
130 .tlbiel = 0, 129 .tlbiel = 0,
131 }, 130 },
132 }; 131 };
133 132
134 /* POWER4, GPUL, POWER5 133 /* POWER4, GPUL, POWER5
135 * 134 *
136 * Support for 16Mb large pages 135 * Support for 16Mb large pages
137 */ 136 */
138 static struct mmu_psize_def mmu_psize_defaults_gp[] = { 137 static struct mmu_psize_def mmu_psize_defaults_gp[] = {
139 [MMU_PAGE_4K] = { 138 [MMU_PAGE_4K] = {
140 .shift = 12, 139 .shift = 12,
141 .sllp = 0, 140 .sllp = 0,
142 .penc = 0, 141 .penc = 0,
143 .avpnm = 0, 142 .avpnm = 0,
144 .tlbiel = 1, 143 .tlbiel = 1,
145 }, 144 },
146 [MMU_PAGE_16M] = { 145 [MMU_PAGE_16M] = {
147 .shift = 24, 146 .shift = 24,
148 .sllp = SLB_VSID_L, 147 .sllp = SLB_VSID_L,
149 .penc = 0, 148 .penc = 0,
150 .avpnm = 0x1UL, 149 .avpnm = 0x1UL,
151 .tlbiel = 0, 150 .tlbiel = 0,
152 }, 151 },
153 }; 152 };
154 153
155 154
156 int htab_bolt_mapping(unsigned long vstart, unsigned long vend, 155 int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
157 unsigned long pstart, unsigned long mode, 156 unsigned long pstart, unsigned long mode,
158 int psize, int ssize) 157 int psize, int ssize)
159 { 158 {
160 unsigned long vaddr, paddr; 159 unsigned long vaddr, paddr;
161 unsigned int step, shift; 160 unsigned int step, shift;
162 unsigned long tmp_mode; 161 unsigned long tmp_mode;
163 int ret = 0; 162 int ret = 0;
164 163
165 shift = mmu_psize_defs[psize].shift; 164 shift = mmu_psize_defs[psize].shift;
166 step = 1 << shift; 165 step = 1 << shift;
167 166
168 for (vaddr = vstart, paddr = pstart; vaddr < vend; 167 for (vaddr = vstart, paddr = pstart; vaddr < vend;
169 vaddr += step, paddr += step) { 168 vaddr += step, paddr += step) {
170 unsigned long hash, hpteg; 169 unsigned long hash, hpteg;
171 unsigned long vsid = get_kernel_vsid(vaddr, ssize); 170 unsigned long vsid = get_kernel_vsid(vaddr, ssize);
172 unsigned long va = hpt_va(vaddr, vsid, ssize); 171 unsigned long va = hpt_va(vaddr, vsid, ssize);
173 172
174 tmp_mode = mode; 173 tmp_mode = mode;
175 174
176 /* Make non-kernel text non-executable */ 175 /* Make non-kernel text non-executable */
177 if (!in_kernel_text(vaddr)) 176 if (!in_kernel_text(vaddr))
178 tmp_mode = mode | HPTE_R_N; 177 tmp_mode = mode | HPTE_R_N;
179 178
180 hash = hpt_hash(va, shift, ssize); 179 hash = hpt_hash(va, shift, ssize);
181 hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP); 180 hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
182 181
183 DBG("htab_bolt_mapping: calling %p\n", ppc_md.hpte_insert); 182 DBG("htab_bolt_mapping: calling %p\n", ppc_md.hpte_insert);
184 183
185 BUG_ON(!ppc_md.hpte_insert); 184 BUG_ON(!ppc_md.hpte_insert);
186 ret = ppc_md.hpte_insert(hpteg, va, paddr, 185 ret = ppc_md.hpte_insert(hpteg, va, paddr,
187 tmp_mode, HPTE_V_BOLTED, psize, ssize); 186 tmp_mode, HPTE_V_BOLTED, psize, ssize);
188 187
189 if (ret < 0) 188 if (ret < 0)
190 break; 189 break;
191 #ifdef CONFIG_DEBUG_PAGEALLOC 190 #ifdef CONFIG_DEBUG_PAGEALLOC
192 if ((paddr >> PAGE_SHIFT) < linear_map_hash_count) 191 if ((paddr >> PAGE_SHIFT) < linear_map_hash_count)
193 linear_map_hash_slots[paddr >> PAGE_SHIFT] = ret | 0x80; 192 linear_map_hash_slots[paddr >> PAGE_SHIFT] = ret | 0x80;
194 #endif /* CONFIG_DEBUG_PAGEALLOC */ 193 #endif /* CONFIG_DEBUG_PAGEALLOC */
195 } 194 }
196 return ret < 0 ? ret : 0; 195 return ret < 0 ? ret : 0;
197 } 196 }
198 197
199 #ifdef CONFIG_MEMORY_HOTPLUG 198 #ifdef CONFIG_MEMORY_HOTPLUG
200 static int htab_remove_mapping(unsigned long vstart, unsigned long vend, 199 static int htab_remove_mapping(unsigned long vstart, unsigned long vend,
201 int psize, int ssize) 200 int psize, int ssize)
202 { 201 {
203 unsigned long vaddr; 202 unsigned long vaddr;
204 unsigned int step, shift; 203 unsigned int step, shift;
205 204
206 shift = mmu_psize_defs[psize].shift; 205 shift = mmu_psize_defs[psize].shift;
207 step = 1 << shift; 206 step = 1 << shift;
208 207
209 if (!ppc_md.hpte_removebolted) { 208 if (!ppc_md.hpte_removebolted) {
210 printk(KERN_WARNING "Platform doesn't implement " 209 printk(KERN_WARNING "Platform doesn't implement "
211 "hpte_removebolted\n"); 210 "hpte_removebolted\n");
212 return -EINVAL; 211 return -EINVAL;
213 } 212 }
214 213
215 for (vaddr = vstart; vaddr < vend; vaddr += step) 214 for (vaddr = vstart; vaddr < vend; vaddr += step)
216 ppc_md.hpte_removebolted(vaddr, psize, ssize); 215 ppc_md.hpte_removebolted(vaddr, psize, ssize);
217 216
218 return 0; 217 return 0;
219 } 218 }
220 #endif /* CONFIG_MEMORY_HOTPLUG */ 219 #endif /* CONFIG_MEMORY_HOTPLUG */
221 220
222 static int __init htab_dt_scan_seg_sizes(unsigned long node, 221 static int __init htab_dt_scan_seg_sizes(unsigned long node,
223 const char *uname, int depth, 222 const char *uname, int depth,
224 void *data) 223 void *data)
225 { 224 {
226 char *type = of_get_flat_dt_prop(node, "device_type", NULL); 225 char *type = of_get_flat_dt_prop(node, "device_type", NULL);
227 u32 *prop; 226 u32 *prop;
228 unsigned long size = 0; 227 unsigned long size = 0;
229 228
230 /* We are scanning "cpu" nodes only */ 229 /* We are scanning "cpu" nodes only */
231 if (type == NULL || strcmp(type, "cpu") != 0) 230 if (type == NULL || strcmp(type, "cpu") != 0)
232 return 0; 231 return 0;
233 232
234 prop = (u32 *)of_get_flat_dt_prop(node, "ibm,processor-segment-sizes", 233 prop = (u32 *)of_get_flat_dt_prop(node, "ibm,processor-segment-sizes",
235 &size); 234 &size);
236 if (prop == NULL) 235 if (prop == NULL)
237 return 0; 236 return 0;
238 for (; size >= 4; size -= 4, ++prop) { 237 for (; size >= 4; size -= 4, ++prop) {
239 if (prop[0] == 40) { 238 if (prop[0] == 40) {
240 DBG("1T segment support detected\n"); 239 DBG("1T segment support detected\n");
241 cur_cpu_spec->cpu_features |= CPU_FTR_1T_SEGMENT; 240 cur_cpu_spec->cpu_features |= CPU_FTR_1T_SEGMENT;
242 return 1; 241 return 1;
243 } 242 }
244 } 243 }
245 cur_cpu_spec->cpu_features &= ~CPU_FTR_NO_SLBIE_B; 244 cur_cpu_spec->cpu_features &= ~CPU_FTR_NO_SLBIE_B;
246 return 0; 245 return 0;
247 } 246 }
248 247
249 static void __init htab_init_seg_sizes(void) 248 static void __init htab_init_seg_sizes(void)
250 { 249 {
251 of_scan_flat_dt(htab_dt_scan_seg_sizes, NULL); 250 of_scan_flat_dt(htab_dt_scan_seg_sizes, NULL);
252 } 251 }
253 252
254 static int __init htab_dt_scan_page_sizes(unsigned long node, 253 static int __init htab_dt_scan_page_sizes(unsigned long node,
255 const char *uname, int depth, 254 const char *uname, int depth,
256 void *data) 255 void *data)
257 { 256 {
258 char *type = of_get_flat_dt_prop(node, "device_type", NULL); 257 char *type = of_get_flat_dt_prop(node, "device_type", NULL);
259 u32 *prop; 258 u32 *prop;
260 unsigned long size = 0; 259 unsigned long size = 0;
261 260
262 /* We are scanning "cpu" nodes only */ 261 /* We are scanning "cpu" nodes only */
263 if (type == NULL || strcmp(type, "cpu") != 0) 262 if (type == NULL || strcmp(type, "cpu") != 0)
264 return 0; 263 return 0;
265 264
266 prop = (u32 *)of_get_flat_dt_prop(node, 265 prop = (u32 *)of_get_flat_dt_prop(node,
267 "ibm,segment-page-sizes", &size); 266 "ibm,segment-page-sizes", &size);
268 if (prop != NULL) { 267 if (prop != NULL) {
269 DBG("Page sizes from device-tree:\n"); 268 DBG("Page sizes from device-tree:\n");
270 size /= 4; 269 size /= 4;
271 cur_cpu_spec->cpu_features &= ~(CPU_FTR_16M_PAGE); 270 cur_cpu_spec->cpu_features &= ~(CPU_FTR_16M_PAGE);
272 while(size > 0) { 271 while(size > 0) {
273 unsigned int shift = prop[0]; 272 unsigned int shift = prop[0];
274 unsigned int slbenc = prop[1]; 273 unsigned int slbenc = prop[1];
275 unsigned int lpnum = prop[2]; 274 unsigned int lpnum = prop[2];
276 unsigned int lpenc = 0; 275 unsigned int lpenc = 0;
277 struct mmu_psize_def *def; 276 struct mmu_psize_def *def;
278 int idx = -1; 277 int idx = -1;
279 278
280 size -= 3; prop += 3; 279 size -= 3; prop += 3;
281 while(size > 0 && lpnum) { 280 while(size > 0 && lpnum) {
282 if (prop[0] == shift) 281 if (prop[0] == shift)
283 lpenc = prop[1]; 282 lpenc = prop[1];
284 prop += 2; size -= 2; 283 prop += 2; size -= 2;
285 lpnum--; 284 lpnum--;
286 } 285 }
287 switch(shift) { 286 switch(shift) {
288 case 0xc: 287 case 0xc:
289 idx = MMU_PAGE_4K; 288 idx = MMU_PAGE_4K;
290 break; 289 break;
291 case 0x10: 290 case 0x10:
292 idx = MMU_PAGE_64K; 291 idx = MMU_PAGE_64K;
293 break; 292 break;
294 case 0x14: 293 case 0x14:
295 idx = MMU_PAGE_1M; 294 idx = MMU_PAGE_1M;
296 break; 295 break;
297 case 0x18: 296 case 0x18:
298 idx = MMU_PAGE_16M; 297 idx = MMU_PAGE_16M;
299 cur_cpu_spec->cpu_features |= CPU_FTR_16M_PAGE; 298 cur_cpu_spec->cpu_features |= CPU_FTR_16M_PAGE;
300 break; 299 break;
301 case 0x22: 300 case 0x22:
302 idx = MMU_PAGE_16G; 301 idx = MMU_PAGE_16G;
303 break; 302 break;
304 } 303 }
305 if (idx < 0) 304 if (idx < 0)
306 continue; 305 continue;
307 def = &mmu_psize_defs[idx]; 306 def = &mmu_psize_defs[idx];
308 def->shift = shift; 307 def->shift = shift;
309 if (shift <= 23) 308 if (shift <= 23)
310 def->avpnm = 0; 309 def->avpnm = 0;
311 else 310 else
312 def->avpnm = (1 << (shift - 23)) - 1; 311 def->avpnm = (1 << (shift - 23)) - 1;
313 def->sllp = slbenc; 312 def->sllp = slbenc;
314 def->penc = lpenc; 313 def->penc = lpenc;
315 /* We don't know for sure what's up with tlbiel, so 314 /* We don't know for sure what's up with tlbiel, so
316 * for now we only set it for 4K and 64K pages 315 * for now we only set it for 4K and 64K pages
317 */ 316 */
318 if (idx == MMU_PAGE_4K || idx == MMU_PAGE_64K) 317 if (idx == MMU_PAGE_4K || idx == MMU_PAGE_64K)
319 def->tlbiel = 1; 318 def->tlbiel = 1;
320 else 319 else
321 def->tlbiel = 0; 320 def->tlbiel = 0;
322 321
323 DBG(" %d: shift=%02x, sllp=%04x, avpnm=%08x, " 322 DBG(" %d: shift=%02x, sllp=%04x, avpnm=%08x, "
324 "tlbiel=%d, penc=%d\n", 323 "tlbiel=%d, penc=%d\n",
325 idx, shift, def->sllp, def->avpnm, def->tlbiel, 324 idx, shift, def->sllp, def->avpnm, def->tlbiel,
326 def->penc); 325 def->penc);
327 } 326 }
328 return 1; 327 return 1;
329 } 328 }
330 return 0; 329 return 0;
331 } 330 }
332 331
333 /* Scan for 16G memory blocks that have been set aside for huge pages 332 /* Scan for 16G memory blocks that have been set aside for huge pages
334 * and reserve those blocks for 16G huge pages. 333 * and reserve those blocks for 16G huge pages.
335 */ 334 */
336 static int __init htab_dt_scan_hugepage_blocks(unsigned long node, 335 static int __init htab_dt_scan_hugepage_blocks(unsigned long node,
337 const char *uname, int depth, 336 const char *uname, int depth,
338 void *data) { 337 void *data) {
339 char *type = of_get_flat_dt_prop(node, "device_type", NULL); 338 char *type = of_get_flat_dt_prop(node, "device_type", NULL);
340 unsigned long *addr_prop; 339 unsigned long *addr_prop;
341 u32 *page_count_prop; 340 u32 *page_count_prop;
342 unsigned int expected_pages; 341 unsigned int expected_pages;
343 long unsigned int phys_addr; 342 long unsigned int phys_addr;
344 long unsigned int block_size; 343 long unsigned int block_size;
345 344
346 /* We are scanning "memory" nodes only */ 345 /* We are scanning "memory" nodes only */
347 if (type == NULL || strcmp(type, "memory") != 0) 346 if (type == NULL || strcmp(type, "memory") != 0)
348 return 0; 347 return 0;
349 348
350 /* This property is the log base 2 of the number of virtual pages that 349 /* This property is the log base 2 of the number of virtual pages that
351 * will represent this memory block. */ 350 * will represent this memory block. */
352 page_count_prop = of_get_flat_dt_prop(node, "ibm,expected#pages", NULL); 351 page_count_prop = of_get_flat_dt_prop(node, "ibm,expected#pages", NULL);
353 if (page_count_prop == NULL) 352 if (page_count_prop == NULL)
354 return 0; 353 return 0;
355 expected_pages = (1 << page_count_prop[0]); 354 expected_pages = (1 << page_count_prop[0]);
356 addr_prop = of_get_flat_dt_prop(node, "reg", NULL); 355 addr_prop = of_get_flat_dt_prop(node, "reg", NULL);
357 if (addr_prop == NULL) 356 if (addr_prop == NULL)
358 return 0; 357 return 0;
359 phys_addr = addr_prop[0]; 358 phys_addr = addr_prop[0];
360 block_size = addr_prop[1]; 359 block_size = addr_prop[1];
361 if (block_size != (16 * GB)) 360 if (block_size != (16 * GB))
362 return 0; 361 return 0;
363 printk(KERN_INFO "Huge page(16GB) memory: " 362 printk(KERN_INFO "Huge page(16GB) memory: "
364 "addr = 0x%lX size = 0x%lX pages = %d\n", 363 "addr = 0x%lX size = 0x%lX pages = %d\n",
365 phys_addr, block_size, expected_pages); 364 phys_addr, block_size, expected_pages);
366 lmb_reserve(phys_addr, block_size * expected_pages); 365 lmb_reserve(phys_addr, block_size * expected_pages);
367 add_gpage(phys_addr, block_size, expected_pages); 366 add_gpage(phys_addr, block_size, expected_pages);
368 return 0; 367 return 0;
369 } 368 }
370 369
371 static void __init htab_init_page_sizes(void) 370 static void __init htab_init_page_sizes(void)
372 { 371 {
373 int rc; 372 int rc;
374 373
375 /* Default to 4K pages only */ 374 /* Default to 4K pages only */
376 memcpy(mmu_psize_defs, mmu_psize_defaults_old, 375 memcpy(mmu_psize_defs, mmu_psize_defaults_old,
377 sizeof(mmu_psize_defaults_old)); 376 sizeof(mmu_psize_defaults_old));
378 377
379 /* 378 /*
380 * Try to find the available page sizes in the device-tree 379 * Try to find the available page sizes in the device-tree
381 */ 380 */
382 rc = of_scan_flat_dt(htab_dt_scan_page_sizes, NULL); 381 rc = of_scan_flat_dt(htab_dt_scan_page_sizes, NULL);
383 if (rc != 0) /* Found */ 382 if (rc != 0) /* Found */
384 goto found; 383 goto found;
385 384
386 /* 385 /*
387 * Not in the device-tree, let's fallback on known size 386 * Not in the device-tree, let's fallback on known size
388 * list for 16M capable GP & GR 387 * list for 16M capable GP & GR
389 */ 388 */
390 if (cpu_has_feature(CPU_FTR_16M_PAGE)) 389 if (cpu_has_feature(CPU_FTR_16M_PAGE))
391 memcpy(mmu_psize_defs, mmu_psize_defaults_gp, 390 memcpy(mmu_psize_defs, mmu_psize_defaults_gp,
392 sizeof(mmu_psize_defaults_gp)); 391 sizeof(mmu_psize_defaults_gp));
393 found: 392 found:
394 #ifndef CONFIG_DEBUG_PAGEALLOC 393 #ifndef CONFIG_DEBUG_PAGEALLOC
395 /* 394 /*
396 * Pick a size for the linear mapping. Currently, we only support 395 * Pick a size for the linear mapping. Currently, we only support
397 * 16M, 1M and 4K which is the default 396 * 16M, 1M and 4K which is the default
398 */ 397 */
399 if (mmu_psize_defs[MMU_PAGE_16M].shift) 398 if (mmu_psize_defs[MMU_PAGE_16M].shift)
400 mmu_linear_psize = MMU_PAGE_16M; 399 mmu_linear_psize = MMU_PAGE_16M;
401 else if (mmu_psize_defs[MMU_PAGE_1M].shift) 400 else if (mmu_psize_defs[MMU_PAGE_1M].shift)
402 mmu_linear_psize = MMU_PAGE_1M; 401 mmu_linear_psize = MMU_PAGE_1M;
403 #endif /* CONFIG_DEBUG_PAGEALLOC */ 402 #endif /* CONFIG_DEBUG_PAGEALLOC */
404 403
405 #ifdef CONFIG_PPC_64K_PAGES 404 #ifdef CONFIG_PPC_64K_PAGES
406 /* 405 /*
407 * Pick a size for the ordinary pages. Default is 4K, we support 406 * Pick a size for the ordinary pages. Default is 4K, we support
408 * 64K for user mappings and vmalloc if supported by the processor. 407 * 64K for user mappings and vmalloc if supported by the processor.
409 * We only use 64k for ioremap if the processor 408 * We only use 64k for ioremap if the processor
410 * (and firmware) support cache-inhibited large pages. 409 * (and firmware) support cache-inhibited large pages.
411 * If not, we use 4k and set mmu_ci_restrictions so that 410 * If not, we use 4k and set mmu_ci_restrictions so that
412 * hash_page knows to switch processes that use cache-inhibited 411 * hash_page knows to switch processes that use cache-inhibited
413 * mappings to 4k pages. 412 * mappings to 4k pages.
414 */ 413 */
415 if (mmu_psize_defs[MMU_PAGE_64K].shift) { 414 if (mmu_psize_defs[MMU_PAGE_64K].shift) {
416 mmu_virtual_psize = MMU_PAGE_64K; 415 mmu_virtual_psize = MMU_PAGE_64K;
417 mmu_vmalloc_psize = MMU_PAGE_64K; 416 mmu_vmalloc_psize = MMU_PAGE_64K;
418 if (mmu_linear_psize == MMU_PAGE_4K) 417 if (mmu_linear_psize == MMU_PAGE_4K)
419 mmu_linear_psize = MMU_PAGE_64K; 418 mmu_linear_psize = MMU_PAGE_64K;
420 if (cpu_has_feature(CPU_FTR_CI_LARGE_PAGE)) { 419 if (cpu_has_feature(CPU_FTR_CI_LARGE_PAGE)) {
421 /* 420 /*
422 * Don't use 64k pages for ioremap on pSeries, since 421 * Don't use 64k pages for ioremap on pSeries, since
423 * that would stop us accessing the HEA ethernet. 422 * that would stop us accessing the HEA ethernet.
424 */ 423 */
425 if (!machine_is(pseries)) 424 if (!machine_is(pseries))
426 mmu_io_psize = MMU_PAGE_64K; 425 mmu_io_psize = MMU_PAGE_64K;
427 } else 426 } else
428 mmu_ci_restrictions = 1; 427 mmu_ci_restrictions = 1;
429 } 428 }
430 #endif /* CONFIG_PPC_64K_PAGES */ 429 #endif /* CONFIG_PPC_64K_PAGES */
431 430
432 #ifdef CONFIG_SPARSEMEM_VMEMMAP 431 #ifdef CONFIG_SPARSEMEM_VMEMMAP
433 /* We try to use 16M pages for vmemmap if that is supported 432 /* We try to use 16M pages for vmemmap if that is supported
434 * and we have at least 1G of RAM at boot 433 * and we have at least 1G of RAM at boot
435 */ 434 */
436 if (mmu_psize_defs[MMU_PAGE_16M].shift && 435 if (mmu_psize_defs[MMU_PAGE_16M].shift &&
437 lmb_phys_mem_size() >= 0x40000000) 436 lmb_phys_mem_size() >= 0x40000000)
438 mmu_vmemmap_psize = MMU_PAGE_16M; 437 mmu_vmemmap_psize = MMU_PAGE_16M;
439 else if (mmu_psize_defs[MMU_PAGE_64K].shift) 438 else if (mmu_psize_defs[MMU_PAGE_64K].shift)
440 mmu_vmemmap_psize = MMU_PAGE_64K; 439 mmu_vmemmap_psize = MMU_PAGE_64K;
441 else 440 else
442 mmu_vmemmap_psize = MMU_PAGE_4K; 441 mmu_vmemmap_psize = MMU_PAGE_4K;
443 #endif /* CONFIG_SPARSEMEM_VMEMMAP */ 442 #endif /* CONFIG_SPARSEMEM_VMEMMAP */
444 443
445 printk(KERN_DEBUG "Page orders: linear mapping = %d, " 444 printk(KERN_DEBUG "Page orders: linear mapping = %d, "
446 "virtual = %d, io = %d" 445 "virtual = %d, io = %d"
447 #ifdef CONFIG_SPARSEMEM_VMEMMAP 446 #ifdef CONFIG_SPARSEMEM_VMEMMAP
448 ", vmemmap = %d" 447 ", vmemmap = %d"
449 #endif 448 #endif
450 "\n", 449 "\n",
451 mmu_psize_defs[mmu_linear_psize].shift, 450 mmu_psize_defs[mmu_linear_psize].shift,
452 mmu_psize_defs[mmu_virtual_psize].shift, 451 mmu_psize_defs[mmu_virtual_psize].shift,
453 mmu_psize_defs[mmu_io_psize].shift 452 mmu_psize_defs[mmu_io_psize].shift
454 #ifdef CONFIG_SPARSEMEM_VMEMMAP 453 #ifdef CONFIG_SPARSEMEM_VMEMMAP
455 ,mmu_psize_defs[mmu_vmemmap_psize].shift 454 ,mmu_psize_defs[mmu_vmemmap_psize].shift
456 #endif 455 #endif
457 ); 456 );
458 457
459 #ifdef CONFIG_HUGETLB_PAGE 458 #ifdef CONFIG_HUGETLB_PAGE
460 /* Reserve 16G huge page memory sections for huge pages */ 459 /* Reserve 16G huge page memory sections for huge pages */
461 of_scan_flat_dt(htab_dt_scan_hugepage_blocks, NULL); 460 of_scan_flat_dt(htab_dt_scan_hugepage_blocks, NULL);
462 461
463 /* Init large page size. Currently, we pick 16M or 1M depending 462 /* Set default large page size. Currently, we pick 16M or 1M depending
464 * on what is available 463 * on what is available
465 */ 464 */
466 if (mmu_psize_defs[MMU_PAGE_16M].shift) 465 if (mmu_psize_defs[MMU_PAGE_16M].shift)
467 set_huge_psize(MMU_PAGE_16M); 466 HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_16M].shift;
468 /* With 4k/4level pagetables, we can't (for now) cope with a 467 /* With 4k/4level pagetables, we can't (for now) cope with a
469 * huge page size < PMD_SIZE */ 468 * huge page size < PMD_SIZE */
470 else if (mmu_psize_defs[MMU_PAGE_1M].shift) 469 else if (mmu_psize_defs[MMU_PAGE_1M].shift)
471 set_huge_psize(MMU_PAGE_1M); 470 HPAGE_SHIFT = mmu_psize_defs[MMU_PAGE_1M].shift;
472 #endif /* CONFIG_HUGETLB_PAGE */ 471 #endif /* CONFIG_HUGETLB_PAGE */
473 } 472 }
474 473
475 static int __init htab_dt_scan_pftsize(unsigned long node, 474 static int __init htab_dt_scan_pftsize(unsigned long node,
476 const char *uname, int depth, 475 const char *uname, int depth,
477 void *data) 476 void *data)
478 { 477 {
479 char *type = of_get_flat_dt_prop(node, "device_type", NULL); 478 char *type = of_get_flat_dt_prop(node, "device_type", NULL);
480 u32 *prop; 479 u32 *prop;
481 480
482 /* We are scanning "cpu" nodes only */ 481 /* We are scanning "cpu" nodes only */
483 if (type == NULL || strcmp(type, "cpu") != 0) 482 if (type == NULL || strcmp(type, "cpu") != 0)
484 return 0; 483 return 0;
485 484
486 prop = (u32 *)of_get_flat_dt_prop(node, "ibm,pft-size", NULL); 485 prop = (u32 *)of_get_flat_dt_prop(node, "ibm,pft-size", NULL);
487 if (prop != NULL) { 486 if (prop != NULL) {
488 /* pft_size[0] is the NUMA CEC cookie */ 487 /* pft_size[0] is the NUMA CEC cookie */
489 ppc64_pft_size = prop[1]; 488 ppc64_pft_size = prop[1];
490 return 1; 489 return 1;
491 } 490 }
492 return 0; 491 return 0;
493 } 492 }
494 493
495 static unsigned long __init htab_get_table_size(void) 494 static unsigned long __init htab_get_table_size(void)
496 { 495 {
497 unsigned long mem_size, rnd_mem_size, pteg_count; 496 unsigned long mem_size, rnd_mem_size, pteg_count;
498 497
499 /* If hash size isn't already provided by the platform, we try to 498 /* If hash size isn't already provided by the platform, we try to
500 * retrieve it from the device-tree. If it's not there neither, we 499 * retrieve it from the device-tree. If it's not there neither, we
501 * calculate it now based on the total RAM size 500 * calculate it now based on the total RAM size
502 */ 501 */
503 if (ppc64_pft_size == 0) 502 if (ppc64_pft_size == 0)
504 of_scan_flat_dt(htab_dt_scan_pftsize, NULL); 503 of_scan_flat_dt(htab_dt_scan_pftsize, NULL);
505 if (ppc64_pft_size) 504 if (ppc64_pft_size)
506 return 1UL << ppc64_pft_size; 505 return 1UL << ppc64_pft_size;
507 506
508 /* round mem_size up to next power of 2 */ 507 /* round mem_size up to next power of 2 */
509 mem_size = lmb_phys_mem_size(); 508 mem_size = lmb_phys_mem_size();
510 rnd_mem_size = 1UL << __ilog2(mem_size); 509 rnd_mem_size = 1UL << __ilog2(mem_size);
511 if (rnd_mem_size < mem_size) 510 if (rnd_mem_size < mem_size)
512 rnd_mem_size <<= 1; 511 rnd_mem_size <<= 1;
513 512
514 /* # pages / 2 */ 513 /* # pages / 2 */
515 pteg_count = max(rnd_mem_size >> (12 + 1), 1UL << 11); 514 pteg_count = max(rnd_mem_size >> (12 + 1), 1UL << 11);
516 515
517 return pteg_count << 7; 516 return pteg_count << 7;
518 } 517 }
519 518
520 #ifdef CONFIG_MEMORY_HOTPLUG 519 #ifdef CONFIG_MEMORY_HOTPLUG
521 void create_section_mapping(unsigned long start, unsigned long end) 520 void create_section_mapping(unsigned long start, unsigned long end)
522 { 521 {
523 BUG_ON(htab_bolt_mapping(start, end, __pa(start), 522 BUG_ON(htab_bolt_mapping(start, end, __pa(start),
524 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX, 523 _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX,
525 mmu_linear_psize, mmu_kernel_ssize)); 524 mmu_linear_psize, mmu_kernel_ssize));
526 } 525 }
527 526
528 int remove_section_mapping(unsigned long start, unsigned long end) 527 int remove_section_mapping(unsigned long start, unsigned long end)
529 { 528 {
530 return htab_remove_mapping(start, end, mmu_linear_psize, 529 return htab_remove_mapping(start, end, mmu_linear_psize,
531 mmu_kernel_ssize); 530 mmu_kernel_ssize);
532 } 531 }
533 #endif /* CONFIG_MEMORY_HOTPLUG */ 532 #endif /* CONFIG_MEMORY_HOTPLUG */
534 533
535 static inline void make_bl(unsigned int *insn_addr, void *func) 534 static inline void make_bl(unsigned int *insn_addr, void *func)
536 { 535 {
537 unsigned long funcp = *((unsigned long *)func); 536 unsigned long funcp = *((unsigned long *)func);
538 int offset = funcp - (unsigned long)insn_addr; 537 int offset = funcp - (unsigned long)insn_addr;
539 538
540 *insn_addr = (unsigned int)(0x48000001 | (offset & 0x03fffffc)); 539 *insn_addr = (unsigned int)(0x48000001 | (offset & 0x03fffffc));
541 flush_icache_range((unsigned long)insn_addr, 4+ 540 flush_icache_range((unsigned long)insn_addr, 4+
542 (unsigned long)insn_addr); 541 (unsigned long)insn_addr);
543 } 542 }
544 543
545 static void __init htab_finish_init(void) 544 static void __init htab_finish_init(void)
546 { 545 {
547 extern unsigned int *htab_call_hpte_insert1; 546 extern unsigned int *htab_call_hpte_insert1;
548 extern unsigned int *htab_call_hpte_insert2; 547 extern unsigned int *htab_call_hpte_insert2;
549 extern unsigned int *htab_call_hpte_remove; 548 extern unsigned int *htab_call_hpte_remove;
550 extern unsigned int *htab_call_hpte_updatepp; 549 extern unsigned int *htab_call_hpte_updatepp;
551 550
552 #ifdef CONFIG_PPC_HAS_HASH_64K 551 #ifdef CONFIG_PPC_HAS_HASH_64K
553 extern unsigned int *ht64_call_hpte_insert1; 552 extern unsigned int *ht64_call_hpte_insert1;
554 extern unsigned int *ht64_call_hpte_insert2; 553 extern unsigned int *ht64_call_hpte_insert2;
555 extern unsigned int *ht64_call_hpte_remove; 554 extern unsigned int *ht64_call_hpte_remove;
556 extern unsigned int *ht64_call_hpte_updatepp; 555 extern unsigned int *ht64_call_hpte_updatepp;
557 556
558 make_bl(ht64_call_hpte_insert1, ppc_md.hpte_insert); 557 make_bl(ht64_call_hpte_insert1, ppc_md.hpte_insert);
559 make_bl(ht64_call_hpte_insert2, ppc_md.hpte_insert); 558 make_bl(ht64_call_hpte_insert2, ppc_md.hpte_insert);
560 make_bl(ht64_call_hpte_remove, ppc_md.hpte_remove); 559 make_bl(ht64_call_hpte_remove, ppc_md.hpte_remove);
561 make_bl(ht64_call_hpte_updatepp, ppc_md.hpte_updatepp); 560 make_bl(ht64_call_hpte_updatepp, ppc_md.hpte_updatepp);
562 #endif /* CONFIG_PPC_HAS_HASH_64K */ 561 #endif /* CONFIG_PPC_HAS_HASH_64K */
563 562
564 make_bl(htab_call_hpte_insert1, ppc_md.hpte_insert); 563 make_bl(htab_call_hpte_insert1, ppc_md.hpte_insert);
565 make_bl(htab_call_hpte_insert2, ppc_md.hpte_insert); 564 make_bl(htab_call_hpte_insert2, ppc_md.hpte_insert);
566 make_bl(htab_call_hpte_remove, ppc_md.hpte_remove); 565 make_bl(htab_call_hpte_remove, ppc_md.hpte_remove);
567 make_bl(htab_call_hpte_updatepp, ppc_md.hpte_updatepp); 566 make_bl(htab_call_hpte_updatepp, ppc_md.hpte_updatepp);
568 } 567 }
569 568
570 void __init htab_initialize(void) 569 void __init htab_initialize(void)
571 { 570 {
572 unsigned long table; 571 unsigned long table;
573 unsigned long pteg_count; 572 unsigned long pteg_count;
574 unsigned long mode_rw; 573 unsigned long mode_rw;
575 unsigned long base = 0, size = 0, limit; 574 unsigned long base = 0, size = 0, limit;
576 int i; 575 int i;
577 576
578 DBG(" -> htab_initialize()\n"); 577 DBG(" -> htab_initialize()\n");
579 578
580 /* Initialize segment sizes */ 579 /* Initialize segment sizes */
581 htab_init_seg_sizes(); 580 htab_init_seg_sizes();
582 581
583 /* Initialize page sizes */ 582 /* Initialize page sizes */
584 htab_init_page_sizes(); 583 htab_init_page_sizes();
585 584
586 if (cpu_has_feature(CPU_FTR_1T_SEGMENT)) { 585 if (cpu_has_feature(CPU_FTR_1T_SEGMENT)) {
587 mmu_kernel_ssize = MMU_SEGSIZE_1T; 586 mmu_kernel_ssize = MMU_SEGSIZE_1T;
588 mmu_highuser_ssize = MMU_SEGSIZE_1T; 587 mmu_highuser_ssize = MMU_SEGSIZE_1T;
589 printk(KERN_INFO "Using 1TB segments\n"); 588 printk(KERN_INFO "Using 1TB segments\n");
590 } 589 }
591 590
592 /* 591 /*
593 * Calculate the required size of the htab. We want the number of 592 * Calculate the required size of the htab. We want the number of
594 * PTEGs to equal one half the number of real pages. 593 * PTEGs to equal one half the number of real pages.
595 */ 594 */
596 htab_size_bytes = htab_get_table_size(); 595 htab_size_bytes = htab_get_table_size();
597 pteg_count = htab_size_bytes >> 7; 596 pteg_count = htab_size_bytes >> 7;
598 597
599 htab_hash_mask = pteg_count - 1; 598 htab_hash_mask = pteg_count - 1;
600 599
601 if (firmware_has_feature(FW_FEATURE_LPAR)) { 600 if (firmware_has_feature(FW_FEATURE_LPAR)) {
602 /* Using a hypervisor which owns the htab */ 601 /* Using a hypervisor which owns the htab */
603 htab_address = NULL; 602 htab_address = NULL;
604 _SDR1 = 0; 603 _SDR1 = 0;
605 } else { 604 } else {
606 /* Find storage for the HPT. Must be contiguous in 605 /* Find storage for the HPT. Must be contiguous in
607 * the absolute address space. On cell we want it to be 606 * the absolute address space. On cell we want it to be
608 * in the first 2 Gig so we can use it for IOMMU hacks. 607 * in the first 2 Gig so we can use it for IOMMU hacks.
609 */ 608 */
610 if (machine_is(cell)) 609 if (machine_is(cell))
611 limit = 0x80000000; 610 limit = 0x80000000;
612 else 611 else
613 limit = 0; 612 limit = 0;
614 613
615 table = lmb_alloc_base(htab_size_bytes, htab_size_bytes, limit); 614 table = lmb_alloc_base(htab_size_bytes, htab_size_bytes, limit);
616 615
617 DBG("Hash table allocated at %lx, size: %lx\n", table, 616 DBG("Hash table allocated at %lx, size: %lx\n", table,
618 htab_size_bytes); 617 htab_size_bytes);
619 618
620 htab_address = abs_to_virt(table); 619 htab_address = abs_to_virt(table);
621 620
622 /* htab absolute addr + encoded htabsize */ 621 /* htab absolute addr + encoded htabsize */
623 _SDR1 = table + __ilog2(pteg_count) - 11; 622 _SDR1 = table + __ilog2(pteg_count) - 11;
624 623
625 /* Initialize the HPT with no entries */ 624 /* Initialize the HPT with no entries */
626 memset((void *)table, 0, htab_size_bytes); 625 memset((void *)table, 0, htab_size_bytes);
627 626
628 /* Set SDR1 */ 627 /* Set SDR1 */
629 mtspr(SPRN_SDR1, _SDR1); 628 mtspr(SPRN_SDR1, _SDR1);
630 } 629 }
631 630
632 mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX; 631 mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX;
633 632
634 #ifdef CONFIG_DEBUG_PAGEALLOC 633 #ifdef CONFIG_DEBUG_PAGEALLOC
635 linear_map_hash_count = lmb_end_of_DRAM() >> PAGE_SHIFT; 634 linear_map_hash_count = lmb_end_of_DRAM() >> PAGE_SHIFT;
636 linear_map_hash_slots = __va(lmb_alloc_base(linear_map_hash_count, 635 linear_map_hash_slots = __va(lmb_alloc_base(linear_map_hash_count,
637 1, lmb.rmo_size)); 636 1, lmb.rmo_size));
638 memset(linear_map_hash_slots, 0, linear_map_hash_count); 637 memset(linear_map_hash_slots, 0, linear_map_hash_count);
639 #endif /* CONFIG_DEBUG_PAGEALLOC */ 638 #endif /* CONFIG_DEBUG_PAGEALLOC */
640 639
641 /* On U3 based machines, we need to reserve the DART area and 640 /* On U3 based machines, we need to reserve the DART area and
642 * _NOT_ map it to avoid cache paradoxes as it's remapped non 641 * _NOT_ map it to avoid cache paradoxes as it's remapped non
643 * cacheable later on 642 * cacheable later on
644 */ 643 */
645 644
646 /* create bolted the linear mapping in the hash table */ 645 /* create bolted the linear mapping in the hash table */
647 for (i=0; i < lmb.memory.cnt; i++) { 646 for (i=0; i < lmb.memory.cnt; i++) {
648 base = (unsigned long)__va(lmb.memory.region[i].base); 647 base = (unsigned long)__va(lmb.memory.region[i].base);
649 size = lmb.memory.region[i].size; 648 size = lmb.memory.region[i].size;
650 649
651 DBG("creating mapping for region: %lx : %lx\n", base, size); 650 DBG("creating mapping for region: %lx : %lx\n", base, size);
652 651
653 #ifdef CONFIG_U3_DART 652 #ifdef CONFIG_U3_DART
654 /* Do not map the DART space. Fortunately, it will be aligned 653 /* Do not map the DART space. Fortunately, it will be aligned
655 * in such a way that it will not cross two lmb regions and 654 * in such a way that it will not cross two lmb regions and
656 * will fit within a single 16Mb page. 655 * will fit within a single 16Mb page.
657 * The DART space is assumed to be a full 16Mb region even if 656 * The DART space is assumed to be a full 16Mb region even if
658 * we only use 2Mb of that space. We will use more of it later 657 * we only use 2Mb of that space. We will use more of it later
659 * for AGP GART. We have to use a full 16Mb large page. 658 * for AGP GART. We have to use a full 16Mb large page.
660 */ 659 */
661 DBG("DART base: %lx\n", dart_tablebase); 660 DBG("DART base: %lx\n", dart_tablebase);
662 661
663 if (dart_tablebase != 0 && dart_tablebase >= base 662 if (dart_tablebase != 0 && dart_tablebase >= base
664 && dart_tablebase < (base + size)) { 663 && dart_tablebase < (base + size)) {
665 unsigned long dart_table_end = dart_tablebase + 16 * MB; 664 unsigned long dart_table_end = dart_tablebase + 16 * MB;
666 if (base != dart_tablebase) 665 if (base != dart_tablebase)
667 BUG_ON(htab_bolt_mapping(base, dart_tablebase, 666 BUG_ON(htab_bolt_mapping(base, dart_tablebase,
668 __pa(base), mode_rw, 667 __pa(base), mode_rw,
669 mmu_linear_psize, 668 mmu_linear_psize,
670 mmu_kernel_ssize)); 669 mmu_kernel_ssize));
671 if ((base + size) > dart_table_end) 670 if ((base + size) > dart_table_end)
672 BUG_ON(htab_bolt_mapping(dart_tablebase+16*MB, 671 BUG_ON(htab_bolt_mapping(dart_tablebase+16*MB,
673 base + size, 672 base + size,
674 __pa(dart_table_end), 673 __pa(dart_table_end),
675 mode_rw, 674 mode_rw,
676 mmu_linear_psize, 675 mmu_linear_psize,
677 mmu_kernel_ssize)); 676 mmu_kernel_ssize));
678 continue; 677 continue;
679 } 678 }
680 #endif /* CONFIG_U3_DART */ 679 #endif /* CONFIG_U3_DART */
681 BUG_ON(htab_bolt_mapping(base, base + size, __pa(base), 680 BUG_ON(htab_bolt_mapping(base, base + size, __pa(base),
682 mode_rw, mmu_linear_psize, mmu_kernel_ssize)); 681 mode_rw, mmu_linear_psize, mmu_kernel_ssize));
683 } 682 }
684 683
685 /* 684 /*
686 * If we have a memory_limit and we've allocated TCEs then we need to 685 * If we have a memory_limit and we've allocated TCEs then we need to
687 * explicitly map the TCE area at the top of RAM. We also cope with the 686 * explicitly map the TCE area at the top of RAM. We also cope with the
688 * case that the TCEs start below memory_limit. 687 * case that the TCEs start below memory_limit.
689 * tce_alloc_start/end are 16MB aligned so the mapping should work 688 * tce_alloc_start/end are 16MB aligned so the mapping should work
690 * for either 4K or 16MB pages. 689 * for either 4K or 16MB pages.
691 */ 690 */
692 if (tce_alloc_start) { 691 if (tce_alloc_start) {
693 tce_alloc_start = (unsigned long)__va(tce_alloc_start); 692 tce_alloc_start = (unsigned long)__va(tce_alloc_start);
694 tce_alloc_end = (unsigned long)__va(tce_alloc_end); 693 tce_alloc_end = (unsigned long)__va(tce_alloc_end);
695 694
696 if (base + size >= tce_alloc_start) 695 if (base + size >= tce_alloc_start)
697 tce_alloc_start = base + size + 1; 696 tce_alloc_start = base + size + 1;
698 697
699 BUG_ON(htab_bolt_mapping(tce_alloc_start, tce_alloc_end, 698 BUG_ON(htab_bolt_mapping(tce_alloc_start, tce_alloc_end,
700 __pa(tce_alloc_start), mode_rw, 699 __pa(tce_alloc_start), mode_rw,
701 mmu_linear_psize, mmu_kernel_ssize)); 700 mmu_linear_psize, mmu_kernel_ssize));
702 } 701 }
703 702
704 htab_finish_init(); 703 htab_finish_init();
705 704
706 DBG(" <- htab_initialize()\n"); 705 DBG(" <- htab_initialize()\n");
707 } 706 }
708 #undef KB 707 #undef KB
709 #undef MB 708 #undef MB
710 709
711 void htab_initialize_secondary(void) 710 void htab_initialize_secondary(void)
712 { 711 {
713 if (!firmware_has_feature(FW_FEATURE_LPAR)) 712 if (!firmware_has_feature(FW_FEATURE_LPAR))
714 mtspr(SPRN_SDR1, _SDR1); 713 mtspr(SPRN_SDR1, _SDR1);
715 } 714 }
716 715
717 /* 716 /*
718 * Called by asm hashtable.S for doing lazy icache flush 717 * Called by asm hashtable.S for doing lazy icache flush
719 */ 718 */
720 unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap) 719 unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap)
721 { 720 {
722 struct page *page; 721 struct page *page;
723 722
724 if (!pfn_valid(pte_pfn(pte))) 723 if (!pfn_valid(pte_pfn(pte)))
725 return pp; 724 return pp;
726 725
727 page = pte_page(pte); 726 page = pte_page(pte);
728 727
729 /* page is dirty */ 728 /* page is dirty */
730 if (!test_bit(PG_arch_1, &page->flags) && !PageReserved(page)) { 729 if (!test_bit(PG_arch_1, &page->flags) && !PageReserved(page)) {
731 if (trap == 0x400) { 730 if (trap == 0x400) {
732 __flush_dcache_icache(page_address(page)); 731 __flush_dcache_icache(page_address(page));
733 set_bit(PG_arch_1, &page->flags); 732 set_bit(PG_arch_1, &page->flags);
734 } else 733 } else
735 pp |= HPTE_R_N; 734 pp |= HPTE_R_N;
736 } 735 }
737 return pp; 736 return pp;
738 } 737 }
739 738
740 #ifdef CONFIG_PPC_MM_SLICES 739 #ifdef CONFIG_PPC_MM_SLICES
741 unsigned int get_paca_psize(unsigned long addr) 740 unsigned int get_paca_psize(unsigned long addr)
742 { 741 {
743 unsigned long index, slices; 742 unsigned long index, slices;
744 743
745 if (addr < SLICE_LOW_TOP) { 744 if (addr < SLICE_LOW_TOP) {
746 slices = get_paca()->context.low_slices_psize; 745 slices = get_paca()->context.low_slices_psize;
747 index = GET_LOW_SLICE_INDEX(addr); 746 index = GET_LOW_SLICE_INDEX(addr);
748 } else { 747 } else {
749 slices = get_paca()->context.high_slices_psize; 748 slices = get_paca()->context.high_slices_psize;
750 index = GET_HIGH_SLICE_INDEX(addr); 749 index = GET_HIGH_SLICE_INDEX(addr);
751 } 750 }
752 return (slices >> (index * 4)) & 0xF; 751 return (slices >> (index * 4)) & 0xF;
753 } 752 }
754 753
755 #else 754 #else
756 unsigned int get_paca_psize(unsigned long addr) 755 unsigned int get_paca_psize(unsigned long addr)
757 { 756 {
758 return get_paca()->context.user_psize; 757 return get_paca()->context.user_psize;
759 } 758 }
760 #endif 759 #endif
761 760
762 /* 761 /*
763 * Demote a segment to using 4k pages. 762 * Demote a segment to using 4k pages.
764 * For now this makes the whole process use 4k pages. 763 * For now this makes the whole process use 4k pages.
765 */ 764 */
766 #ifdef CONFIG_PPC_64K_PAGES 765 #ifdef CONFIG_PPC_64K_PAGES
767 void demote_segment_4k(struct mm_struct *mm, unsigned long addr) 766 void demote_segment_4k(struct mm_struct *mm, unsigned long addr)
768 { 767 {
769 if (get_slice_psize(mm, addr) == MMU_PAGE_4K) 768 if (get_slice_psize(mm, addr) == MMU_PAGE_4K)
770 return; 769 return;
771 slice_set_range_psize(mm, addr, 1, MMU_PAGE_4K); 770 slice_set_range_psize(mm, addr, 1, MMU_PAGE_4K);
772 #ifdef CONFIG_SPU_BASE 771 #ifdef CONFIG_SPU_BASE
773 spu_flush_all_slbs(mm); 772 spu_flush_all_slbs(mm);
774 #endif 773 #endif
775 if (get_paca_psize(addr) != MMU_PAGE_4K) { 774 if (get_paca_psize(addr) != MMU_PAGE_4K) {
776 get_paca()->context = mm->context; 775 get_paca()->context = mm->context;
777 slb_flush_and_rebolt(); 776 slb_flush_and_rebolt();
778 } 777 }
779 } 778 }
780 #endif /* CONFIG_PPC_64K_PAGES */ 779 #endif /* CONFIG_PPC_64K_PAGES */
781 780
782 #ifdef CONFIG_PPC_SUBPAGE_PROT 781 #ifdef CONFIG_PPC_SUBPAGE_PROT
783 /* 782 /*
784 * This looks up a 2-bit protection code for a 4k subpage of a 64k page. 783 * This looks up a 2-bit protection code for a 4k subpage of a 64k page.
785 * Userspace sets the subpage permissions using the subpage_prot system call. 784 * Userspace sets the subpage permissions using the subpage_prot system call.
786 * 785 *
787 * Result is 0: full permissions, _PAGE_RW: read-only, 786 * Result is 0: full permissions, _PAGE_RW: read-only,
788 * _PAGE_USER or _PAGE_USER|_PAGE_RW: no access. 787 * _PAGE_USER or _PAGE_USER|_PAGE_RW: no access.
789 */ 788 */
790 static int subpage_protection(pgd_t *pgdir, unsigned long ea) 789 static int subpage_protection(pgd_t *pgdir, unsigned long ea)
791 { 790 {
792 struct subpage_prot_table *spt = pgd_subpage_prot(pgdir); 791 struct subpage_prot_table *spt = pgd_subpage_prot(pgdir);
793 u32 spp = 0; 792 u32 spp = 0;
794 u32 **sbpm, *sbpp; 793 u32 **sbpm, *sbpp;
795 794
796 if (ea >= spt->maxaddr) 795 if (ea >= spt->maxaddr)
797 return 0; 796 return 0;
798 if (ea < 0x100000000) { 797 if (ea < 0x100000000) {
799 /* addresses below 4GB use spt->low_prot */ 798 /* addresses below 4GB use spt->low_prot */
800 sbpm = spt->low_prot; 799 sbpm = spt->low_prot;
801 } else { 800 } else {
802 sbpm = spt->protptrs[ea >> SBP_L3_SHIFT]; 801 sbpm = spt->protptrs[ea >> SBP_L3_SHIFT];
803 if (!sbpm) 802 if (!sbpm)
804 return 0; 803 return 0;
805 } 804 }
806 sbpp = sbpm[(ea >> SBP_L2_SHIFT) & (SBP_L2_COUNT - 1)]; 805 sbpp = sbpm[(ea >> SBP_L2_SHIFT) & (SBP_L2_COUNT - 1)];
807 if (!sbpp) 806 if (!sbpp)
808 return 0; 807 return 0;
809 spp = sbpp[(ea >> PAGE_SHIFT) & (SBP_L1_COUNT - 1)]; 808 spp = sbpp[(ea >> PAGE_SHIFT) & (SBP_L1_COUNT - 1)];
810 809
811 /* extract 2-bit bitfield for this 4k subpage */ 810 /* extract 2-bit bitfield for this 4k subpage */
812 spp >>= 30 - 2 * ((ea >> 12) & 0xf); 811 spp >>= 30 - 2 * ((ea >> 12) & 0xf);
813 812
814 /* turn 0,1,2,3 into combination of _PAGE_USER and _PAGE_RW */ 813 /* turn 0,1,2,3 into combination of _PAGE_USER and _PAGE_RW */
815 spp = ((spp & 2) ? _PAGE_USER : 0) | ((spp & 1) ? _PAGE_RW : 0); 814 spp = ((spp & 2) ? _PAGE_USER : 0) | ((spp & 1) ? _PAGE_RW : 0);
816 return spp; 815 return spp;
817 } 816 }
818 817
819 #else /* CONFIG_PPC_SUBPAGE_PROT */ 818 #else /* CONFIG_PPC_SUBPAGE_PROT */
820 static inline int subpage_protection(pgd_t *pgdir, unsigned long ea) 819 static inline int subpage_protection(pgd_t *pgdir, unsigned long ea)
821 { 820 {
822 return 0; 821 return 0;
823 } 822 }
824 #endif 823 #endif
825 824
826 /* Result code is: 825 /* Result code is:
827 * 0 - handled 826 * 0 - handled
828 * 1 - normal page fault 827 * 1 - normal page fault
829 * -1 - critical hash insertion error 828 * -1 - critical hash insertion error
830 * -2 - access not permitted by subpage protection mechanism 829 * -2 - access not permitted by subpage protection mechanism
831 */ 830 */
832 int hash_page(unsigned long ea, unsigned long access, unsigned long trap) 831 int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
833 { 832 {
834 void *pgdir; 833 void *pgdir;
835 unsigned long vsid; 834 unsigned long vsid;
836 struct mm_struct *mm; 835 struct mm_struct *mm;
837 pte_t *ptep; 836 pte_t *ptep;
838 cpumask_t tmp; 837 cpumask_t tmp;
839 int rc, user_region = 0, local = 0; 838 int rc, user_region = 0, local = 0;
840 int psize, ssize; 839 int psize, ssize;
841 840
842 DBG_LOW("hash_page(ea=%016lx, access=%lx, trap=%lx\n", 841 DBG_LOW("hash_page(ea=%016lx, access=%lx, trap=%lx\n",
843 ea, access, trap); 842 ea, access, trap);
844 843
845 if ((ea & ~REGION_MASK) >= PGTABLE_RANGE) { 844 if ((ea & ~REGION_MASK) >= PGTABLE_RANGE) {
846 DBG_LOW(" out of pgtable range !\n"); 845 DBG_LOW(" out of pgtable range !\n");
847 return 1; 846 return 1;
848 } 847 }
849 848
850 /* Get region & vsid */ 849 /* Get region & vsid */
851 switch (REGION_ID(ea)) { 850 switch (REGION_ID(ea)) {
852 case USER_REGION_ID: 851 case USER_REGION_ID:
853 user_region = 1; 852 user_region = 1;
854 mm = current->mm; 853 mm = current->mm;
855 if (! mm) { 854 if (! mm) {
856 DBG_LOW(" user region with no mm !\n"); 855 DBG_LOW(" user region with no mm !\n");
857 return 1; 856 return 1;
858 } 857 }
859 psize = get_slice_psize(mm, ea); 858 psize = get_slice_psize(mm, ea);
860 ssize = user_segment_size(ea); 859 ssize = user_segment_size(ea);
861 vsid = get_vsid(mm->context.id, ea, ssize); 860 vsid = get_vsid(mm->context.id, ea, ssize);
862 break; 861 break;
863 case VMALLOC_REGION_ID: 862 case VMALLOC_REGION_ID:
864 mm = &init_mm; 863 mm = &init_mm;
865 vsid = get_kernel_vsid(ea, mmu_kernel_ssize); 864 vsid = get_kernel_vsid(ea, mmu_kernel_ssize);
866 if (ea < VMALLOC_END) 865 if (ea < VMALLOC_END)
867 psize = mmu_vmalloc_psize; 866 psize = mmu_vmalloc_psize;
868 else 867 else
869 psize = mmu_io_psize; 868 psize = mmu_io_psize;
870 ssize = mmu_kernel_ssize; 869 ssize = mmu_kernel_ssize;
871 break; 870 break;
872 default: 871 default:
873 /* Not a valid range 872 /* Not a valid range
874 * Send the problem up to do_page_fault 873 * Send the problem up to do_page_fault
875 */ 874 */
876 return 1; 875 return 1;
877 } 876 }
878 DBG_LOW(" mm=%p, mm->pgdir=%p, vsid=%016lx\n", mm, mm->pgd, vsid); 877 DBG_LOW(" mm=%p, mm->pgdir=%p, vsid=%016lx\n", mm, mm->pgd, vsid);
879 878
880 /* Get pgdir */ 879 /* Get pgdir */
881 pgdir = mm->pgd; 880 pgdir = mm->pgd;
882 if (pgdir == NULL) 881 if (pgdir == NULL)
883 return 1; 882 return 1;
884 883
885 /* Check CPU locality */ 884 /* Check CPU locality */
886 tmp = cpumask_of_cpu(smp_processor_id()); 885 tmp = cpumask_of_cpu(smp_processor_id());
887 if (user_region && cpus_equal(mm->cpu_vm_mask, tmp)) 886 if (user_region && cpus_equal(mm->cpu_vm_mask, tmp))
888 local = 1; 887 local = 1;
889 888
890 #ifdef CONFIG_HUGETLB_PAGE 889 #ifdef CONFIG_HUGETLB_PAGE
891 /* Handle hugepage regions */ 890 /* Handle hugepage regions */
892 if (HPAGE_SHIFT && psize == mmu_huge_psize) { 891 if (HPAGE_SHIFT && mmu_huge_psizes[psize]) {
893 DBG_LOW(" -> huge page !\n"); 892 DBG_LOW(" -> huge page !\n");
894 return hash_huge_page(mm, access, ea, vsid, local, trap); 893 return hash_huge_page(mm, access, ea, vsid, local, trap);
895 } 894 }
896 #endif /* CONFIG_HUGETLB_PAGE */ 895 #endif /* CONFIG_HUGETLB_PAGE */
897 896
898 #ifndef CONFIG_PPC_64K_PAGES 897 #ifndef CONFIG_PPC_64K_PAGES
899 /* If we use 4K pages and our psize is not 4K, then we are hitting 898 /* If we use 4K pages and our psize is not 4K, then we are hitting
900 * a special driver mapping, we need to align the address before 899 * a special driver mapping, we need to align the address before
901 * we fetch the PTE 900 * we fetch the PTE
902 */ 901 */
903 if (psize != MMU_PAGE_4K) 902 if (psize != MMU_PAGE_4K)
904 ea &= ~((1ul << mmu_psize_defs[psize].shift) - 1); 903 ea &= ~((1ul << mmu_psize_defs[psize].shift) - 1);
905 #endif /* CONFIG_PPC_64K_PAGES */ 904 #endif /* CONFIG_PPC_64K_PAGES */
906 905
907 /* Get PTE and page size from page tables */ 906 /* Get PTE and page size from page tables */
908 ptep = find_linux_pte(pgdir, ea); 907 ptep = find_linux_pte(pgdir, ea);
909 if (ptep == NULL || !pte_present(*ptep)) { 908 if (ptep == NULL || !pte_present(*ptep)) {
910 DBG_LOW(" no PTE !\n"); 909 DBG_LOW(" no PTE !\n");
911 return 1; 910 return 1;
912 } 911 }
913 912
914 #ifndef CONFIG_PPC_64K_PAGES 913 #ifndef CONFIG_PPC_64K_PAGES
915 DBG_LOW(" i-pte: %016lx\n", pte_val(*ptep)); 914 DBG_LOW(" i-pte: %016lx\n", pte_val(*ptep));
916 #else 915 #else
917 DBG_LOW(" i-pte: %016lx %016lx\n", pte_val(*ptep), 916 DBG_LOW(" i-pte: %016lx %016lx\n", pte_val(*ptep),
918 pte_val(*(ptep + PTRS_PER_PTE))); 917 pte_val(*(ptep + PTRS_PER_PTE)));
919 #endif 918 #endif
920 /* Pre-check access permissions (will be re-checked atomically 919 /* Pre-check access permissions (will be re-checked atomically
921 * in __hash_page_XX but this pre-check is a fast path 920 * in __hash_page_XX but this pre-check is a fast path
922 */ 921 */
923 if (access & ~pte_val(*ptep)) { 922 if (access & ~pte_val(*ptep)) {
924 DBG_LOW(" no access !\n"); 923 DBG_LOW(" no access !\n");
925 return 1; 924 return 1;
926 } 925 }
927 926
928 /* Do actual hashing */ 927 /* Do actual hashing */
929 #ifdef CONFIG_PPC_64K_PAGES 928 #ifdef CONFIG_PPC_64K_PAGES
930 /* If _PAGE_4K_PFN is set, make sure this is a 4k segment */ 929 /* If _PAGE_4K_PFN is set, make sure this is a 4k segment */
931 if ((pte_val(*ptep) & _PAGE_4K_PFN) && psize == MMU_PAGE_64K) { 930 if ((pte_val(*ptep) & _PAGE_4K_PFN) && psize == MMU_PAGE_64K) {
932 demote_segment_4k(mm, ea); 931 demote_segment_4k(mm, ea);
933 psize = MMU_PAGE_4K; 932 psize = MMU_PAGE_4K;
934 } 933 }
935 934
936 /* If this PTE is non-cacheable and we have restrictions on 935 /* If this PTE is non-cacheable and we have restrictions on
937 * using non cacheable large pages, then we switch to 4k 936 * using non cacheable large pages, then we switch to 4k
938 */ 937 */
939 if (mmu_ci_restrictions && psize == MMU_PAGE_64K && 938 if (mmu_ci_restrictions && psize == MMU_PAGE_64K &&
940 (pte_val(*ptep) & _PAGE_NO_CACHE)) { 939 (pte_val(*ptep) & _PAGE_NO_CACHE)) {
941 if (user_region) { 940 if (user_region) {
942 demote_segment_4k(mm, ea); 941 demote_segment_4k(mm, ea);
943 psize = MMU_PAGE_4K; 942 psize = MMU_PAGE_4K;
944 } else if (ea < VMALLOC_END) { 943 } else if (ea < VMALLOC_END) {
945 /* 944 /*
946 * some driver did a non-cacheable mapping 945 * some driver did a non-cacheable mapping
947 * in vmalloc space, so switch vmalloc 946 * in vmalloc space, so switch vmalloc
948 * to 4k pages 947 * to 4k pages
949 */ 948 */
950 printk(KERN_ALERT "Reducing vmalloc segment " 949 printk(KERN_ALERT "Reducing vmalloc segment "
951 "to 4kB pages because of " 950 "to 4kB pages because of "
952 "non-cacheable mapping\n"); 951 "non-cacheable mapping\n");
953 psize = mmu_vmalloc_psize = MMU_PAGE_4K; 952 psize = mmu_vmalloc_psize = MMU_PAGE_4K;
954 #ifdef CONFIG_SPU_BASE 953 #ifdef CONFIG_SPU_BASE
955 spu_flush_all_slbs(mm); 954 spu_flush_all_slbs(mm);
956 #endif 955 #endif
957 } 956 }
958 } 957 }
959 if (user_region) { 958 if (user_region) {
960 if (psize != get_paca_psize(ea)) { 959 if (psize != get_paca_psize(ea)) {
961 get_paca()->context = mm->context; 960 get_paca()->context = mm->context;
962 slb_flush_and_rebolt(); 961 slb_flush_and_rebolt();
963 } 962 }
964 } else if (get_paca()->vmalloc_sllp != 963 } else if (get_paca()->vmalloc_sllp !=
965 mmu_psize_defs[mmu_vmalloc_psize].sllp) { 964 mmu_psize_defs[mmu_vmalloc_psize].sllp) {
966 get_paca()->vmalloc_sllp = 965 get_paca()->vmalloc_sllp =
967 mmu_psize_defs[mmu_vmalloc_psize].sllp; 966 mmu_psize_defs[mmu_vmalloc_psize].sllp;
968 slb_vmalloc_update(); 967 slb_vmalloc_update();
969 } 968 }
970 #endif /* CONFIG_PPC_64K_PAGES */ 969 #endif /* CONFIG_PPC_64K_PAGES */
971 970
972 #ifdef CONFIG_PPC_HAS_HASH_64K 971 #ifdef CONFIG_PPC_HAS_HASH_64K
973 if (psize == MMU_PAGE_64K) 972 if (psize == MMU_PAGE_64K)
974 rc = __hash_page_64K(ea, access, vsid, ptep, trap, local, ssize); 973 rc = __hash_page_64K(ea, access, vsid, ptep, trap, local, ssize);
975 else 974 else
976 #endif /* CONFIG_PPC_HAS_HASH_64K */ 975 #endif /* CONFIG_PPC_HAS_HASH_64K */
977 { 976 {
978 int spp = subpage_protection(pgdir, ea); 977 int spp = subpage_protection(pgdir, ea);
979 if (access & spp) 978 if (access & spp)
980 rc = -2; 979 rc = -2;
981 else 980 else
982 rc = __hash_page_4K(ea, access, vsid, ptep, trap, 981 rc = __hash_page_4K(ea, access, vsid, ptep, trap,
983 local, ssize, spp); 982 local, ssize, spp);
984 } 983 }
985 984
986 #ifndef CONFIG_PPC_64K_PAGES 985 #ifndef CONFIG_PPC_64K_PAGES
987 DBG_LOW(" o-pte: %016lx\n", pte_val(*ptep)); 986 DBG_LOW(" o-pte: %016lx\n", pte_val(*ptep));
988 #else 987 #else
989 DBG_LOW(" o-pte: %016lx %016lx\n", pte_val(*ptep), 988 DBG_LOW(" o-pte: %016lx %016lx\n", pte_val(*ptep),
990 pte_val(*(ptep + PTRS_PER_PTE))); 989 pte_val(*(ptep + PTRS_PER_PTE)));
991 #endif 990 #endif
992 DBG_LOW(" -> rc=%d\n", rc); 991 DBG_LOW(" -> rc=%d\n", rc);
993 return rc; 992 return rc;
994 } 993 }
995 EXPORT_SYMBOL_GPL(hash_page); 994 EXPORT_SYMBOL_GPL(hash_page);
996 995
997 void hash_preload(struct mm_struct *mm, unsigned long ea, 996 void hash_preload(struct mm_struct *mm, unsigned long ea,
998 unsigned long access, unsigned long trap) 997 unsigned long access, unsigned long trap)
999 { 998 {
1000 unsigned long vsid; 999 unsigned long vsid;
1001 void *pgdir; 1000 void *pgdir;
1002 pte_t *ptep; 1001 pte_t *ptep;
1003 cpumask_t mask; 1002 cpumask_t mask;
1004 unsigned long flags; 1003 unsigned long flags;
1005 int local = 0; 1004 int local = 0;
1006 int ssize; 1005 int ssize;
1007 1006
1008 BUG_ON(REGION_ID(ea) != USER_REGION_ID); 1007 BUG_ON(REGION_ID(ea) != USER_REGION_ID);
1009 1008
1010 #ifdef CONFIG_PPC_MM_SLICES 1009 #ifdef CONFIG_PPC_MM_SLICES
1011 /* We only prefault standard pages for now */ 1010 /* We only prefault standard pages for now */
1012 if (unlikely(get_slice_psize(mm, ea) != mm->context.user_psize)) 1011 if (unlikely(get_slice_psize(mm, ea) != mm->context.user_psize))
1013 return; 1012 return;
1014 #endif 1013 #endif
1015 1014
1016 DBG_LOW("hash_preload(mm=%p, mm->pgdir=%p, ea=%016lx, access=%lx," 1015 DBG_LOW("hash_preload(mm=%p, mm->pgdir=%p, ea=%016lx, access=%lx,"
1017 " trap=%lx\n", mm, mm->pgd, ea, access, trap); 1016 " trap=%lx\n", mm, mm->pgd, ea, access, trap);
1018 1017
1019 /* Get Linux PTE if available */ 1018 /* Get Linux PTE if available */
1020 pgdir = mm->pgd; 1019 pgdir = mm->pgd;
1021 if (pgdir == NULL) 1020 if (pgdir == NULL)
1022 return; 1021 return;
1023 ptep = find_linux_pte(pgdir, ea); 1022 ptep = find_linux_pte(pgdir, ea);
1024 if (!ptep) 1023 if (!ptep)
1025 return; 1024 return;
1026 1025
1027 #ifdef CONFIG_PPC_64K_PAGES 1026 #ifdef CONFIG_PPC_64K_PAGES
1028 /* If either _PAGE_4K_PFN or _PAGE_NO_CACHE is set (and we are on 1027 /* If either _PAGE_4K_PFN or _PAGE_NO_CACHE is set (and we are on
1029 * a 64K kernel), then we don't preload, hash_page() will take 1028 * a 64K kernel), then we don't preload, hash_page() will take
1030 * care of it once we actually try to access the page. 1029 * care of it once we actually try to access the page.
1031 * That way we don't have to duplicate all of the logic for segment 1030 * That way we don't have to duplicate all of the logic for segment
1032 * page size demotion here 1031 * page size demotion here
1033 */ 1032 */
1034 if (pte_val(*ptep) & (_PAGE_4K_PFN | _PAGE_NO_CACHE)) 1033 if (pte_val(*ptep) & (_PAGE_4K_PFN | _PAGE_NO_CACHE))
1035 return; 1034 return;
1036 #endif /* CONFIG_PPC_64K_PAGES */ 1035 #endif /* CONFIG_PPC_64K_PAGES */
1037 1036
1038 /* Get VSID */ 1037 /* Get VSID */
1039 ssize = user_segment_size(ea); 1038 ssize = user_segment_size(ea);
1040 vsid = get_vsid(mm->context.id, ea, ssize); 1039 vsid = get_vsid(mm->context.id, ea, ssize);
1041 1040
1042 /* Hash doesn't like irqs */ 1041 /* Hash doesn't like irqs */
1043 local_irq_save(flags); 1042 local_irq_save(flags);
1044 1043
1045 /* Is that local to this CPU ? */ 1044 /* Is that local to this CPU ? */
1046 mask = cpumask_of_cpu(smp_processor_id()); 1045 mask = cpumask_of_cpu(smp_processor_id());
1047 if (cpus_equal(mm->cpu_vm_mask, mask)) 1046 if (cpus_equal(mm->cpu_vm_mask, mask))
1048 local = 1; 1047 local = 1;
1049 1048
1050 /* Hash it in */ 1049 /* Hash it in */
1051 #ifdef CONFIG_PPC_HAS_HASH_64K 1050 #ifdef CONFIG_PPC_HAS_HASH_64K
1052 if (mm->context.user_psize == MMU_PAGE_64K) 1051 if (mm->context.user_psize == MMU_PAGE_64K)
1053 __hash_page_64K(ea, access, vsid, ptep, trap, local, ssize); 1052 __hash_page_64K(ea, access, vsid, ptep, trap, local, ssize);
1054 else 1053 else
1055 #endif /* CONFIG_PPC_HAS_HASH_64K */ 1054 #endif /* CONFIG_PPC_HAS_HASH_64K */
1056 __hash_page_4K(ea, access, vsid, ptep, trap, local, ssize, 1055 __hash_page_4K(ea, access, vsid, ptep, trap, local, ssize,
1057 subpage_protection(pgdir, ea)); 1056 subpage_protection(pgdir, ea));
1058 1057
1059 local_irq_restore(flags); 1058 local_irq_restore(flags);
1060 } 1059 }
1061 1060
1062 /* WARNING: This is called from hash_low_64.S, if you change this prototype, 1061 /* WARNING: This is called from hash_low_64.S, if you change this prototype,
1063 * do not forget to update the assembly call site ! 1062 * do not forget to update the assembly call site !
1064 */ 1063 */
1065 void flush_hash_page(unsigned long va, real_pte_t pte, int psize, int ssize, 1064 void flush_hash_page(unsigned long va, real_pte_t pte, int psize, int ssize,
1066 int local) 1065 int local)
1067 { 1066 {
1068 unsigned long hash, index, shift, hidx, slot; 1067 unsigned long hash, index, shift, hidx, slot;
1069 1068
1070 DBG_LOW("flush_hash_page(va=%016x)\n", va); 1069 DBG_LOW("flush_hash_page(va=%016x)\n", va);
1071 pte_iterate_hashed_subpages(pte, psize, va, index, shift) { 1070 pte_iterate_hashed_subpages(pte, psize, va, index, shift) {
1072 hash = hpt_hash(va, shift, ssize); 1071 hash = hpt_hash(va, shift, ssize);
1073 hidx = __rpte_to_hidx(pte, index); 1072 hidx = __rpte_to_hidx(pte, index);
1074 if (hidx & _PTEIDX_SECONDARY) 1073 if (hidx & _PTEIDX_SECONDARY)
1075 hash = ~hash; 1074 hash = ~hash;
1076 slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; 1075 slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
1077 slot += hidx & _PTEIDX_GROUP_IX; 1076 slot += hidx & _PTEIDX_GROUP_IX;
1078 DBG_LOW(" sub %d: hash=%x, hidx=%x\n", index, slot, hidx); 1077 DBG_LOW(" sub %d: hash=%x, hidx=%x\n", index, slot, hidx);
1079 ppc_md.hpte_invalidate(slot, va, psize, ssize, local); 1078 ppc_md.hpte_invalidate(slot, va, psize, ssize, local);
1080 } pte_iterate_hashed_end(); 1079 } pte_iterate_hashed_end();
1081 } 1080 }
1082 1081
1083 void flush_hash_range(unsigned long number, int local) 1082 void flush_hash_range(unsigned long number, int local)
1084 { 1083 {
1085 if (ppc_md.flush_hash_range) 1084 if (ppc_md.flush_hash_range)
1086 ppc_md.flush_hash_range(number, local); 1085 ppc_md.flush_hash_range(number, local);
1087 else { 1086 else {
1088 int i; 1087 int i;
1089 struct ppc64_tlb_batch *batch = 1088 struct ppc64_tlb_batch *batch =
1090 &__get_cpu_var(ppc64_tlb_batch); 1089 &__get_cpu_var(ppc64_tlb_batch);
1091 1090
1092 for (i = 0; i < number; i++) 1091 for (i = 0; i < number; i++)
1093 flush_hash_page(batch->vaddr[i], batch->pte[i], 1092 flush_hash_page(batch->vaddr[i], batch->pte[i],
1094 batch->psize, batch->ssize, local); 1093 batch->psize, batch->ssize, local);
1095 } 1094 }
1096 } 1095 }
1097 1096
1098 /* 1097 /*
1099 * low_hash_fault is called when we the low level hash code failed 1098 * low_hash_fault is called when we the low level hash code failed
1100 * to instert a PTE due to an hypervisor error 1099 * to instert a PTE due to an hypervisor error
1101 */ 1100 */
1102 void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc) 1101 void low_hash_fault(struct pt_regs *regs, unsigned long address, int rc)
1103 { 1102 {
1104 if (user_mode(regs)) { 1103 if (user_mode(regs)) {
1105 #ifdef CONFIG_PPC_SUBPAGE_PROT 1104 #ifdef CONFIG_PPC_SUBPAGE_PROT
1106 if (rc == -2) 1105 if (rc == -2)
1107 _exception(SIGSEGV, regs, SEGV_ACCERR, address); 1106 _exception(SIGSEGV, regs, SEGV_ACCERR, address);
1108 else 1107 else
1109 #endif 1108 #endif
1110 _exception(SIGBUS, regs, BUS_ADRERR, address); 1109 _exception(SIGBUS, regs, BUS_ADRERR, address);
1111 } else 1110 } else
1112 bad_page_fault(regs, address, SIGBUS); 1111 bad_page_fault(regs, address, SIGBUS);
1113 } 1112 }
1114 1113
1115 #ifdef CONFIG_DEBUG_PAGEALLOC 1114 #ifdef CONFIG_DEBUG_PAGEALLOC
1116 static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi) 1115 static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
1117 { 1116 {
1118 unsigned long hash, hpteg; 1117 unsigned long hash, hpteg;
1119 unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize); 1118 unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize);
1120 unsigned long va = hpt_va(vaddr, vsid, mmu_kernel_ssize); 1119 unsigned long va = hpt_va(vaddr, vsid, mmu_kernel_ssize);
1121 unsigned long mode = _PAGE_ACCESSED | _PAGE_DIRTY | 1120 unsigned long mode = _PAGE_ACCESSED | _PAGE_DIRTY |
1122 _PAGE_COHERENT | PP_RWXX | HPTE_R_N; 1121 _PAGE_COHERENT | PP_RWXX | HPTE_R_N;
1123 int ret; 1122 int ret;
1124 1123
1125 hash = hpt_hash(va, PAGE_SHIFT, mmu_kernel_ssize); 1124 hash = hpt_hash(va, PAGE_SHIFT, mmu_kernel_ssize);
1126 hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP); 1125 hpteg = ((hash & htab_hash_mask) * HPTES_PER_GROUP);
1127 1126
1128 ret = ppc_md.hpte_insert(hpteg, va, __pa(vaddr), 1127 ret = ppc_md.hpte_insert(hpteg, va, __pa(vaddr),
1129 mode, HPTE_V_BOLTED, 1128 mode, HPTE_V_BOLTED,
1130 mmu_linear_psize, mmu_kernel_ssize); 1129 mmu_linear_psize, mmu_kernel_ssize);
1131 BUG_ON (ret < 0); 1130 BUG_ON (ret < 0);
1132 spin_lock(&linear_map_hash_lock); 1131 spin_lock(&linear_map_hash_lock);
1133 BUG_ON(linear_map_hash_slots[lmi] & 0x80); 1132 BUG_ON(linear_map_hash_slots[lmi] & 0x80);
1134 linear_map_hash_slots[lmi] = ret | 0x80; 1133 linear_map_hash_slots[lmi] = ret | 0x80;
1135 spin_unlock(&linear_map_hash_lock); 1134 spin_unlock(&linear_map_hash_lock);
1136 } 1135 }
1137 1136
1138 static void kernel_unmap_linear_page(unsigned long vaddr, unsigned long lmi) 1137 static void kernel_unmap_linear_page(unsigned long vaddr, unsigned long lmi)
1139 { 1138 {
1140 unsigned long hash, hidx, slot; 1139 unsigned long hash, hidx, slot;
1141 unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize); 1140 unsigned long vsid = get_kernel_vsid(vaddr, mmu_kernel_ssize);
1142 unsigned long va = hpt_va(vaddr, vsid, mmu_kernel_ssize); 1141 unsigned long va = hpt_va(vaddr, vsid, mmu_kernel_ssize);
1143 1142
1144 hash = hpt_hash(va, PAGE_SHIFT, mmu_kernel_ssize); 1143 hash = hpt_hash(va, PAGE_SHIFT, mmu_kernel_ssize);
1145 spin_lock(&linear_map_hash_lock); 1144 spin_lock(&linear_map_hash_lock);
1146 BUG_ON(!(linear_map_hash_slots[lmi] & 0x80)); 1145 BUG_ON(!(linear_map_hash_slots[lmi] & 0x80));
1147 hidx = linear_map_hash_slots[lmi] & 0x7f; 1146 hidx = linear_map_hash_slots[lmi] & 0x7f;
1148 linear_map_hash_slots[lmi] = 0; 1147 linear_map_hash_slots[lmi] = 0;
1149 spin_unlock(&linear_map_hash_lock); 1148 spin_unlock(&linear_map_hash_lock);
1150 if (hidx & _PTEIDX_SECONDARY) 1149 if (hidx & _PTEIDX_SECONDARY)
1151 hash = ~hash; 1150 hash = ~hash;
1152 slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; 1151 slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
1153 slot += hidx & _PTEIDX_GROUP_IX; 1152 slot += hidx & _PTEIDX_GROUP_IX;
1154 ppc_md.hpte_invalidate(slot, va, mmu_linear_psize, mmu_kernel_ssize, 0); 1153 ppc_md.hpte_invalidate(slot, va, mmu_linear_psize, mmu_kernel_ssize, 0);
1155 } 1154 }
1156 1155
1157 void kernel_map_pages(struct page *page, int numpages, int enable) 1156 void kernel_map_pages(struct page *page, int numpages, int enable)
1158 { 1157 {
1159 unsigned long flags, vaddr, lmi; 1158 unsigned long flags, vaddr, lmi;
1160 int i; 1159 int i;
1161 1160
1162 local_irq_save(flags); 1161 local_irq_save(flags);
1163 for (i = 0; i < numpages; i++, page++) { 1162 for (i = 0; i < numpages; i++, page++) {
1164 vaddr = (unsigned long)page_address(page); 1163 vaddr = (unsigned long)page_address(page);
1165 lmi = __pa(vaddr) >> PAGE_SHIFT; 1164 lmi = __pa(vaddr) >> PAGE_SHIFT;
1166 if (lmi >= linear_map_hash_count) 1165 if (lmi >= linear_map_hash_count)
1167 continue; 1166 continue;
1168 if (enable) 1167 if (enable)
1169 kernel_map_linear_page(vaddr, lmi); 1168 kernel_map_linear_page(vaddr, lmi);
1170 else 1169 else
1171 kernel_unmap_linear_page(vaddr, lmi); 1170 kernel_unmap_linear_page(vaddr, lmi);
1172 } 1171 }
1173 local_irq_restore(flags); 1172 local_irq_restore(flags);
1174 } 1173 }
1175 #endif /* CONFIG_DEBUG_PAGEALLOC */ 1174 #endif /* CONFIG_DEBUG_PAGEALLOC */
1176 1175
arch/powerpc/mm/hugetlbpage.c
1 /* 1 /*
2 * PPC64 (POWER4) Huge TLB Page Support for Kernel. 2 * PPC64 (POWER4) Huge TLB Page Support for Kernel.
3 * 3 *
4 * Copyright (C) 2003 David Gibson, IBM Corporation. 4 * Copyright (C) 2003 David Gibson, IBM Corporation.
5 * 5 *
6 * Based on the IA-32 version: 6 * Based on the IA-32 version:
7 * Copyright (C) 2002, Rohit Seth <rohit.seth@intel.com> 7 * Copyright (C) 2002, Rohit Seth <rohit.seth@intel.com>
8 */ 8 */
9 9
10 #include <linux/init.h> 10 #include <linux/init.h>
11 #include <linux/fs.h> 11 #include <linux/fs.h>
12 #include <linux/mm.h> 12 #include <linux/mm.h>
13 #include <linux/hugetlb.h> 13 #include <linux/hugetlb.h>
14 #include <linux/pagemap.h> 14 #include <linux/pagemap.h>
15 #include <linux/slab.h> 15 #include <linux/slab.h>
16 #include <linux/err.h> 16 #include <linux/err.h>
17 #include <linux/sysctl.h> 17 #include <linux/sysctl.h>
18 #include <asm/mman.h> 18 #include <asm/mman.h>
19 #include <asm/pgalloc.h> 19 #include <asm/pgalloc.h>
20 #include <asm/tlb.h> 20 #include <asm/tlb.h>
21 #include <asm/tlbflush.h> 21 #include <asm/tlbflush.h>
22 #include <asm/mmu_context.h> 22 #include <asm/mmu_context.h>
23 #include <asm/machdep.h> 23 #include <asm/machdep.h>
24 #include <asm/cputable.h> 24 #include <asm/cputable.h>
25 #include <asm/spu.h> 25 #include <asm/spu.h>
26 26
27 #define PAGE_SHIFT_64K 16 27 #define PAGE_SHIFT_64K 16
28 #define PAGE_SHIFT_16M 24 28 #define PAGE_SHIFT_16M 24
29 #define PAGE_SHIFT_16G 34 29 #define PAGE_SHIFT_16G 34
30 30
31 #define NUM_LOW_AREAS (0x100000000UL >> SID_SHIFT) 31 #define NUM_LOW_AREAS (0x100000000UL >> SID_SHIFT)
32 #define NUM_HIGH_AREAS (PGTABLE_RANGE >> HTLB_AREA_SHIFT) 32 #define NUM_HIGH_AREAS (PGTABLE_RANGE >> HTLB_AREA_SHIFT)
33 #define MAX_NUMBER_GPAGES 1024 33 #define MAX_NUMBER_GPAGES 1024
34 34
35 /* Tracks the 16G pages after the device tree is scanned and before the 35 /* Tracks the 16G pages after the device tree is scanned and before the
36 * huge_boot_pages list is ready. */ 36 * huge_boot_pages list is ready. */
37 static unsigned long gpage_freearray[MAX_NUMBER_GPAGES]; 37 static unsigned long gpage_freearray[MAX_NUMBER_GPAGES];
38 static unsigned nr_gpages; 38 static unsigned nr_gpages;
39 39
40 unsigned int hugepte_shift; 40 /* Array of valid huge page sizes - non-zero value(hugepte_shift) is
41 #define PTRS_PER_HUGEPTE (1 << hugepte_shift) 41 * stored for the huge page sizes that are valid.
42 #define HUGEPTE_TABLE_SIZE (sizeof(pte_t) << hugepte_shift) 42 */
43 unsigned int mmu_huge_psizes[MMU_PAGE_COUNT] = { }; /* initialize all to 0 */
43 44
44 #define HUGEPD_SHIFT (HPAGE_SHIFT + hugepte_shift) 45 #define hugepte_shift mmu_huge_psizes
45 #define HUGEPD_SIZE (1UL << HUGEPD_SHIFT) 46 #define PTRS_PER_HUGEPTE(psize) (1 << hugepte_shift[psize])
46 #define HUGEPD_MASK (~(HUGEPD_SIZE-1)) 47 #define HUGEPTE_TABLE_SIZE(psize) (sizeof(pte_t) << hugepte_shift[psize])
47 48
48 #define huge_pgtable_cache (pgtable_cache[HUGEPTE_CACHE_NUM]) 49 #define HUGEPD_SHIFT(psize) (mmu_psize_to_shift(psize) \
50 + hugepte_shift[psize])
51 #define HUGEPD_SIZE(psize) (1UL << HUGEPD_SHIFT(psize))
52 #define HUGEPD_MASK(psize) (~(HUGEPD_SIZE(psize)-1))
49 53
54 /* Subtract one from array size because we don't need a cache for 4K since
55 * is not a huge page size */
56 #define huge_pgtable_cache(psize) (pgtable_cache[HUGEPTE_CACHE_NUM \
57 + psize-1])
58 #define HUGEPTE_CACHE_NAME(psize) (huge_pgtable_cache_name[psize])
59
60 static const char *huge_pgtable_cache_name[MMU_PAGE_COUNT] = {
61 "unused_4K", "hugepte_cache_64K", "unused_64K_AP",
62 "hugepte_cache_1M", "hugepte_cache_16M", "hugepte_cache_16G"
63 };
64
50 /* Flag to mark huge PD pointers. This means pmd_bad() and pud_bad() 65 /* Flag to mark huge PD pointers. This means pmd_bad() and pud_bad()
51 * will choke on pointers to hugepte tables, which is handy for 66 * will choke on pointers to hugepte tables, which is handy for
52 * catching screwups early. */ 67 * catching screwups early. */
53 #define HUGEPD_OK 0x1 68 #define HUGEPD_OK 0x1
54 69
55 typedef struct { unsigned long pd; } hugepd_t; 70 typedef struct { unsigned long pd; } hugepd_t;
56 71
57 #define hugepd_none(hpd) ((hpd).pd == 0) 72 #define hugepd_none(hpd) ((hpd).pd == 0)
58 73
74 static inline int shift_to_mmu_psize(unsigned int shift)
75 {
76 switch (shift) {
77 #ifndef CONFIG_PPC_64K_PAGES
78 case PAGE_SHIFT_64K:
79 return MMU_PAGE_64K;
80 #endif
81 case PAGE_SHIFT_16M:
82 return MMU_PAGE_16M;
83 case PAGE_SHIFT_16G:
84 return MMU_PAGE_16G;
85 }
86 return -1;
87 }
88
89 static inline unsigned int mmu_psize_to_shift(unsigned int mmu_psize)
90 {
91 if (mmu_psize_defs[mmu_psize].shift)
92 return mmu_psize_defs[mmu_psize].shift;
93 BUG();
94 }
95
59 static inline pte_t *hugepd_page(hugepd_t hpd) 96 static inline pte_t *hugepd_page(hugepd_t hpd)
60 { 97 {
61 BUG_ON(!(hpd.pd & HUGEPD_OK)); 98 BUG_ON(!(hpd.pd & HUGEPD_OK));
62 return (pte_t *)(hpd.pd & ~HUGEPD_OK); 99 return (pte_t *)(hpd.pd & ~HUGEPD_OK);
63 } 100 }
64 101
65 static inline pte_t *hugepte_offset(hugepd_t *hpdp, unsigned long addr) 102 static inline pte_t *hugepte_offset(hugepd_t *hpdp, unsigned long addr,
103 struct hstate *hstate)
66 { 104 {
67 unsigned long idx = ((addr >> HPAGE_SHIFT) & (PTRS_PER_HUGEPTE-1)); 105 unsigned int shift = huge_page_shift(hstate);
106 int psize = shift_to_mmu_psize(shift);
107 unsigned long idx = ((addr >> shift) & (PTRS_PER_HUGEPTE(psize)-1));
68 pte_t *dir = hugepd_page(*hpdp); 108 pte_t *dir = hugepd_page(*hpdp);
69 109
70 return dir + idx; 110 return dir + idx;
71 } 111 }
72 112
73 static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, 113 static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp,
74 unsigned long address) 114 unsigned long address, unsigned int psize)
75 { 115 {
76 pte_t *new = kmem_cache_alloc(huge_pgtable_cache, 116 pte_t *new = kmem_cache_alloc(huge_pgtable_cache(psize),
77 GFP_KERNEL|__GFP_REPEAT); 117 GFP_KERNEL|__GFP_REPEAT);
78 118
79 if (! new) 119 if (! new)
80 return -ENOMEM; 120 return -ENOMEM;
81 121
82 spin_lock(&mm->page_table_lock); 122 spin_lock(&mm->page_table_lock);
83 if (!hugepd_none(*hpdp)) 123 if (!hugepd_none(*hpdp))
84 kmem_cache_free(huge_pgtable_cache, new); 124 kmem_cache_free(huge_pgtable_cache(psize), new);
85 else 125 else
86 hpdp->pd = (unsigned long)new | HUGEPD_OK; 126 hpdp->pd = (unsigned long)new | HUGEPD_OK;
87 spin_unlock(&mm->page_table_lock); 127 spin_unlock(&mm->page_table_lock);
88 return 0; 128 return 0;
89 } 129 }
90 130
91 /* Base page size affects how we walk hugetlb page tables */ 131 /* Base page size affects how we walk hugetlb page tables */
92 #ifdef CONFIG_PPC_64K_PAGES 132 #ifdef CONFIG_PPC_64K_PAGES
93 #define hpmd_offset(pud, addr) pmd_offset(pud, addr) 133 #define hpmd_offset(pud, addr, h) pmd_offset(pud, addr)
94 #define hpmd_alloc(mm, pud, addr) pmd_alloc(mm, pud, addr) 134 #define hpmd_alloc(mm, pud, addr, h) pmd_alloc(mm, pud, addr)
95 #else 135 #else
96 static inline 136 static inline
97 pmd_t *hpmd_offset(pud_t *pud, unsigned long addr) 137 pmd_t *hpmd_offset(pud_t *pud, unsigned long addr, struct hstate *hstate)
98 { 138 {
99 if (HPAGE_SHIFT == PAGE_SHIFT_64K) 139 if (huge_page_shift(hstate) == PAGE_SHIFT_64K)
100 return pmd_offset(pud, addr); 140 return pmd_offset(pud, addr);
101 else 141 else
102 return (pmd_t *) pud; 142 return (pmd_t *) pud;
103 } 143 }
104 static inline 144 static inline
105 pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr) 145 pmd_t *hpmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long addr,
146 struct hstate *hstate)
106 { 147 {
107 if (HPAGE_SHIFT == PAGE_SHIFT_64K) 148 if (huge_page_shift(hstate) == PAGE_SHIFT_64K)
108 return pmd_alloc(mm, pud, addr); 149 return pmd_alloc(mm, pud, addr);
109 else 150 else
110 return (pmd_t *) pud; 151 return (pmd_t *) pud;
111 } 152 }
112 #endif 153 #endif
113 154
114 /* Build list of addresses of gigantic pages. This function is used in early 155 /* Build list of addresses of gigantic pages. This function is used in early
115 * boot before the buddy or bootmem allocator is setup. 156 * boot before the buddy or bootmem allocator is setup.
116 */ 157 */
117 void add_gpage(unsigned long addr, unsigned long page_size, 158 void add_gpage(unsigned long addr, unsigned long page_size,
118 unsigned long number_of_pages) 159 unsigned long number_of_pages)
119 { 160 {
120 if (!addr) 161 if (!addr)
121 return; 162 return;
122 while (number_of_pages > 0) { 163 while (number_of_pages > 0) {
123 gpage_freearray[nr_gpages] = addr; 164 gpage_freearray[nr_gpages] = addr;
124 nr_gpages++; 165 nr_gpages++;
125 number_of_pages--; 166 number_of_pages--;
126 addr += page_size; 167 addr += page_size;
127 } 168 }
128 } 169 }
129 170
130 /* Moves the gigantic page addresses from the temporary list to the 171 /* Moves the gigantic page addresses from the temporary list to the
131 * huge_boot_pages list. */ 172 * huge_boot_pages list.
132 int alloc_bootmem_huge_page(struct hstate *h) 173 */
174 int alloc_bootmem_huge_page(struct hstate *hstate)
133 { 175 {
134 struct huge_bootmem_page *m; 176 struct huge_bootmem_page *m;
135 if (nr_gpages == 0) 177 if (nr_gpages == 0)
136 return 0; 178 return 0;
137 m = phys_to_virt(gpage_freearray[--nr_gpages]); 179 m = phys_to_virt(gpage_freearray[--nr_gpages]);
138 gpage_freearray[nr_gpages] = 0; 180 gpage_freearray[nr_gpages] = 0;
139 list_add(&m->list, &huge_boot_pages); 181 list_add(&m->list, &huge_boot_pages);
140 m->hstate = h; 182 m->hstate = hstate;
141 return 1; 183 return 1;
142 } 184 }
143 185
144 186
145 /* Modelled after find_linux_pte() */ 187 /* Modelled after find_linux_pte() */
146 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr) 188 pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
147 { 189 {
148 pgd_t *pg; 190 pgd_t *pg;
149 pud_t *pu; 191 pud_t *pu;
150 pmd_t *pm; 192 pmd_t *pm;
151 193
152 BUG_ON(get_slice_psize(mm, addr) != mmu_huge_psize); 194 unsigned int psize;
195 unsigned int shift;
196 unsigned long sz;
197 struct hstate *hstate;
198 psize = get_slice_psize(mm, addr);
199 shift = mmu_psize_to_shift(psize);
200 sz = ((1UL) << shift);
201 hstate = size_to_hstate(sz);
153 202
154 addr &= HPAGE_MASK; 203 addr &= hstate->mask;
155 204
156 pg = pgd_offset(mm, addr); 205 pg = pgd_offset(mm, addr);
157 if (!pgd_none(*pg)) { 206 if (!pgd_none(*pg)) {
158 pu = pud_offset(pg, addr); 207 pu = pud_offset(pg, addr);
159 if (!pud_none(*pu)) { 208 if (!pud_none(*pu)) {
160 pm = hpmd_offset(pu, addr); 209 pm = hpmd_offset(pu, addr, hstate);
161 if (!pmd_none(*pm)) 210 if (!pmd_none(*pm))
162 return hugepte_offset((hugepd_t *)pm, addr); 211 return hugepte_offset((hugepd_t *)pm, addr,
212 hstate);
163 } 213 }
164 } 214 }
165 215
166 return NULL; 216 return NULL;
167 } 217 }
168 218
169 pte_t *huge_pte_alloc(struct mm_struct *mm, 219 pte_t *huge_pte_alloc(struct mm_struct *mm,
170 unsigned long addr, unsigned long sz) 220 unsigned long addr, unsigned long sz)
171 { 221 {
172 pgd_t *pg; 222 pgd_t *pg;
173 pud_t *pu; 223 pud_t *pu;
174 pmd_t *pm; 224 pmd_t *pm;
175 hugepd_t *hpdp = NULL; 225 hugepd_t *hpdp = NULL;
226 struct hstate *hstate;
227 unsigned int psize;
228 hstate = size_to_hstate(sz);
176 229
177 BUG_ON(get_slice_psize(mm, addr) != mmu_huge_psize); 230 psize = get_slice_psize(mm, addr);
231 BUG_ON(!mmu_huge_psizes[psize]);
178 232
179 addr &= HPAGE_MASK; 233 addr &= hstate->mask;
180 234
181 pg = pgd_offset(mm, addr); 235 pg = pgd_offset(mm, addr);
182 pu = pud_alloc(mm, pg, addr); 236 pu = pud_alloc(mm, pg, addr);
183 237
184 if (pu) { 238 if (pu) {
185 pm = hpmd_alloc(mm, pu, addr); 239 pm = hpmd_alloc(mm, pu, addr, hstate);
186 if (pm) 240 if (pm)
187 hpdp = (hugepd_t *)pm; 241 hpdp = (hugepd_t *)pm;
188 } 242 }
189 243
190 if (! hpdp) 244 if (! hpdp)
191 return NULL; 245 return NULL;
192 246
193 if (hugepd_none(*hpdp) && __hugepte_alloc(mm, hpdp, addr)) 247 if (hugepd_none(*hpdp) && __hugepte_alloc(mm, hpdp, addr, psize))
194 return NULL; 248 return NULL;
195 249
196 return hugepte_offset(hpdp, addr); 250 return hugepte_offset(hpdp, addr, hstate);
197 } 251 }
198 252
199 int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep) 253 int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
200 { 254 {
201 return 0; 255 return 0;
202 } 256 }
203 257
204 static void free_hugepte_range(struct mmu_gather *tlb, hugepd_t *hpdp) 258 static void free_hugepte_range(struct mmu_gather *tlb, hugepd_t *hpdp,
259 unsigned int psize)
205 { 260 {
206 pte_t *hugepte = hugepd_page(*hpdp); 261 pte_t *hugepte = hugepd_page(*hpdp);
207 262
208 hpdp->pd = 0; 263 hpdp->pd = 0;
209 tlb->need_flush = 1; 264 tlb->need_flush = 1;
210 pgtable_free_tlb(tlb, pgtable_free_cache(hugepte, HUGEPTE_CACHE_NUM, 265 pgtable_free_tlb(tlb, pgtable_free_cache(hugepte,
266 HUGEPTE_CACHE_NUM+psize-1,
211 PGF_CACHENUM_MASK)); 267 PGF_CACHENUM_MASK));
212 } 268 }
213 269
214 static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, 270 static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
215 unsigned long addr, unsigned long end, 271 unsigned long addr, unsigned long end,
216 unsigned long floor, unsigned long ceiling) 272 unsigned long floor, unsigned long ceiling,
273 unsigned int psize)
217 { 274 {
218 pmd_t *pmd; 275 pmd_t *pmd;
219 unsigned long next; 276 unsigned long next;
220 unsigned long start; 277 unsigned long start;
221 278
222 start = addr; 279 start = addr;
223 pmd = pmd_offset(pud, addr); 280 pmd = pmd_offset(pud, addr);
224 do { 281 do {
225 next = pmd_addr_end(addr, end); 282 next = pmd_addr_end(addr, end);
226 if (pmd_none(*pmd)) 283 if (pmd_none(*pmd))
227 continue; 284 continue;
228 free_hugepte_range(tlb, (hugepd_t *)pmd); 285 free_hugepte_range(tlb, (hugepd_t *)pmd, psize);
229 } while (pmd++, addr = next, addr != end); 286 } while (pmd++, addr = next, addr != end);
230 287
231 start &= PUD_MASK; 288 start &= PUD_MASK;
232 if (start < floor) 289 if (start < floor)
233 return; 290 return;
234 if (ceiling) { 291 if (ceiling) {
235 ceiling &= PUD_MASK; 292 ceiling &= PUD_MASK;
236 if (!ceiling) 293 if (!ceiling)
237 return; 294 return;
238 } 295 }
239 if (end - 1 > ceiling - 1) 296 if (end - 1 > ceiling - 1)
240 return; 297 return;
241 298
242 pmd = pmd_offset(pud, start); 299 pmd = pmd_offset(pud, start);
243 pud_clear(pud); 300 pud_clear(pud);
244 pmd_free_tlb(tlb, pmd); 301 pmd_free_tlb(tlb, pmd);
245 } 302 }
246 303
247 static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd, 304 static void hugetlb_free_pud_range(struct mmu_gather *tlb, pgd_t *pgd,
248 unsigned long addr, unsigned long end, 305 unsigned long addr, unsigned long end,
249 unsigned long floor, unsigned long ceiling) 306 unsigned long floor, unsigned long ceiling)
250 { 307 {
251 pud_t *pud; 308 pud_t *pud;
252 unsigned long next; 309 unsigned long next;
253 unsigned long start; 310 unsigned long start;
311 unsigned int shift;
312 unsigned int psize = get_slice_psize(tlb->mm, addr);
313 shift = mmu_psize_to_shift(psize);
254 314
255 start = addr; 315 start = addr;
256 pud = pud_offset(pgd, addr); 316 pud = pud_offset(pgd, addr);
257 do { 317 do {
258 next = pud_addr_end(addr, end); 318 next = pud_addr_end(addr, end);
259 #ifdef CONFIG_PPC_64K_PAGES 319 #ifdef CONFIG_PPC_64K_PAGES
260 if (pud_none_or_clear_bad(pud)) 320 if (pud_none_or_clear_bad(pud))
261 continue; 321 continue;
262 hugetlb_free_pmd_range(tlb, pud, addr, next, floor, ceiling); 322 hugetlb_free_pmd_range(tlb, pud, addr, next, floor, ceiling,
323 psize);
263 #else 324 #else
264 if (HPAGE_SHIFT == PAGE_SHIFT_64K) { 325 if (shift == PAGE_SHIFT_64K) {
265 if (pud_none_or_clear_bad(pud)) 326 if (pud_none_or_clear_bad(pud))
266 continue; 327 continue;
267 hugetlb_free_pmd_range(tlb, pud, addr, next, floor, ceiling); 328 hugetlb_free_pmd_range(tlb, pud, addr, next, floor,
329 ceiling, psize);
268 } else { 330 } else {
269 if (pud_none(*pud)) 331 if (pud_none(*pud))
270 continue; 332 continue;
271 free_hugepte_range(tlb, (hugepd_t *)pud); 333 free_hugepte_range(tlb, (hugepd_t *)pud, psize);
272 } 334 }
273 #endif 335 #endif
274 } while (pud++, addr = next, addr != end); 336 } while (pud++, addr = next, addr != end);
275 337
276 start &= PGDIR_MASK; 338 start &= PGDIR_MASK;
277 if (start < floor) 339 if (start < floor)
278 return; 340 return;
279 if (ceiling) { 341 if (ceiling) {
280 ceiling &= PGDIR_MASK; 342 ceiling &= PGDIR_MASK;
281 if (!ceiling) 343 if (!ceiling)
282 return; 344 return;
283 } 345 }
284 if (end - 1 > ceiling - 1) 346 if (end - 1 > ceiling - 1)
285 return; 347 return;
286 348
287 pud = pud_offset(pgd, start); 349 pud = pud_offset(pgd, start);
288 pgd_clear(pgd); 350 pgd_clear(pgd);
289 pud_free_tlb(tlb, pud); 351 pud_free_tlb(tlb, pud);
290 } 352 }
291 353
292 /* 354 /*
293 * This function frees user-level page tables of a process. 355 * This function frees user-level page tables of a process.
294 * 356 *
295 * Must be called with pagetable lock held. 357 * Must be called with pagetable lock held.
296 */ 358 */
297 void hugetlb_free_pgd_range(struct mmu_gather *tlb, 359 void hugetlb_free_pgd_range(struct mmu_gather *tlb,
298 unsigned long addr, unsigned long end, 360 unsigned long addr, unsigned long end,
299 unsigned long floor, unsigned long ceiling) 361 unsigned long floor, unsigned long ceiling)
300 { 362 {
301 pgd_t *pgd; 363 pgd_t *pgd;
302 unsigned long next; 364 unsigned long next;
303 unsigned long start; 365 unsigned long start;
304 366
305 /* 367 /*
306 * Comments below take from the normal free_pgd_range(). They 368 * Comments below take from the normal free_pgd_range(). They
307 * apply here too. The tests against HUGEPD_MASK below are 369 * apply here too. The tests against HUGEPD_MASK below are
308 * essential, because we *don't* test for this at the bottom 370 * essential, because we *don't* test for this at the bottom
309 * level. Without them we'll attempt to free a hugepte table 371 * level. Without them we'll attempt to free a hugepte table
310 * when we unmap just part of it, even if there are other 372 * when we unmap just part of it, even if there are other
311 * active mappings using it. 373 * active mappings using it.
312 * 374 *
313 * The next few lines have given us lots of grief... 375 * The next few lines have given us lots of grief...
314 * 376 *
315 * Why are we testing HUGEPD* at this top level? Because 377 * Why are we testing HUGEPD* at this top level? Because
316 * often there will be no work to do at all, and we'd prefer 378 * often there will be no work to do at all, and we'd prefer
317 * not to go all the way down to the bottom just to discover 379 * not to go all the way down to the bottom just to discover
318 * that. 380 * that.
319 * 381 *
320 * Why all these "- 1"s? Because 0 represents both the bottom 382 * Why all these "- 1"s? Because 0 represents both the bottom
321 * of the address space and the top of it (using -1 for the 383 * of the address space and the top of it (using -1 for the
322 * top wouldn't help much: the masks would do the wrong thing). 384 * top wouldn't help much: the masks would do the wrong thing).
323 * The rule is that addr 0 and floor 0 refer to the bottom of 385 * The rule is that addr 0 and floor 0 refer to the bottom of
324 * the address space, but end 0 and ceiling 0 refer to the top 386 * the address space, but end 0 and ceiling 0 refer to the top
325 * Comparisons need to use "end - 1" and "ceiling - 1" (though 387 * Comparisons need to use "end - 1" and "ceiling - 1" (though
326 * that end 0 case should be mythical). 388 * that end 0 case should be mythical).
327 * 389 *
328 * Wherever addr is brought up or ceiling brought down, we 390 * Wherever addr is brought up or ceiling brought down, we
329 * must be careful to reject "the opposite 0" before it 391 * must be careful to reject "the opposite 0" before it
330 * confuses the subsequent tests. But what about where end is 392 * confuses the subsequent tests. But what about where end is
331 * brought down by HUGEPD_SIZE below? no, end can't go down to 393 * brought down by HUGEPD_SIZE below? no, end can't go down to
332 * 0 there. 394 * 0 there.
333 * 395 *
334 * Whereas we round start (addr) and ceiling down, by different 396 * Whereas we round start (addr) and ceiling down, by different
335 * masks at different levels, in order to test whether a table 397 * masks at different levels, in order to test whether a table
336 * now has no other vmas using it, so can be freed, we don't 398 * now has no other vmas using it, so can be freed, we don't
337 * bother to round floor or end up - the tests don't need that. 399 * bother to round floor or end up - the tests don't need that.
338 */ 400 */
401 unsigned int psize = get_slice_psize(tlb->mm, addr);
339 402
340 addr &= HUGEPD_MASK; 403 addr &= HUGEPD_MASK(psize);
341 if (addr < floor) { 404 if (addr < floor) {
342 addr += HUGEPD_SIZE; 405 addr += HUGEPD_SIZE(psize);
343 if (!addr) 406 if (!addr)
344 return; 407 return;
345 } 408 }
346 if (ceiling) { 409 if (ceiling) {
347 ceiling &= HUGEPD_MASK; 410 ceiling &= HUGEPD_MASK(psize);
348 if (!ceiling) 411 if (!ceiling)
349 return; 412 return;
350 } 413 }
351 if (end - 1 > ceiling - 1) 414 if (end - 1 > ceiling - 1)
352 end -= HUGEPD_SIZE; 415 end -= HUGEPD_SIZE(psize);
353 if (addr > end - 1) 416 if (addr > end - 1)
354 return; 417 return;
355 418
356 start = addr; 419 start = addr;
357 pgd = pgd_offset(tlb->mm, addr); 420 pgd = pgd_offset(tlb->mm, addr);
358 do { 421 do {
359 BUG_ON(get_slice_psize(tlb->mm, addr) != mmu_huge_psize); 422 psize = get_slice_psize(tlb->mm, addr);
423 BUG_ON(!mmu_huge_psizes[psize]);
360 next = pgd_addr_end(addr, end); 424 next = pgd_addr_end(addr, end);
361 if (pgd_none_or_clear_bad(pgd)) 425 if (pgd_none_or_clear_bad(pgd))
362 continue; 426 continue;
363 hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling); 427 hugetlb_free_pud_range(tlb, pgd, addr, next, floor, ceiling);
364 } while (pgd++, addr = next, addr != end); 428 } while (pgd++, addr = next, addr != end);
365 } 429 }
366 430
367 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, 431 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
368 pte_t *ptep, pte_t pte) 432 pte_t *ptep, pte_t pte)
369 { 433 {
370 if (pte_present(*ptep)) { 434 if (pte_present(*ptep)) {
371 /* We open-code pte_clear because we need to pass the right 435 /* We open-code pte_clear because we need to pass the right
372 * argument to hpte_need_flush (huge / !huge). Might not be 436 * argument to hpte_need_flush (huge / !huge). Might not be
373 * necessary anymore if we make hpte_need_flush() get the 437 * necessary anymore if we make hpte_need_flush() get the
374 * page size from the slices 438 * page size from the slices
375 */ 439 */
376 pte_update(mm, addr & HPAGE_MASK, ptep, ~0UL, 1); 440 unsigned int psize = get_slice_psize(mm, addr);
441 unsigned int shift = mmu_psize_to_shift(psize);
442 unsigned long sz = ((1UL) << shift);
443 struct hstate *hstate = size_to_hstate(sz);
444 pte_update(mm, addr & hstate->mask, ptep, ~0UL, 1);
377 } 445 }
378 *ptep = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS); 446 *ptep = __pte(pte_val(pte) & ~_PAGE_HPTEFLAGS);
379 } 447 }
380 448
381 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, 449 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
382 pte_t *ptep) 450 pte_t *ptep)
383 { 451 {
384 unsigned long old = pte_update(mm, addr, ptep, ~0UL, 1); 452 unsigned long old = pte_update(mm, addr, ptep, ~0UL, 1);
385 return __pte(old); 453 return __pte(old);
386 } 454 }
387 455
388 struct page * 456 struct page *
389 follow_huge_addr(struct mm_struct *mm, unsigned long address, int write) 457 follow_huge_addr(struct mm_struct *mm, unsigned long address, int write)
390 { 458 {
391 pte_t *ptep; 459 pte_t *ptep;
392 struct page *page; 460 struct page *page;
461 unsigned int mmu_psize = get_slice_psize(mm, address);
393 462
394 if (get_slice_psize(mm, address) != mmu_huge_psize) 463 /* Verify it is a huge page else bail. */
464 if (!mmu_huge_psizes[mmu_psize])
395 return ERR_PTR(-EINVAL); 465 return ERR_PTR(-EINVAL);
396 466
397 ptep = huge_pte_offset(mm, address); 467 ptep = huge_pte_offset(mm, address);
398 page = pte_page(*ptep); 468 page = pte_page(*ptep);
399 if (page) 469 if (page) {
400 page += (address % HPAGE_SIZE) / PAGE_SIZE; 470 unsigned int shift = mmu_psize_to_shift(mmu_psize);
471 unsigned long sz = ((1UL) << shift);
472 page += (address % sz) / PAGE_SIZE;
473 }
401 474
402 return page; 475 return page;
403 } 476 }
404 477
405 int pmd_huge(pmd_t pmd) 478 int pmd_huge(pmd_t pmd)
406 { 479 {
407 return 0; 480 return 0;
408 } 481 }
409 482
410 int pud_huge(pud_t pud) 483 int pud_huge(pud_t pud)
411 { 484 {
412 return 0; 485 return 0;
413 } 486 }
414 487
415 struct page * 488 struct page *
416 follow_huge_pmd(struct mm_struct *mm, unsigned long address, 489 follow_huge_pmd(struct mm_struct *mm, unsigned long address,
417 pmd_t *pmd, int write) 490 pmd_t *pmd, int write)
418 { 491 {
419 BUG(); 492 BUG();
420 return NULL; 493 return NULL;
421 } 494 }
422 495
423 496
424 unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, 497 unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
425 unsigned long len, unsigned long pgoff, 498 unsigned long len, unsigned long pgoff,
426 unsigned long flags) 499 unsigned long flags)
427 { 500 {
428 return slice_get_unmapped_area(addr, len, flags, 501 struct hstate *hstate = hstate_file(file);
429 mmu_huge_psize, 1, 0); 502 int mmu_psize = shift_to_mmu_psize(huge_page_shift(hstate));
503 return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1, 0);
430 } 504 }
431 505
432 /* 506 /*
433 * Called by asm hashtable.S for doing lazy icache flush 507 * Called by asm hashtable.S for doing lazy icache flush
434 */ 508 */
435 static unsigned int hash_huge_page_do_lazy_icache(unsigned long rflags, 509 static unsigned int hash_huge_page_do_lazy_icache(unsigned long rflags,
436 pte_t pte, int trap) 510 pte_t pte, int trap, unsigned long sz)
437 { 511 {
438 struct page *page; 512 struct page *page;
439 int i; 513 int i;
440 514
441 if (!pfn_valid(pte_pfn(pte))) 515 if (!pfn_valid(pte_pfn(pte)))
442 return rflags; 516 return rflags;
443 517
444 page = pte_page(pte); 518 page = pte_page(pte);
445 519
446 /* page is dirty */ 520 /* page is dirty */
447 if (!test_bit(PG_arch_1, &page->flags) && !PageReserved(page)) { 521 if (!test_bit(PG_arch_1, &page->flags) && !PageReserved(page)) {
448 if (trap == 0x400) { 522 if (trap == 0x400) {
449 for (i = 0; i < (HPAGE_SIZE / PAGE_SIZE); i++) 523 for (i = 0; i < (sz / PAGE_SIZE); i++)
450 __flush_dcache_icache(page_address(page+i)); 524 __flush_dcache_icache(page_address(page+i));
451 set_bit(PG_arch_1, &page->flags); 525 set_bit(PG_arch_1, &page->flags);
452 } else { 526 } else {
453 rflags |= HPTE_R_N; 527 rflags |= HPTE_R_N;
454 } 528 }
455 } 529 }
456 return rflags; 530 return rflags;
457 } 531 }
458 532
459 int hash_huge_page(struct mm_struct *mm, unsigned long access, 533 int hash_huge_page(struct mm_struct *mm, unsigned long access,
460 unsigned long ea, unsigned long vsid, int local, 534 unsigned long ea, unsigned long vsid, int local,
461 unsigned long trap) 535 unsigned long trap)
462 { 536 {
463 pte_t *ptep; 537 pte_t *ptep;
464 unsigned long old_pte, new_pte; 538 unsigned long old_pte, new_pte;
465 unsigned long va, rflags, pa; 539 unsigned long va, rflags, pa, sz;
466 long slot; 540 long slot;
467 int err = 1; 541 int err = 1;
468 int ssize = user_segment_size(ea); 542 int ssize = user_segment_size(ea);
543 unsigned int mmu_psize;
544 int shift;
545 mmu_psize = get_slice_psize(mm, ea);
469 546
547 if (!mmu_huge_psizes[mmu_psize])
548 goto out;
470 ptep = huge_pte_offset(mm, ea); 549 ptep = huge_pte_offset(mm, ea);
471 550
472 /* Search the Linux page table for a match with va */ 551 /* Search the Linux page table for a match with va */
473 va = hpt_va(ea, vsid, ssize); 552 va = hpt_va(ea, vsid, ssize);
474 553
475 /* 554 /*
476 * If no pte found or not present, send the problem up to 555 * If no pte found or not present, send the problem up to
477 * do_page_fault 556 * do_page_fault
478 */ 557 */
479 if (unlikely(!ptep || pte_none(*ptep))) 558 if (unlikely(!ptep || pte_none(*ptep)))
480 goto out; 559 goto out;
481 560
482 /* 561 /*
483 * Check the user's access rights to the page. If access should be 562 * Check the user's access rights to the page. If access should be
484 * prevented then send the problem up to do_page_fault. 563 * prevented then send the problem up to do_page_fault.
485 */ 564 */
486 if (unlikely(access & ~pte_val(*ptep))) 565 if (unlikely(access & ~pte_val(*ptep)))
487 goto out; 566 goto out;
488 /* 567 /*
489 * At this point, we have a pte (old_pte) which can be used to build 568 * At this point, we have a pte (old_pte) which can be used to build
490 * or update an HPTE. There are 2 cases: 569 * or update an HPTE. There are 2 cases:
491 * 570 *
492 * 1. There is a valid (present) pte with no associated HPTE (this is 571 * 1. There is a valid (present) pte with no associated HPTE (this is
493 * the most common case) 572 * the most common case)
494 * 2. There is a valid (present) pte with an associated HPTE. The 573 * 2. There is a valid (present) pte with an associated HPTE. The
495 * current values of the pp bits in the HPTE prevent access 574 * current values of the pp bits in the HPTE prevent access
496 * because we are doing software DIRTY bit management and the 575 * because we are doing software DIRTY bit management and the
497 * page is currently not DIRTY. 576 * page is currently not DIRTY.
498 */ 577 */
499 578
500 579
501 do { 580 do {
502 old_pte = pte_val(*ptep); 581 old_pte = pte_val(*ptep);
503 if (old_pte & _PAGE_BUSY) 582 if (old_pte & _PAGE_BUSY)
504 goto out; 583 goto out;
505 new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED; 584 new_pte = old_pte | _PAGE_BUSY | _PAGE_ACCESSED;
506 } while(old_pte != __cmpxchg_u64((unsigned long *)ptep, 585 } while(old_pte != __cmpxchg_u64((unsigned long *)ptep,
507 old_pte, new_pte)); 586 old_pte, new_pte));
508 587
509 rflags = 0x2 | (!(new_pte & _PAGE_RW)); 588 rflags = 0x2 | (!(new_pte & _PAGE_RW));
510 /* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */ 589 /* _PAGE_EXEC -> HW_NO_EXEC since it's inverted */
511 rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N); 590 rflags |= ((new_pte & _PAGE_EXEC) ? 0 : HPTE_R_N);
591 shift = mmu_psize_to_shift(mmu_psize);
592 sz = ((1UL) << shift);
512 if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE)) 593 if (!cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
513 /* No CPU has hugepages but lacks no execute, so we 594 /* No CPU has hugepages but lacks no execute, so we
514 * don't need to worry about that case */ 595 * don't need to worry about that case */
515 rflags = hash_huge_page_do_lazy_icache(rflags, __pte(old_pte), 596 rflags = hash_huge_page_do_lazy_icache(rflags, __pte(old_pte),
516 trap); 597 trap, sz);
517 598
518 /* Check if pte already has an hpte (case 2) */ 599 /* Check if pte already has an hpte (case 2) */
519 if (unlikely(old_pte & _PAGE_HASHPTE)) { 600 if (unlikely(old_pte & _PAGE_HASHPTE)) {
520 /* There MIGHT be an HPTE for this pte */ 601 /* There MIGHT be an HPTE for this pte */
521 unsigned long hash, slot; 602 unsigned long hash, slot;
522 603
523 hash = hpt_hash(va, HPAGE_SHIFT, ssize); 604 hash = hpt_hash(va, shift, ssize);
524 if (old_pte & _PAGE_F_SECOND) 605 if (old_pte & _PAGE_F_SECOND)
525 hash = ~hash; 606 hash = ~hash;
526 slot = (hash & htab_hash_mask) * HPTES_PER_GROUP; 607 slot = (hash & htab_hash_mask) * HPTES_PER_GROUP;
527 slot += (old_pte & _PAGE_F_GIX) >> 12; 608 slot += (old_pte & _PAGE_F_GIX) >> 12;
528 609
529 if (ppc_md.hpte_updatepp(slot, rflags, va, mmu_huge_psize, 610 if (ppc_md.hpte_updatepp(slot, rflags, va, mmu_psize,
530 ssize, local) == -1) 611 ssize, local) == -1)
531 old_pte &= ~_PAGE_HPTEFLAGS; 612 old_pte &= ~_PAGE_HPTEFLAGS;
532 } 613 }
533 614
534 if (likely(!(old_pte & _PAGE_HASHPTE))) { 615 if (likely(!(old_pte & _PAGE_HASHPTE))) {
535 unsigned long hash = hpt_hash(va, HPAGE_SHIFT, ssize); 616 unsigned long hash = hpt_hash(va, shift, ssize);
536 unsigned long hpte_group; 617 unsigned long hpte_group;
537 618
538 pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT; 619 pa = pte_pfn(__pte(old_pte)) << PAGE_SHIFT;
539 620
540 repeat: 621 repeat:
541 hpte_group = ((hash & htab_hash_mask) * 622 hpte_group = ((hash & htab_hash_mask) *
542 HPTES_PER_GROUP) & ~0x7UL; 623 HPTES_PER_GROUP) & ~0x7UL;
543 624
544 /* clear HPTE slot informations in new PTE */ 625 /* clear HPTE slot informations in new PTE */
545 #ifdef CONFIG_PPC_64K_PAGES 626 #ifdef CONFIG_PPC_64K_PAGES
546 new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0; 627 new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HPTE_SUB0;
547 #else 628 #else
548 new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE; 629 new_pte = (new_pte & ~_PAGE_HPTEFLAGS) | _PAGE_HASHPTE;
549 #endif 630 #endif
550 /* Add in WIMG bits */ 631 /* Add in WIMG bits */
551 rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE | 632 rflags |= (new_pte & (_PAGE_WRITETHRU | _PAGE_NO_CACHE |
552 _PAGE_COHERENT | _PAGE_GUARDED)); 633 _PAGE_COHERENT | _PAGE_GUARDED));
553 634
554 /* Insert into the hash table, primary slot */ 635 /* Insert into the hash table, primary slot */
555 slot = ppc_md.hpte_insert(hpte_group, va, pa, rflags, 0, 636 slot = ppc_md.hpte_insert(hpte_group, va, pa, rflags, 0,
556 mmu_huge_psize, ssize); 637 mmu_psize, ssize);
557 638
558 /* Primary is full, try the secondary */ 639 /* Primary is full, try the secondary */
559 if (unlikely(slot == -1)) { 640 if (unlikely(slot == -1)) {
560 hpte_group = ((~hash & htab_hash_mask) * 641 hpte_group = ((~hash & htab_hash_mask) *
561 HPTES_PER_GROUP) & ~0x7UL; 642 HPTES_PER_GROUP) & ~0x7UL;
562 slot = ppc_md.hpte_insert(hpte_group, va, pa, rflags, 643 slot = ppc_md.hpte_insert(hpte_group, va, pa, rflags,
563 HPTE_V_SECONDARY, 644 HPTE_V_SECONDARY,
564 mmu_huge_psize, ssize); 645 mmu_psize, ssize);
565 if (slot == -1) { 646 if (slot == -1) {
566 if (mftb() & 0x1) 647 if (mftb() & 0x1)
567 hpte_group = ((hash & htab_hash_mask) * 648 hpte_group = ((hash & htab_hash_mask) *
568 HPTES_PER_GROUP)&~0x7UL; 649 HPTES_PER_GROUP)&~0x7UL;
569 650
570 ppc_md.hpte_remove(hpte_group); 651 ppc_md.hpte_remove(hpte_group);
571 goto repeat; 652 goto repeat;
572 } 653 }
573 } 654 }
574 655
575 if (unlikely(slot == -2)) 656 if (unlikely(slot == -2))
576 panic("hash_huge_page: pte_insert failed\n"); 657 panic("hash_huge_page: pte_insert failed\n");
577 658
578 new_pte |= (slot << 12) & (_PAGE_F_SECOND | _PAGE_F_GIX); 659 new_pte |= (slot << 12) & (_PAGE_F_SECOND | _PAGE_F_GIX);
579 } 660 }
580 661
581 /* 662 /*
582 * No need to use ldarx/stdcx here 663 * No need to use ldarx/stdcx here
583 */ 664 */
584 *ptep = __pte(new_pte & ~_PAGE_BUSY); 665 *ptep = __pte(new_pte & ~_PAGE_BUSY);
585 666
586 err = 0; 667 err = 0;
587 668
588 out: 669 out:
589 return err; 670 return err;
590 } 671 }
591 672
592 void set_huge_psize(int psize) 673 void set_huge_psize(int psize)
593 { 674 {
594 /* Check that it is a page size supported by the hardware and 675 /* Check that it is a page size supported by the hardware and
595 * that it fits within pagetable limits. */ 676 * that it fits within pagetable limits. */
596 if (mmu_psize_defs[psize].shift && 677 if (mmu_psize_defs[psize].shift &&
597 mmu_psize_defs[psize].shift < SID_SHIFT_1T && 678 mmu_psize_defs[psize].shift < SID_SHIFT_1T &&
598 (mmu_psize_defs[psize].shift > MIN_HUGEPTE_SHIFT || 679 (mmu_psize_defs[psize].shift > MIN_HUGEPTE_SHIFT ||
599 mmu_psize_defs[psize].shift == PAGE_SHIFT_64K || 680 mmu_psize_defs[psize].shift == PAGE_SHIFT_64K ||
600 mmu_psize_defs[psize].shift == PAGE_SHIFT_16G)) { 681 mmu_psize_defs[psize].shift == PAGE_SHIFT_16G)) {
601 /* Return if huge page size is the same as the 682 /* Return if huge page size has already been setup or is the
602 * base page size. */ 683 * same as the base page size. */
603 if (mmu_psize_defs[psize].shift == PAGE_SHIFT) 684 if (mmu_huge_psizes[psize] ||
685 mmu_psize_defs[psize].shift == PAGE_SHIFT)
604 return; 686 return;
687 hugetlb_add_hstate(mmu_psize_defs[psize].shift - PAGE_SHIFT);
605 688
606 HPAGE_SHIFT = mmu_psize_defs[psize].shift; 689 switch (mmu_psize_defs[psize].shift) {
607 mmu_huge_psize = psize;
608
609 switch (HPAGE_SHIFT) {
610 case PAGE_SHIFT_64K: 690 case PAGE_SHIFT_64K:
611 /* We only allow 64k hpages with 4k base page, 691 /* We only allow 64k hpages with 4k base page,
612 * which was checked above, and always put them 692 * which was checked above, and always put them
613 * at the PMD */ 693 * at the PMD */
614 hugepte_shift = PMD_SHIFT; 694 hugepte_shift[psize] = PMD_SHIFT;
615 break; 695 break;
616 case PAGE_SHIFT_16M: 696 case PAGE_SHIFT_16M:
617 /* 16M pages can be at two different levels 697 /* 16M pages can be at two different levels
618 * of pagestables based on base page size */ 698 * of pagestables based on base page size */
619 if (PAGE_SHIFT == PAGE_SHIFT_64K) 699 if (PAGE_SHIFT == PAGE_SHIFT_64K)
620 hugepte_shift = PMD_SHIFT; 700 hugepte_shift[psize] = PMD_SHIFT;
621 else /* 4k base page */ 701 else /* 4k base page */
622 hugepte_shift = PUD_SHIFT; 702 hugepte_shift[psize] = PUD_SHIFT;
623 break; 703 break;
624 case PAGE_SHIFT_16G: 704 case PAGE_SHIFT_16G:
625 /* 16G pages are always at PGD level */ 705 /* 16G pages are always at PGD level */
626 hugepte_shift = PGDIR_SHIFT; 706 hugepte_shift[psize] = PGDIR_SHIFT;
627 break; 707 break;
628 } 708 }
629 hugepte_shift -= HPAGE_SHIFT; 709 hugepte_shift[psize] -= mmu_psize_defs[psize].shift;
630 } else 710 } else
631 HPAGE_SHIFT = 0; 711 hugepte_shift[psize] = 0;
632 } 712 }
633 713
634 static int __init hugepage_setup_sz(char *str) 714 static int __init hugepage_setup_sz(char *str)
635 { 715 {
636 unsigned long long size; 716 unsigned long long size;
637 int mmu_psize = -1; 717 int mmu_psize;
638 int shift; 718 int shift;
639 719
640 size = memparse(str, &str); 720 size = memparse(str, &str);
641 721
642 shift = __ffs(size); 722 shift = __ffs(size);
643 switch (shift) { 723 mmu_psize = shift_to_mmu_psize(shift);
644 #ifndef CONFIG_PPC_64K_PAGES 724 if (mmu_psize >= 0 && mmu_psize_defs[mmu_psize].shift)
645 case PAGE_SHIFT_64K:
646 mmu_psize = MMU_PAGE_64K;
647 break;
648 #endif
649 case PAGE_SHIFT_16M:
650 mmu_psize = MMU_PAGE_16M;
651 break;
652 case PAGE_SHIFT_16G:
653 mmu_psize = MMU_PAGE_16G;
654 break;
655 }
656
657 if (mmu_psize >= 0 && mmu_psize_defs[mmu_psize].shift) {
658 set_huge_psize(mmu_psize); 725 set_huge_psize(mmu_psize);
659 hugetlb_add_hstate(shift - PAGE_SHIFT);
660 }
661 else 726 else
662 printk(KERN_WARNING "Invalid huge page size specified(%llu)\n", size); 727 printk(KERN_WARNING "Invalid huge page size specified(%llu)\n", size);
663 728
664 return 1; 729 return 1;
665 } 730 }
666 __setup("hugepagesz=", hugepage_setup_sz); 731 __setup("hugepagesz=", hugepage_setup_sz);
667 732
668 static void zero_ctor(struct kmem_cache *cache, void *addr) 733 static void zero_ctor(struct kmem_cache *cache, void *addr)
669 { 734 {
670 memset(addr, 0, kmem_cache_size(cache)); 735 memset(addr, 0, kmem_cache_size(cache));
671 } 736 }
672 737
673 static int __init hugetlbpage_init(void) 738 static int __init hugetlbpage_init(void)
674 { 739 {
740 unsigned int psize;
741
675 if (!cpu_has_feature(CPU_FTR_16M_PAGE)) 742 if (!cpu_has_feature(CPU_FTR_16M_PAGE))
676 return -ENODEV; 743 return -ENODEV;
744 /* Add supported huge page sizes. Need to change HUGE_MAX_HSTATE
745 * and adjust PTE_NONCACHE_NUM if the number of supported huge page
746 * sizes changes.
747 */
748 set_huge_psize(MMU_PAGE_16M);
749 set_huge_psize(MMU_PAGE_64K);
750 set_huge_psize(MMU_PAGE_16G);
677 751
678 huge_pgtable_cache = kmem_cache_create("hugepte_cache", 752 for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
arch/powerpc/mm/init_64.c
1 /* 1 /*
2 * PowerPC version 2 * PowerPC version
3 * Copyright (C) 1995-1996 Gary Thomas (gdt@linuxppc.org) 3 * Copyright (C) 1995-1996 Gary Thomas (gdt@linuxppc.org)
4 * 4 *
5 * Modifications by Paul Mackerras (PowerMac) (paulus@cs.anu.edu.au) 5 * Modifications by Paul Mackerras (PowerMac) (paulus@cs.anu.edu.au)
6 * and Cort Dougan (PReP) (cort@cs.nmt.edu) 6 * and Cort Dougan (PReP) (cort@cs.nmt.edu)
7 * Copyright (C) 1996 Paul Mackerras 7 * Copyright (C) 1996 Paul Mackerras
8 * 8 *
9 * Derived from "arch/i386/mm/init.c" 9 * Derived from "arch/i386/mm/init.c"
10 * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds 10 * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds
11 * 11 *
12 * Dave Engebretsen <engebret@us.ibm.com> 12 * Dave Engebretsen <engebret@us.ibm.com>
13 * Rework for PPC64 port. 13 * Rework for PPC64 port.
14 * 14 *
15 * This program is free software; you can redistribute it and/or 15 * This program is free software; you can redistribute it and/or
16 * modify it under the terms of the GNU General Public License 16 * modify it under the terms of the GNU General Public License
17 * as published by the Free Software Foundation; either version 17 * as published by the Free Software Foundation; either version
18 * 2 of the License, or (at your option) any later version. 18 * 2 of the License, or (at your option) any later version.
19 * 19 *
20 */ 20 */
21 21
22 #undef DEBUG 22 #undef DEBUG
23 23
24 #include <linux/signal.h> 24 #include <linux/signal.h>
25 #include <linux/sched.h> 25 #include <linux/sched.h>
26 #include <linux/kernel.h> 26 #include <linux/kernel.h>
27 #include <linux/errno.h> 27 #include <linux/errno.h>
28 #include <linux/string.h> 28 #include <linux/string.h>
29 #include <linux/types.h> 29 #include <linux/types.h>
30 #include <linux/mman.h> 30 #include <linux/mman.h>
31 #include <linux/mm.h> 31 #include <linux/mm.h>
32 #include <linux/swap.h> 32 #include <linux/swap.h>
33 #include <linux/stddef.h> 33 #include <linux/stddef.h>
34 #include <linux/vmalloc.h> 34 #include <linux/vmalloc.h>
35 #include <linux/init.h> 35 #include <linux/init.h>
36 #include <linux/delay.h> 36 #include <linux/delay.h>
37 #include <linux/bootmem.h> 37 #include <linux/bootmem.h>
38 #include <linux/highmem.h> 38 #include <linux/highmem.h>
39 #include <linux/idr.h> 39 #include <linux/idr.h>
40 #include <linux/nodemask.h> 40 #include <linux/nodemask.h>
41 #include <linux/module.h> 41 #include <linux/module.h>
42 #include <linux/poison.h> 42 #include <linux/poison.h>
43 #include <linux/lmb.h> 43 #include <linux/lmb.h>
44 44
45 #include <asm/pgalloc.h> 45 #include <asm/pgalloc.h>
46 #include <asm/page.h> 46 #include <asm/page.h>
47 #include <asm/prom.h> 47 #include <asm/prom.h>
48 #include <asm/rtas.h> 48 #include <asm/rtas.h>
49 #include <asm/io.h> 49 #include <asm/io.h>
50 #include <asm/mmu_context.h> 50 #include <asm/mmu_context.h>
51 #include <asm/pgtable.h> 51 #include <asm/pgtable.h>
52 #include <asm/mmu.h> 52 #include <asm/mmu.h>
53 #include <asm/uaccess.h> 53 #include <asm/uaccess.h>
54 #include <asm/smp.h> 54 #include <asm/smp.h>
55 #include <asm/machdep.h> 55 #include <asm/machdep.h>
56 #include <asm/tlb.h> 56 #include <asm/tlb.h>
57 #include <asm/eeh.h> 57 #include <asm/eeh.h>
58 #include <asm/processor.h> 58 #include <asm/processor.h>
59 #include <asm/mmzone.h> 59 #include <asm/mmzone.h>
60 #include <asm/cputable.h> 60 #include <asm/cputable.h>
61 #include <asm/sections.h> 61 #include <asm/sections.h>
62 #include <asm/system.h> 62 #include <asm/system.h>
63 #include <asm/iommu.h> 63 #include <asm/iommu.h>
64 #include <asm/abs_addr.h> 64 #include <asm/abs_addr.h>
65 #include <asm/vdso.h> 65 #include <asm/vdso.h>
66 66
67 #include "mmu_decl.h" 67 #include "mmu_decl.h"
68 68
69 #if PGTABLE_RANGE > USER_VSID_RANGE 69 #if PGTABLE_RANGE > USER_VSID_RANGE
70 #warning Limited user VSID range means pagetable space is wasted 70 #warning Limited user VSID range means pagetable space is wasted
71 #endif 71 #endif
72 72
73 #if (TASK_SIZE_USER64 < PGTABLE_RANGE) && (TASK_SIZE_USER64 < USER_VSID_RANGE) 73 #if (TASK_SIZE_USER64 < PGTABLE_RANGE) && (TASK_SIZE_USER64 < USER_VSID_RANGE)
74 #warning TASK_SIZE is smaller than it needs to be. 74 #warning TASK_SIZE is smaller than it needs to be.
75 #endif 75 #endif
76 76
77 phys_addr_t memstart_addr = ~0; 77 phys_addr_t memstart_addr = ~0;
78 phys_addr_t kernstart_addr; 78 phys_addr_t kernstart_addr;
79 79
80 void free_initmem(void) 80 void free_initmem(void)
81 { 81 {
82 unsigned long addr; 82 unsigned long addr;
83 83
84 addr = (unsigned long)__init_begin; 84 addr = (unsigned long)__init_begin;
85 for (; addr < (unsigned long)__init_end; addr += PAGE_SIZE) { 85 for (; addr < (unsigned long)__init_end; addr += PAGE_SIZE) {
86 memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE); 86 memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE);
87 ClearPageReserved(virt_to_page(addr)); 87 ClearPageReserved(virt_to_page(addr));
88 init_page_count(virt_to_page(addr)); 88 init_page_count(virt_to_page(addr));
89 free_page(addr); 89 free_page(addr);
90 totalram_pages++; 90 totalram_pages++;
91 } 91 }
92 printk ("Freeing unused kernel memory: %luk freed\n", 92 printk ("Freeing unused kernel memory: %luk freed\n",
93 ((unsigned long)__init_end - (unsigned long)__init_begin) >> 10); 93 ((unsigned long)__init_end - (unsigned long)__init_begin) >> 10);
94 } 94 }
95 95
96 #ifdef CONFIG_BLK_DEV_INITRD 96 #ifdef CONFIG_BLK_DEV_INITRD
97 void free_initrd_mem(unsigned long start, unsigned long end) 97 void free_initrd_mem(unsigned long start, unsigned long end)
98 { 98 {
99 if (start < end) 99 if (start < end)
100 printk ("Freeing initrd memory: %ldk freed\n", (end - start) >> 10); 100 printk ("Freeing initrd memory: %ldk freed\n", (end - start) >> 10);
101 for (; start < end; start += PAGE_SIZE) { 101 for (; start < end; start += PAGE_SIZE) {
102 ClearPageReserved(virt_to_page(start)); 102 ClearPageReserved(virt_to_page(start));
103 init_page_count(virt_to_page(start)); 103 init_page_count(virt_to_page(start));
104 free_page(start); 104 free_page(start);
105 totalram_pages++; 105 totalram_pages++;
106 } 106 }
107 } 107 }
108 #endif 108 #endif
109 109
110 #ifdef CONFIG_PROC_KCORE 110 #ifdef CONFIG_PROC_KCORE
111 static struct kcore_list kcore_vmem; 111 static struct kcore_list kcore_vmem;
112 112
113 static int __init setup_kcore(void) 113 static int __init setup_kcore(void)
114 { 114 {
115 int i; 115 int i;
116 116
117 for (i=0; i < lmb.memory.cnt; i++) { 117 for (i=0; i < lmb.memory.cnt; i++) {
118 unsigned long base, size; 118 unsigned long base, size;
119 struct kcore_list *kcore_mem; 119 struct kcore_list *kcore_mem;
120 120
121 base = lmb.memory.region[i].base; 121 base = lmb.memory.region[i].base;
122 size = lmb.memory.region[i].size; 122 size = lmb.memory.region[i].size;
123 123
124 /* GFP_ATOMIC to avoid might_sleep warnings during boot */ 124 /* GFP_ATOMIC to avoid might_sleep warnings during boot */
125 kcore_mem = kmalloc(sizeof(struct kcore_list), GFP_ATOMIC); 125 kcore_mem = kmalloc(sizeof(struct kcore_list), GFP_ATOMIC);
126 if (!kcore_mem) 126 if (!kcore_mem)
127 panic("%s: kmalloc failed\n", __func__); 127 panic("%s: kmalloc failed\n", __func__);
128 128
129 kclist_add(kcore_mem, __va(base), size); 129 kclist_add(kcore_mem, __va(base), size);
130 } 130 }
131 131
132 kclist_add(&kcore_vmem, (void *)VMALLOC_START, VMALLOC_END-VMALLOC_START); 132 kclist_add(&kcore_vmem, (void *)VMALLOC_START, VMALLOC_END-VMALLOC_START);
133 133
134 return 0; 134 return 0;
135 } 135 }
136 module_init(setup_kcore); 136 module_init(setup_kcore);
137 #endif 137 #endif
138 138
139 static void zero_ctor(struct kmem_cache *cache, void *addr) 139 static void zero_ctor(struct kmem_cache *cache, void *addr)
140 { 140 {
141 memset(addr, 0, kmem_cache_size(cache)); 141 memset(addr, 0, kmem_cache_size(cache));
142 } 142 }
143 143
144 static const unsigned int pgtable_cache_size[2] = { 144 static const unsigned int pgtable_cache_size[2] = {
145 PGD_TABLE_SIZE, PMD_TABLE_SIZE 145 PGD_TABLE_SIZE, PMD_TABLE_SIZE
146 }; 146 };
147 static const char *pgtable_cache_name[ARRAY_SIZE(pgtable_cache_size)] = { 147 static const char *pgtable_cache_name[ARRAY_SIZE(pgtable_cache_size)] = {
148 #ifdef CONFIG_PPC_64K_PAGES 148 #ifdef CONFIG_PPC_64K_PAGES
149 "pgd_cache", "pmd_cache", 149 "pgd_cache", "pmd_cache",
150 #else 150 #else
151 "pgd_cache", "pud_pmd_cache", 151 "pgd_cache", "pud_pmd_cache",
152 #endif /* CONFIG_PPC_64K_PAGES */ 152 #endif /* CONFIG_PPC_64K_PAGES */
153 }; 153 };
154 154
155 #ifdef CONFIG_HUGETLB_PAGE 155 #ifdef CONFIG_HUGETLB_PAGE
156 /* Hugepages need one extra cache, initialized in hugetlbpage.c. We 156 /* Hugepages need an extra cache per hugepagesize, initialized in
157 * can't put into the tables above, because HPAGE_SHIFT is not compile 157 * hugetlbpage.c. We can't put into the tables above, because HPAGE_SHIFT
158 * time constant. */ 158 * is not compile time constant. */
159 struct kmem_cache *pgtable_cache[ARRAY_SIZE(pgtable_cache_size)+1]; 159 struct kmem_cache *pgtable_cache[ARRAY_SIZE(pgtable_cache_size)+MMU_PAGE_COUNT];
160 #else 160 #else
161 struct kmem_cache *pgtable_cache[ARRAY_SIZE(pgtable_cache_size)]; 161 struct kmem_cache *pgtable_cache[ARRAY_SIZE(pgtable_cache_size)];
162 #endif 162 #endif
163 163
164 void pgtable_cache_init(void) 164 void pgtable_cache_init(void)
165 { 165 {
166 int i; 166 int i;
167 167
168 for (i = 0; i < ARRAY_SIZE(pgtable_cache_size); i++) { 168 for (i = 0; i < ARRAY_SIZE(pgtable_cache_size); i++) {
169 int size = pgtable_cache_size[i]; 169 int size = pgtable_cache_size[i];
170 const char *name = pgtable_cache_name[i]; 170 const char *name = pgtable_cache_name[i];
171 171
172 pr_debug("Allocating page table cache %s (#%d) " 172 pr_debug("Allocating page table cache %s (#%d) "
173 "for size: %08x...\n", name, i, size); 173 "for size: %08x...\n", name, i, size);
174 pgtable_cache[i] = kmem_cache_create(name, 174 pgtable_cache[i] = kmem_cache_create(name,
175 size, size, 175 size, size,
176 SLAB_PANIC, 176 SLAB_PANIC,
177 zero_ctor); 177 zero_ctor);
178 } 178 }
179 } 179 }
180 180
181 #ifdef CONFIG_SPARSEMEM_VMEMMAP 181 #ifdef CONFIG_SPARSEMEM_VMEMMAP
182 /* 182 /*
183 * Given an address within the vmemmap, determine the pfn of the page that 183 * Given an address within the vmemmap, determine the pfn of the page that
184 * represents the start of the section it is within. Note that we have to 184 * represents the start of the section it is within. Note that we have to
185 * do this by hand as the proffered address may not be correctly aligned. 185 * do this by hand as the proffered address may not be correctly aligned.
186 * Subtraction of non-aligned pointers produces undefined results. 186 * Subtraction of non-aligned pointers produces undefined results.
187 */ 187 */
188 static unsigned long __meminit vmemmap_section_start(unsigned long page) 188 static unsigned long __meminit vmemmap_section_start(unsigned long page)
189 { 189 {
190 unsigned long offset = page - ((unsigned long)(vmemmap)); 190 unsigned long offset = page - ((unsigned long)(vmemmap));
191 191
192 /* Return the pfn of the start of the section. */ 192 /* Return the pfn of the start of the section. */
193 return (offset / sizeof(struct page)) & PAGE_SECTION_MASK; 193 return (offset / sizeof(struct page)) & PAGE_SECTION_MASK;
194 } 194 }
195 195
196 /* 196 /*
197 * Check if this vmemmap page is already initialised. If any section 197 * Check if this vmemmap page is already initialised. If any section
198 * which overlaps this vmemmap page is initialised then this page is 198 * which overlaps this vmemmap page is initialised then this page is
199 * initialised already. 199 * initialised already.
200 */ 200 */
201 static int __meminit vmemmap_populated(unsigned long start, int page_size) 201 static int __meminit vmemmap_populated(unsigned long start, int page_size)
202 { 202 {
203 unsigned long end = start + page_size; 203 unsigned long end = start + page_size;
204 204
205 for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page))) 205 for (; start < end; start += (PAGES_PER_SECTION * sizeof(struct page)))
206 if (pfn_valid(vmemmap_section_start(start))) 206 if (pfn_valid(vmemmap_section_start(start)))
207 return 1; 207 return 1;
208 208
209 return 0; 209 return 0;
210 } 210 }
211 211
212 int __meminit vmemmap_populate(struct page *start_page, 212 int __meminit vmemmap_populate(struct page *start_page,
213 unsigned long nr_pages, int node) 213 unsigned long nr_pages, int node)
214 { 214 {
215 unsigned long mode_rw; 215 unsigned long mode_rw;
216 unsigned long start = (unsigned long)start_page; 216 unsigned long start = (unsigned long)start_page;
217 unsigned long end = (unsigned long)(start_page + nr_pages); 217 unsigned long end = (unsigned long)(start_page + nr_pages);
218 unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift; 218 unsigned long page_size = 1 << mmu_psize_defs[mmu_vmemmap_psize].shift;
219 219
220 mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX; 220 mode_rw = _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_COHERENT | PP_RWXX;
221 221
222 /* Align to the page size of the linear mapping. */ 222 /* Align to the page size of the linear mapping. */
223 start = _ALIGN_DOWN(start, page_size); 223 start = _ALIGN_DOWN(start, page_size);
224 224
225 for (; start < end; start += page_size) { 225 for (; start < end; start += page_size) {
226 int mapped; 226 int mapped;
227 void *p; 227 void *p;
228 228
229 if (vmemmap_populated(start, page_size)) 229 if (vmemmap_populated(start, page_size))
230 continue; 230 continue;
231 231
232 p = vmemmap_alloc_block(page_size, node); 232 p = vmemmap_alloc_block(page_size, node);
233 if (!p) 233 if (!p)
234 return -ENOMEM; 234 return -ENOMEM;
235 235
236 pr_debug("vmemmap %08lx allocated at %p, physical %08lx.\n", 236 pr_debug("vmemmap %08lx allocated at %p, physical %08lx.\n",
237 start, p, __pa(p)); 237 start, p, __pa(p));
238 238
239 mapped = htab_bolt_mapping(start, start + page_size, 239 mapped = htab_bolt_mapping(start, start + page_size,
240 __pa(p), mode_rw, mmu_vmemmap_psize, 240 __pa(p), mode_rw, mmu_vmemmap_psize,
241 mmu_kernel_ssize); 241 mmu_kernel_ssize);
242 BUG_ON(mapped < 0); 242 BUG_ON(mapped < 0);
243 } 243 }
244 244
245 return 0; 245 return 0;
246 } 246 }
247 #endif /* CONFIG_SPARSEMEM_VMEMMAP */ 247 #endif /* CONFIG_SPARSEMEM_VMEMMAP */
248 248
arch/powerpc/mm/tlb_64.c
1 /* 1 /*
2 * This file contains the routines for flushing entries from the 2 * This file contains the routines for flushing entries from the
3 * TLB and MMU hash table. 3 * TLB and MMU hash table.
4 * 4 *
5 * Derived from arch/ppc64/mm/init.c: 5 * Derived from arch/ppc64/mm/init.c:
6 * Copyright (C) 1995-1996 Gary Thomas (gdt@linuxppc.org) 6 * Copyright (C) 1995-1996 Gary Thomas (gdt@linuxppc.org)
7 * 7 *
8 * Modifications by Paul Mackerras (PowerMac) (paulus@cs.anu.edu.au) 8 * Modifications by Paul Mackerras (PowerMac) (paulus@cs.anu.edu.au)
9 * and Cort Dougan (PReP) (cort@cs.nmt.edu) 9 * and Cort Dougan (PReP) (cort@cs.nmt.edu)
10 * Copyright (C) 1996 Paul Mackerras 10 * Copyright (C) 1996 Paul Mackerras
11 * 11 *
12 * Derived from "arch/i386/mm/init.c" 12 * Derived from "arch/i386/mm/init.c"
13 * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds 13 * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds
14 * 14 *
15 * Dave Engebretsen <engebret@us.ibm.com> 15 * Dave Engebretsen <engebret@us.ibm.com>
16 * Rework for PPC64 port. 16 * Rework for PPC64 port.
17 * 17 *
18 * This program is free software; you can redistribute it and/or 18 * This program is free software; you can redistribute it and/or
19 * modify it under the terms of the GNU General Public License 19 * modify it under the terms of the GNU General Public License
20 * as published by the Free Software Foundation; either version 20 * as published by the Free Software Foundation; either version
21 * 2 of the License, or (at your option) any later version. 21 * 2 of the License, or (at your option) any later version.
22 */ 22 */
23 23
24 #include <linux/kernel.h> 24 #include <linux/kernel.h>
25 #include <linux/mm.h> 25 #include <linux/mm.h>
26 #include <linux/init.h> 26 #include <linux/init.h>
27 #include <linux/percpu.h> 27 #include <linux/percpu.h>
28 #include <linux/hardirq.h> 28 #include <linux/hardirq.h>
29 #include <asm/pgalloc.h> 29 #include <asm/pgalloc.h>
30 #include <asm/tlbflush.h> 30 #include <asm/tlbflush.h>
31 #include <asm/tlb.h> 31 #include <asm/tlb.h>
32 #include <asm/bug.h> 32 #include <asm/bug.h>
33 33
34 DEFINE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch); 34 DEFINE_PER_CPU(struct ppc64_tlb_batch, ppc64_tlb_batch);
35 35
36 /* This is declared as we are using the more or less generic 36 /* This is declared as we are using the more or less generic
37 * include/asm-powerpc/tlb.h file -- tgall 37 * include/asm-powerpc/tlb.h file -- tgall
38 */ 38 */
39 DEFINE_PER_CPU(struct mmu_gather, mmu_gathers); 39 DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
40 static DEFINE_PER_CPU(struct pte_freelist_batch *, pte_freelist_cur); 40 static DEFINE_PER_CPU(struct pte_freelist_batch *, pte_freelist_cur);
41 static unsigned long pte_freelist_forced_free; 41 static unsigned long pte_freelist_forced_free;
42 42
43 struct pte_freelist_batch 43 struct pte_freelist_batch
44 { 44 {
45 struct rcu_head rcu; 45 struct rcu_head rcu;
46 unsigned int index; 46 unsigned int index;
47 pgtable_free_t tables[0]; 47 pgtable_free_t tables[0];
48 }; 48 };
49 49
50 #define PTE_FREELIST_SIZE \ 50 #define PTE_FREELIST_SIZE \
51 ((PAGE_SIZE - sizeof(struct pte_freelist_batch)) \ 51 ((PAGE_SIZE - sizeof(struct pte_freelist_batch)) \
52 / sizeof(pgtable_free_t)) 52 / sizeof(pgtable_free_t))
53 53
54 static void pte_free_smp_sync(void *arg) 54 static void pte_free_smp_sync(void *arg)
55 { 55 {
56 /* Do nothing, just ensure we sync with all CPUs */ 56 /* Do nothing, just ensure we sync with all CPUs */
57 } 57 }
58 58
59 /* This is only called when we are critically out of memory 59 /* This is only called when we are critically out of memory
60 * (and fail to get a page in pte_free_tlb). 60 * (and fail to get a page in pte_free_tlb).
61 */ 61 */
62 static void pgtable_free_now(pgtable_free_t pgf) 62 static void pgtable_free_now(pgtable_free_t pgf)
63 { 63 {
64 pte_freelist_forced_free++; 64 pte_freelist_forced_free++;
65 65
66 smp_call_function(pte_free_smp_sync, NULL, 1); 66 smp_call_function(pte_free_smp_sync, NULL, 1);
67 67
68 pgtable_free(pgf); 68 pgtable_free(pgf);
69 } 69 }
70 70
71 static void pte_free_rcu_callback(struct rcu_head *head) 71 static void pte_free_rcu_callback(struct rcu_head *head)
72 { 72 {
73 struct pte_freelist_batch *batch = 73 struct pte_freelist_batch *batch =
74 container_of(head, struct pte_freelist_batch, rcu); 74 container_of(head, struct pte_freelist_batch, rcu);
75 unsigned int i; 75 unsigned int i;
76 76
77 for (i = 0; i < batch->index; i++) 77 for (i = 0; i < batch->index; i++)
78 pgtable_free(batch->tables[i]); 78 pgtable_free(batch->tables[i]);
79 79
80 free_page((unsigned long)batch); 80 free_page((unsigned long)batch);
81 } 81 }
82 82
83 static void pte_free_submit(struct pte_freelist_batch *batch) 83 static void pte_free_submit(struct pte_freelist_batch *batch)
84 { 84 {
85 INIT_RCU_HEAD(&batch->rcu); 85 INIT_RCU_HEAD(&batch->rcu);
86 call_rcu(&batch->rcu, pte_free_rcu_callback); 86 call_rcu(&batch->rcu, pte_free_rcu_callback);
87 } 87 }
88 88
89 void pgtable_free_tlb(struct mmu_gather *tlb, pgtable_free_t pgf) 89 void pgtable_free_tlb(struct mmu_gather *tlb, pgtable_free_t pgf)
90 { 90 {
91 /* This is safe since tlb_gather_mmu has disabled preemption */ 91 /* This is safe since tlb_gather_mmu has disabled preemption */
92 cpumask_t local_cpumask = cpumask_of_cpu(smp_processor_id()); 92 cpumask_t local_cpumask = cpumask_of_cpu(smp_processor_id());
93 struct pte_freelist_batch **batchp = &__get_cpu_var(pte_freelist_cur); 93 struct pte_freelist_batch **batchp = &__get_cpu_var(pte_freelist_cur);
94 94
95 if (atomic_read(&tlb->mm->mm_users) < 2 || 95 if (atomic_read(&tlb->mm->mm_users) < 2 ||
96 cpus_equal(tlb->mm->cpu_vm_mask, local_cpumask)) { 96 cpus_equal(tlb->mm->cpu_vm_mask, local_cpumask)) {
97 pgtable_free(pgf); 97 pgtable_free(pgf);
98 return; 98 return;
99 } 99 }
100 100
101 if (*batchp == NULL) { 101 if (*batchp == NULL) {
102 *batchp = (struct pte_freelist_batch *)__get_free_page(GFP_ATOMIC); 102 *batchp = (struct pte_freelist_batch *)__get_free_page(GFP_ATOMIC);
103 if (*batchp == NULL) { 103 if (*batchp == NULL) {
104 pgtable_free_now(pgf); 104 pgtable_free_now(pgf);
105 return; 105 return;
106 } 106 }
107 (*batchp)->index = 0; 107 (*batchp)->index = 0;
108 } 108 }
109 (*batchp)->tables[(*batchp)->index++] = pgf; 109 (*batchp)->tables[(*batchp)->index++] = pgf;
110 if ((*batchp)->index == PTE_FREELIST_SIZE) { 110 if ((*batchp)->index == PTE_FREELIST_SIZE) {
111 pte_free_submit(*batchp); 111 pte_free_submit(*batchp);
112 *batchp = NULL; 112 *batchp = NULL;
113 } 113 }
114 } 114 }
115 115
116 /* 116 /*
117 * A linux PTE was changed and the corresponding hash table entry 117 * A linux PTE was changed and the corresponding hash table entry
118 * neesd to be flushed. This function will either perform the flush 118 * neesd to be flushed. This function will either perform the flush
119 * immediately or will batch it up if the current CPU has an active 119 * immediately or will batch it up if the current CPU has an active
120 * batch on it. 120 * batch on it.
121 * 121 *
122 * Must be called from within some kind of spinlock/non-preempt region... 122 * Must be called from within some kind of spinlock/non-preempt region...
123 */ 123 */
124 void hpte_need_flush(struct mm_struct *mm, unsigned long addr, 124 void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
125 pte_t *ptep, unsigned long pte, int huge) 125 pte_t *ptep, unsigned long pte, int huge)
126 { 126 {
127 struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch); 127 struct ppc64_tlb_batch *batch = &__get_cpu_var(ppc64_tlb_batch);
128 unsigned long vsid, vaddr; 128 unsigned long vsid, vaddr;
129 unsigned int psize; 129 unsigned int psize;
130 int ssize; 130 int ssize;
131 real_pte_t rpte; 131 real_pte_t rpte;
132 int i; 132 int i;
133 133
134 i = batch->index; 134 i = batch->index;
135 135
136 /* We mask the address for the base page size. Huge pages will 136 /* We mask the address for the base page size. Huge pages will
137 * have applied their own masking already 137 * have applied their own masking already
138 */ 138 */
139 addr &= PAGE_MASK; 139 addr &= PAGE_MASK;
140 140
141 /* Get page size (maybe move back to caller). 141 /* Get page size (maybe move back to caller).
142 * 142 *
143 * NOTE: when using special 64K mappings in 4K environment like 143 * NOTE: when using special 64K mappings in 4K environment like
144 * for SPEs, we obtain the page size from the slice, which thus 144 * for SPEs, we obtain the page size from the slice, which thus
145 * must still exist (and thus the VMA not reused) at the time 145 * must still exist (and thus the VMA not reused) at the time
146 * of this call 146 * of this call
147 */ 147 */
148 if (huge) { 148 if (huge) {
149 #ifdef CONFIG_HUGETLB_PAGE 149 #ifdef CONFIG_HUGETLB_PAGE
150 psize = mmu_huge_psize; 150 psize = get_slice_psize(mm, addr);;
151 #else 151 #else
152 BUG(); 152 BUG();
153 psize = pte_pagesize_index(mm, addr, pte); /* shutup gcc */ 153 psize = pte_pagesize_index(mm, addr, pte); /* shutup gcc */
154 #endif 154 #endif
155 } else 155 } else
156 psize = pte_pagesize_index(mm, addr, pte); 156 psize = pte_pagesize_index(mm, addr, pte);
157 157
158 /* Build full vaddr */ 158 /* Build full vaddr */
159 if (!is_kernel_addr(addr)) { 159 if (!is_kernel_addr(addr)) {
160 ssize = user_segment_size(addr); 160 ssize = user_segment_size(addr);
161 vsid = get_vsid(mm->context.id, addr, ssize); 161 vsid = get_vsid(mm->context.id, addr, ssize);
162 WARN_ON(vsid == 0); 162 WARN_ON(vsid == 0);
163 } else { 163 } else {
164 vsid = get_kernel_vsid(addr, mmu_kernel_ssize); 164 vsid = get_kernel_vsid(addr, mmu_kernel_ssize);
165 ssize = mmu_kernel_ssize; 165 ssize = mmu_kernel_ssize;
166 } 166 }
167 vaddr = hpt_va(addr, vsid, ssize); 167 vaddr = hpt_va(addr, vsid, ssize);
168 rpte = __real_pte(__pte(pte), ptep); 168 rpte = __real_pte(__pte(pte), ptep);
169 169
170 /* 170 /*
171 * Check if we have an active batch on this CPU. If not, just 171 * Check if we have an active batch on this CPU. If not, just
172 * flush now and return. For now, we don global invalidates 172 * flush now and return. For now, we don global invalidates
173 * in that case, might be worth testing the mm cpu mask though 173 * in that case, might be worth testing the mm cpu mask though
174 * and decide to use local invalidates instead... 174 * and decide to use local invalidates instead...
175 */ 175 */
176 if (!batch->active) { 176 if (!batch->active) {
177 flush_hash_page(vaddr, rpte, psize, ssize, 0); 177 flush_hash_page(vaddr, rpte, psize, ssize, 0);
178 return; 178 return;
179 } 179 }
180 180
181 /* 181 /*
182 * This can happen when we are in the middle of a TLB batch and 182 * This can happen when we are in the middle of a TLB batch and
183 * we encounter memory pressure (eg copy_page_range when it tries 183 * we encounter memory pressure (eg copy_page_range when it tries
184 * to allocate a new pte). If we have to reclaim memory and end 184 * to allocate a new pte). If we have to reclaim memory and end
185 * up scanning and resetting referenced bits then our batch context 185 * up scanning and resetting referenced bits then our batch context
186 * will change mid stream. 186 * will change mid stream.
187 * 187 *
188 * We also need to ensure only one page size is present in a given 188 * We also need to ensure only one page size is present in a given
189 * batch 189 * batch
190 */ 190 */
191 if (i != 0 && (mm != batch->mm || batch->psize != psize || 191 if (i != 0 && (mm != batch->mm || batch->psize != psize ||
192 batch->ssize != ssize)) { 192 batch->ssize != ssize)) {
193 __flush_tlb_pending(batch); 193 __flush_tlb_pending(batch);
194 i = 0; 194 i = 0;
195 } 195 }
196 if (i == 0) { 196 if (i == 0) {
197 batch->mm = mm; 197 batch->mm = mm;
198 batch->psize = psize; 198 batch->psize = psize;
199 batch->ssize = ssize; 199 batch->ssize = ssize;
200 } 200 }
201 batch->pte[i] = rpte; 201 batch->pte[i] = rpte;
202 batch->vaddr[i] = vaddr; 202 batch->vaddr[i] = vaddr;
203 batch->index = ++i; 203 batch->index = ++i;
204 if (i >= PPC64_TLB_BATCH_NR) 204 if (i >= PPC64_TLB_BATCH_NR)
205 __flush_tlb_pending(batch); 205 __flush_tlb_pending(batch);
206 } 206 }
207 207
208 /* 208 /*
209 * This function is called when terminating an mmu batch or when a batch 209 * This function is called when terminating an mmu batch or when a batch
210 * is full. It will perform the flush of all the entries currently stored 210 * is full. It will perform the flush of all the entries currently stored
211 * in a batch. 211 * in a batch.
212 * 212 *
213 * Must be called from within some kind of spinlock/non-preempt region... 213 * Must be called from within some kind of spinlock/non-preempt region...
214 */ 214 */
215 void __flush_tlb_pending(struct ppc64_tlb_batch *batch) 215 void __flush_tlb_pending(struct ppc64_tlb_batch *batch)
216 { 216 {
217 cpumask_t tmp; 217 cpumask_t tmp;
218 int i, local = 0; 218 int i, local = 0;
219 219
220 i = batch->index; 220 i = batch->index;
221 tmp = cpumask_of_cpu(smp_processor_id()); 221 tmp = cpumask_of_cpu(smp_processor_id());
222 if (cpus_equal(batch->mm->cpu_vm_mask, tmp)) 222 if (cpus_equal(batch->mm->cpu_vm_mask, tmp))
223 local = 1; 223 local = 1;
224 if (i == 1) 224 if (i == 1)
225 flush_hash_page(batch->vaddr[0], batch->pte[0], 225 flush_hash_page(batch->vaddr[0], batch->pte[0],
226 batch->psize, batch->ssize, local); 226 batch->psize, batch->ssize, local);
227 else 227 else
228 flush_hash_range(i, local); 228 flush_hash_range(i, local);
229 batch->index = 0; 229 batch->index = 0;
230 } 230 }
231 231
232 void pte_free_finish(void) 232 void pte_free_finish(void)
233 { 233 {
234 /* This is safe since tlb_gather_mmu has disabled preemption */ 234 /* This is safe since tlb_gather_mmu has disabled preemption */
235 struct pte_freelist_batch **batchp = &__get_cpu_var(pte_freelist_cur); 235 struct pte_freelist_batch **batchp = &__get_cpu_var(pte_freelist_cur);
236 236
237 if (*batchp == NULL) 237 if (*batchp == NULL)
238 return; 238 return;
239 pte_free_submit(*batchp); 239 pte_free_submit(*batchp);
240 *batchp = NULL; 240 *batchp = NULL;
241 } 241 }
242 242
243 /** 243 /**
244 * __flush_hash_table_range - Flush all HPTEs for a given address range 244 * __flush_hash_table_range - Flush all HPTEs for a given address range
245 * from the hash table (and the TLB). But keeps 245 * from the hash table (and the TLB). But keeps
246 * the linux PTEs intact. 246 * the linux PTEs intact.
247 * 247 *
248 * @mm : mm_struct of the target address space (generally init_mm) 248 * @mm : mm_struct of the target address space (generally init_mm)
249 * @start : starting address 249 * @start : starting address
250 * @end : ending address (not included in the flush) 250 * @end : ending address (not included in the flush)
251 * 251 *
252 * This function is mostly to be used by some IO hotplug code in order 252 * This function is mostly to be used by some IO hotplug code in order
253 * to remove all hash entries from a given address range used to map IO 253 * to remove all hash entries from a given address range used to map IO
254 * space on a removed PCI-PCI bidge without tearing down the full mapping 254 * space on a removed PCI-PCI bidge without tearing down the full mapping
255 * since 64K pages may overlap with other bridges when using 64K pages 255 * since 64K pages may overlap with other bridges when using 64K pages
256 * with 4K HW pages on IO space. 256 * with 4K HW pages on IO space.
257 * 257 *
258 * Because of that usage pattern, it's only available with CONFIG_HOTPLUG 258 * Because of that usage pattern, it's only available with CONFIG_HOTPLUG
259 * and is implemented for small size rather than speed. 259 * and is implemented for small size rather than speed.
260 */ 260 */
261 #ifdef CONFIG_HOTPLUG 261 #ifdef CONFIG_HOTPLUG
262 262
263 void __flush_hash_table_range(struct mm_struct *mm, unsigned long start, 263 void __flush_hash_table_range(struct mm_struct *mm, unsigned long start,
264 unsigned long end) 264 unsigned long end)
265 { 265 {
266 unsigned long flags; 266 unsigned long flags;
267 267
268 start = _ALIGN_DOWN(start, PAGE_SIZE); 268 start = _ALIGN_DOWN(start, PAGE_SIZE);
269 end = _ALIGN_UP(end, PAGE_SIZE); 269 end = _ALIGN_UP(end, PAGE_SIZE);
270 270
271 BUG_ON(!mm->pgd); 271 BUG_ON(!mm->pgd);
272 272
273 /* Note: Normally, we should only ever use a batch within a 273 /* Note: Normally, we should only ever use a batch within a
274 * PTE locked section. This violates the rule, but will work 274 * PTE locked section. This violates the rule, but will work
275 * since we don't actually modify the PTEs, we just flush the 275 * since we don't actually modify the PTEs, we just flush the
276 * hash while leaving the PTEs intact (including their reference 276 * hash while leaving the PTEs intact (including their reference
277 * to being hashed). This is not the most performance oriented 277 * to being hashed). This is not the most performance oriented
278 * way to do things but is fine for our needs here. 278 * way to do things but is fine for our needs here.
279 */ 279 */
280 local_irq_save(flags); 280 local_irq_save(flags);
281 arch_enter_lazy_mmu_mode(); 281 arch_enter_lazy_mmu_mode();
282 for (; start < end; start += PAGE_SIZE) { 282 for (; start < end; start += PAGE_SIZE) {
283 pte_t *ptep = find_linux_pte(mm->pgd, start); 283 pte_t *ptep = find_linux_pte(mm->pgd, start);
284 unsigned long pte; 284 unsigned long pte;
285 285
286 if (ptep == NULL) 286 if (ptep == NULL)
287 continue; 287 continue;
288 pte = pte_val(*ptep); 288 pte = pte_val(*ptep);
289 if (!(pte & _PAGE_HASHPTE)) 289 if (!(pte & _PAGE_HASHPTE))
290 continue; 290 continue;
291 hpte_need_flush(mm, start, ptep, pte, 0); 291 hpte_need_flush(mm, start, ptep, pte, 0);
292 } 292 }
293 arch_leave_lazy_mmu_mode(); 293 arch_leave_lazy_mmu_mode();
294 local_irq_restore(flags); 294 local_irq_restore(flags);
295 } 295 }
296 296
297 #endif /* CONFIG_HOTPLUG */ 297 #endif /* CONFIG_HOTPLUG */
298 298
include/asm-powerpc/hugetlb.h
1 #ifndef _ASM_POWERPC_HUGETLB_H 1 #ifndef _ASM_POWERPC_HUGETLB_H
2 #define _ASM_POWERPC_HUGETLB_H 2 #define _ASM_POWERPC_HUGETLB_H
3 3
4 #include <asm/page.h> 4 #include <asm/page.h>
5 5
6 6
7 int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr, 7 int is_hugepage_only_range(struct mm_struct *mm, unsigned long addr,
8 unsigned long len); 8 unsigned long len);
9 9
10 void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr, 10 void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
11 unsigned long end, unsigned long floor, 11 unsigned long end, unsigned long floor,
12 unsigned long ceiling); 12 unsigned long ceiling);
13 13
14 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, 14 void set_huge_pte_at(struct mm_struct *mm, unsigned long addr,
15 pte_t *ptep, pte_t pte); 15 pte_t *ptep, pte_t pte);
16 16
17 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, 17 pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
18 pte_t *ptep); 18 pte_t *ptep);
19 19
20 /* 20 /*
21 * If the arch doesn't supply something else, assume that hugepage 21 * If the arch doesn't supply something else, assume that hugepage
22 * size aligned regions are ok without further preparation. 22 * size aligned regions are ok without further preparation.
23 */ 23 */
24 static inline int prepare_hugepage_range(struct file *file, 24 static inline int prepare_hugepage_range(struct file *file,
25 unsigned long addr, unsigned long len) 25 unsigned long addr, unsigned long len)
26 { 26 {
27 if (len & ~HPAGE_MASK) 27 struct hstate *h = hstate_file(file);
28 if (len & ~huge_page_mask(h))
28 return -EINVAL; 29 return -EINVAL;
29 if (addr & ~HPAGE_MASK) 30 if (addr & ~huge_page_mask(h))
30 return -EINVAL; 31 return -EINVAL;
31 return 0; 32 return 0;
32 } 33 }
33 34
34 static inline void hugetlb_prefault_arch_hook(struct mm_struct *mm) 35 static inline void hugetlb_prefault_arch_hook(struct mm_struct *mm)
35 { 36 {
36 } 37 }
37 38
38 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma, 39 static inline void huge_ptep_clear_flush(struct vm_area_struct *vma,
39 unsigned long addr, pte_t *ptep) 40 unsigned long addr, pte_t *ptep)
40 { 41 {
41 } 42 }
42 43
43 static inline int huge_pte_none(pte_t pte) 44 static inline int huge_pte_none(pte_t pte)
44 { 45 {
45 return pte_none(pte); 46 return pte_none(pte);
46 } 47 }
47 48
48 static inline pte_t huge_pte_wrprotect(pte_t pte) 49 static inline pte_t huge_pte_wrprotect(pte_t pte)
49 { 50 {
50 return pte_wrprotect(pte); 51 return pte_wrprotect(pte);
51 } 52 }
52 53
53 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, 54 static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
54 unsigned long addr, pte_t *ptep, 55 unsigned long addr, pte_t *ptep,
55 pte_t pte, int dirty) 56 pte_t pte, int dirty)
56 { 57 {
57 return ptep_set_access_flags(vma, addr, ptep, pte, dirty); 58 return ptep_set_access_flags(vma, addr, ptep, pte, dirty);
58 } 59 }
59 60
60 static inline pte_t huge_ptep_get(pte_t *ptep) 61 static inline pte_t huge_ptep_get(pte_t *ptep)
61 { 62 {
62 return *ptep; 63 return *ptep;
63 } 64 }
64 65
65 static inline int arch_prepare_hugepage(struct page *page) 66 static inline int arch_prepare_hugepage(struct page *page)
66 { 67 {
67 return 0; 68 return 0;
68 } 69 }
69 70
70 static inline void arch_release_hugepage(struct page *page) 71 static inline void arch_release_hugepage(struct page *page)
71 { 72 {
72 } 73 }
73 74
74 #endif /* _ASM_POWERPC_HUGETLB_H */ 75 #endif /* _ASM_POWERPC_HUGETLB_H */
75 76
include/asm-powerpc/mmu-hash64.h
1 #ifndef _ASM_POWERPC_MMU_HASH64_H_ 1 #ifndef _ASM_POWERPC_MMU_HASH64_H_
2 #define _ASM_POWERPC_MMU_HASH64_H_ 2 #define _ASM_POWERPC_MMU_HASH64_H_
3 /* 3 /*
4 * PowerPC64 memory management structures 4 * PowerPC64 memory management structures
5 * 5 *
6 * Dave Engebretsen & Mike Corrigan <{engebret|mikejc}@us.ibm.com> 6 * Dave Engebretsen & Mike Corrigan <{engebret|mikejc}@us.ibm.com>
7 * PPC64 rework. 7 * PPC64 rework.
8 * 8 *
9 * This program is free software; you can redistribute it and/or 9 * This program is free software; you can redistribute it and/or
10 * modify it under the terms of the GNU General Public License 10 * modify it under the terms of the GNU General Public License
11 * as published by the Free Software Foundation; either version 11 * as published by the Free Software Foundation; either version
12 * 2 of the License, or (at your option) any later version. 12 * 2 of the License, or (at your option) any later version.
13 */ 13 */
14 14
15 #include <asm/asm-compat.h> 15 #include <asm/asm-compat.h>
16 #include <asm/page.h> 16 #include <asm/page.h>
17 17
18 /* 18 /*
19 * Segment table 19 * Segment table
20 */ 20 */
21 21
22 #define STE_ESID_V 0x80 22 #define STE_ESID_V 0x80
23 #define STE_ESID_KS 0x20 23 #define STE_ESID_KS 0x20
24 #define STE_ESID_KP 0x10 24 #define STE_ESID_KP 0x10
25 #define STE_ESID_N 0x08 25 #define STE_ESID_N 0x08
26 26
27 #define STE_VSID_SHIFT 12 27 #define STE_VSID_SHIFT 12
28 28
29 /* Location of cpu0's segment table */ 29 /* Location of cpu0's segment table */
30 #define STAB0_PAGE 0x6 30 #define STAB0_PAGE 0x6
31 #define STAB0_OFFSET (STAB0_PAGE << 12) 31 #define STAB0_OFFSET (STAB0_PAGE << 12)
32 #define STAB0_PHYS_ADDR (STAB0_OFFSET + PHYSICAL_START) 32 #define STAB0_PHYS_ADDR (STAB0_OFFSET + PHYSICAL_START)
33 33
34 #ifndef __ASSEMBLY__ 34 #ifndef __ASSEMBLY__
35 extern char initial_stab[]; 35 extern char initial_stab[];
36 #endif /* ! __ASSEMBLY */ 36 #endif /* ! __ASSEMBLY */
37 37
38 /* 38 /*
39 * SLB 39 * SLB
40 */ 40 */
41 41
42 #define SLB_NUM_BOLTED 3 42 #define SLB_NUM_BOLTED 3
43 #define SLB_CACHE_ENTRIES 8 43 #define SLB_CACHE_ENTRIES 8
44 44
45 /* Bits in the SLB ESID word */ 45 /* Bits in the SLB ESID word */
46 #define SLB_ESID_V ASM_CONST(0x0000000008000000) /* valid */ 46 #define SLB_ESID_V ASM_CONST(0x0000000008000000) /* valid */
47 47
48 /* Bits in the SLB VSID word */ 48 /* Bits in the SLB VSID word */
49 #define SLB_VSID_SHIFT 12 49 #define SLB_VSID_SHIFT 12
50 #define SLB_VSID_SHIFT_1T 24 50 #define SLB_VSID_SHIFT_1T 24
51 #define SLB_VSID_SSIZE_SHIFT 62 51 #define SLB_VSID_SSIZE_SHIFT 62
52 #define SLB_VSID_B ASM_CONST(0xc000000000000000) 52 #define SLB_VSID_B ASM_CONST(0xc000000000000000)
53 #define SLB_VSID_B_256M ASM_CONST(0x0000000000000000) 53 #define SLB_VSID_B_256M ASM_CONST(0x0000000000000000)
54 #define SLB_VSID_B_1T ASM_CONST(0x4000000000000000) 54 #define SLB_VSID_B_1T ASM_CONST(0x4000000000000000)
55 #define SLB_VSID_KS ASM_CONST(0x0000000000000800) 55 #define SLB_VSID_KS ASM_CONST(0x0000000000000800)
56 #define SLB_VSID_KP ASM_CONST(0x0000000000000400) 56 #define SLB_VSID_KP ASM_CONST(0x0000000000000400)
57 #define SLB_VSID_N ASM_CONST(0x0000000000000200) /* no-execute */ 57 #define SLB_VSID_N ASM_CONST(0x0000000000000200) /* no-execute */
58 #define SLB_VSID_L ASM_CONST(0x0000000000000100) 58 #define SLB_VSID_L ASM_CONST(0x0000000000000100)
59 #define SLB_VSID_C ASM_CONST(0x0000000000000080) /* class */ 59 #define SLB_VSID_C ASM_CONST(0x0000000000000080) /* class */
60 #define SLB_VSID_LP ASM_CONST(0x0000000000000030) 60 #define SLB_VSID_LP ASM_CONST(0x0000000000000030)
61 #define SLB_VSID_LP_00 ASM_CONST(0x0000000000000000) 61 #define SLB_VSID_LP_00 ASM_CONST(0x0000000000000000)
62 #define SLB_VSID_LP_01 ASM_CONST(0x0000000000000010) 62 #define SLB_VSID_LP_01 ASM_CONST(0x0000000000000010)
63 #define SLB_VSID_LP_10 ASM_CONST(0x0000000000000020) 63 #define SLB_VSID_LP_10 ASM_CONST(0x0000000000000020)
64 #define SLB_VSID_LP_11 ASM_CONST(0x0000000000000030) 64 #define SLB_VSID_LP_11 ASM_CONST(0x0000000000000030)
65 #define SLB_VSID_LLP (SLB_VSID_L|SLB_VSID_LP) 65 #define SLB_VSID_LLP (SLB_VSID_L|SLB_VSID_LP)
66 66
67 #define SLB_VSID_KERNEL (SLB_VSID_KP) 67 #define SLB_VSID_KERNEL (SLB_VSID_KP)
68 #define SLB_VSID_USER (SLB_VSID_KP|SLB_VSID_KS|SLB_VSID_C) 68 #define SLB_VSID_USER (SLB_VSID_KP|SLB_VSID_KS|SLB_VSID_C)
69 69
70 #define SLBIE_C (0x08000000) 70 #define SLBIE_C (0x08000000)
71 #define SLBIE_SSIZE_SHIFT 25 71 #define SLBIE_SSIZE_SHIFT 25
72 72
73 /* 73 /*
74 * Hash table 74 * Hash table
75 */ 75 */
76 76
77 #define HPTES_PER_GROUP 8 77 #define HPTES_PER_GROUP 8
78 78
79 #define HPTE_V_SSIZE_SHIFT 62 79 #define HPTE_V_SSIZE_SHIFT 62
80 #define HPTE_V_AVPN_SHIFT 7 80 #define HPTE_V_AVPN_SHIFT 7
81 #define HPTE_V_AVPN ASM_CONST(0x3fffffffffffff80) 81 #define HPTE_V_AVPN ASM_CONST(0x3fffffffffffff80)
82 #define HPTE_V_AVPN_VAL(x) (((x) & HPTE_V_AVPN) >> HPTE_V_AVPN_SHIFT) 82 #define HPTE_V_AVPN_VAL(x) (((x) & HPTE_V_AVPN) >> HPTE_V_AVPN_SHIFT)
83 #define HPTE_V_COMPARE(x,y) (!(((x) ^ (y)) & 0xffffffffffffff80UL)) 83 #define HPTE_V_COMPARE(x,y) (!(((x) ^ (y)) & 0xffffffffffffff80UL))
84 #define HPTE_V_BOLTED ASM_CONST(0x0000000000000010) 84 #define HPTE_V_BOLTED ASM_CONST(0x0000000000000010)
85 #define HPTE_V_LOCK ASM_CONST(0x0000000000000008) 85 #define HPTE_V_LOCK ASM_CONST(0x0000000000000008)
86 #define HPTE_V_LARGE ASM_CONST(0x0000000000000004) 86 #define HPTE_V_LARGE ASM_CONST(0x0000000000000004)
87 #define HPTE_V_SECONDARY ASM_CONST(0x0000000000000002) 87 #define HPTE_V_SECONDARY ASM_CONST(0x0000000000000002)
88 #define HPTE_V_VALID ASM_CONST(0x0000000000000001) 88 #define HPTE_V_VALID ASM_CONST(0x0000000000000001)
89 89
90 #define HPTE_R_PP0 ASM_CONST(0x8000000000000000) 90 #define HPTE_R_PP0 ASM_CONST(0x8000000000000000)
91 #define HPTE_R_TS ASM_CONST(0x4000000000000000) 91 #define HPTE_R_TS ASM_CONST(0x4000000000000000)
92 #define HPTE_R_RPN_SHIFT 12 92 #define HPTE_R_RPN_SHIFT 12
93 #define HPTE_R_RPN ASM_CONST(0x3ffffffffffff000) 93 #define HPTE_R_RPN ASM_CONST(0x3ffffffffffff000)
94 #define HPTE_R_FLAGS ASM_CONST(0x00000000000003ff) 94 #define HPTE_R_FLAGS ASM_CONST(0x00000000000003ff)
95 #define HPTE_R_PP ASM_CONST(0x0000000000000003) 95 #define HPTE_R_PP ASM_CONST(0x0000000000000003)
96 #define HPTE_R_N ASM_CONST(0x0000000000000004) 96 #define HPTE_R_N ASM_CONST(0x0000000000000004)
97 #define HPTE_R_C ASM_CONST(0x0000000000000080) 97 #define HPTE_R_C ASM_CONST(0x0000000000000080)
98 #define HPTE_R_R ASM_CONST(0x0000000000000100) 98 #define HPTE_R_R ASM_CONST(0x0000000000000100)
99 99
100 #define HPTE_V_1TB_SEG ASM_CONST(0x4000000000000000) 100 #define HPTE_V_1TB_SEG ASM_CONST(0x4000000000000000)
101 #define HPTE_V_VRMA_MASK ASM_CONST(0x4001ffffff000000) 101 #define HPTE_V_VRMA_MASK ASM_CONST(0x4001ffffff000000)
102 102
103 /* Values for PP (assumes Ks=0, Kp=1) */ 103 /* Values for PP (assumes Ks=0, Kp=1) */
104 /* pp0 will always be 0 for linux */ 104 /* pp0 will always be 0 for linux */
105 #define PP_RWXX 0 /* Supervisor read/write, User none */ 105 #define PP_RWXX 0 /* Supervisor read/write, User none */
106 #define PP_RWRX 1 /* Supervisor read/write, User read */ 106 #define PP_RWRX 1 /* Supervisor read/write, User read */
107 #define PP_RWRW 2 /* Supervisor read/write, User read/write */ 107 #define PP_RWRW 2 /* Supervisor read/write, User read/write */
108 #define PP_RXRX 3 /* Supervisor read, User read */ 108 #define PP_RXRX 3 /* Supervisor read, User read */
109 109
110 #ifndef __ASSEMBLY__ 110 #ifndef __ASSEMBLY__
111 111
112 struct hash_pte { 112 struct hash_pte {
113 unsigned long v; 113 unsigned long v;
114 unsigned long r; 114 unsigned long r;
115 }; 115 };
116 116
117 extern struct hash_pte *htab_address; 117 extern struct hash_pte *htab_address;
118 extern unsigned long htab_size_bytes; 118 extern unsigned long htab_size_bytes;
119 extern unsigned long htab_hash_mask; 119 extern unsigned long htab_hash_mask;
120 120
121 /* 121 /*
122 * Page size definition 122 * Page size definition
123 * 123 *
124 * shift : is the "PAGE_SHIFT" value for that page size 124 * shift : is the "PAGE_SHIFT" value for that page size
125 * sllp : is a bit mask with the value of SLB L || LP to be or'ed 125 * sllp : is a bit mask with the value of SLB L || LP to be or'ed
126 * directly to a slbmte "vsid" value 126 * directly to a slbmte "vsid" value
127 * penc : is the HPTE encoding mask for the "LP" field: 127 * penc : is the HPTE encoding mask for the "LP" field:
128 * 128 *
129 */ 129 */
130 struct mmu_psize_def 130 struct mmu_psize_def
131 { 131 {
132 unsigned int shift; /* number of bits */ 132 unsigned int shift; /* number of bits */
133 unsigned int penc; /* HPTE encoding */ 133 unsigned int penc; /* HPTE encoding */
134 unsigned int tlbiel; /* tlbiel supported for that page size */ 134 unsigned int tlbiel; /* tlbiel supported for that page size */
135 unsigned long avpnm; /* bits to mask out in AVPN in the HPTE */ 135 unsigned long avpnm; /* bits to mask out in AVPN in the HPTE */
136 unsigned long sllp; /* SLB L||LP (exact mask to use in slbmte) */ 136 unsigned long sllp; /* SLB L||LP (exact mask to use in slbmte) */
137 }; 137 };
138 138
139 #endif /* __ASSEMBLY__ */ 139 #endif /* __ASSEMBLY__ */
140 140
141 /* 141 /*
142 * The kernel use the constants below to index in the page sizes array. 142 * The kernel use the constants below to index in the page sizes array.
143 * The use of fixed constants for this purpose is better for performances 143 * The use of fixed constants for this purpose is better for performances
144 * of the low level hash refill handlers. 144 * of the low level hash refill handlers.
145 * 145 *
146 * A non supported page size has a "shift" field set to 0 146 * A non supported page size has a "shift" field set to 0
147 * 147 *
148 * Any new page size being implemented can get a new entry in here. Whether 148 * Any new page size being implemented can get a new entry in here. Whether
149 * the kernel will use it or not is a different matter though. The actual page 149 * the kernel will use it or not is a different matter though. The actual page
150 * size used by hugetlbfs is not defined here and may be made variable 150 * size used by hugetlbfs is not defined here and may be made variable
151 */ 151 */
152 152
153 #define MMU_PAGE_4K 0 /* 4K */ 153 #define MMU_PAGE_4K 0 /* 4K */
154 #define MMU_PAGE_64K 1 /* 64K */ 154 #define MMU_PAGE_64K 1 /* 64K */
155 #define MMU_PAGE_64K_AP 2 /* 64K Admixed (in a 4K segment) */ 155 #define MMU_PAGE_64K_AP 2 /* 64K Admixed (in a 4K segment) */
156 #define MMU_PAGE_1M 3 /* 1M */ 156 #define MMU_PAGE_1M 3 /* 1M */
157 #define MMU_PAGE_16M 4 /* 16M */ 157 #define MMU_PAGE_16M 4 /* 16M */
158 #define MMU_PAGE_16G 5 /* 16G */ 158 #define MMU_PAGE_16G 5 /* 16G */
159 #define MMU_PAGE_COUNT 6 159 #define MMU_PAGE_COUNT 6
160 160
161 /* 161 /*
162 * Segment sizes. 162 * Segment sizes.
163 * These are the values used by hardware in the B field of 163 * These are the values used by hardware in the B field of
164 * SLB entries and the first dword of MMU hashtable entries. 164 * SLB entries and the first dword of MMU hashtable entries.
165 * The B field is 2 bits; the values 2 and 3 are unused and reserved. 165 * The B field is 2 bits; the values 2 and 3 are unused and reserved.
166 */ 166 */
167 #define MMU_SEGSIZE_256M 0 167 #define MMU_SEGSIZE_256M 0
168 #define MMU_SEGSIZE_1T 1 168 #define MMU_SEGSIZE_1T 1
169 169
170 170
171 #ifndef __ASSEMBLY__ 171 #ifndef __ASSEMBLY__
172 172
173 /* 173 /*
174 * The current system page and segment sizes 174 * The current system page and segment sizes
175 */ 175 */
176 extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; 176 extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
177 extern int mmu_linear_psize; 177 extern int mmu_linear_psize;
178 extern int mmu_virtual_psize; 178 extern int mmu_virtual_psize;
179 extern int mmu_vmalloc_psize; 179 extern int mmu_vmalloc_psize;
180 extern int mmu_vmemmap_psize; 180 extern int mmu_vmemmap_psize;
181 extern int mmu_io_psize; 181 extern int mmu_io_psize;
182 extern int mmu_kernel_ssize; 182 extern int mmu_kernel_ssize;
183 extern int mmu_highuser_ssize; 183 extern int mmu_highuser_ssize;
184 extern u16 mmu_slb_size; 184 extern u16 mmu_slb_size;
185 extern unsigned long tce_alloc_start, tce_alloc_end; 185 extern unsigned long tce_alloc_start, tce_alloc_end;
186 186
187 /* 187 /*
188 * If the processor supports 64k normal pages but not 64k cache 188 * If the processor supports 64k normal pages but not 64k cache
189 * inhibited pages, we have to be prepared to switch processes 189 * inhibited pages, we have to be prepared to switch processes
190 * to use 4k pages when they create cache-inhibited mappings. 190 * to use 4k pages when they create cache-inhibited mappings.
191 * If this is the case, mmu_ci_restrictions will be set to 1. 191 * If this is the case, mmu_ci_restrictions will be set to 1.
192 */ 192 */
193 extern int mmu_ci_restrictions; 193 extern int mmu_ci_restrictions;
194 194
195 #ifdef CONFIG_HUGETLB_PAGE 195 #ifdef CONFIG_HUGETLB_PAGE
196 /* 196 /*
197 * The page size index of the huge pages for use by hugetlbfs 197 * The page size indexes of the huge pages for use by hugetlbfs
198 */ 198 */
199 extern int mmu_huge_psize; 199 extern unsigned int mmu_huge_psizes[MMU_PAGE_COUNT];
200 200
201 #endif /* CONFIG_HUGETLB_PAGE */ 201 #endif /* CONFIG_HUGETLB_PAGE */
202 202
203 /* 203 /*
204 * This function sets the AVPN and L fields of the HPTE appropriately 204 * This function sets the AVPN and L fields of the HPTE appropriately
205 * for the page size 205 * for the page size
206 */ 206 */
207 static inline unsigned long hpte_encode_v(unsigned long va, int psize, 207 static inline unsigned long hpte_encode_v(unsigned long va, int psize,
208 int ssize) 208 int ssize)
209 { 209 {
210 unsigned long v; 210 unsigned long v;
211 v = (va >> 23) & ~(mmu_psize_defs[psize].avpnm); 211 v = (va >> 23) & ~(mmu_psize_defs[psize].avpnm);
212 v <<= HPTE_V_AVPN_SHIFT; 212 v <<= HPTE_V_AVPN_SHIFT;
213 if (psize != MMU_PAGE_4K) 213 if (psize != MMU_PAGE_4K)
214 v |= HPTE_V_LARGE; 214 v |= HPTE_V_LARGE;
215 v |= ((unsigned long) ssize) << HPTE_V_SSIZE_SHIFT; 215 v |= ((unsigned long) ssize) << HPTE_V_SSIZE_SHIFT;
216 return v; 216 return v;
217 } 217 }
218 218
219 /* 219 /*
220 * This function sets the ARPN, and LP fields of the HPTE appropriately 220 * This function sets the ARPN, and LP fields of the HPTE appropriately
221 * for the page size. We assume the pa is already "clean" that is properly 221 * for the page size. We assume the pa is already "clean" that is properly
222 * aligned for the requested page size 222 * aligned for the requested page size
223 */ 223 */
224 static inline unsigned long hpte_encode_r(unsigned long pa, int psize) 224 static inline unsigned long hpte_encode_r(unsigned long pa, int psize)
225 { 225 {
226 unsigned long r; 226 unsigned long r;
227 227
228 /* A 4K page needs no special encoding */ 228 /* A 4K page needs no special encoding */
229 if (psize == MMU_PAGE_4K) 229 if (psize == MMU_PAGE_4K)
230 return pa & HPTE_R_RPN; 230 return pa & HPTE_R_RPN;
231 else { 231 else {
232 unsigned int penc = mmu_psize_defs[psize].penc; 232 unsigned int penc = mmu_psize_defs[psize].penc;
233 unsigned int shift = mmu_psize_defs[psize].shift; 233 unsigned int shift = mmu_psize_defs[psize].shift;
234 return (pa & ~((1ul << shift) - 1)) | (penc << 12); 234 return (pa & ~((1ul << shift) - 1)) | (penc << 12);
235 } 235 }
236 return r; 236 return r;
237 } 237 }
238 238
239 /* 239 /*
240 * Build a VA given VSID, EA and segment size 240 * Build a VA given VSID, EA and segment size
241 */ 241 */
242 static inline unsigned long hpt_va(unsigned long ea, unsigned long vsid, 242 static inline unsigned long hpt_va(unsigned long ea, unsigned long vsid,
243 int ssize) 243 int ssize)
244 { 244 {
245 if (ssize == MMU_SEGSIZE_256M) 245 if (ssize == MMU_SEGSIZE_256M)
246 return (vsid << 28) | (ea & 0xfffffffUL); 246 return (vsid << 28) | (ea & 0xfffffffUL);
247 return (vsid << 40) | (ea & 0xffffffffffUL); 247 return (vsid << 40) | (ea & 0xffffffffffUL);
248 } 248 }
249 249
250 /* 250 /*
251 * This hashes a virtual address 251 * This hashes a virtual address
252 */ 252 */
253 253
254 static inline unsigned long hpt_hash(unsigned long va, unsigned int shift, 254 static inline unsigned long hpt_hash(unsigned long va, unsigned int shift,
255 int ssize) 255 int ssize)
256 { 256 {
257 unsigned long hash, vsid; 257 unsigned long hash, vsid;
258 258
259 if (ssize == MMU_SEGSIZE_256M) { 259 if (ssize == MMU_SEGSIZE_256M) {
260 hash = (va >> 28) ^ ((va & 0x0fffffffUL) >> shift); 260 hash = (va >> 28) ^ ((va & 0x0fffffffUL) >> shift);
261 } else { 261 } else {
262 vsid = va >> 40; 262 vsid = va >> 40;
263 hash = vsid ^ (vsid << 25) ^ ((va & 0xffffffffffUL) >> shift); 263 hash = vsid ^ (vsid << 25) ^ ((va & 0xffffffffffUL) >> shift);
264 } 264 }
265 return hash & 0x7fffffffffUL; 265 return hash & 0x7fffffffffUL;
266 } 266 }
267 267
268 extern int __hash_page_4K(unsigned long ea, unsigned long access, 268 extern int __hash_page_4K(unsigned long ea, unsigned long access,
269 unsigned long vsid, pte_t *ptep, unsigned long trap, 269 unsigned long vsid, pte_t *ptep, unsigned long trap,
270 unsigned int local, int ssize, int subpage_prot); 270 unsigned int local, int ssize, int subpage_prot);
271 extern int __hash_page_64K(unsigned long ea, unsigned long access, 271 extern int __hash_page_64K(unsigned long ea, unsigned long access,
272 unsigned long vsid, pte_t *ptep, unsigned long trap, 272 unsigned long vsid, pte_t *ptep, unsigned long trap,
273 unsigned int local, int ssize); 273 unsigned int local, int ssize);
274 struct mm_struct; 274 struct mm_struct;
275 extern int hash_page(unsigned long ea, unsigned long access, unsigned long trap); 275 extern int hash_page(unsigned long ea, unsigned long access, unsigned long trap);
276 extern int hash_huge_page(struct mm_struct *mm, unsigned long access, 276 extern int hash_huge_page(struct mm_struct *mm, unsigned long access,
277 unsigned long ea, unsigned long vsid, int local, 277 unsigned long ea, unsigned long vsid, int local,
278 unsigned long trap); 278 unsigned long trap);
279 279
280 extern int htab_bolt_mapping(unsigned long vstart, unsigned long vend, 280 extern int htab_bolt_mapping(unsigned long vstart, unsigned long vend,
281 unsigned long pstart, unsigned long mode, 281 unsigned long pstart, unsigned long mode,
282 int psize, int ssize); 282 int psize, int ssize);
283 extern void set_huge_psize(int psize); 283 extern void set_huge_psize(int psize);
284 extern void add_gpage(unsigned long addr, unsigned long page_size, 284 extern void add_gpage(unsigned long addr, unsigned long page_size,
285 unsigned long number_of_pages); 285 unsigned long number_of_pages);
286 extern void demote_segment_4k(struct mm_struct *mm, unsigned long addr); 286 extern void demote_segment_4k(struct mm_struct *mm, unsigned long addr);
287 287
288 extern void htab_initialize(void); 288 extern void htab_initialize(void);
289 extern void htab_initialize_secondary(void); 289 extern void htab_initialize_secondary(void);
290 extern void hpte_init_native(void); 290 extern void hpte_init_native(void);
291 extern void hpte_init_lpar(void); 291 extern void hpte_init_lpar(void);
292 extern void hpte_init_iSeries(void); 292 extern void hpte_init_iSeries(void);
293 extern void hpte_init_beat(void); 293 extern void hpte_init_beat(void);
294 extern void hpte_init_beat_v3(void); 294 extern void hpte_init_beat_v3(void);
295 295
296 extern void stabs_alloc(void); 296 extern void stabs_alloc(void);
297 extern void slb_initialize(void); 297 extern void slb_initialize(void);
298 extern void slb_flush_and_rebolt(void); 298 extern void slb_flush_and_rebolt(void);
299 extern void stab_initialize(unsigned long stab); 299 extern void stab_initialize(unsigned long stab);
300 300
301 extern void slb_vmalloc_update(void); 301 extern void slb_vmalloc_update(void);
302 #endif /* __ASSEMBLY__ */ 302 #endif /* __ASSEMBLY__ */
303 303
304 /* 304 /*
305 * VSID allocation 305 * VSID allocation
306 * 306 *
307 * We first generate a 36-bit "proto-VSID". For kernel addresses this 307 * We first generate a 36-bit "proto-VSID". For kernel addresses this
308 * is equal to the ESID, for user addresses it is: 308 * is equal to the ESID, for user addresses it is:
309 * (context << 15) | (esid & 0x7fff) 309 * (context << 15) | (esid & 0x7fff)
310 * 310 *
311 * The two forms are distinguishable because the top bit is 0 for user 311 * The two forms are distinguishable because the top bit is 0 for user
312 * addresses, whereas the top two bits are 1 for kernel addresses. 312 * addresses, whereas the top two bits are 1 for kernel addresses.
313 * Proto-VSIDs with the top two bits equal to 0b10 are reserved for 313 * Proto-VSIDs with the top two bits equal to 0b10 are reserved for
314 * now. 314 * now.
315 * 315 *
316 * The proto-VSIDs are then scrambled into real VSIDs with the 316 * The proto-VSIDs are then scrambled into real VSIDs with the
317 * multiplicative hash: 317 * multiplicative hash:
318 * 318 *
319 * VSID = (proto-VSID * VSID_MULTIPLIER) % VSID_MODULUS 319 * VSID = (proto-VSID * VSID_MULTIPLIER) % VSID_MODULUS
320 * where VSID_MULTIPLIER = 268435399 = 0xFFFFFC7 320 * where VSID_MULTIPLIER = 268435399 = 0xFFFFFC7
321 * VSID_MODULUS = 2^36-1 = 0xFFFFFFFFF 321 * VSID_MODULUS = 2^36-1 = 0xFFFFFFFFF
322 * 322 *
323 * This scramble is only well defined for proto-VSIDs below 323 * This scramble is only well defined for proto-VSIDs below
324 * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are 324 * 0xFFFFFFFFF, so both proto-VSID and actual VSID 0xFFFFFFFFF are
325 * reserved. VSID_MULTIPLIER is prime, so in particular it is 325 * reserved. VSID_MULTIPLIER is prime, so in particular it is
326 * co-prime to VSID_MODULUS, making this a 1:1 scrambling function. 326 * co-prime to VSID_MODULUS, making this a 1:1 scrambling function.
327 * Because the modulus is 2^n-1 we can compute it efficiently without 327 * Because the modulus is 2^n-1 we can compute it efficiently without
328 * a divide or extra multiply (see below). 328 * a divide or extra multiply (see below).
329 * 329 *
330 * This scheme has several advantages over older methods: 330 * This scheme has several advantages over older methods:
331 * 331 *
332 * - We have VSIDs allocated for every kernel address 332 * - We have VSIDs allocated for every kernel address
333 * (i.e. everything above 0xC000000000000000), except the very top 333 * (i.e. everything above 0xC000000000000000), except the very top
334 * segment, which simplifies several things. 334 * segment, which simplifies several things.
335 * 335 *
336 * - We allow for 15 significant bits of ESID and 20 bits of 336 * - We allow for 15 significant bits of ESID and 20 bits of
337 * context for user addresses. i.e. 8T (43 bits) of address space for 337 * context for user addresses. i.e. 8T (43 bits) of address space for
338 * up to 1M contexts (although the page table structure and context 338 * up to 1M contexts (although the page table structure and context
339 * allocation will need changes to take advantage of this). 339 * allocation will need changes to take advantage of this).
340 * 340 *
341 * - The scramble function gives robust scattering in the hash 341 * - The scramble function gives robust scattering in the hash
342 * table (at least based on some initial results). The previous 342 * table (at least based on some initial results). The previous
343 * method was more susceptible to pathological cases giving excessive 343 * method was more susceptible to pathological cases giving excessive
344 * hash collisions. 344 * hash collisions.
345 */ 345 */
346 /* 346 /*
347 * WARNING - If you change these you must make sure the asm 347 * WARNING - If you change these you must make sure the asm
348 * implementations in slb_allocate (slb_low.S), do_stab_bolted 348 * implementations in slb_allocate (slb_low.S), do_stab_bolted
349 * (head.S) and ASM_VSID_SCRAMBLE (below) are changed accordingly. 349 * (head.S) and ASM_VSID_SCRAMBLE (below) are changed accordingly.
350 * 350 *
351 * You'll also need to change the precomputed VSID values in head.S 351 * You'll also need to change the precomputed VSID values in head.S
352 * which are used by the iSeries firmware. 352 * which are used by the iSeries firmware.
353 */ 353 */
354 354
355 #define VSID_MULTIPLIER_256M ASM_CONST(200730139) /* 28-bit prime */ 355 #define VSID_MULTIPLIER_256M ASM_CONST(200730139) /* 28-bit prime */
356 #define VSID_BITS_256M 36 356 #define VSID_BITS_256M 36
357 #define VSID_MODULUS_256M ((1UL<<VSID_BITS_256M)-1) 357 #define VSID_MODULUS_256M ((1UL<<VSID_BITS_256M)-1)
358 358
359 #define VSID_MULTIPLIER_1T ASM_CONST(12538073) /* 24-bit prime */ 359 #define VSID_MULTIPLIER_1T ASM_CONST(12538073) /* 24-bit prime */
360 #define VSID_BITS_1T 24 360 #define VSID_BITS_1T 24
361 #define VSID_MODULUS_1T ((1UL<<VSID_BITS_1T)-1) 361 #define VSID_MODULUS_1T ((1UL<<VSID_BITS_1T)-1)
362 362
363 #define CONTEXT_BITS 19 363 #define CONTEXT_BITS 19
364 #define USER_ESID_BITS 16 364 #define USER_ESID_BITS 16
365 #define USER_ESID_BITS_1T 4 365 #define USER_ESID_BITS_1T 4
366 366
367 #define USER_VSID_RANGE (1UL << (USER_ESID_BITS + SID_SHIFT)) 367 #define USER_VSID_RANGE (1UL << (USER_ESID_BITS + SID_SHIFT))
368 368
369 /* 369 /*
370 * This macro generates asm code to compute the VSID scramble 370 * This macro generates asm code to compute the VSID scramble
371 * function. Used in slb_allocate() and do_stab_bolted. The function 371 * function. Used in slb_allocate() and do_stab_bolted. The function
372 * computed is: (protovsid*VSID_MULTIPLIER) % VSID_MODULUS 372 * computed is: (protovsid*VSID_MULTIPLIER) % VSID_MODULUS
373 * 373 *
374 * rt = register continaing the proto-VSID and into which the 374 * rt = register continaing the proto-VSID and into which the
375 * VSID will be stored 375 * VSID will be stored
376 * rx = scratch register (clobbered) 376 * rx = scratch register (clobbered)
377 * 377 *
378 * - rt and rx must be different registers 378 * - rt and rx must be different registers
379 * - The answer will end up in the low VSID_BITS bits of rt. The higher 379 * - The answer will end up in the low VSID_BITS bits of rt. The higher
380 * bits may contain other garbage, so you may need to mask the 380 * bits may contain other garbage, so you may need to mask the
381 * result. 381 * result.
382 */ 382 */
383 #define ASM_VSID_SCRAMBLE(rt, rx, size) \ 383 #define ASM_VSID_SCRAMBLE(rt, rx, size) \
384 lis rx,VSID_MULTIPLIER_##size@h; \ 384 lis rx,VSID_MULTIPLIER_##size@h; \
385 ori rx,rx,VSID_MULTIPLIER_##size@l; \ 385 ori rx,rx,VSID_MULTIPLIER_##size@l; \
386 mulld rt,rt,rx; /* rt = rt * MULTIPLIER */ \ 386 mulld rt,rt,rx; /* rt = rt * MULTIPLIER */ \
387 \ 387 \
388 srdi rx,rt,VSID_BITS_##size; \ 388 srdi rx,rt,VSID_BITS_##size; \
389 clrldi rt,rt,(64-VSID_BITS_##size); \ 389 clrldi rt,rt,(64-VSID_BITS_##size); \
390 add rt,rt,rx; /* add high and low bits */ \ 390 add rt,rt,rx; /* add high and low bits */ \
391 /* Now, r3 == VSID (mod 2^36-1), and lies between 0 and \ 391 /* Now, r3 == VSID (mod 2^36-1), and lies between 0 and \
392 * 2^36-1+2^28-1. That in particular means that if r3 >= \ 392 * 2^36-1+2^28-1. That in particular means that if r3 >= \
393 * 2^36-1, then r3+1 has the 2^36 bit set. So, if r3+1 has \ 393 * 2^36-1, then r3+1 has the 2^36 bit set. So, if r3+1 has \
394 * the bit clear, r3 already has the answer we want, if it \ 394 * the bit clear, r3 already has the answer we want, if it \
395 * doesn't, the answer is the low 36 bits of r3+1. So in all \ 395 * doesn't, the answer is the low 36 bits of r3+1. So in all \
396 * cases the answer is the low 36 bits of (r3 + ((r3+1) >> 36))*/\ 396 * cases the answer is the low 36 bits of (r3 + ((r3+1) >> 36))*/\
397 addi rx,rt,1; \ 397 addi rx,rt,1; \
398 srdi rx,rx,VSID_BITS_##size; /* extract 2^VSID_BITS bit */ \ 398 srdi rx,rx,VSID_BITS_##size; /* extract 2^VSID_BITS bit */ \
399 add rt,rt,rx 399 add rt,rt,rx
400 400
401 401
402 #ifndef __ASSEMBLY__ 402 #ifndef __ASSEMBLY__
403 403
404 typedef unsigned long mm_context_id_t; 404 typedef unsigned long mm_context_id_t;
405 405
406 typedef struct { 406 typedef struct {
407 mm_context_id_t id; 407 mm_context_id_t id;
408 u16 user_psize; /* page size index */ 408 u16 user_psize; /* page size index */
409 409
410 #ifdef CONFIG_PPC_MM_SLICES 410 #ifdef CONFIG_PPC_MM_SLICES
411 u64 low_slices_psize; /* SLB page size encodings */ 411 u64 low_slices_psize; /* SLB page size encodings */
412 u64 high_slices_psize; /* 4 bits per slice for now */ 412 u64 high_slices_psize; /* 4 bits per slice for now */
413 #else 413 #else
414 u16 sllp; /* SLB page size encoding */ 414 u16 sllp; /* SLB page size encoding */
415 #endif 415 #endif
416 unsigned long vdso_base; 416 unsigned long vdso_base;
417 } mm_context_t; 417 } mm_context_t;
418 418
419 419
420 #if 0 420 #if 0
421 /* 421 /*
422 * The code below is equivalent to this function for arguments 422 * The code below is equivalent to this function for arguments
423 * < 2^VSID_BITS, which is all this should ever be called 423 * < 2^VSID_BITS, which is all this should ever be called
424 * with. However gcc is not clever enough to compute the 424 * with. However gcc is not clever enough to compute the
425 * modulus (2^n-1) without a second multiply. 425 * modulus (2^n-1) without a second multiply.
426 */ 426 */
427 #define vsid_scrample(protovsid, size) \ 427 #define vsid_scrample(protovsid, size) \
428 ((((protovsid) * VSID_MULTIPLIER_##size) % VSID_MODULUS_##size)) 428 ((((protovsid) * VSID_MULTIPLIER_##size) % VSID_MODULUS_##size))
429 429
430 #else /* 1 */ 430 #else /* 1 */
431 #define vsid_scramble(protovsid, size) \ 431 #define vsid_scramble(protovsid, size) \
432 ({ \ 432 ({ \
433 unsigned long x; \ 433 unsigned long x; \
434 x = (protovsid) * VSID_MULTIPLIER_##size; \ 434 x = (protovsid) * VSID_MULTIPLIER_##size; \
435 x = (x >> VSID_BITS_##size) + (x & VSID_MODULUS_##size); \ 435 x = (x >> VSID_BITS_##size) + (x & VSID_MODULUS_##size); \
436 (x + ((x+1) >> VSID_BITS_##size)) & VSID_MODULUS_##size; \ 436 (x + ((x+1) >> VSID_BITS_##size)) & VSID_MODULUS_##size; \
437 }) 437 })
438 #endif /* 1 */ 438 #endif /* 1 */
439 439
440 /* This is only valid for addresses >= KERNELBASE */ 440 /* This is only valid for addresses >= KERNELBASE */
441 static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize) 441 static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
442 { 442 {
443 if (ssize == MMU_SEGSIZE_256M) 443 if (ssize == MMU_SEGSIZE_256M)
444 return vsid_scramble(ea >> SID_SHIFT, 256M); 444 return vsid_scramble(ea >> SID_SHIFT, 256M);
445 return vsid_scramble(ea >> SID_SHIFT_1T, 1T); 445 return vsid_scramble(ea >> SID_SHIFT_1T, 1T);
446 } 446 }
447 447
448 /* Returns the segment size indicator for a user address */ 448 /* Returns the segment size indicator for a user address */
449 static inline int user_segment_size(unsigned long addr) 449 static inline int user_segment_size(unsigned long addr)
450 { 450 {
451 /* Use 1T segments if possible for addresses >= 1T */ 451 /* Use 1T segments if possible for addresses >= 1T */
452 if (addr >= (1UL << SID_SHIFT_1T)) 452 if (addr >= (1UL << SID_SHIFT_1T))
453 return mmu_highuser_ssize; 453 return mmu_highuser_ssize;
454 return MMU_SEGSIZE_256M; 454 return MMU_SEGSIZE_256M;
455 } 455 }
456 456
457 /* This is only valid for user addresses (which are below 2^44) */ 457 /* This is only valid for user addresses (which are below 2^44) */
458 static inline unsigned long get_vsid(unsigned long context, unsigned long ea, 458 static inline unsigned long get_vsid(unsigned long context, unsigned long ea,
459 int ssize) 459 int ssize)
460 { 460 {
461 if (ssize == MMU_SEGSIZE_256M) 461 if (ssize == MMU_SEGSIZE_256M)
462 return vsid_scramble((context << USER_ESID_BITS) 462 return vsid_scramble((context << USER_ESID_BITS)
463 | (ea >> SID_SHIFT), 256M); 463 | (ea >> SID_SHIFT), 256M);
464 return vsid_scramble((context << USER_ESID_BITS_1T) 464 return vsid_scramble((context << USER_ESID_BITS_1T)
465 | (ea >> SID_SHIFT_1T), 1T); 465 | (ea >> SID_SHIFT_1T), 1T);
466 } 466 }
467 467
468 /* 468 /*
469 * This is only used on legacy iSeries in lparmap.c, 469 * This is only used on legacy iSeries in lparmap.c,
470 * hence the 256MB segment assumption. 470 * hence the 256MB segment assumption.
471 */ 471 */
472 #define VSID_SCRAMBLE(pvsid) (((pvsid) * VSID_MULTIPLIER_256M) % \ 472 #define VSID_SCRAMBLE(pvsid) (((pvsid) * VSID_MULTIPLIER_256M) % \
473 VSID_MODULUS_256M) 473 VSID_MODULUS_256M)
474 #define KERNEL_VSID(ea) VSID_SCRAMBLE(GET_ESID(ea)) 474 #define KERNEL_VSID(ea) VSID_SCRAMBLE(GET_ESID(ea))
475 475
476 #endif /* __ASSEMBLY__ */ 476 #endif /* __ASSEMBLY__ */
477 477
478 #endif /* _ASM_POWERPC_MMU_HASH64_H_ */ 478 #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
479 479
include/asm-powerpc/page_64.h
1 #ifndef _ASM_POWERPC_PAGE_64_H 1 #ifndef _ASM_POWERPC_PAGE_64_H
2 #define _ASM_POWERPC_PAGE_64_H 2 #define _ASM_POWERPC_PAGE_64_H
3 3
4 /* 4 /*
5 * Copyright (C) 2001 PPC64 Team, IBM Corp 5 * Copyright (C) 2001 PPC64 Team, IBM Corp
6 * 6 *
7 * This program is free software; you can redistribute it and/or 7 * This program is free software; you can redistribute it and/or
8 * modify it under the terms of the GNU General Public License 8 * modify it under the terms of the GNU General Public License
9 * as published by the Free Software Foundation; either version 9 * as published by the Free Software Foundation; either version
10 * 2 of the License, or (at your option) any later version. 10 * 2 of the License, or (at your option) any later version.
11 */ 11 */
12 12
13 /* 13 /*
14 * We always define HW_PAGE_SHIFT to 12 as use of 64K pages remains Linux 14 * We always define HW_PAGE_SHIFT to 12 as use of 64K pages remains Linux
15 * specific, every notion of page number shared with the firmware, TCEs, 15 * specific, every notion of page number shared with the firmware, TCEs,
16 * iommu, etc... still uses a page size of 4K. 16 * iommu, etc... still uses a page size of 4K.
17 */ 17 */
18 #define HW_PAGE_SHIFT 12 18 #define HW_PAGE_SHIFT 12
19 #define HW_PAGE_SIZE (ASM_CONST(1) << HW_PAGE_SHIFT) 19 #define HW_PAGE_SIZE (ASM_CONST(1) << HW_PAGE_SHIFT)
20 #define HW_PAGE_MASK (~(HW_PAGE_SIZE-1)) 20 #define HW_PAGE_MASK (~(HW_PAGE_SIZE-1))
21 21
22 /* 22 /*
23 * PAGE_FACTOR is the number of bits factor between PAGE_SHIFT and 23 * PAGE_FACTOR is the number of bits factor between PAGE_SHIFT and
24 * HW_PAGE_SHIFT, that is 4K pages. 24 * HW_PAGE_SHIFT, that is 4K pages.
25 */ 25 */
26 #define PAGE_FACTOR (PAGE_SHIFT - HW_PAGE_SHIFT) 26 #define PAGE_FACTOR (PAGE_SHIFT - HW_PAGE_SHIFT)
27 27
28 /* Segment size; normal 256M segments */ 28 /* Segment size; normal 256M segments */
29 #define SID_SHIFT 28 29 #define SID_SHIFT 28
30 #define SID_MASK ASM_CONST(0xfffffffff) 30 #define SID_MASK ASM_CONST(0xfffffffff)
31 #define ESID_MASK 0xfffffffff0000000UL 31 #define ESID_MASK 0xfffffffff0000000UL
32 #define GET_ESID(x) (((x) >> SID_SHIFT) & SID_MASK) 32 #define GET_ESID(x) (((x) >> SID_SHIFT) & SID_MASK)
33 33
34 /* 1T segments */ 34 /* 1T segments */
35 #define SID_SHIFT_1T 40 35 #define SID_SHIFT_1T 40
36 #define SID_MASK_1T 0xffffffUL 36 #define SID_MASK_1T 0xffffffUL
37 #define ESID_MASK_1T 0xffffff0000000000UL 37 #define ESID_MASK_1T 0xffffff0000000000UL
38 #define GET_ESID_1T(x) (((x) >> SID_SHIFT_1T) & SID_MASK_1T) 38 #define GET_ESID_1T(x) (((x) >> SID_SHIFT_1T) & SID_MASK_1T)
39 39
40 #ifndef __ASSEMBLY__ 40 #ifndef __ASSEMBLY__
41 #include <asm/cache.h> 41 #include <asm/cache.h>
42 42
43 typedef unsigned long pte_basic_t; 43 typedef unsigned long pte_basic_t;
44 44
45 static __inline__ void clear_page(void *addr) 45 static __inline__ void clear_page(void *addr)
46 { 46 {
47 unsigned long lines, line_size; 47 unsigned long lines, line_size;
48 48
49 line_size = ppc64_caches.dline_size; 49 line_size = ppc64_caches.dline_size;
50 lines = ppc64_caches.dlines_per_page; 50 lines = ppc64_caches.dlines_per_page;
51 51
52 __asm__ __volatile__( 52 __asm__ __volatile__(
53 "mtctr %1 # clear_page\n\ 53 "mtctr %1 # clear_page\n\
54 1: dcbz 0,%0\n\ 54 1: dcbz 0,%0\n\
55 add %0,%0,%3\n\ 55 add %0,%0,%3\n\
56 bdnz+ 1b" 56 bdnz+ 1b"
57 : "=r" (addr) 57 : "=r" (addr)
58 : "r" (lines), "0" (addr), "r" (line_size) 58 : "r" (lines), "0" (addr), "r" (line_size)
59 : "ctr", "memory"); 59 : "ctr", "memory");
60 } 60 }
61 61
62 extern void copy_4K_page(void *to, void *from); 62 extern void copy_4K_page(void *to, void *from);
63 63
64 #ifdef CONFIG_PPC_64K_PAGES 64 #ifdef CONFIG_PPC_64K_PAGES
65 static inline void copy_page(void *to, void *from) 65 static inline void copy_page(void *to, void *from)
66 { 66 {
67 unsigned int i; 67 unsigned int i;
68 for (i=0; i < (1 << (PAGE_SHIFT - 12)); i++) { 68 for (i=0; i < (1 << (PAGE_SHIFT - 12)); i++) {
69 copy_4K_page(to, from); 69 copy_4K_page(to, from);
70 to += 4096; 70 to += 4096;
71 from += 4096; 71 from += 4096;
72 } 72 }
73 } 73 }
74 #else /* CONFIG_PPC_64K_PAGES */ 74 #else /* CONFIG_PPC_64K_PAGES */
75 static inline void copy_page(void *to, void *from) 75 static inline void copy_page(void *to, void *from)
76 { 76 {
77 copy_4K_page(to, from); 77 copy_4K_page(to, from);
78 } 78 }
79 #endif /* CONFIG_PPC_64K_PAGES */ 79 #endif /* CONFIG_PPC_64K_PAGES */
80 80
81 /* Log 2 of page table size */ 81 /* Log 2 of page table size */
82 extern u64 ppc64_pft_size; 82 extern u64 ppc64_pft_size;
83 83
84 /* Large pages size */ 84 /* Large pages size */
85 #ifdef CONFIG_HUGETLB_PAGE 85 #ifdef CONFIG_HUGETLB_PAGE
86 extern unsigned int HPAGE_SHIFT; 86 extern unsigned int HPAGE_SHIFT;
87 #else 87 #else
88 #define HPAGE_SHIFT PAGE_SHIFT 88 #define HPAGE_SHIFT PAGE_SHIFT
89 #endif 89 #endif
90 #define HPAGE_SIZE ((1UL) << HPAGE_SHIFT) 90 #define HPAGE_SIZE ((1UL) << HPAGE_SHIFT)
91 #define HPAGE_MASK (~(HPAGE_SIZE - 1)) 91 #define HPAGE_MASK (~(HPAGE_SIZE - 1))
92 #define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT) 92 #define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT)
93 #define HUGE_MAX_HSTATE 3
93 94
94 #endif /* __ASSEMBLY__ */ 95 #endif /* __ASSEMBLY__ */
95 96
96 #ifdef CONFIG_PPC_MM_SLICES 97 #ifdef CONFIG_PPC_MM_SLICES
97 98
98 #define SLICE_LOW_SHIFT 28 99 #define SLICE_LOW_SHIFT 28
99 #define SLICE_HIGH_SHIFT 40 100 #define SLICE_HIGH_SHIFT 40
100 101
101 #define SLICE_LOW_TOP (0x100000000ul) 102 #define SLICE_LOW_TOP (0x100000000ul)
102 #define SLICE_NUM_LOW (SLICE_LOW_TOP >> SLICE_LOW_SHIFT) 103 #define SLICE_NUM_LOW (SLICE_LOW_TOP >> SLICE_LOW_SHIFT)
103 #define SLICE_NUM_HIGH (PGTABLE_RANGE >> SLICE_HIGH_SHIFT) 104 #define SLICE_NUM_HIGH (PGTABLE_RANGE >> SLICE_HIGH_SHIFT)
104 105
105 #define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT) 106 #define GET_LOW_SLICE_INDEX(addr) ((addr) >> SLICE_LOW_SHIFT)
106 #define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT) 107 #define GET_HIGH_SLICE_INDEX(addr) ((addr) >> SLICE_HIGH_SHIFT)
107 108
108 #ifndef __ASSEMBLY__ 109 #ifndef __ASSEMBLY__
109 110
110 struct slice_mask { 111 struct slice_mask {
111 u16 low_slices; 112 u16 low_slices;
112 u16 high_slices; 113 u16 high_slices;
113 }; 114 };
114 115
115 struct mm_struct; 116 struct mm_struct;
116 117
117 extern unsigned long slice_get_unmapped_area(unsigned long addr, 118 extern unsigned long slice_get_unmapped_area(unsigned long addr,
118 unsigned long len, 119 unsigned long len,
119 unsigned long flags, 120 unsigned long flags,
120 unsigned int psize, 121 unsigned int psize,
121 int topdown, 122 int topdown,
122 int use_cache); 123 int use_cache);
123 124
124 extern unsigned int get_slice_psize(struct mm_struct *mm, 125 extern unsigned int get_slice_psize(struct mm_struct *mm,
125 unsigned long addr); 126 unsigned long addr);
126 127
127 extern void slice_init_context(struct mm_struct *mm, unsigned int psize); 128 extern void slice_init_context(struct mm_struct *mm, unsigned int psize);
128 extern void slice_set_user_psize(struct mm_struct *mm, unsigned int psize); 129 extern void slice_set_user_psize(struct mm_struct *mm, unsigned int psize);
129 extern void slice_set_range_psize(struct mm_struct *mm, unsigned long start, 130 extern void slice_set_range_psize(struct mm_struct *mm, unsigned long start,
130 unsigned long len, unsigned int psize); 131 unsigned long len, unsigned int psize);
131 132
132 #define slice_mm_new_context(mm) ((mm)->context.id == 0) 133 #define slice_mm_new_context(mm) ((mm)->context.id == 0)
133 134
134 #endif /* __ASSEMBLY__ */ 135 #endif /* __ASSEMBLY__ */
135 #else 136 #else
136 #define slice_init() 137 #define slice_init()
137 #define get_slice_psize(mm, addr) ((mm)->context.user_psize) 138 #define get_slice_psize(mm, addr) ((mm)->context.user_psize)
138 #define slice_set_user_psize(mm, psize) \ 139 #define slice_set_user_psize(mm, psize) \
139 do { \ 140 do { \
140 (mm)->context.user_psize = (psize); \ 141 (mm)->context.user_psize = (psize); \
141 (mm)->context.sllp = SLB_VSID_USER | mmu_psize_defs[(psize)].sllp; \ 142 (mm)->context.sllp = SLB_VSID_USER | mmu_psize_defs[(psize)].sllp; \
142 } while (0) 143 } while (0)
143 #define slice_set_range_psize(mm, start, len, psize) \ 144 #define slice_set_range_psize(mm, start, len, psize) \
144 slice_set_user_psize((mm), (psize)) 145 slice_set_user_psize((mm), (psize))
145 #define slice_mm_new_context(mm) 1 146 #define slice_mm_new_context(mm) 1
146 #endif /* CONFIG_PPC_MM_SLICES */ 147 #endif /* CONFIG_PPC_MM_SLICES */
147 148
148 #ifdef CONFIG_HUGETLB_PAGE 149 #ifdef CONFIG_HUGETLB_PAGE
149 150
150 #define HAVE_ARCH_HUGETLB_UNMAPPED_AREA 151 #define HAVE_ARCH_HUGETLB_UNMAPPED_AREA
151 152
152 #endif /* !CONFIG_HUGETLB_PAGE */ 153 #endif /* !CONFIG_HUGETLB_PAGE */
153 154
154 #ifdef MODULE 155 #ifdef MODULE
155 #define __page_aligned __attribute__((__aligned__(PAGE_SIZE))) 156 #define __page_aligned __attribute__((__aligned__(PAGE_SIZE)))
156 #else 157 #else
157 #define __page_aligned \ 158 #define __page_aligned \
158 __attribute__((__aligned__(PAGE_SIZE), \ 159 __attribute__((__aligned__(PAGE_SIZE), \
159 __section__(".data.page_aligned"))) 160 __section__(".data.page_aligned")))
160 #endif 161 #endif
161 162
162 #define VM_DATA_DEFAULT_FLAGS \ 163 #define VM_DATA_DEFAULT_FLAGS \
163 (test_thread_flag(TIF_32BIT) ? \ 164 (test_thread_flag(TIF_32BIT) ? \
164 VM_DATA_DEFAULT_FLAGS32 : VM_DATA_DEFAULT_FLAGS64) 165 VM_DATA_DEFAULT_FLAGS32 : VM_DATA_DEFAULT_FLAGS64)
165 166
166 /* 167 /*
167 * This is the default if a program doesn't have a PT_GNU_STACK 168 * This is the default if a program doesn't have a PT_GNU_STACK
168 * program header entry. The PPC64 ELF ABI has a non executable stack 169 * program header entry. The PPC64 ELF ABI has a non executable stack
169 * stack by default, so in the absense of a PT_GNU_STACK program header 170 * stack by default, so in the absense of a PT_GNU_STACK program header
170 * we turn execute permission off. 171 * we turn execute permission off.
171 */ 172 */
172 #define VM_STACK_DEFAULT_FLAGS32 (VM_READ | VM_WRITE | VM_EXEC | \ 173 #define VM_STACK_DEFAULT_FLAGS32 (VM_READ | VM_WRITE | VM_EXEC | \
173 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) 174 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
174 175
175 #define VM_STACK_DEFAULT_FLAGS64 (VM_READ | VM_WRITE | \ 176 #define VM_STACK_DEFAULT_FLAGS64 (VM_READ | VM_WRITE | \
176 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) 177 VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
177 178
178 #define VM_STACK_DEFAULT_FLAGS \ 179 #define VM_STACK_DEFAULT_FLAGS \
179 (test_thread_flag(TIF_32BIT) ? \ 180 (test_thread_flag(TIF_32BIT) ? \
180 VM_STACK_DEFAULT_FLAGS32 : VM_STACK_DEFAULT_FLAGS64) 181 VM_STACK_DEFAULT_FLAGS32 : VM_STACK_DEFAULT_FLAGS64)
181 182
182 #include <asm-generic/page.h> 183 #include <asm-generic/page.h>
183 184
184 #endif /* _ASM_POWERPC_PAGE_64_H */ 185 #endif /* _ASM_POWERPC_PAGE_64_H */
185 186
include/asm-powerpc/pgalloc-64.h
1 #ifndef _ASM_POWERPC_PGALLOC_64_H 1 #ifndef _ASM_POWERPC_PGALLOC_64_H
2 #define _ASM_POWERPC_PGALLOC_64_H 2 #define _ASM_POWERPC_PGALLOC_64_H
3 /* 3 /*
4 * This program is free software; you can redistribute it and/or 4 * This program is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU General Public License 5 * modify it under the terms of the GNU General Public License
6 * as published by the Free Software Foundation; either version 6 * as published by the Free Software Foundation; either version
7 * 2 of the License, or (at your option) any later version. 7 * 2 of the License, or (at your option) any later version.
8 */ 8 */
9 9
10 #include <linux/mm.h> 10 #include <linux/mm.h>
11 #include <linux/slab.h> 11 #include <linux/slab.h>
12 #include <linux/cpumask.h> 12 #include <linux/cpumask.h>
13 #include <linux/percpu.h> 13 #include <linux/percpu.h>
14 14
15 #ifndef CONFIG_PPC_SUBPAGE_PROT 15 #ifndef CONFIG_PPC_SUBPAGE_PROT
16 static inline void subpage_prot_free(pgd_t *pgd) {} 16 static inline void subpage_prot_free(pgd_t *pgd) {}
17 #endif 17 #endif
18 18
19 extern struct kmem_cache *pgtable_cache[]; 19 extern struct kmem_cache *pgtable_cache[];
20 20
21 #define PGD_CACHE_NUM 0 21 #define PGD_CACHE_NUM 0
22 #define PUD_CACHE_NUM 1 22 #define PUD_CACHE_NUM 1
23 #define PMD_CACHE_NUM 1 23 #define PMD_CACHE_NUM 1
24 #define HUGEPTE_CACHE_NUM 2 24 #define HUGEPTE_CACHE_NUM 2
25 #define PTE_NONCACHE_NUM 3 /* from GFP rather than kmem_cache */ 25 #define PTE_NONCACHE_NUM 7 /* from GFP rather than kmem_cache */
26 26
27 static inline pgd_t *pgd_alloc(struct mm_struct *mm) 27 static inline pgd_t *pgd_alloc(struct mm_struct *mm)
28 { 28 {
29 return kmem_cache_alloc(pgtable_cache[PGD_CACHE_NUM], GFP_KERNEL); 29 return kmem_cache_alloc(pgtable_cache[PGD_CACHE_NUM], GFP_KERNEL);
30 } 30 }
31 31
32 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd) 32 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
33 { 33 {
34 subpage_prot_free(pgd); 34 subpage_prot_free(pgd);
35 kmem_cache_free(pgtable_cache[PGD_CACHE_NUM], pgd); 35 kmem_cache_free(pgtable_cache[PGD_CACHE_NUM], pgd);
36 } 36 }
37 37
38 #ifndef CONFIG_PPC_64K_PAGES 38 #ifndef CONFIG_PPC_64K_PAGES
39 39
40 #define pgd_populate(MM, PGD, PUD) pgd_set(PGD, PUD) 40 #define pgd_populate(MM, PGD, PUD) pgd_set(PGD, PUD)
41 41
42 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr) 42 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
43 { 43 {
44 return kmem_cache_alloc(pgtable_cache[PUD_CACHE_NUM], 44 return kmem_cache_alloc(pgtable_cache[PUD_CACHE_NUM],
45 GFP_KERNEL|__GFP_REPEAT); 45 GFP_KERNEL|__GFP_REPEAT);
46 } 46 }
47 47
48 static inline void pud_free(struct mm_struct *mm, pud_t *pud) 48 static inline void pud_free(struct mm_struct *mm, pud_t *pud)
49 { 49 {
50 kmem_cache_free(pgtable_cache[PUD_CACHE_NUM], pud); 50 kmem_cache_free(pgtable_cache[PUD_CACHE_NUM], pud);
51 } 51 }
52 52
53 static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd) 53 static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
54 { 54 {
55 pud_set(pud, (unsigned long)pmd); 55 pud_set(pud, (unsigned long)pmd);
56 } 56 }
57 57
58 #define pmd_populate(mm, pmd, pte_page) \ 58 #define pmd_populate(mm, pmd, pte_page) \
59 pmd_populate_kernel(mm, pmd, page_address(pte_page)) 59 pmd_populate_kernel(mm, pmd, page_address(pte_page))
60 #define pmd_populate_kernel(mm, pmd, pte) pmd_set(pmd, (unsigned long)(pte)) 60 #define pmd_populate_kernel(mm, pmd, pte) pmd_set(pmd, (unsigned long)(pte))
61 #define pmd_pgtable(pmd) pmd_page(pmd) 61 #define pmd_pgtable(pmd) pmd_page(pmd)
62 62
63 63
64 #else /* CONFIG_PPC_64K_PAGES */ 64 #else /* CONFIG_PPC_64K_PAGES */
65 65
66 #define pud_populate(mm, pud, pmd) pud_set(pud, (unsigned long)pmd) 66 #define pud_populate(mm, pud, pmd) pud_set(pud, (unsigned long)pmd)
67 67
68 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, 68 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
69 pte_t *pte) 69 pte_t *pte)
70 { 70 {
71 pmd_set(pmd, (unsigned long)pte); 71 pmd_set(pmd, (unsigned long)pte);
72 } 72 }
73 73
74 #define pmd_populate(mm, pmd, pte_page) \ 74 #define pmd_populate(mm, pmd, pte_page) \
75 pmd_populate_kernel(mm, pmd, page_address(pte_page)) 75 pmd_populate_kernel(mm, pmd, page_address(pte_page))
76 #define pmd_pgtable(pmd) pmd_page(pmd) 76 #define pmd_pgtable(pmd) pmd_page(pmd)
77 77
78 #endif /* CONFIG_PPC_64K_PAGES */ 78 #endif /* CONFIG_PPC_64K_PAGES */
79 79
80 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr) 80 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
81 { 81 {
82 return kmem_cache_alloc(pgtable_cache[PMD_CACHE_NUM], 82 return kmem_cache_alloc(pgtable_cache[PMD_CACHE_NUM],
83 GFP_KERNEL|__GFP_REPEAT); 83 GFP_KERNEL|__GFP_REPEAT);
84 } 84 }
85 85
86 static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd) 86 static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
87 { 87 {
88 kmem_cache_free(pgtable_cache[PMD_CACHE_NUM], pmd); 88 kmem_cache_free(pgtable_cache[PMD_CACHE_NUM], pmd);
89 } 89 }
90 90
91 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm, 91 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
92 unsigned long address) 92 unsigned long address)
93 { 93 {
94 return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO); 94 return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_REPEAT | __GFP_ZERO);
95 } 95 }
96 96
97 static inline pgtable_t pte_alloc_one(struct mm_struct *mm, 97 static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
98 unsigned long address) 98 unsigned long address)
99 { 99 {
100 struct page *page; 100 struct page *page;
101 pte_t *pte; 101 pte_t *pte;
102 102
103 pte = pte_alloc_one_kernel(mm, address); 103 pte = pte_alloc_one_kernel(mm, address);
104 if (!pte) 104 if (!pte)
105 return NULL; 105 return NULL;
106 page = virt_to_page(pte); 106 page = virt_to_page(pte);
107 pgtable_page_ctor(page); 107 pgtable_page_ctor(page);
108 return page; 108 return page;
109 } 109 }
110 110
111 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte) 111 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
112 { 112 {
113 free_page((unsigned long)pte); 113 free_page((unsigned long)pte);
114 } 114 }
115 115
116 static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage) 116 static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
117 { 117 {
118 pgtable_page_dtor(ptepage); 118 pgtable_page_dtor(ptepage);
119 __free_page(ptepage); 119 __free_page(ptepage);
120 } 120 }
121 121
122 #define PGF_CACHENUM_MASK 0x3 122 #define PGF_CACHENUM_MASK 0x7
123 123
124 typedef struct pgtable_free { 124 typedef struct pgtable_free {
125 unsigned long val; 125 unsigned long val;
126 } pgtable_free_t; 126 } pgtable_free_t;
127 127
128 static inline pgtable_free_t pgtable_free_cache(void *p, int cachenum, 128 static inline pgtable_free_t pgtable_free_cache(void *p, int cachenum,
129 unsigned long mask) 129 unsigned long mask)
130 { 130 {
131 BUG_ON(cachenum > PGF_CACHENUM_MASK); 131 BUG_ON(cachenum > PGF_CACHENUM_MASK);
132 132
133 return (pgtable_free_t){.val = ((unsigned long) p & ~mask) | cachenum}; 133 return (pgtable_free_t){.val = ((unsigned long) p & ~mask) | cachenum};
134 } 134 }
135 135
136 static inline void pgtable_free(pgtable_free_t pgf) 136 static inline void pgtable_free(pgtable_free_t pgf)
137 { 137 {
138 void *p = (void *)(pgf.val & ~PGF_CACHENUM_MASK); 138 void *p = (void *)(pgf.val & ~PGF_CACHENUM_MASK);
139 int cachenum = pgf.val & PGF_CACHENUM_MASK; 139 int cachenum = pgf.val & PGF_CACHENUM_MASK;
140 140
141 if (cachenum == PTE_NONCACHE_NUM) 141 if (cachenum == PTE_NONCACHE_NUM)
142 free_page((unsigned long)p); 142 free_page((unsigned long)p);
143 else 143 else
144 kmem_cache_free(pgtable_cache[cachenum], p); 144 kmem_cache_free(pgtable_cache[cachenum], p);
145 } 145 }
146 146
147 extern void pgtable_free_tlb(struct mmu_gather *tlb, pgtable_free_t pgf); 147 extern void pgtable_free_tlb(struct mmu_gather *tlb, pgtable_free_t pgf);
148 148
149 #define __pte_free_tlb(tlb,ptepage) \ 149 #define __pte_free_tlb(tlb,ptepage) \
150 do { \ 150 do { \
151 pgtable_page_dtor(ptepage); \ 151 pgtable_page_dtor(ptepage); \
152 pgtable_free_tlb(tlb, pgtable_free_cache(page_address(ptepage), \ 152 pgtable_free_tlb(tlb, pgtable_free_cache(page_address(ptepage), \
153 PTE_NONCACHE_NUM, PTE_TABLE_SIZE-1)); \ 153 PTE_NONCACHE_NUM, PTE_TABLE_SIZE-1)); \
154 } while (0) 154 } while (0)
155 #define __pmd_free_tlb(tlb, pmd) \ 155 #define __pmd_free_tlb(tlb, pmd) \
156 pgtable_free_tlb(tlb, pgtable_free_cache(pmd, \ 156 pgtable_free_tlb(tlb, pgtable_free_cache(pmd, \
157 PMD_CACHE_NUM, PMD_TABLE_SIZE-1)) 157 PMD_CACHE_NUM, PMD_TABLE_SIZE-1))
158 #ifndef CONFIG_PPC_64K_PAGES 158 #ifndef CONFIG_PPC_64K_PAGES
159 #define __pud_free_tlb(tlb, pud) \ 159 #define __pud_free_tlb(tlb, pud) \
160 pgtable_free_tlb(tlb, pgtable_free_cache(pud, \ 160 pgtable_free_tlb(tlb, pgtable_free_cache(pud, \
161 PUD_CACHE_NUM, PUD_TABLE_SIZE-1)) 161 PUD_CACHE_NUM, PUD_TABLE_SIZE-1))
162 #endif /* CONFIG_PPC_64K_PAGES */ 162 #endif /* CONFIG_PPC_64K_PAGES */
163 163
164 #define check_pgt_cache() do { } while (0) 164 #define check_pgt_cache() do { } while (0)
165 165
166 #endif /* _ASM_POWERPC_PGALLOC_64_H */ 166 #endif /* _ASM_POWERPC_PGALLOC_64_H */
167 167