Blame view
Documentation/core-api/protection-keys.rst
3.39 KB
28e21eac9 Documentation: x8... |
1 2 3 4 5 |
.. SPDX-License-Identifier: GPL-2.0 ====================== Memory Protection Keys ====================== |
c51ff2c7f x86/pkeys: Update... |
6 7 8 9 10 11 12 |
Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature which is found on Intel's Skylake "Scalable Processor" Server CPUs. It will be avalable in future non-server parts. For anyone wishing to test or use this feature, it is available in Amazon's EC2 C5 instances and is known to work there using an Ubuntu 17.04 image. |
591b1d8d8 x86/mm/pkeys: Add... |
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
Memory Protection Keys provides a mechanism for enforcing page-based protections, but without requiring modification of the page tables when an application changes protection domains. It works by dedicating 4 previously ignored bits in each page table entry to a "protection key", giving 16 possible keys. There is also a new user-accessible register (PKRU) with two separate bits (Access Disable and Write Disable) for each key. Being a CPU register, PKRU is inherently thread-local, potentially giving each thread a different set of protections from every other thread. There are two new instructions (RDPKRU/WRPKRU) for reading and writing to the new register. The feature is only available in 64-bit mode, even though there is theoretically space in the PAE PTEs. These permissions are enforced on data access only and have no effect on instruction fetches. |
28e21eac9 Documentation: x8... |
30 31 |
Syscalls ======== |
c74fe3940 pkeys: Add detail... |
32 |
|
28e21eac9 Documentation: x8... |
33 |
There are 3 system calls which directly interact with pkeys:: |
c74fe3940 pkeys: Add detail... |
34 35 36 37 38 39 40 41 42 43 44 |
int pkey_alloc(unsigned long flags, unsigned long init_access_rights) int pkey_free(int pkey); int pkey_mprotect(unsigned long start, size_t len, unsigned long prot, int pkey); Before a pkey can be used, it must first be allocated with pkey_alloc(). An application calls the WRPKRU instruction directly in order to change access permissions to memory covered with a key. In this example WRPKRU is wrapped by a C function called pkey_set(). |
28e21eac9 Documentation: x8... |
45 |
:: |
c74fe3940 pkeys: Add detail... |
46 47 |
int real_prot = PROT_READ|PROT_WRITE; |
f90e2d9a5 x86/mm/pkeys: Fix... |
48 |
pkey = pkey_alloc(0, PKEY_DISABLE_WRITE); |
c74fe3940 pkeys: Add detail... |
49 50 51 52 53 |
ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0); ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey); ... application runs here Now, if the application needs to update the data at 'ptr', it can |
28e21eac9 Documentation: x8... |
54 |
gain access, do the update, then remove its write access:: |
c74fe3940 pkeys: Add detail... |
55 |
|
f90e2d9a5 x86/mm/pkeys: Fix... |
56 |
pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE |
c74fe3940 pkeys: Add detail... |
57 |
*ptr = foo; // assign something |
f90e2d9a5 x86/mm/pkeys: Fix... |
58 |
pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again |
c74fe3940 pkeys: Add detail... |
59 60 |
Now when it frees the memory, it will also free the pkey since it |
28e21eac9 Documentation: x8... |
61 |
is no longer in use:: |
c74fe3940 pkeys: Add detail... |
62 63 64 |
munmap(ptr, PAGE_SIZE); pkey_free(pkey); |
28e21eac9 Documentation: x8... |
65 66 67 |
.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions. An example implementation can be found in tools/testing/selftests/x86/protection_keys.c. |
6679dac51 x86/pkeys: Update... |
68 |
|
28e21eac9 Documentation: x8... |
69 70 |
Behavior ======== |
c74fe3940 pkeys: Add detail... |
71 72 |
The kernel attempts to make protection keys consistent with the |
28e21eac9 Documentation: x8... |
73 |
behavior of a plain mprotect(). For instance if you do this:: |
c74fe3940 pkeys: Add detail... |
74 75 76 |
mprotect(ptr, size, PROT_NONE); something(ptr); |
28e21eac9 Documentation: x8... |
77 |
you can expect the same effects with protection keys when doing this:: |
c74fe3940 pkeys: Add detail... |
78 79 80 81 82 83 |
pkey = pkey_alloc(0, PKEY_DISABLE_WRITE | PKEY_DISABLE_READ); pkey_mprotect(ptr, size, PROT_READ|PROT_WRITE, pkey); something(ptr); That should be true whether something() is a direct access to 'ptr' |
28e21eac9 Documentation: x8... |
84 |
like:: |
c74fe3940 pkeys: Add detail... |
85 86 87 88 |
*ptr = foo; or when the kernel does the access on the application's behalf like |
28e21eac9 Documentation: x8... |
89 |
with a read():: |
c74fe3940 pkeys: Add detail... |
90 91 92 93 94 95 |
read(fd, ptr, 1); The kernel will send a SIGSEGV in both cases, but si_code will be set to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when the plain mprotect() permissions are violated. |