Commit 37b8304642c7f91df54888955c373ae89b577fcc

Authored by Nicolas Pitre
Committed by Nicolas Pitre
1 parent 2c53b436a3

ARM: kuser: move interface documentation out of the source code

Digging into some assembly file in order to get information about the
kuser helpers is not that convivial.  Let's move that information to
a better formatted file in Documentation/arm/ and improve on it a bit.

Thanks to Dave Martin <dave.martin@linaro.org> for the initial cleanup and
clarifications.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Acked-by: Dave Martin <dave.martin@linaro.org>

Showing 2 changed files with 204 additions and 152 deletions Side-by-side Diff

Documentation/arm/kernel_user_helpers.txt
  1 +Kernel-provided User Helpers
  2 +============================
  3 +
  4 +These are segment of kernel provided user code reachable from user space
  5 +at a fixed address in kernel memory. This is used to provide user space
  6 +with some operations which require kernel help because of unimplemented
  7 +native feature and/or instructions in many ARM CPUs. The idea is for this
  8 +code to be executed directly in user mode for best efficiency but which is
  9 +too intimate with the kernel counter part to be left to user libraries.
  10 +In fact this code might even differ from one CPU to another depending on
  11 +the available instruction set, or whether it is a SMP systems. In other
  12 +words, the kernel reserves the right to change this code as needed without
  13 +warning. Only the entry points and their results as documented here are
  14 +guaranteed to be stable.
  15 +
  16 +This is different from (but doesn't preclude) a full blown VDSO
  17 +implementation, however a VDSO would prevent some assembly tricks with
  18 +constants that allows for efficient branching to those code segments. And
  19 +since those code segments only use a few cycles before returning to user
  20 +code, the overhead of a VDSO indirect far call would add a measurable
  21 +overhead to such minimalistic operations.
  22 +
  23 +User space is expected to bypass those helpers and implement those things
  24 +inline (either in the code emitted directly by the compiler, or part of
  25 +the implementation of a library call) when optimizing for a recent enough
  26 +processor that has the necessary native support, but only if resulting
  27 +binaries are already to be incompatible with earlier ARM processors due to
  28 +useage of similar native instructions for other things. In other words
  29 +don't make binaries unable to run on earlier processors just for the sake
  30 +of not using these kernel helpers if your compiled code is not going to
  31 +use new instructions for other purpose.
  32 +
  33 +New helpers may be added over time, so an older kernel may be missing some
  34 +helpers present in a newer kernel. For this reason, programs must check
  35 +the value of __kuser_helper_version (see below) before assuming that it is
  36 +safe to call any particular helper. This check should ideally be
  37 +performed only once at process startup time, and execution aborted early
  38 +if the required helpers are not provided by the kernel version that
  39 +process is running on.
  40 +
  41 +kuser_helper_version
  42 +--------------------
  43 +
  44 +Location: 0xffff0ffc
  45 +
  46 +Reference declaration:
  47 +
  48 + extern int32_t __kuser_helper_version;
  49 +
  50 +Definition:
  51 +
  52 + This field contains the number of helpers being implemented by the
  53 + running kernel. User space may read this to determine the availability
  54 + of a particular helper.
  55 +
  56 +Usage example:
  57 +
  58 +#define __kuser_helper_version (*(int32_t *)0xffff0ffc)
  59 +
  60 +void check_kuser_version(void)
  61 +{
  62 + if (__kuser_helper_version < 2) {
  63 + fprintf(stderr, "can't do atomic operations, kernel too old\n");
  64 + abort();
  65 + }
  66 +}
  67 +
  68 +Notes:
  69 +
  70 + User space may assume that the value of this field never changes
  71 + during the lifetime of any single process. This means that this
  72 + field can be read once during the initialisation of a library or
  73 + startup phase of a program.
  74 +
  75 +kuser_get_tls
  76 +-------------
  77 +
  78 +Location: 0xffff0fe0
  79 +
  80 +Reference prototype:
  81 +
  82 + void * __kuser_get_tls(void);
  83 +
  84 +Input:
  85 +
  86 + lr = return address
  87 +
  88 +Output:
  89 +
  90 + r0 = TLS value
  91 +
  92 +Clobbered registers:
  93 +
  94 + none
  95 +
  96 +Definition:
  97 +
  98 + Get the TLS value as previously set via the __ARM_NR_set_tls syscall.
  99 +
  100 +Usage example:
  101 +
  102 +typedef void * (__kuser_get_tls_t)(void);
  103 +#define __kuser_get_tls (*(__kuser_get_tls_t *)0xffff0fe0)
  104 +
  105 +void foo()
  106 +{
  107 + void *tls = __kuser_get_tls();
  108 + printf("TLS = %p\n", tls);
  109 +}
  110 +
  111 +Notes:
  112 +
  113 + - Valid only if __kuser_helper_version >= 1 (from kernel version 2.6.12).
  114 +
  115 +kuser_cmpxchg
  116 +-------------
  117 +
  118 +Location: 0xffff0fc0
  119 +
  120 +Reference prototype:
  121 +
  122 + int __kuser_cmpxchg(int32_t oldval, int32_t newval, volatile int32_t *ptr);
  123 +
  124 +Input:
  125 +
  126 + r0 = oldval
  127 + r1 = newval
  128 + r2 = ptr
  129 + lr = return address
  130 +
  131 +Output:
  132 +
  133 + r0 = success code (zero or non-zero)
  134 + C flag = set if r0 == 0, clear if r0 != 0
  135 +
  136 +Clobbered registers:
  137 +
  138 + r3, ip, flags
  139 +
  140 +Definition:
  141 +
  142 + Atomically store newval in *ptr only if *ptr is equal to oldval.
  143 + Return zero if *ptr was changed or non-zero if no exchange happened.
  144 + The C flag is also set if *ptr was changed to allow for assembly
  145 + optimization in the calling code.
  146 +
  147 +Usage example:
  148 +
  149 +typedef int (__kuser_cmpxchg_t)(int oldval, int newval, volatile int *ptr);
  150 +#define __kuser_cmpxchg (*(__kuser_cmpxchg_t *)0xffff0fc0)
  151 +
  152 +int atomic_add(volatile int *ptr, int val)
  153 +{
  154 + int old, new;
  155 +
  156 + do {
  157 + old = *ptr;
  158 + new = old + val;
  159 + } while(__kuser_cmpxchg(old, new, ptr));
  160 +
  161 + return new;
  162 +}
  163 +
  164 +Notes:
  165 +
  166 + - This routine already includes memory barriers as needed.
  167 +
  168 + - Valid only if __kuser_helper_version >= 2 (from kernel version 2.6.12).
  169 +
  170 +kuser_memory_barrier
  171 +--------------------
  172 +
  173 +Location: 0xffff0fa0
  174 +
  175 +Reference prototype:
  176 +
  177 + void __kuser_memory_barrier(void);
  178 +
  179 +Input:
  180 +
  181 + lr = return address
  182 +
  183 +Output:
  184 +
  185 + none
  186 +
  187 +Clobbered registers:
  188 +
  189 + none
  190 +
  191 +Definition:
  192 +
  193 + Apply any needed memory barrier to preserve consistency with data modified
  194 + manually and __kuser_cmpxchg usage.
  195 +
  196 +Usage example:
  197 +
  198 +typedef void (__kuser_dmb_t)(void);
  199 +#define __kuser_dmb (*(__kuser_dmb_t *)0xffff0fa0)
  200 +
  201 +Notes:
  202 +
  203 + - Valid only if __kuser_helper_version >= 3 (from kernel version 2.6.15).
arch/arm/kernel/entry-armv.S
... ... @@ -754,31 +754,12 @@
754 754 /*
755 755 * User helpers.
756 756 *
757   - * These are segment of kernel provided user code reachable from user space
758   - * at a fixed address in kernel memory. This is used to provide user space
759   - * with some operations which require kernel help because of unimplemented
760   - * native feature and/or instructions in many ARM CPUs. The idea is for
761   - * this code to be executed directly in user mode for best efficiency but
762   - * which is too intimate with the kernel counter part to be left to user
763   - * libraries. In fact this code might even differ from one CPU to another
764   - * depending on the available instruction set and restrictions like on
765   - * SMP systems. In other words, the kernel reserves the right to change
766   - * this code as needed without warning. Only the entry points and their
767   - * results are guaranteed to be stable.
768   - *
769 757 * Each segment is 32-byte aligned and will be moved to the top of the high
770 758 * vector page. New segments (if ever needed) must be added in front of
771 759 * existing ones. This mechanism should be used only for things that are
772 760 * really small and justified, and not be abused freely.
773 761 *
774   - * User space is expected to implement those things inline when optimizing
775   - * for a processor that has the necessary native support, but only if such
776   - * resulting binaries are already to be incompatible with earlier ARM
777   - * processors due to the use of unsupported instructions other than what
778   - * is provided here. In other words don't make binaries unable to run on
779   - * earlier processors just for the sake of not using these kernel helpers
780   - * if your compiled code is not going to use the new instructions for other
781   - * purpose.
  762 + * See Documentation/arm/kernel_user_helpers.txt for formal definitions.
782 763 */
783 764 THUMB( .arm )
784 765  
785 766  
... ... @@ -794,98 +775,12 @@
794 775 .globl __kuser_helper_start
795 776 __kuser_helper_start:
796 777  
797   -/*
798   - * Reference prototype:
799   - *
800   - * void __kernel_memory_barrier(void)
801   - *
802   - * Input:
803   - *
804   - * lr = return address
805   - *
806   - * Output:
807   - *
808   - * none
809   - *
810   - * Clobbered:
811   - *
812   - * none
813   - *
814   - * Definition and user space usage example:
815   - *
816   - * typedef void (__kernel_dmb_t)(void);
817   - * #define __kernel_dmb (*(__kernel_dmb_t *)0xffff0fa0)
818   - *
819   - * Apply any needed memory barrier to preserve consistency with data modified
820   - * manually and __kuser_cmpxchg usage.
821   - *
822   - * This could be used as follows:
823   - *
824   - * #define __kernel_dmb() \
825   - * asm volatile ( "mov r0, #0xffff0fff; mov lr, pc; sub pc, r0, #95" \
826   - * : : : "r0", "lr","cc" )
827   - */
828   -
829 778 __kuser_memory_barrier: @ 0xffff0fa0
830 779 smp_dmb arm
831 780 usr_ret lr
832 781  
833 782 .align 5
834 783  
835   -/*
836   - * Reference prototype:
837   - *
838   - * int __kernel_cmpxchg(int oldval, int newval, int *ptr)
839   - *
840   - * Input:
841   - *
842   - * r0 = oldval
843   - * r1 = newval
844   - * r2 = ptr
845   - * lr = return address
846   - *
847   - * Output:
848   - *
849   - * r0 = returned value (zero or non-zero)
850   - * C flag = set if r0 == 0, clear if r0 != 0
851   - *
852   - * Clobbered:
853   - *
854   - * r3, ip, flags
855   - *
856   - * Definition and user space usage example:
857   - *
858   - * typedef int (__kernel_cmpxchg_t)(int oldval, int newval, int *ptr);
859   - * #define __kernel_cmpxchg (*(__kernel_cmpxchg_t *)0xffff0fc0)
860   - *
861   - * Atomically store newval in *ptr if *ptr is equal to oldval for user space.
862   - * Return zero if *ptr was changed or non-zero if no exchange happened.
863   - * The C flag is also set if *ptr was changed to allow for assembly
864   - * optimization in the calling code.
865   - *
866   - * Notes:
867   - *
868   - * - This routine already includes memory barriers as needed.
869   - *
870   - * For example, a user space atomic_add implementation could look like this:
871   - *
872   - * #define atomic_add(ptr, val) \
873   - * ({ register unsigned int *__ptr asm("r2") = (ptr); \
874   - * register unsigned int __result asm("r1"); \
875   - * asm volatile ( \
876   - * "1: @ atomic_add\n\t" \
877   - * "ldr r0, [r2]\n\t" \
878   - * "mov r3, #0xffff0fff\n\t" \
879   - * "add lr, pc, #4\n\t" \
880   - * "add r1, r0, %2\n\t" \
881   - * "add pc, r3, #(0xffff0fc0 - 0xffff0fff)\n\t" \
882   - * "bcc 1b" \
883   - * : "=&r" (__result) \
884   - * : "r" (__ptr), "rIL" (val) \
885   - * : "r0","r3","ip","lr","cc","memory" ); \
886   - * __result; })
887   - */
888   -
889 784 __kuser_cmpxchg: @ 0xffff0fc0
890 785  
891 786 #if defined(CONFIG_NEEDS_SYSCALL_FOR_CMPXCHG)
... ... @@ -959,39 +854,6 @@
959 854  
960 855 .align 5
961 856  
962   -/*
963   - * Reference prototype:
964   - *
965   - * int __kernel_get_tls(void)
966   - *
967   - * Input:
968   - *
969   - * lr = return address
970   - *
971   - * Output:
972   - *
973   - * r0 = TLS value
974   - *
975   - * Clobbered:
976   - *
977   - * none
978   - *
979   - * Definition and user space usage example:
980   - *
981   - * typedef int (__kernel_get_tls_t)(void);
982   - * #define __kernel_get_tls (*(__kernel_get_tls_t *)0xffff0fe0)
983   - *
984   - * Get the TLS value as previously set via the __ARM_NR_set_tls syscall.
985   - *
986   - * This could be used as follows:
987   - *
988   - * #define __kernel_get_tls() \
989   - * ({ register unsigned int __val asm("r0"); \
990   - * asm( "mov r0, #0xffff0fff; mov lr, pc; sub pc, r0, #31" \
991   - * : "=r" (__val) : : "lr","cc" ); \
992   - * __val; })
993   - */
994   -
995 857 __kuser_get_tls: @ 0xffff0fe0
996 858 ldr r0, [pc, #(16 - 8)] @ read TLS, set in kuser_get_tls_init
997 859 usr_ret lr
... ... @@ -999,19 +861,6 @@
999 861 .rep 4
1000 862 .word 0 @ 0xffff0ff0 software TLS value, then
1001 863 .endr @ pad up to __kuser_helper_version
1002   -
1003   -/*
1004   - * Reference declaration:
1005   - *
1006   - * extern unsigned int __kernel_helper_version;
1007   - *
1008   - * Definition and user space usage example:
1009   - *
1010   - * #define __kernel_helper_version (*(unsigned int *)0xffff0ffc)
1011   - *
1012   - * User space may read this to determine the curent number of helpers
1013   - * available.
1014   - */
1015 864  
1016 865 __kuser_helper_version: @ 0xffff0ffc
1017 866 .word ((__kuser_helper_end - __kuser_helper_start) >> 5)