Commit 14897e35fdc045fff9baabf0354570da22386706

Authored by Linus Torvalds

Merge branch 'docs' of git://git.lwn.net/linux-2.6

* 'docs' of git://git.lwn.net/linux-2.6:
  Add additional examples in Documentation/spinlocks.txt
  Move sched-rt-group.txt to scheduler/
  Documentation: move rpc-cache.txt to filesystems/
  Documentation: move nfsroot.txt to filesystems/
  Spell out behavior of atomic_dec_and_lock() in kerneldoc
  Fix a typo in highres.txt
  Fixes to the seq_file document
  Fill out information on patch tags in SubmittingPatches
  Add the seq_file documentation

Showing 18 changed files Side-by-side Diff

Documentation/00-INDEX
... ... @@ -271,8 +271,6 @@
271 271 - directory with information on the NetLabel subsystem.
272 272 networking/
273 273 - directory with info on various aspects of networking with Linux.
274   -nfsroot.txt
275   - - short guide on setting up a diskless box with NFS root filesystem.
276 274 nmi_watchdog.txt
277 275 - info on NMI watchdog for SMP systems.
278 276 nommu-mmap.txt
... ... @@ -321,8 +319,6 @@
321 319 - a description of what robust futexes are.
322 320 rocket.txt
323 321 - info on the Comtrol RocketPort multiport serial driver.
324   -rpc-cache.txt
325   - - introduction to the caching mechanisms in the sunrpc layer.
326 322 rt-mutex-design.txt
327 323 - description of the RealTime mutex implementation design.
328 324 rt-mutex.txt
Documentation/SubmittingPatches
... ... @@ -328,7 +328,7 @@
328 328 point out some special detail about the sign-off.
329 329  
330 330  
331   -13) When to use Acked-by:
  331 +13) When to use Acked-by: and Cc:
332 332  
333 333 The Signed-off-by: tag indicates that the signer was involved in the
334 334 development of the patch, or that he/she was in the patch's delivery path.
335 335  
336 336  
... ... @@ -349,11 +349,59 @@
349 349 For example, if a patch affects multiple subsystems and has an Acked-by: from
350 350 one subsystem maintainer then this usually indicates acknowledgement of just
351 351 the part which affects that maintainer's code. Judgement should be used here.
352   - When in doubt people should refer to the original discussion in the mailing
  352 +When in doubt people should refer to the original discussion in the mailing
353 353 list archives.
354 354  
  355 +If a person has had the opportunity to comment on a patch, but has not
  356 +provided such comments, you may optionally add a "Cc:" tag to the patch.
  357 +This is the only tag which might be added without an explicit action by the
  358 +person it names. This tag documents that potentially interested parties
  359 +have been included in the discussion
355 360  
356   -14) The canonical patch format
  361 +
  362 +14) Using Test-by: and Reviewed-by:
  363 +
  364 +A Tested-by: tag indicates that the patch has been successfully tested (in
  365 +some environment) by the person named. This tag informs maintainers that
  366 +some testing has been performed, provides a means to locate testers for
  367 +future patches, and ensures credit for the testers.
  368 +
  369 +Reviewed-by:, instead, indicates that the patch has been reviewed and found
  370 +acceptable according to the Reviewer's Statement:
  371 +
  372 + Reviewer's statement of oversight
  373 +
  374 + By offering my Reviewed-by: tag, I state that:
  375 +
  376 + (a) I have carried out a technical review of this patch to
  377 + evaluate its appropriateness and readiness for inclusion into
  378 + the mainline kernel.
  379 +
  380 + (b) Any problems, concerns, or questions relating to the patch
  381 + have been communicated back to the submitter. I am satisfied
  382 + with the submitter's response to my comments.
  383 +
  384 + (c) While there may be things that could be improved with this
  385 + submission, I believe that it is, at this time, (1) a
  386 + worthwhile modification to the kernel, and (2) free of known
  387 + issues which would argue against its inclusion.
  388 +
  389 + (d) While I have reviewed the patch and believe it to be sound, I
  390 + do not (unless explicitly stated elsewhere) make any
  391 + warranties or guarantees that it will achieve its stated
  392 + purpose or function properly in any given situation.
  393 +
  394 +A Reviewed-by tag is a statement of opinion that the patch is an
  395 +appropriate modification of the kernel without any remaining serious
  396 +technical issues. Any interested reviewer (who has done the work) can
  397 +offer a Reviewed-by tag for a patch. This tag serves to give credit to
  398 +reviewers and to inform maintainers of the degree of review which has been
  399 +done on the patch. Reviewed-by: tags, when supplied by reviewers known to
  400 +understand the subject area and to perform thorough reviews, will normally
  401 +increase the liklihood of your patch getting into the kernel.
  402 +
  403 +
  404 +15) The canonical patch format
357 405  
358 406 The canonical patch subject line is:
359 407  
Documentation/filesystems/00-INDEX
... ... @@ -66,6 +66,8 @@
66 66 - info on the Linux implementation of Sys V mandatory file locking.
67 67 ncpfs.txt
68 68 - info on Novell Netware(tm) filesystem using NCP protocol.
  69 +nfsroot.txt
  70 + - short guide on setting up a diskless box with NFS root filesystem.
69 71 ntfs.txt
70 72 - info and mount options for the NTFS filesystem (Windows NT).
71 73 ocfs2.txt
... ... @@ -82,6 +84,10 @@
82 84 - info on relay, for efficient streaming from kernel to user space.
83 85 romfs.txt
84 86 - description of the ROMFS filesystem.
  87 +rpc-cache.txt
  88 + - introduction to the caching mechanisms in the sunrpc layer.
  89 +seq_file.txt
  90 + - how to use the seq_file API
85 91 sharedsubtree.txt
86 92 - a description of shared subtrees for namespaces.
87 93 smbfs.txt
Documentation/filesystems/nfsroot.txt
  1 +Mounting the root filesystem via NFS (nfsroot)
  2 +===============================================
  3 +
  4 +Written 1996 by Gero Kuhlmann <gero@gkminix.han.de>
  5 +Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
  6 +Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org>
  7 +Updated 2006 by Horms <horms@verge.net.au>
  8 +
  9 +
  10 +
  11 +In order to use a diskless system, such as an X-terminal or printer server
  12 +for example, it is necessary for the root filesystem to be present on a
  13 +non-disk device. This may be an initramfs (see Documentation/filesystems/
  14 +ramfs-rootfs-initramfs.txt), a ramdisk (see Documentation/initrd.txt) or a
  15 +filesystem mounted via NFS. The following text describes on how to use NFS
  16 +for the root filesystem. For the rest of this text 'client' means the
  17 +diskless system, and 'server' means the NFS server.
  18 +
  19 +
  20 +
  21 +
  22 +1.) Enabling nfsroot capabilities
  23 + -----------------------------
  24 +
  25 +In order to use nfsroot, NFS client support needs to be selected as
  26 +built-in during configuration. Once this has been selected, the nfsroot
  27 +option will become available, which should also be selected.
  28 +
  29 +In the networking options, kernel level autoconfiguration can be selected,
  30 +along with the types of autoconfiguration to support. Selecting all of
  31 +DHCP, BOOTP and RARP is safe.
  32 +
  33 +
  34 +
  35 +
  36 +2.) Kernel command line
  37 + -------------------
  38 +
  39 +When the kernel has been loaded by a boot loader (see below) it needs to be
  40 +told what root fs device to use. And in the case of nfsroot, where to find
  41 +both the server and the name of the directory on the server to mount as root.
  42 +This can be established using the following kernel command line parameters:
  43 +
  44 +
  45 +root=/dev/nfs
  46 +
  47 + This is necessary to enable the pseudo-NFS-device. Note that it's not a
  48 + real device but just a synonym to tell the kernel to use NFS instead of
  49 + a real device.
  50 +
  51 +
  52 +nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
  53 +
  54 + If the `nfsroot' parameter is NOT given on the command line,
  55 + the default "/tftpboot/%s" will be used.
  56 +
  57 + <server-ip> Specifies the IP address of the NFS server.
  58 + The default address is determined by the `ip' parameter
  59 + (see below). This parameter allows the use of different
  60 + servers for IP autoconfiguration and NFS.
  61 +
  62 + <root-dir> Name of the directory on the server to mount as root.
  63 + If there is a "%s" token in the string, it will be
  64 + replaced by the ASCII-representation of the client's
  65 + IP address.
  66 +
  67 + <nfs-options> Standard NFS options. All options are separated by commas.
  68 + The following defaults are used:
  69 + port = as given by server portmap daemon
  70 + rsize = 4096
  71 + wsize = 4096
  72 + timeo = 7
  73 + retrans = 3
  74 + acregmin = 3
  75 + acregmax = 60
  76 + acdirmin = 30
  77 + acdirmax = 60
  78 + flags = hard, nointr, noposix, cto, ac
  79 +
  80 +
  81 +ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>
  82 +
  83 + This parameter tells the kernel how to configure IP addresses of devices
  84 + and also how to set up the IP routing table. It was originally called
  85 + `nfsaddrs', but now the boot-time IP configuration works independently of
  86 + NFS, so it was renamed to `ip' and the old name remained as an alias for
  87 + compatibility reasons.
  88 +
  89 + If this parameter is missing from the kernel command line, all fields are
  90 + assumed to be empty, and the defaults mentioned below apply. In general
  91 + this means that the kernel tries to configure everything using
  92 + autoconfiguration.
  93 +
  94 + The <autoconf> parameter can appear alone as the value to the `ip'
  95 + parameter (without all the ':' characters before). If the value is
  96 + "ip=off" or "ip=none", no autoconfiguration will take place, otherwise
  97 + autoconfiguration will take place. The most common way to use this
  98 + is "ip=dhcp".
  99 +
  100 + <client-ip> IP address of the client.
  101 +
  102 + Default: Determined using autoconfiguration.
  103 +
  104 + <server-ip> IP address of the NFS server. If RARP is used to determine
  105 + the client address and this parameter is NOT empty only
  106 + replies from the specified server are accepted.
  107 +
  108 + Only required for for NFS root. That is autoconfiguration
  109 + will not be triggered if it is missing and NFS root is not
  110 + in operation.
  111 +
  112 + Default: Determined using autoconfiguration.
  113 + The address of the autoconfiguration server is used.
  114 +
  115 + <gw-ip> IP address of a gateway if the server is on a different subnet.
  116 +
  117 + Default: Determined using autoconfiguration.
  118 +
  119 + <netmask> Netmask for local network interface. If unspecified
  120 + the netmask is derived from the client IP address assuming
  121 + classful addressing.
  122 +
  123 + Default: Determined using autoconfiguration.
  124 +
  125 + <hostname> Name of the client. May be supplied by autoconfiguration,
  126 + but its absence will not trigger autoconfiguration.
  127 +
  128 + Default: Client IP address is used in ASCII notation.
  129 +
  130 + <device> Name of network device to use.
  131 +
  132 + Default: If the host only has one device, it is used.
  133 + Otherwise the device is determined using
  134 + autoconfiguration. This is done by sending
  135 + autoconfiguration requests out of all devices,
  136 + and using the device that received the first reply.
  137 +
  138 + <autoconf> Method to use for autoconfiguration. In the case of options
  139 + which specify multiple autoconfiguration protocols,
  140 + requests are sent using all protocols, and the first one
  141 + to reply is used.
  142 +
  143 + Only autoconfiguration protocols that have been compiled
  144 + into the kernel will be used, regardless of the value of
  145 + this option.
  146 +
  147 + off or none: don't use autoconfiguration
  148 + (do static IP assignment instead)
  149 + on or any: use any protocol available in the kernel
  150 + (default)
  151 + dhcp: use DHCP
  152 + bootp: use BOOTP
  153 + rarp: use RARP
  154 + both: use both BOOTP and RARP but not DHCP
  155 + (old option kept for backwards compatibility)
  156 +
  157 + Default: any
  158 +
  159 +
  160 +
  161 +
  162 +3.) Boot Loader
  163 + ----------
  164 +
  165 +To get the kernel into memory different approaches can be used.
  166 +They depend on various facilities being available:
  167 +
  168 +
  169 +3.1) Booting from a floppy using syslinux
  170 +
  171 + When building kernels, an easy way to create a boot floppy that uses
  172 + syslinux is to use the zdisk or bzdisk make targets which use
  173 + and bzimage images respectively. Both targets accept the
  174 + FDARGS parameter which can be used to set the kernel command line.
  175 +
  176 + e.g.
  177 + make bzdisk FDARGS="root=/dev/nfs"
  178 +
  179 + Note that the user running this command will need to have
  180 + access to the floppy drive device, /dev/fd0
  181 +
  182 + For more information on syslinux, including how to create bootdisks
  183 + for prebuilt kernels, see http://syslinux.zytor.com/
  184 +
  185 + N.B: Previously it was possible to write a kernel directly to
  186 + a floppy using dd, configure the boot device using rdev, and
  187 + boot using the resulting floppy. Linux no longer supports this
  188 + method of booting.
  189 +
  190 +3.2) Booting from a cdrom using isolinux
  191 +
  192 + When building kernels, an easy way to create a bootable cdrom that
  193 + uses isolinux is to use the isoimage target which uses a bzimage
  194 + image. Like zdisk and bzdisk, this target accepts the FDARGS
  195 + parameter which can be used to set the kernel command line.
  196 +
  197 + e.g.
  198 + make isoimage FDARGS="root=/dev/nfs"
  199 +
  200 + The resulting iso image will be arch/<ARCH>/boot/image.iso
  201 + This can be written to a cdrom using a variety of tools including
  202 + cdrecord.
  203 +
  204 + e.g.
  205 + cdrecord dev=ATAPI:1,0,0 arch/i386/boot/image.iso
  206 +
  207 + For more information on isolinux, including how to create bootdisks
  208 + for prebuilt kernels, see http://syslinux.zytor.com/
  209 +
  210 +3.2) Using LILO
  211 + When using LILO all the necessary command line parameters may be
  212 + specified using the 'append=' directive in the LILO configuration
  213 + file.
  214 +
  215 + However, to use the 'root=' directive you also need to create
  216 + a dummy root device, which may be removed after LILO is run.
  217 +
  218 + mknod /dev/boot255 c 0 255
  219 +
  220 + For information on configuring LILO, please refer to its documentation.
  221 +
  222 +3.3) Using GRUB
  223 + When using GRUB, kernel parameter are simply appended after the kernel
  224 + specification: kernel <kernel> <parameters>
  225 +
  226 +3.4) Using loadlin
  227 + loadlin may be used to boot Linux from a DOS command prompt without
  228 + requiring a local hard disk to mount as root. This has not been
  229 + thoroughly tested by the authors of this document, but in general
  230 + it should be possible configure the kernel command line similarly
  231 + to the configuration of LILO.
  232 +
  233 + Please refer to the loadlin documentation for further information.
  234 +
  235 +3.5) Using a boot ROM
  236 + This is probably the most elegant way of booting a diskless client.
  237 + With a boot ROM the kernel is loaded using the TFTP protocol. The
  238 + authors of this document are not aware of any no commercial boot
  239 + ROMs that support booting Linux over the network. However, there
  240 + are two free implementations of a boot ROM, netboot-nfs and
  241 + etherboot, both of which are available on sunsite.unc.edu, and both
  242 + of which contain everything you need to boot a diskless Linux client.
  243 +
  244 +3.6) Using pxelinux
  245 + Pxelinux may be used to boot linux using the PXE boot loader
  246 + which is present on many modern network cards.
  247 +
  248 + When using pxelinux, the kernel image is specified using
  249 + "kernel <relative-path-below /tftpboot>". The nfsroot parameters
  250 + are passed to the kernel by adding them to the "append" line.
  251 + It is common to use serial console in conjunction with pxeliunx,
  252 + see Documentation/serial-console.txt for more information.
  253 +
  254 + For more information on isolinux, including how to create bootdisks
  255 + for prebuilt kernels, see http://syslinux.zytor.com/
  256 +
  257 +
  258 +
  259 +
  260 +4.) Credits
  261 + -------
  262 +
  263 + The nfsroot code in the kernel and the RARP support have been written
  264 + by Gero Kuhlmann <gero@gkminix.han.de>.
  265 +
  266 + The rest of the IP layer autoconfiguration code has been written
  267 + by Martin Mares <mj@atrey.karlin.mff.cuni.cz>.
  268 +
  269 + In order to write the initial version of nfsroot I would like to thank
  270 + Jens-Uwe Mager <jum@anubis.han.de> for his help.
Documentation/filesystems/rpc-cache.txt
  1 + This document gives a brief introduction to the caching
  2 +mechanisms in the sunrpc layer that is used, in particular,
  3 +for NFS authentication.
  4 +
  5 +CACHES
  6 +======
  7 +The caching replaces the old exports table and allows for
  8 +a wide variety of values to be caches.
  9 +
  10 +There are a number of caches that are similar in structure though
  11 +quite possibly very different in content and use. There is a corpus
  12 +of common code for managing these caches.
  13 +
  14 +Examples of caches that are likely to be needed are:
  15 + - mapping from IP address to client name
  16 + - mapping from client name and filesystem to export options
  17 + - mapping from UID to list of GIDs, to work around NFS's limitation
  18 + of 16 gids.
  19 + - mappings between local UID/GID and remote UID/GID for sites that
  20 + do not have uniform uid assignment
  21 + - mapping from network identify to public key for crypto authentication.
  22 +
  23 +The common code handles such things as:
  24 + - general cache lookup with correct locking
  25 + - supporting 'NEGATIVE' as well as positive entries
  26 + - allowing an EXPIRED time on cache items, and removing
  27 + items after they expire, and are no longer in-use.
  28 + - making requests to user-space to fill in cache entries
  29 + - allowing user-space to directly set entries in the cache
  30 + - delaying RPC requests that depend on as-yet incomplete
  31 + cache entries, and replaying those requests when the cache entry
  32 + is complete.
  33 + - clean out old entries as they expire.
  34 +
  35 +Creating a Cache
  36 +----------------
  37 +
  38 +1/ A cache needs a datum to store. This is in the form of a
  39 + structure definition that must contain a
  40 + struct cache_head
  41 + as an element, usually the first.
  42 + It will also contain a key and some content.
  43 + Each cache element is reference counted and contains
  44 + expiry and update times for use in cache management.
  45 +2/ A cache needs a "cache_detail" structure that
  46 + describes the cache. This stores the hash table, some
  47 + parameters for cache management, and some operations detailing how
  48 + to work with particular cache items.
  49 + The operations requires are:
  50 + struct cache_head *alloc(void)
  51 + This simply allocates appropriate memory and returns
  52 + a pointer to the cache_detail embedded within the
  53 + structure
  54 + void cache_put(struct kref *)
  55 + This is called when the last reference to an item is
  56 + dropped. The pointer passed is to the 'ref' field
  57 + in the cache_head. cache_put should release any
  58 + references create by 'cache_init' and, if CACHE_VALID
  59 + is set, any references created by cache_update.
  60 + It should then release the memory allocated by
  61 + 'alloc'.
  62 + int match(struct cache_head *orig, struct cache_head *new)
  63 + test if the keys in the two structures match. Return
  64 + 1 if they do, 0 if they don't.
  65 + void init(struct cache_head *orig, struct cache_head *new)
  66 + Set the 'key' fields in 'new' from 'orig'. This may
  67 + include taking references to shared objects.
  68 + void update(struct cache_head *orig, struct cache_head *new)
  69 + Set the 'content' fileds in 'new' from 'orig'.
  70 + int cache_show(struct seq_file *m, struct cache_detail *cd,
  71 + struct cache_head *h)
  72 + Optional. Used to provide a /proc file that lists the
  73 + contents of a cache. This should show one item,
  74 + usually on just one line.
  75 + int cache_request(struct cache_detail *cd, struct cache_head *h,
  76 + char **bpp, int *blen)
  77 + Format a request to be send to user-space for an item
  78 + to be instantiated. *bpp is a buffer of size *blen.
  79 + bpp should be moved forward over the encoded message,
  80 + and *blen should be reduced to show how much free
  81 + space remains. Return 0 on success or <0 if not
  82 + enough room or other problem.
  83 + int cache_parse(struct cache_detail *cd, char *buf, int len)
  84 + A message from user space has arrived to fill out a
  85 + cache entry. It is in 'buf' of length 'len'.
  86 + cache_parse should parse this, find the item in the
  87 + cache with sunrpc_cache_lookup, and update the item
  88 + with sunrpc_cache_update.
  89 +
  90 +
  91 +3/ A cache needs to be registered using cache_register(). This
  92 + includes it on a list of caches that will be regularly
  93 + cleaned to discard old data.
  94 +
  95 +Using a cache
  96 +-------------
  97 +
  98 +To find a value in a cache, call sunrpc_cache_lookup passing a pointer
  99 +to the cache_head in a sample item with the 'key' fields filled in.
  100 +This will be passed to ->match to identify the target entry. If no
  101 +entry is found, a new entry will be create, added to the cache, and
  102 +marked as not containing valid data.
  103 +
  104 +The item returned is typically passed to cache_check which will check
  105 +if the data is valid, and may initiate an up-call to get fresh data.
  106 +cache_check will return -ENOENT in the entry is negative or if an up
  107 +call is needed but not possible, -EAGAIN if an upcall is pending,
  108 +or 0 if the data is valid;
  109 +
  110 +cache_check can be passed a "struct cache_req *". This structure is
  111 +typically embedded in the actual request and can be used to create a
  112 +deferred copy of the request (struct cache_deferred_req). This is
  113 +done when the found cache item is not uptodate, but the is reason to
  114 +believe that userspace might provide information soon. When the cache
  115 +item does become valid, the deferred copy of the request will be
  116 +revisited (->revisit). It is expected that this method will
  117 +reschedule the request for processing.
  118 +
  119 +The value returned by sunrpc_cache_lookup can also be passed to
  120 +sunrpc_cache_update to set the content for the item. A second item is
  121 +passed which should hold the content. If the item found by _lookup
  122 +has valid data, then it is discarded and a new item is created. This
  123 +saves any user of an item from worrying about content changing while
  124 +it is being inspected. If the item found by _lookup does not contain
  125 +valid data, then the content is copied across and CACHE_VALID is set.
  126 +
  127 +Populating a cache
  128 +------------------
  129 +
  130 +Each cache has a name, and when the cache is registered, a directory
  131 +with that name is created in /proc/net/rpc
  132 +
  133 +This directory contains a file called 'channel' which is a channel
  134 +for communicating between kernel and user for populating the cache.
  135 +This directory may later contain other files of interacting
  136 +with the cache.
  137 +
  138 +The 'channel' works a bit like a datagram socket. Each 'write' is
  139 +passed as a whole to the cache for parsing and interpretation.
  140 +Each cache can treat the write requests differently, but it is
  141 +expected that a message written will contain:
  142 + - a key
  143 + - an expiry time
  144 + - a content.
  145 +with the intention that an item in the cache with the give key
  146 +should be create or updated to have the given content, and the
  147 +expiry time should be set on that item.
  148 +
  149 +Reading from a channel is a bit more interesting. When a cache
  150 +lookup fails, or when it succeeds but finds an entry that may soon
  151 +expire, a request is lodged for that cache item to be updated by
  152 +user-space. These requests appear in the channel file.
  153 +
  154 +Successive reads will return successive requests.
  155 +If there are no more requests to return, read will return EOF, but a
  156 +select or poll for read will block waiting for another request to be
  157 +added.
  158 +
  159 +Thus a user-space helper is likely to:
  160 + open the channel.
  161 + select for readable
  162 + read a request
  163 + write a response
  164 + loop.
  165 +
  166 +If it dies and needs to be restarted, any requests that have not been
  167 +answered will still appear in the file and will be read by the new
  168 +instance of the helper.
  169 +
  170 +Each cache should define a "cache_parse" method which takes a message
  171 +written from user-space and processes it. It should return an error
  172 +(which propagates back to the write syscall) or 0.
  173 +
  174 +Each cache should also define a "cache_request" method which
  175 +takes a cache item and encodes a request into the buffer
  176 +provided.
  177 +
  178 +Note: If a cache has no active readers on the channel, and has had not
  179 +active readers for more than 60 seconds, further requests will not be
  180 +added to the channel but instead all lookups that do not find a valid
  181 +entry will fail. This is partly for backward compatibility: The
  182 +previous nfs exports table was deemed to be authoritative and a
  183 +failed lookup meant a definite 'no'.
  184 +
  185 +request/response format
  186 +-----------------------
  187 +
  188 +While each cache is free to use it's own format for requests
  189 +and responses over channel, the following is recommended as
  190 +appropriate and support routines are available to help:
  191 +Each request or response record should be printable ASCII
  192 +with precisely one newline character which should be at the end.
  193 +Fields within the record should be separated by spaces, normally one.
  194 +If spaces, newlines, or nul characters are needed in a field they
  195 +much be quoted. two mechanisms are available:
  196 +1/ If a field begins '\x' then it must contain an even number of
  197 + hex digits, and pairs of these digits provide the bytes in the
  198 + field.
  199 +2/ otherwise a \ in the field must be followed by 3 octal digits
  200 + which give the code for a byte. Other characters are treated
  201 + as them selves. At the very least, space, newline, nul, and
  202 + '\' must be quoted in this way.
Documentation/filesystems/seq_file.txt
  1 +The seq_file interface
  2 +
  3 + Copyright 2003 Jonathan Corbet <corbet@lwn.net>
  4 + This file is originally from the LWN.net Driver Porting series at
  5 + http://lwn.net/Articles/driver-porting/
  6 +
  7 +
  8 +There are numerous ways for a device driver (or other kernel component) to
  9 +provide information to the user or system administrator. One useful
  10 +technique is the creation of virtual files, in debugfs, /proc or elsewhere.
  11 +Virtual files can provide human-readable output that is easy to get at
  12 +without any special utility programs; they can also make life easier for
  13 +script writers. It is not surprising that the use of virtual files has
  14 +grown over the years.
  15 +
  16 +Creating those files correctly has always been a bit of a challenge,
  17 +however. It is not that hard to make a virtual file which returns a
  18 +string. But life gets trickier if the output is long - anything greater
  19 +than an application is likely to read in a single operation. Handling
  20 +multiple reads (and seeks) requires careful attention to the reader's
  21 +position within the virtual file - that position is, likely as not, in the
  22 +middle of a line of output. The kernel has traditionally had a number of
  23 +implementations that got this wrong.
  24 +
  25 +The 2.6 kernel contains a set of functions (implemented by Alexander Viro)
  26 +which are designed to make it easy for virtual file creators to get it
  27 +right.
  28 +
  29 +The seq_file interface is available via <linux/seq_file.h>. There are
  30 +three aspects to seq_file:
  31 +
  32 + * An iterator interface which lets a virtual file implementation
  33 + step through the objects it is presenting.
  34 +
  35 + * Some utility functions for formatting objects for output without
  36 + needing to worry about things like output buffers.
  37 +
  38 + * A set of canned file_operations which implement most operations on
  39 + the virtual file.
  40 +
  41 +We'll look at the seq_file interface via an extremely simple example: a
  42 +loadable module which creates a file called /proc/sequence. The file, when
  43 +read, simply produces a set of increasing integer values, one per line. The
  44 +sequence will continue until the user loses patience and finds something
  45 +better to do. The file is seekable, in that one can do something like the
  46 +following:
  47 +
  48 + dd if=/proc/sequence of=out1 count=1
  49 + dd if=/proc/sequence skip=1 out=out2 count=1
  50 +
  51 +Then concatenate the output files out1 and out2 and get the right
  52 +result. Yes, it is a thoroughly useless module, but the point is to show
  53 +how the mechanism works without getting lost in other details. (Those
  54 +wanting to see the full source for this module can find it at
  55 +http://lwn.net/Articles/22359/).
  56 +
  57 +
  58 +The iterator interface
  59 +
  60 +Modules implementing a virtual file with seq_file must implement a simple
  61 +iterator object that allows stepping through the data of interest.
  62 +Iterators must be able to move to a specific position - like the file they
  63 +implement - but the interpretation of that position is up to the iterator
  64 +itself. A seq_file implementation that is formatting firewall rules, for
  65 +example, could interpret position N as the Nth rule in the chain.
  66 +Positioning can thus be done in whatever way makes the most sense for the
  67 +generator of the data, which need not be aware of how a position translates
  68 +to an offset in the virtual file. The one obvious exception is that a
  69 +position of zero should indicate the beginning of the file.
  70 +
  71 +The /proc/sequence iterator just uses the count of the next number it
  72 +will output as its position.
  73 +
  74 +Four functions must be implemented to make the iterator work. The first,
  75 +called start() takes a position as an argument and returns an iterator
  76 +which will start reading at that position. For our simple sequence example,
  77 +the start() function looks like:
  78 +
  79 + static void *ct_seq_start(struct seq_file *s, loff_t *pos)
  80 + {
  81 + loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL);
  82 + if (! spos)
  83 + return NULL;
  84 + *spos = *pos;
  85 + return spos;
  86 + }
  87 +
  88 +The entire data structure for this iterator is a single loff_t value
  89 +holding the current position. There is no upper bound for the sequence
  90 +iterator, but that will not be the case for most other seq_file
  91 +implementations; in most cases the start() function should check for a
  92 +"past end of file" condition and return NULL if need be.
  93 +
  94 +For more complicated applications, the private field of the seq_file
  95 +structure can be used. There is also a special value whch can be returned
  96 +by the start() function called SEQ_START_TOKEN; it can be used if you wish
  97 +to instruct your show() function (described below) to print a header at the
  98 +top of the output. SEQ_START_TOKEN should only be used if the offset is
  99 +zero, however.
  100 +
  101 +The next function to implement is called, amazingly, next(); its job is to
  102 +move the iterator forward to the next position in the sequence. The
  103 +example module can simply increment the position by one; more useful
  104 +modules will do what is needed to step through some data structure. The
  105 +next() function returns a new iterator, or NULL if the sequence is
  106 +complete. Here's the example version:
  107 +
  108 + static void *ct_seq_next(struct seq_file *s, void *v, loff_t *pos)
  109 + {
  110 + loff_t *spos = v;
  111 + *pos = ++*spos;
  112 + return spos;
  113 + }
  114 +
  115 +The stop() function is called when iteration is complete; its job, of
  116 +course, is to clean up. If dynamic memory is allocated for the iterator,
  117 +stop() is the place to free it.
  118 +
  119 + static void ct_seq_stop(struct seq_file *s, void *v)
  120 + {
  121 + kfree(v);
  122 + }
  123 +
  124 +Finally, the show() function should format the object currently pointed to
  125 +by the iterator for output. It should return zero, or an error code if
  126 +something goes wrong. The example module's show() function is:
  127 +
  128 + static int ct_seq_show(struct seq_file *s, void *v)
  129 + {
  130 + loff_t *spos = v;
  131 + seq_printf(s, "%lld\n", (long long)*spos);
  132 + return 0;
  133 + }
  134 +
  135 +We will look at seq_printf() in a moment. But first, the definition of the
  136 +seq_file iterator is finished by creating a seq_operations structure with
  137 +the four functions we have just defined:
  138 +
  139 + static const struct seq_operations ct_seq_ops = {
  140 + .start = ct_seq_start,
  141 + .next = ct_seq_next,
  142 + .stop = ct_seq_stop,
  143 + .show = ct_seq_show
  144 + };
  145 +
  146 +This structure will be needed to tie our iterator to the /proc file in
  147 +a little bit.
  148 +
  149 +It's worth noting that the interator value returned by start() and
  150 +manipulated by the other functions is considered to be completely opaque by
  151 +the seq_file code. It can thus be anything that is useful in stepping
  152 +through the data to be output. Counters can be useful, but it could also be
  153 +a direct pointer into an array or linked list. Anything goes, as long as
  154 +the programmer is aware that things can happen between calls to the
  155 +iterator function. However, the seq_file code (by design) will not sleep
  156 +between the calls to start() and stop(), so holding a lock during that time
  157 +is a reasonable thing to do. The seq_file code will also avoid taking any
  158 +other locks while the iterator is active.
  159 +
  160 +
  161 +Formatted output
  162 +
  163 +The seq_file code manages positioning within the output created by the
  164 +iterator and getting it into the user's buffer. But, for that to work, that
  165 +output must be passed to the seq_file code. Some utility functions have
  166 +been defined which make this task easy.
  167 +
  168 +Most code will simply use seq_printf(), which works pretty much like
  169 +printk(), but which requires the seq_file pointer as an argument. It is
  170 +common to ignore the return value from seq_printf(), but a function
  171 +producing complicated output may want to check that value and quit if
  172 +something non-zero is returned; an error return means that the seq_file
  173 +buffer has been filled and further output will be discarded.
  174 +
  175 +For straight character output, the following functions may be used:
  176 +
  177 + int seq_putc(struct seq_file *m, char c);
  178 + int seq_puts(struct seq_file *m, const char *s);
  179 + int seq_escape(struct seq_file *m, const char *s, const char *esc);
  180 +
  181 +The first two output a single character and a string, just like one would
  182 +expect. seq_escape() is like seq_puts(), except that any character in s
  183 +which is in the string esc will be represented in octal form in the output.
  184 +
  185 +There is also a function for printing filenames:
  186 +
  187 + int seq_path(struct seq_file *m, struct path *path, char *esc);
  188 +
  189 +Here, path indicates the file of interest, and esc is a set of characters
  190 +which should be escaped in the output.
  191 +
  192 +
  193 +Making it all work
  194 +
  195 +So far, we have a nice set of functions which can produce output within the
  196 +seq_file system, but we have not yet turned them into a file that a user
  197 +can see. Creating a file within the kernel requires, of course, the
  198 +creation of a set of file_operations which implement the operations on that
  199 +file. The seq_file interface provides a set of canned operations which do
  200 +most of the work. The virtual file author still must implement the open()
  201 +method, however, to hook everything up. The open function is often a single
  202 +line, as in the example module:
  203 +
  204 + static int ct_open(struct inode *inode, struct file *file)
  205 + {
  206 + return seq_open(file, &ct_seq_ops);
  207 + }
  208 +
  209 +Here, the call to seq_open() takes the seq_operations structure we created
  210 +before, and gets set up to iterate through the virtual file.
  211 +
  212 +On a successful open, seq_open() stores the struct seq_file pointer in
  213 +file->private_data. If you have an application where the same iterator can
  214 +be used for more than one file, you can store an arbitrary pointer in the
  215 +private field of the seq_file structure; that value can then be retrieved
  216 +by the iterator functions.
  217 +
  218 +The other operations of interest - read(), llseek(), and release() - are
  219 +all implemented by the seq_file code itself. So a virtual file's
  220 +file_operations structure will look like:
  221 +
  222 + static const struct file_operations ct_file_ops = {
  223 + .owner = THIS_MODULE,
  224 + .open = ct_open,
  225 + .read = seq_read,
  226 + .llseek = seq_lseek,
  227 + .release = seq_release
  228 + };
  229 +
  230 +There is also a seq_release_private() which passes the contents of the
  231 +seq_file private field to kfree() before releasing the structure.
  232 +
  233 +The final step is the creation of the /proc file itself. In the example
  234 +code, that is done in the initialization code in the usual way:
  235 +
  236 + static int ct_init(void)
  237 + {
  238 + struct proc_dir_entry *entry;
  239 +
  240 + entry = create_proc_entry("sequence", 0, NULL);
  241 + if (entry)
  242 + entry->proc_fops = &ct_file_ops;
  243 + return 0;
  244 + }
  245 +
  246 + module_init(ct_init);
  247 +
  248 +And that is pretty much it.
  249 +
  250 +
  251 +seq_list
  252 +
  253 +If your file will be iterating through a linked list, you may find these
  254 +routines useful:
  255 +
  256 + struct list_head *seq_list_start(struct list_head *head,
  257 + loff_t pos);
  258 + struct list_head *seq_list_start_head(struct list_head *head,
  259 + loff_t pos);
  260 + struct list_head *seq_list_next(void *v, struct list_head *head,
  261 + loff_t *ppos);
  262 +
  263 +These helpers will interpret pos as a position within the list and iterate
  264 +accordingly. Your start() and next() functions need only invoke the
  265 +seq_list_* helpers with a pointer to the appropriate list_head structure.
  266 +
  267 +
  268 +The extra-simple version
  269 +
  270 +For extremely simple virtual files, there is an even easier interface. A
  271 +module can define only the show() function, which should create all the
  272 +output that the virtual file will contain. The file's open() method then
  273 +calls:
  274 +
  275 + int single_open(struct file *file,
  276 + int (*show)(struct seq_file *m, void *p),
  277 + void *data);
  278 +
  279 +When output time comes, the show() function will be called once. The data
  280 +value given to single_open() can be found in the private field of the
  281 +seq_file structure. When using single_open(), the programmer should use
  282 +single_release() instead of seq_release() in the file_operations structure
  283 +to avoid a memory leak.
Documentation/hrtimers/highres.txt
... ... @@ -98,7 +98,7 @@
98 98 event devices are used to provide local CPU functionality such as process
99 99 accounting, profiling, and high resolution timers.
100 100  
101   -The management layer assignes one or more of the folliwing functions to a clock
  101 +The management layer assigns one or more of the following functions to a clock
102 102 event device:
103 103 - system global periodic tick (jiffies update)
104 104 - cpu local update_process_times
Documentation/kernel-parameters.txt
... ... @@ -844,7 +844,7 @@
844 844 arch/alpha/kernel/core_marvel.c.
845 845  
846 846 ip= [IP_PNP]
847   - See Documentation/nfsroot.txt.
  847 + See Documentation/filesystems/nfsroot.txt.
848 848  
849 849 ip2= [HW] Set IO/IRQ pairs for up to 4 IntelliPort boards
850 850 See comment before ip2_setup() in
851 851  
... ... @@ -1198,10 +1198,10 @@
1198 1198 file if at all.
1199 1199  
1200 1200 nfsaddrs= [NFS]
1201   - See Documentation/nfsroot.txt.
  1201 + See Documentation/filesystems/nfsroot.txt.
1202 1202  
1203 1203 nfsroot= [NFS] nfs root filesystem for disk-less boxes.
1204   - See Documentation/nfsroot.txt.
  1204 + See Documentation/filesystems/nfsroot.txt.
1205 1205  
1206 1206 nfs.callback_tcpport=
1207 1207 [NFS] set the TCP port on which the NFSv4 callback
Documentation/nfsroot.txt
1   -Mounting the root filesystem via NFS (nfsroot)
2   -===============================================
3   -
4   -Written 1996 by Gero Kuhlmann <gero@gkminix.han.de>
5   -Updated 1997 by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
6   -Updated 2006 by Nico Schottelius <nico-kernel-nfsroot@schottelius.org>
7   -Updated 2006 by Horms <horms@verge.net.au>
8   -
9   -
10   -
11   -In order to use a diskless system, such as an X-terminal or printer server
12   -for example, it is necessary for the root filesystem to be present on a
13   -non-disk device. This may be an initramfs (see Documentation/filesystems/
14   -ramfs-rootfs-initramfs.txt), a ramdisk (see Documentation/initrd.txt) or a
15   -filesystem mounted via NFS. The following text describes on how to use NFS
16   -for the root filesystem. For the rest of this text 'client' means the
17   -diskless system, and 'server' means the NFS server.
18   -
19   -
20   -
21   -
22   -1.) Enabling nfsroot capabilities
23   - -----------------------------
24   -
25   -In order to use nfsroot, NFS client support needs to be selected as
26   -built-in during configuration. Once this has been selected, the nfsroot
27   -option will become available, which should also be selected.
28   -
29   -In the networking options, kernel level autoconfiguration can be selected,
30   -along with the types of autoconfiguration to support. Selecting all of
31   -DHCP, BOOTP and RARP is safe.
32   -
33   -
34   -
35   -
36   -2.) Kernel command line
37   - -------------------
38   -
39   -When the kernel has been loaded by a boot loader (see below) it needs to be
40   -told what root fs device to use. And in the case of nfsroot, where to find
41   -both the server and the name of the directory on the server to mount as root.
42   -This can be established using the following kernel command line parameters:
43   -
44   -
45   -root=/dev/nfs
46   -
47   - This is necessary to enable the pseudo-NFS-device. Note that it's not a
48   - real device but just a synonym to tell the kernel to use NFS instead of
49   - a real device.
50   -
51   -
52   -nfsroot=[<server-ip>:]<root-dir>[,<nfs-options>]
53   -
54   - If the `nfsroot' parameter is NOT given on the command line,
55   - the default "/tftpboot/%s" will be used.
56   -
57   - <server-ip> Specifies the IP address of the NFS server.
58   - The default address is determined by the `ip' parameter
59   - (see below). This parameter allows the use of different
60   - servers for IP autoconfiguration and NFS.
61   -
62   - <root-dir> Name of the directory on the server to mount as root.
63   - If there is a "%s" token in the string, it will be
64   - replaced by the ASCII-representation of the client's
65   - IP address.
66   -
67   - <nfs-options> Standard NFS options. All options are separated by commas.
68   - The following defaults are used:
69   - port = as given by server portmap daemon
70   - rsize = 4096
71   - wsize = 4096
72   - timeo = 7
73   - retrans = 3
74   - acregmin = 3
75   - acregmax = 60
76   - acdirmin = 30
77   - acdirmax = 60
78   - flags = hard, nointr, noposix, cto, ac
79   -
80   -
81   -ip=<client-ip>:<server-ip>:<gw-ip>:<netmask>:<hostname>:<device>:<autoconf>
82   -
83   - This parameter tells the kernel how to configure IP addresses of devices
84   - and also how to set up the IP routing table. It was originally called
85   - `nfsaddrs', but now the boot-time IP configuration works independently of
86   - NFS, so it was renamed to `ip' and the old name remained as an alias for
87   - compatibility reasons.
88   -
89   - If this parameter is missing from the kernel command line, all fields are
90   - assumed to be empty, and the defaults mentioned below apply. In general
91   - this means that the kernel tries to configure everything using
92   - autoconfiguration.
93   -
94   - The <autoconf> parameter can appear alone as the value to the `ip'
95   - parameter (without all the ':' characters before). If the value is
96   - "ip=off" or "ip=none", no autoconfiguration will take place, otherwise
97   - autoconfiguration will take place. The most common way to use this
98   - is "ip=dhcp".
99   -
100   - <client-ip> IP address of the client.
101   -
102   - Default: Determined using autoconfiguration.
103   -
104   - <server-ip> IP address of the NFS server. If RARP is used to determine
105   - the client address and this parameter is NOT empty only
106   - replies from the specified server are accepted.
107   -
108   - Only required for for NFS root. That is autoconfiguration
109   - will not be triggered if it is missing and NFS root is not
110   - in operation.
111   -
112   - Default: Determined using autoconfiguration.
113   - The address of the autoconfiguration server is used.
114   -
115   - <gw-ip> IP address of a gateway if the server is on a different subnet.
116   -
117   - Default: Determined using autoconfiguration.
118   -
119   - <netmask> Netmask for local network interface. If unspecified
120   - the netmask is derived from the client IP address assuming
121   - classful addressing.
122   -
123   - Default: Determined using autoconfiguration.
124   -
125   - <hostname> Name of the client. May be supplied by autoconfiguration,
126   - but its absence will not trigger autoconfiguration.
127   -
128   - Default: Client IP address is used in ASCII notation.
129   -
130   - <device> Name of network device to use.
131   -
132   - Default: If the host only has one device, it is used.
133   - Otherwise the device is determined using
134   - autoconfiguration. This is done by sending
135   - autoconfiguration requests out of all devices,
136   - and using the device that received the first reply.
137   -
138   - <autoconf> Method to use for autoconfiguration. In the case of options
139   - which specify multiple autoconfiguration protocols,
140   - requests are sent using all protocols, and the first one
141   - to reply is used.
142   -
143   - Only autoconfiguration protocols that have been compiled
144   - into the kernel will be used, regardless of the value of
145   - this option.
146   -
147   - off or none: don't use autoconfiguration
148   - (do static IP assignment instead)
149   - on or any: use any protocol available in the kernel
150   - (default)
151   - dhcp: use DHCP
152   - bootp: use BOOTP
153   - rarp: use RARP
154   - both: use both BOOTP and RARP but not DHCP
155   - (old option kept for backwards compatibility)
156   -
157   - Default: any
158   -
159   -
160   -
161   -
162   -3.) Boot Loader
163   - ----------
164   -
165   -To get the kernel into memory different approaches can be used.
166   -They depend on various facilities being available:
167   -
168   -
169   -3.1) Booting from a floppy using syslinux
170   -
171   - When building kernels, an easy way to create a boot floppy that uses
172   - syslinux is to use the zdisk or bzdisk make targets which use
173   - and bzimage images respectively. Both targets accept the
174   - FDARGS parameter which can be used to set the kernel command line.
175   -
176   - e.g.
177   - make bzdisk FDARGS="root=/dev/nfs"
178   -
179   - Note that the user running this command will need to have
180   - access to the floppy drive device, /dev/fd0
181   -
182   - For more information on syslinux, including how to create bootdisks
183   - for prebuilt kernels, see http://syslinux.zytor.com/
184   -
185   - N.B: Previously it was possible to write a kernel directly to
186   - a floppy using dd, configure the boot device using rdev, and
187   - boot using the resulting floppy. Linux no longer supports this
188   - method of booting.
189   -
190   -3.2) Booting from a cdrom using isolinux
191   -
192   - When building kernels, an easy way to create a bootable cdrom that
193   - uses isolinux is to use the isoimage target which uses a bzimage
194   - image. Like zdisk and bzdisk, this target accepts the FDARGS
195   - parameter which can be used to set the kernel command line.
196   -
197   - e.g.
198   - make isoimage FDARGS="root=/dev/nfs"
199   -
200   - The resulting iso image will be arch/<ARCH>/boot/image.iso
201   - This can be written to a cdrom using a variety of tools including
202   - cdrecord.
203   -
204   - e.g.
205   - cdrecord dev=ATAPI:1,0,0 arch/i386/boot/image.iso
206   -
207   - For more information on isolinux, including how to create bootdisks
208   - for prebuilt kernels, see http://syslinux.zytor.com/
209   -
210   -3.2) Using LILO
211   - When using LILO all the necessary command line parameters may be
212   - specified using the 'append=' directive in the LILO configuration
213   - file.
214   -
215   - However, to use the 'root=' directive you also need to create
216   - a dummy root device, which may be removed after LILO is run.
217   -
218   - mknod /dev/boot255 c 0 255
219   -
220   - For information on configuring LILO, please refer to its documentation.
221   -
222   -3.3) Using GRUB
223   - When using GRUB, kernel parameter are simply appended after the kernel
224   - specification: kernel <kernel> <parameters>
225   -
226   -3.4) Using loadlin
227   - loadlin may be used to boot Linux from a DOS command prompt without
228   - requiring a local hard disk to mount as root. This has not been
229   - thoroughly tested by the authors of this document, but in general
230   - it should be possible configure the kernel command line similarly
231   - to the configuration of LILO.
232   -
233   - Please refer to the loadlin documentation for further information.
234   -
235   -3.5) Using a boot ROM
236   - This is probably the most elegant way of booting a diskless client.
237   - With a boot ROM the kernel is loaded using the TFTP protocol. The
238   - authors of this document are not aware of any no commercial boot
239   - ROMs that support booting Linux over the network. However, there
240   - are two free implementations of a boot ROM, netboot-nfs and
241   - etherboot, both of which are available on sunsite.unc.edu, and both
242   - of which contain everything you need to boot a diskless Linux client.
243   -
244   -3.6) Using pxelinux
245   - Pxelinux may be used to boot linux using the PXE boot loader
246   - which is present on many modern network cards.
247   -
248   - When using pxelinux, the kernel image is specified using
249   - "kernel <relative-path-below /tftpboot>". The nfsroot parameters
250   - are passed to the kernel by adding them to the "append" line.
251   - It is common to use serial console in conjunction with pxeliunx,
252   - see Documentation/serial-console.txt for more information.
253   -
254   - For more information on isolinux, including how to create bootdisks
255   - for prebuilt kernels, see http://syslinux.zytor.com/
256   -
257   -
258   -
259   -
260   -4.) Credits
261   - -------
262   -
263   - The nfsroot code in the kernel and the RARP support have been written
264   - by Gero Kuhlmann <gero@gkminix.han.de>.
265   -
266   - The rest of the IP layer autoconfiguration code has been written
267   - by Martin Mares <mj@atrey.karlin.mff.cuni.cz>.
268   -
269   - In order to write the initial version of nfsroot I would like to thank
270   - Jens-Uwe Mager <jum@anubis.han.de> for his help.
Documentation/rpc-cache.txt
1   - This document gives a brief introduction to the caching
2   -mechanisms in the sunrpc layer that is used, in particular,
3   -for NFS authentication.
4   -
5   -CACHES
6   -======
7   -The caching replaces the old exports table and allows for
8   -a wide variety of values to be caches.
9   -
10   -There are a number of caches that are similar in structure though
11   -quite possibly very different in content and use. There is a corpus
12   -of common code for managing these caches.
13   -
14   -Examples of caches that are likely to be needed are:
15   - - mapping from IP address to client name
16   - - mapping from client name and filesystem to export options
17   - - mapping from UID to list of GIDs, to work around NFS's limitation
18   - of 16 gids.
19   - - mappings between local UID/GID and remote UID/GID for sites that
20   - do not have uniform uid assignment
21   - - mapping from network identify to public key for crypto authentication.
22   -
23   -The common code handles such things as:
24   - - general cache lookup with correct locking
25   - - supporting 'NEGATIVE' as well as positive entries
26   - - allowing an EXPIRED time on cache items, and removing
27   - items after they expire, and are no longer in-use.
28   - - making requests to user-space to fill in cache entries
29   - - allowing user-space to directly set entries in the cache
30   - - delaying RPC requests that depend on as-yet incomplete
31   - cache entries, and replaying those requests when the cache entry
32   - is complete.
33   - - clean out old entries as they expire.
34   -
35   -Creating a Cache
36   -----------------
37   -
38   -1/ A cache needs a datum to store. This is in the form of a
39   - structure definition that must contain a
40   - struct cache_head
41   - as an element, usually the first.
42   - It will also contain a key and some content.
43   - Each cache element is reference counted and contains
44   - expiry and update times for use in cache management.
45   -2/ A cache needs a "cache_detail" structure that
46   - describes the cache. This stores the hash table, some
47   - parameters for cache management, and some operations detailing how
48   - to work with particular cache items.
49   - The operations requires are:
50   - struct cache_head *alloc(void)
51   - This simply allocates appropriate memory and returns
52   - a pointer to the cache_detail embedded within the
53   - structure
54   - void cache_put(struct kref *)
55   - This is called when the last reference to an item is
56   - dropped. The pointer passed is to the 'ref' field
57   - in the cache_head. cache_put should release any
58   - references create by 'cache_init' and, if CACHE_VALID
59   - is set, any references created by cache_update.
60   - It should then release the memory allocated by
61   - 'alloc'.
62   - int match(struct cache_head *orig, struct cache_head *new)
63   - test if the keys in the two structures match. Return
64   - 1 if they do, 0 if they don't.
65   - void init(struct cache_head *orig, struct cache_head *new)
66   - Set the 'key' fields in 'new' from 'orig'. This may
67   - include taking references to shared objects.
68   - void update(struct cache_head *orig, struct cache_head *new)
69   - Set the 'content' fileds in 'new' from 'orig'.
70   - int cache_show(struct seq_file *m, struct cache_detail *cd,
71   - struct cache_head *h)
72   - Optional. Used to provide a /proc file that lists the
73   - contents of a cache. This should show one item,
74   - usually on just one line.
75   - int cache_request(struct cache_detail *cd, struct cache_head *h,
76   - char **bpp, int *blen)
77   - Format a request to be send to user-space for an item
78   - to be instantiated. *bpp is a buffer of size *blen.
79   - bpp should be moved forward over the encoded message,
80   - and *blen should be reduced to show how much free
81   - space remains. Return 0 on success or <0 if not
82   - enough room or other problem.
83   - int cache_parse(struct cache_detail *cd, char *buf, int len)
84   - A message from user space has arrived to fill out a
85   - cache entry. It is in 'buf' of length 'len'.
86   - cache_parse should parse this, find the item in the
87   - cache with sunrpc_cache_lookup, and update the item
88   - with sunrpc_cache_update.
89   -
90   -
91   -3/ A cache needs to be registered using cache_register(). This
92   - includes it on a list of caches that will be regularly
93   - cleaned to discard old data.
94   -
95   -Using a cache
96   --------------
97   -
98   -To find a value in a cache, call sunrpc_cache_lookup passing a pointer
99   -to the cache_head in a sample item with the 'key' fields filled in.
100   -This will be passed to ->match to identify the target entry. If no
101   -entry is found, a new entry will be create, added to the cache, and
102   -marked as not containing valid data.
103   -
104   -The item returned is typically passed to cache_check which will check
105   -if the data is valid, and may initiate an up-call to get fresh data.
106   -cache_check will return -ENOENT in the entry is negative or if an up
107   -call is needed but not possible, -EAGAIN if an upcall is pending,
108   -or 0 if the data is valid;
109   -
110   -cache_check can be passed a "struct cache_req *". This structure is
111   -typically embedded in the actual request and can be used to create a
112   -deferred copy of the request (struct cache_deferred_req). This is
113   -done when the found cache item is not uptodate, but the is reason to
114   -believe that userspace might provide information soon. When the cache
115   -item does become valid, the deferred copy of the request will be
116   -revisited (->revisit). It is expected that this method will
117   -reschedule the request for processing.
118   -
119   -The value returned by sunrpc_cache_lookup can also be passed to
120   -sunrpc_cache_update to set the content for the item. A second item is
121   -passed which should hold the content. If the item found by _lookup
122   -has valid data, then it is discarded and a new item is created. This
123   -saves any user of an item from worrying about content changing while
124   -it is being inspected. If the item found by _lookup does not contain
125   -valid data, then the content is copied across and CACHE_VALID is set.
126   -
127   -Populating a cache
128   -------------------
129   -
130   -Each cache has a name, and when the cache is registered, a directory
131   -with that name is created in /proc/net/rpc
132   -
133   -This directory contains a file called 'channel' which is a channel
134   -for communicating between kernel and user for populating the cache.
135   -This directory may later contain other files of interacting
136   -with the cache.
137   -
138   -The 'channel' works a bit like a datagram socket. Each 'write' is
139   -passed as a whole to the cache for parsing and interpretation.
140   -Each cache can treat the write requests differently, but it is
141   -expected that a message written will contain:
142   - - a key
143   - - an expiry time
144   - - a content.
145   -with the intention that an item in the cache with the give key
146   -should be create or updated to have the given content, and the
147   -expiry time should be set on that item.
148   -
149   -Reading from a channel is a bit more interesting. When a cache
150   -lookup fails, or when it succeeds but finds an entry that may soon
151   -expire, a request is lodged for that cache item to be updated by
152   -user-space. These requests appear in the channel file.
153   -
154   -Successive reads will return successive requests.
155   -If there are no more requests to return, read will return EOF, but a
156   -select or poll for read will block waiting for another request to be
157   -added.
158   -
159   -Thus a user-space helper is likely to:
160   - open the channel.
161   - select for readable
162   - read a request
163   - write a response
164   - loop.
165   -
166   -If it dies and needs to be restarted, any requests that have not been
167   -answered will still appear in the file and will be read by the new
168   -instance of the helper.
169   -
170   -Each cache should define a "cache_parse" method which takes a message
171   -written from user-space and processes it. It should return an error
172   -(which propagates back to the write syscall) or 0.
173   -
174   -Each cache should also define a "cache_request" method which
175   -takes a cache item and encodes a request into the buffer
176   -provided.
177   -
178   -Note: If a cache has no active readers on the channel, and has had not
179   -active readers for more than 60 seconds, further requests will not be
180   -added to the channel but instead all lookups that do not find a valid
181   -entry will fail. This is partly for backward compatibility: The
182   -previous nfs exports table was deemed to be authoritative and a
183   -failed lookup meant a definite 'no'.
184   -
185   -request/response format
186   ------------------------
187   -
188   -While each cache is free to use it's own format for requests
189   -and responses over channel, the following is recommended as
190   -appropriate and support routines are available to help:
191   -Each request or response record should be printable ASCII
192   -with precisely one newline character which should be at the end.
193   -Fields within the record should be separated by spaces, normally one.
194   -If spaces, newlines, or nul characters are needed in a field they
195   -much be quoted. two mechanisms are available:
196   -1/ If a field begins '\x' then it must contain an even number of
197   - hex digits, and pairs of these digits provide the bytes in the
198   - field.
199   -2/ otherwise a \ in the field must be followed by 3 octal digits
200   - which give the code for a byte. Other characters are treated
201   - as them selves. At the very least, space, newline, nul, and
202   - '\' must be quoted in this way.
Documentation/sched-rt-group.txt
1   -
2   -
3   -Real-Time group scheduling.
4   -
5   -The problem space:
6   -
7   -In order to schedule multiple groups of realtime tasks each group must
8   -be assigned a fixed portion of the CPU time available. Without a minimum
9   -guarantee a realtime group can obviously fall short. A fuzzy upper limit
10   -is of no use since it cannot be relied upon. Which leaves us with just
11   -the single fixed portion.
12   -
13   -CPU time is divided by means of specifying how much time can be spent
14   -running in a given period. Say a frame fixed realtime renderer must
15   -deliver 25 frames a second, which yields a period of 0.04s. Now say
16   -it will also have to play some music and respond to input, leaving it
17   -with around 80% for the graphics. We can then give this group a runtime
18   -of 0.8 * 0.04s = 0.032s.
19   -
20   -This way the graphics group will have a 0.04s period with a 0.032s runtime
21   -limit.
22   -
23   -Now if the audio thread needs to refill the DMA buffer every 0.005s, but
24   -needs only about 3% CPU time to do so, it can do with a 0.03 * 0.005s
25   -= 0.00015s.
26   -
27   -
28   -The Interface:
29   -
30   -system wide:
31   -
32   -/proc/sys/kernel/sched_rt_period_ms
33   -/proc/sys/kernel/sched_rt_runtime_us
34   -
35   -CONFIG_FAIR_USER_SCHED
36   -
37   -/sys/kernel/uids/<uid>/cpu_rt_runtime_us
38   -
39   -or
40   -
41   -CONFIG_FAIR_CGROUP_SCHED
42   -
43   -/cgroup/<cgroup>/cpu.rt_runtime_us
44   -
45   -[ time is specified in us because the interface is s32; this gives an
46   - operating range of ~35m to 1us ]
47   -
48   -The period takes values in [ 1, INT_MAX ], runtime in [ -1, INT_MAX - 1 ].
49   -
50   -A runtime of -1 specifies runtime == period, ie. no limit.
51   -
52   -New groups get the period from /proc/sys/kernel/sched_rt_period_us and
53   -a runtime of 0.
54   -
55   -Settings are constrained to:
56   -
57   - \Sum_{i} runtime_{i} / global_period <= global_runtime / global_period
58   -
59   -in order to keep the configuration schedulable.
Documentation/scheduler/00-INDEX
... ... @@ -12,6 +12,8 @@
12 12 - information on scheduling domains.
13 13 sched-nice-design.txt
14 14 - How and why the scheduler's nice levels are implemented.
  15 +sched-rt-group.txt
  16 + - real-time group scheduling.
15 17 sched-stats.txt
16 18 - information on schedstats (Linux Scheduler Statistics).
Documentation/scheduler/sched-rt-group.txt
  1 +
  2 +
  3 +Real-Time group scheduling.
  4 +
  5 +The problem space:
  6 +
  7 +In order to schedule multiple groups of realtime tasks each group must
  8 +be assigned a fixed portion of the CPU time available. Without a minimum
  9 +guarantee a realtime group can obviously fall short. A fuzzy upper limit
  10 +is of no use since it cannot be relied upon. Which leaves us with just
  11 +the single fixed portion.
  12 +
  13 +CPU time is divided by means of specifying how much time can be spent
  14 +running in a given period. Say a frame fixed realtime renderer must
  15 +deliver 25 frames a second, which yields a period of 0.04s. Now say
  16 +it will also have to play some music and respond to input, leaving it
  17 +with around 80% for the graphics. We can then give this group a runtime
  18 +of 0.8 * 0.04s = 0.032s.
  19 +
  20 +This way the graphics group will have a 0.04s period with a 0.032s runtime
  21 +limit.
  22 +
  23 +Now if the audio thread needs to refill the DMA buffer every 0.005s, but
  24 +needs only about 3% CPU time to do so, it can do with a 0.03 * 0.005s
  25 += 0.00015s.
  26 +
  27 +
  28 +The Interface:
  29 +
  30 +system wide:
  31 +
  32 +/proc/sys/kernel/sched_rt_period_ms
  33 +/proc/sys/kernel/sched_rt_runtime_us
  34 +
  35 +CONFIG_FAIR_USER_SCHED
  36 +
  37 +/sys/kernel/uids/<uid>/cpu_rt_runtime_us
  38 +
  39 +or
  40 +
  41 +CONFIG_FAIR_CGROUP_SCHED
  42 +
  43 +/cgroup/<cgroup>/cpu.rt_runtime_us
  44 +
  45 +[ time is specified in us because the interface is s32; this gives an
  46 + operating range of ~35m to 1us ]
  47 +
  48 +The period takes values in [ 1, INT_MAX ], runtime in [ -1, INT_MAX - 1 ].
  49 +
  50 +A runtime of -1 specifies runtime == period, ie. no limit.
  51 +
  52 +New groups get the period from /proc/sys/kernel/sched_rt_period_us and
  53 +a runtime of 0.
  54 +
  55 +Settings are constrained to:
  56 +
  57 + \Sum_{i} runtime_{i} / global_period <= global_runtime / global_period
  58 +
  59 +in order to keep the configuration schedulable.
Documentation/spinlocks.txt
... ... @@ -5,6 +5,28 @@
5 5 __SPIN_LOCK_UNLOCKED()/__RW_LOCK_UNLOCKED() as appropriate for static
6 6 initialization.
7 7  
  8 +Most of the time, you can simply turn:
  9 +
  10 + static spinlock_t xxx_lock = SPIN_LOCK_UNLOCKED;
  11 +
  12 +into:
  13 +
  14 + static DEFINE_SPINLOCK(xxx_lock);
  15 +
  16 +Static structure member variables go from:
  17 +
  18 + struct foo bar {
  19 + .lock = SPIN_LOCK_UNLOCKED;
  20 + };
  21 +
  22 +to:
  23 +
  24 + struct foo bar {
  25 + .lock = __SPIN_LOCK_UNLOCKED(bar.lock);
  26 + };
  27 +
  28 +Declaration of static rw_locks undergo a similar transformation.
  29 +
8 30 Dynamic initialization, when necessary, may be performed as
9 31 demonstrated below.
10 32  
... ... @@ -1744,10 +1744,10 @@
1744 1744 If you want your Linux box to mount its whole root file system (the
1745 1745 one containing the directory /) from some other computer over the
1746 1746 net via NFS (presumably because your box doesn't have a hard disk),
1747   - say Y. Read <file:Documentation/nfsroot.txt> for details. It is
1748   - likely that in this case, you also want to say Y to "Kernel level IP
1749   - autoconfiguration" so that your box can discover its network address
1750   - at boot time.
  1747 + say Y. Read <file:Documentation/filesystems/nfsroot.txt> for
  1748 + details. It is likely that in this case, you also want to say Y to
  1749 + "Kernel level IP autoconfiguration" so that your box can discover
  1750 + its network address at boot time.
1751 1751  
1752 1752 Most people say N here.
1753 1753  
include/linux/spinlock.h
... ... @@ -341,6 +341,9 @@
341 341 * atomic_dec_and_lock - lock on reaching reference count zero
342 342 * @atomic: the atomic counter
343 343 * @lock: the spinlock in question
  344 + *
  345 + * Decrements @atomic by 1. If the result is 0, returns true and locks
  346 + * @lock. Returns false for all other cases.
344 347 */
345 348 extern int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock);
346 349 #define atomic_dec_and_lock(atomic, lock) \
... ... @@ -160,7 +160,7 @@
160 160  
161 161 If unsure, say Y. Note that if you want to use DHCP, a DHCP server
162 162 must be operating on your network. Read
163   - <file:Documentation/nfsroot.txt> for details.
  163 + <file:Documentation/filesystems/nfsroot.txt> for details.
164 164  
165 165 config IP_PNP_BOOTP
166 166 bool "IP: BOOTP support"
... ... @@ -175,7 +175,7 @@
175 175 does BOOTP itself, providing all necessary information on the kernel
176 176 command line, you can say N here. If unsure, say Y. Note that if you
177 177 want to use BOOTP, a BOOTP server must be operating on your network.
178   - Read <file:Documentation/nfsroot.txt> for details.
  178 + Read <file:Documentation/filesystems/nfsroot.txt> for details.
179 179  
180 180 config IP_PNP_RARP
181 181 bool "IP: RARP support"
... ... @@ -187,8 +187,8 @@
187 187 discovered automatically at boot time using the RARP protocol (an
188 188 older protocol which is being obsoleted by BOOTP and DHCP), say Y
189 189 here. Note that if you want to use RARP, a RARP server must be
190   - operating on your network. Read <file:Documentation/nfsroot.txt> for
191   - details.
  190 + operating on your network. Read
  191 + <file:Documentation/filesystems/nfsroot.txt> for details.
192 192  
193 193 # not yet ready..
194 194 # bool ' IP: ARP support' CONFIG_IP_PNP_ARP
... ... @@ -1411,7 +1411,7 @@
1411 1411  
1412 1412 /*
1413 1413 * Decode any IP configuration options in the "ip=" or "nfsaddrs=" kernel
1414   - * command line parameter. See Documentation/nfsroot.txt.
  1414 + * command line parameter. See Documentation/filesystems/nfsroot.txt.
1415 1415 */
1416 1416 static int __init ic_proto_name(char *name)
1417 1417 {