Blame view

fs/cramfs/README 6.1 KB
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
  Notes on Filesystem Layout
  --------------------------
  
  These notes describe what mkcramfs generates.  Kernel requirements are
  a bit looser, e.g. it doesn't care if the <file_data> items are
  swapped around (though it does care that directory entries (inodes) in
  a given directory are contiguous, as this is used by readdir).
  
  All data is currently in host-endian format; neither mkcramfs nor the
  kernel ever do swabbing.  (See section `Block Size' below.)
  
  <filesystem>:
  	<superblock>
  	<directory_structure>
  	<data>
  
  <superblock>: struct cramfs_super (see cramfs_fs.h).
  
  <directory_structure>:
  	For each file:
  		struct cramfs_inode (see cramfs_fs.h).
  		Filename.  Not generally null-terminated, but it is
  		 null-padded to a multiple of 4 bytes.
  
  The order of inode traversal is described as "width-first" (not to be
  confused with breadth-first); i.e. like depth-first but listing all of
  a directory's entries before recursing down its subdirectories: the
  same order as `ls -AUR' (but without the /^\..*:$/ directory header
  lines); put another way, the same order as `find -type d -exec
  ls -AU1 {} \;'.
  
  Beginning in 2.4.7, directory entries are sorted.  This optimization
  allows cramfs_lookup to return more quickly when a filename does not
  exist, speeds up user-space directory sorts, etc.
  
  <data>:
  	One <file_data> for each file that's either a symlink or a
  	 regular file of non-zero st_size.
  
  <file_data>:
  	nblocks * <block_pointer>
  	 (where nblocks = (st_size - 1) / blksize + 1)
  	nblocks * <block>
  	padding to multiple of 4 bytes
  
  The i'th <block_pointer> for a file stores the byte offset of the
  *end* of the i'th <block> (i.e. one past the last byte, which is the
  same as the start of the (i+1)'th <block> if there is one).  The first
  <block> immediately follows the last <block_pointer> for the file.
  <block_pointer>s are each 32 bits long.
  
  The order of <file_data>'s is a depth-first descent of the directory
  tree, i.e. the same order as `find -size +0 \( -type f -o -type l \)
  -print'.
  
  
  <block>: The i'th <block> is the output of zlib's compress function
  applied to the i'th blksize-sized chunk of the input data.
  (For the last <block> of the file, the input may of course be smaller.)
  Each <block> may be a different size.  (See <block_pointer> above.)
  <block>s are merely byte-aligned, not generally u32-aligned.
  
  
  Holes
  -----
  
  This kernel supports cramfs holes (i.e. [efficient representation of]
  blocks in uncompressed data consisting entirely of NUL bytes), but by
  default mkcramfs doesn't test for & create holes, since cramfs in
  kernels up to at least 2.3.39 didn't support holes.  Run mkcramfs
  with -z if you want it to create files that can have holes in them.
  
  
  Tools
  -----
  
  The cramfs user-space tools, including mkcramfs and cramfsck, are
  located at <http://sourceforge.net/projects/cramfs/>.
  
  
  Future Development
  ==================
  
  Block Size
  ----------
  
  (Block size in cramfs refers to the size of input data that is
  compressed at a time.  It's intended to be somewhere around
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
89
  PAGE_SIZE for cramfs_readpage's convenience.)
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
90
91
92
  
  The superblock ought to indicate the block size that the fs was
  written for, since comments in <linux/pagemap.h> indicate that
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
93
  PAGE_SIZE may grow in future (if I interpret the comment
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
94
  correctly).
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
95
96
  Currently, mkcramfs #define's PAGE_SIZE as 4096 and uses that
  for blksize, whereas Linux-2.3.39 uses its PAGE_SIZE, which in
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
97
98
99
  turn is defined as PAGE_SIZE (which can be as large as 32KB on arm).
  This discrepancy is a bug, though it's not clear which should be
  changed.
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
100
  One option is to change mkcramfs to take its PAGE_SIZE from
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
101
102
  <asm/page.h>.  Personally I don't like this option, but it does
  require the least amount of change: just change `#define
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
103
  PAGE_SIZE (4096)' to `#include <asm/page.h>'.  The disadvantage
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
104
105
  is that the generated cramfs cannot always be shared between different
  kernels, not even necessarily kernels of the same architecture if
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
106
  PAGE_SIZE is subject to change between kernel versions
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
  (currently possible with arm and ia64).
  
  The remaining options try to make cramfs more sharable.
  
  One part of that is addressing endianness.  The two options here are
  `always use little-endian' (like ext2fs) or `writer chooses
  endianness; kernel adapts at runtime'.  Little-endian wins because of
  code simplicity and little CPU overhead even on big-endian machines.
  
  The cost of swabbing is changing the code to use the le32_to_cpu
  etc. macros as used by ext2fs.  We don't need to swab the compressed
  data, only the superblock, inodes and block pointers.
  
  
  The other part of making cramfs more sharable is choosing a block
  size.  The options are:
  
    1. Always 4096 bytes.
  
    2. Writer chooses blocksize; kernel adapts but rejects blocksize >
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
127
       PAGE_SIZE.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
128
129
  
    3. Writer chooses blocksize; kernel adapts even to blocksize >
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
130
       PAGE_SIZE.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
131
132
  
  It's easy enough to change the kernel to use a smaller value than
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
133
  PAGE_SIZE: just make cramfs_readpage read multiple blocks.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
134

ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
135
  The cost of option 1 is that kernels with a larger PAGE_SIZE
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
136
137
138
139
  value don't get as good compression as they can.
  
  The cost of option 2 relative to option 1 is that the code uses
  variables instead of #define'd constants.  The gain is that people
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
140
  with kernels having larger PAGE_SIZE can make use of that if
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
141
  they don't mind their cramfs being inaccessible to kernels with
ea1754a08   Kirill A. Shutemov   mm, fs: remove re...
142
  smaller PAGE_SIZE values.
1da177e4c   Linus Torvalds   Linux-2.6.12-rc2
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
  
  Option 3 is easy to implement if we don't mind being CPU-inefficient:
  e.g. get readpage to decompress to a buffer of size MAX_BLKSIZE (which
  must be no larger than 32KB) and discard what it doesn't need.
  Getting readpage to read into all the covered pages is harder.
  
  The main advantage of option 3 over 1, 2, is better compression.  The
  cost is greater complexity.  Probably not worth it, but I hope someone
  will disagree.  (If it is implemented, then I'll re-use that code in
  e2compr.)
  
  
  Another cost of 2 and 3 over 1 is making mkcramfs use a different
  block size, but that just means adding and parsing a -b option.
  
  
  Inode Size
  ----------
  
  Given that cramfs will probably be used for CDs etc. as well as just
  silicon ROMs, it might make sense to expand the inode a little from
  its current 12 bytes.  Inodes other than the root inode are followed
  by filename, so the expansion doesn't even have to be a multiple of 4
  bytes.