Commit 4af4c52f34606bdaab6930a845550c6fb02078a4

Authored by Oleg Drokin
Committed by Linus Torvalds
1 parent 8d8c85117f

[PATCH] Missed error checking for intent's filp in open_namei().

It seems there is error check missing in open_namei for errors returned
through intent.open.file (from lookup_instantiate_filp).

If there is plain open performed, then such a check done inside
__path_lookup_intent_open called from path_lookup_open(), but when the open
is performed with O_CREAT flag set, then __path_lookup_intent_open is only
called with LOOKUP_PARENT set where no file opening can occur yet.

Later on lookup_hash is called where exact opening might take place and
intent.open.file may be filled.  If it is filled with error value of some
sort, then we get kernel attempting to dereference this error value as
address (and corresponding oops) in nameidata_to_filp() called from
filp_open().

While this is relatively simple to workaround in ->lookup() method by just
checking lookup_instantiate_filp() return value and returning error as
needed, this is not so easy in ->d_revalidate(), where we can only return
"yes, dentry is valid" or "no, dentry is invalid, perform full lookup
again", and just returning 0 on error would cause extra lookup (with
potential extra costly RPCs).

So in short, I believe that there should be no difference in error handling
for opening a file and creating a file in open_namei() and propose this
simple patch as a solution.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Showing 1 changed file with 6 additions and 0 deletions Inline Diff

1 /* 1 /*
2 * linux/fs/namei.c 2 * linux/fs/namei.c
3 * 3 *
4 * Copyright (C) 1991, 1992 Linus Torvalds 4 * Copyright (C) 1991, 1992 Linus Torvalds
5 */ 5 */
6 6
7 /* 7 /*
8 * Some corrections by tytso. 8 * Some corrections by tytso.
9 */ 9 */
10 10
11 /* [Feb 1997 T. Schoebel-Theuer] Complete rewrite of the pathname 11 /* [Feb 1997 T. Schoebel-Theuer] Complete rewrite of the pathname
12 * lookup logic. 12 * lookup logic.
13 */ 13 */
14 /* [Feb-Apr 2000, AV] Rewrite to the new namespace architecture. 14 /* [Feb-Apr 2000, AV] Rewrite to the new namespace architecture.
15 */ 15 */
16 16
17 #include <linux/init.h> 17 #include <linux/init.h>
18 #include <linux/module.h> 18 #include <linux/module.h>
19 #include <linux/slab.h> 19 #include <linux/slab.h>
20 #include <linux/fs.h> 20 #include <linux/fs.h>
21 #include <linux/namei.h> 21 #include <linux/namei.h>
22 #include <linux/quotaops.h> 22 #include <linux/quotaops.h>
23 #include <linux/pagemap.h> 23 #include <linux/pagemap.h>
24 #include <linux/fsnotify.h> 24 #include <linux/fsnotify.h>
25 #include <linux/smp_lock.h> 25 #include <linux/smp_lock.h>
26 #include <linux/personality.h> 26 #include <linux/personality.h>
27 #include <linux/security.h> 27 #include <linux/security.h>
28 #include <linux/syscalls.h> 28 #include <linux/syscalls.h>
29 #include <linux/mount.h> 29 #include <linux/mount.h>
30 #include <linux/audit.h> 30 #include <linux/audit.h>
31 #include <linux/capability.h> 31 #include <linux/capability.h>
32 #include <linux/file.h> 32 #include <linux/file.h>
33 #include <linux/fcntl.h> 33 #include <linux/fcntl.h>
34 #include <linux/namei.h> 34 #include <linux/namei.h>
35 #include <asm/namei.h> 35 #include <asm/namei.h>
36 #include <asm/uaccess.h> 36 #include <asm/uaccess.h>
37 37
38 #define ACC_MODE(x) ("\000\004\002\006"[(x)&O_ACCMODE]) 38 #define ACC_MODE(x) ("\000\004\002\006"[(x)&O_ACCMODE])
39 39
40 /* [Feb-1997 T. Schoebel-Theuer] 40 /* [Feb-1997 T. Schoebel-Theuer]
41 * Fundamental changes in the pathname lookup mechanisms (namei) 41 * Fundamental changes in the pathname lookup mechanisms (namei)
42 * were necessary because of omirr. The reason is that omirr needs 42 * were necessary because of omirr. The reason is that omirr needs
43 * to know the _real_ pathname, not the user-supplied one, in case 43 * to know the _real_ pathname, not the user-supplied one, in case
44 * of symlinks (and also when transname replacements occur). 44 * of symlinks (and also when transname replacements occur).
45 * 45 *
46 * The new code replaces the old recursive symlink resolution with 46 * The new code replaces the old recursive symlink resolution with
47 * an iterative one (in case of non-nested symlink chains). It does 47 * an iterative one (in case of non-nested symlink chains). It does
48 * this with calls to <fs>_follow_link(). 48 * this with calls to <fs>_follow_link().
49 * As a side effect, dir_namei(), _namei() and follow_link() are now 49 * As a side effect, dir_namei(), _namei() and follow_link() are now
50 * replaced with a single function lookup_dentry() that can handle all 50 * replaced with a single function lookup_dentry() that can handle all
51 * the special cases of the former code. 51 * the special cases of the former code.
52 * 52 *
53 * With the new dcache, the pathname is stored at each inode, at least as 53 * With the new dcache, the pathname is stored at each inode, at least as
54 * long as the refcount of the inode is positive. As a side effect, the 54 * long as the refcount of the inode is positive. As a side effect, the
55 * size of the dcache depends on the inode cache and thus is dynamic. 55 * size of the dcache depends on the inode cache and thus is dynamic.
56 * 56 *
57 * [29-Apr-1998 C. Scott Ananian] Updated above description of symlink 57 * [29-Apr-1998 C. Scott Ananian] Updated above description of symlink
58 * resolution to correspond with current state of the code. 58 * resolution to correspond with current state of the code.
59 * 59 *
60 * Note that the symlink resolution is not *completely* iterative. 60 * Note that the symlink resolution is not *completely* iterative.
61 * There is still a significant amount of tail- and mid- recursion in 61 * There is still a significant amount of tail- and mid- recursion in
62 * the algorithm. Also, note that <fs>_readlink() is not used in 62 * the algorithm. Also, note that <fs>_readlink() is not used in
63 * lookup_dentry(): lookup_dentry() on the result of <fs>_readlink() 63 * lookup_dentry(): lookup_dentry() on the result of <fs>_readlink()
64 * may return different results than <fs>_follow_link(). Many virtual 64 * may return different results than <fs>_follow_link(). Many virtual
65 * filesystems (including /proc) exhibit this behavior. 65 * filesystems (including /proc) exhibit this behavior.
66 */ 66 */
67 67
68 /* [24-Feb-97 T. Schoebel-Theuer] Side effects caused by new implementation: 68 /* [24-Feb-97 T. Schoebel-Theuer] Side effects caused by new implementation:
69 * New symlink semantics: when open() is called with flags O_CREAT | O_EXCL 69 * New symlink semantics: when open() is called with flags O_CREAT | O_EXCL
70 * and the name already exists in form of a symlink, try to create the new 70 * and the name already exists in form of a symlink, try to create the new
71 * name indicated by the symlink. The old code always complained that the 71 * name indicated by the symlink. The old code always complained that the
72 * name already exists, due to not following the symlink even if its target 72 * name already exists, due to not following the symlink even if its target
73 * is nonexistent. The new semantics affects also mknod() and link() when 73 * is nonexistent. The new semantics affects also mknod() and link() when
74 * the name is a symlink pointing to a non-existant name. 74 * the name is a symlink pointing to a non-existant name.
75 * 75 *
76 * I don't know which semantics is the right one, since I have no access 76 * I don't know which semantics is the right one, since I have no access
77 * to standards. But I found by trial that HP-UX 9.0 has the full "new" 77 * to standards. But I found by trial that HP-UX 9.0 has the full "new"
78 * semantics implemented, while SunOS 4.1.1 and Solaris (SunOS 5.4) have the 78 * semantics implemented, while SunOS 4.1.1 and Solaris (SunOS 5.4) have the
79 * "old" one. Personally, I think the new semantics is much more logical. 79 * "old" one. Personally, I think the new semantics is much more logical.
80 * Note that "ln old new" where "new" is a symlink pointing to a non-existing 80 * Note that "ln old new" where "new" is a symlink pointing to a non-existing
81 * file does succeed in both HP-UX and SunOs, but not in Solaris 81 * file does succeed in both HP-UX and SunOs, but not in Solaris
82 * and in the old Linux semantics. 82 * and in the old Linux semantics.
83 */ 83 */
84 84
85 /* [16-Dec-97 Kevin Buhr] For security reasons, we change some symlink 85 /* [16-Dec-97 Kevin Buhr] For security reasons, we change some symlink
86 * semantics. See the comments in "open_namei" and "do_link" below. 86 * semantics. See the comments in "open_namei" and "do_link" below.
87 * 87 *
88 * [10-Sep-98 Alan Modra] Another symlink change. 88 * [10-Sep-98 Alan Modra] Another symlink change.
89 */ 89 */
90 90
91 /* [Feb-Apr 2000 AV] Complete rewrite. Rules for symlinks: 91 /* [Feb-Apr 2000 AV] Complete rewrite. Rules for symlinks:
92 * inside the path - always follow. 92 * inside the path - always follow.
93 * in the last component in creation/removal/renaming - never follow. 93 * in the last component in creation/removal/renaming - never follow.
94 * if LOOKUP_FOLLOW passed - follow. 94 * if LOOKUP_FOLLOW passed - follow.
95 * if the pathname has trailing slashes - follow. 95 * if the pathname has trailing slashes - follow.
96 * otherwise - don't follow. 96 * otherwise - don't follow.
97 * (applied in that order). 97 * (applied in that order).
98 * 98 *
99 * [Jun 2000 AV] Inconsistent behaviour of open() in case if flags==O_CREAT 99 * [Jun 2000 AV] Inconsistent behaviour of open() in case if flags==O_CREAT
100 * restored for 2.4. This is the last surviving part of old 4.2BSD bug. 100 * restored for 2.4. This is the last surviving part of old 4.2BSD bug.
101 * During the 2.4 we need to fix the userland stuff depending on it - 101 * During the 2.4 we need to fix the userland stuff depending on it -
102 * hopefully we will be able to get rid of that wart in 2.5. So far only 102 * hopefully we will be able to get rid of that wart in 2.5. So far only
103 * XEmacs seems to be relying on it... 103 * XEmacs seems to be relying on it...
104 */ 104 */
105 /* 105 /*
106 * [Sep 2001 AV] Single-semaphore locking scheme (kudos to David Holland) 106 * [Sep 2001 AV] Single-semaphore locking scheme (kudos to David Holland)
107 * implemented. Let's see if raised priority of ->s_vfs_rename_mutex gives 107 * implemented. Let's see if raised priority of ->s_vfs_rename_mutex gives
108 * any extra contention... 108 * any extra contention...
109 */ 109 */
110 110
111 /* In order to reduce some races, while at the same time doing additional 111 /* In order to reduce some races, while at the same time doing additional
112 * checking and hopefully speeding things up, we copy filenames to the 112 * checking and hopefully speeding things up, we copy filenames to the
113 * kernel data space before using them.. 113 * kernel data space before using them..
114 * 114 *
115 * POSIX.1 2.4: an empty pathname is invalid (ENOENT). 115 * POSIX.1 2.4: an empty pathname is invalid (ENOENT).
116 * PATH_MAX includes the nul terminator --RR. 116 * PATH_MAX includes the nul terminator --RR.
117 */ 117 */
118 static int do_getname(const char __user *filename, char *page) 118 static int do_getname(const char __user *filename, char *page)
119 { 119 {
120 int retval; 120 int retval;
121 unsigned long len = PATH_MAX; 121 unsigned long len = PATH_MAX;
122 122
123 if (!segment_eq(get_fs(), KERNEL_DS)) { 123 if (!segment_eq(get_fs(), KERNEL_DS)) {
124 if ((unsigned long) filename >= TASK_SIZE) 124 if ((unsigned long) filename >= TASK_SIZE)
125 return -EFAULT; 125 return -EFAULT;
126 if (TASK_SIZE - (unsigned long) filename < PATH_MAX) 126 if (TASK_SIZE - (unsigned long) filename < PATH_MAX)
127 len = TASK_SIZE - (unsigned long) filename; 127 len = TASK_SIZE - (unsigned long) filename;
128 } 128 }
129 129
130 retval = strncpy_from_user(page, filename, len); 130 retval = strncpy_from_user(page, filename, len);
131 if (retval > 0) { 131 if (retval > 0) {
132 if (retval < len) 132 if (retval < len)
133 return 0; 133 return 0;
134 return -ENAMETOOLONG; 134 return -ENAMETOOLONG;
135 } else if (!retval) 135 } else if (!retval)
136 retval = -ENOENT; 136 retval = -ENOENT;
137 return retval; 137 return retval;
138 } 138 }
139 139
140 char * getname(const char __user * filename) 140 char * getname(const char __user * filename)
141 { 141 {
142 char *tmp, *result; 142 char *tmp, *result;
143 143
144 result = ERR_PTR(-ENOMEM); 144 result = ERR_PTR(-ENOMEM);
145 tmp = __getname(); 145 tmp = __getname();
146 if (tmp) { 146 if (tmp) {
147 int retval = do_getname(filename, tmp); 147 int retval = do_getname(filename, tmp);
148 148
149 result = tmp; 149 result = tmp;
150 if (retval < 0) { 150 if (retval < 0) {
151 __putname(tmp); 151 __putname(tmp);
152 result = ERR_PTR(retval); 152 result = ERR_PTR(retval);
153 } 153 }
154 } 154 }
155 audit_getname(result); 155 audit_getname(result);
156 return result; 156 return result;
157 } 157 }
158 158
159 #ifdef CONFIG_AUDITSYSCALL 159 #ifdef CONFIG_AUDITSYSCALL
160 void putname(const char *name) 160 void putname(const char *name)
161 { 161 {
162 if (unlikely(current->audit_context)) 162 if (unlikely(current->audit_context))
163 audit_putname(name); 163 audit_putname(name);
164 else 164 else
165 __putname(name); 165 __putname(name);
166 } 166 }
167 EXPORT_SYMBOL(putname); 167 EXPORT_SYMBOL(putname);
168 #endif 168 #endif
169 169
170 170
171 /** 171 /**
172 * generic_permission - check for access rights on a Posix-like filesystem 172 * generic_permission - check for access rights on a Posix-like filesystem
173 * @inode: inode to check access rights for 173 * @inode: inode to check access rights for
174 * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC) 174 * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
175 * @check_acl: optional callback to check for Posix ACLs 175 * @check_acl: optional callback to check for Posix ACLs
176 * 176 *
177 * Used to check for read/write/execute permissions on a file. 177 * Used to check for read/write/execute permissions on a file.
178 * We use "fsuid" for this, letting us set arbitrary permissions 178 * We use "fsuid" for this, letting us set arbitrary permissions
179 * for filesystem access without changing the "normal" uids which 179 * for filesystem access without changing the "normal" uids which
180 * are used for other things.. 180 * are used for other things..
181 */ 181 */
182 int generic_permission(struct inode *inode, int mask, 182 int generic_permission(struct inode *inode, int mask,
183 int (*check_acl)(struct inode *inode, int mask)) 183 int (*check_acl)(struct inode *inode, int mask))
184 { 184 {
185 umode_t mode = inode->i_mode; 185 umode_t mode = inode->i_mode;
186 186
187 if (current->fsuid == inode->i_uid) 187 if (current->fsuid == inode->i_uid)
188 mode >>= 6; 188 mode >>= 6;
189 else { 189 else {
190 if (IS_POSIXACL(inode) && (mode & S_IRWXG) && check_acl) { 190 if (IS_POSIXACL(inode) && (mode & S_IRWXG) && check_acl) {
191 int error = check_acl(inode, mask); 191 int error = check_acl(inode, mask);
192 if (error == -EACCES) 192 if (error == -EACCES)
193 goto check_capabilities; 193 goto check_capabilities;
194 else if (error != -EAGAIN) 194 else if (error != -EAGAIN)
195 return error; 195 return error;
196 } 196 }
197 197
198 if (in_group_p(inode->i_gid)) 198 if (in_group_p(inode->i_gid))
199 mode >>= 3; 199 mode >>= 3;
200 } 200 }
201 201
202 /* 202 /*
203 * If the DACs are ok we don't need any capability check. 203 * If the DACs are ok we don't need any capability check.
204 */ 204 */
205 if (((mode & mask & (MAY_READ|MAY_WRITE|MAY_EXEC)) == mask)) 205 if (((mode & mask & (MAY_READ|MAY_WRITE|MAY_EXEC)) == mask))
206 return 0; 206 return 0;
207 207
208 check_capabilities: 208 check_capabilities:
209 /* 209 /*
210 * Read/write DACs are always overridable. 210 * Read/write DACs are always overridable.
211 * Executable DACs are overridable if at least one exec bit is set. 211 * Executable DACs are overridable if at least one exec bit is set.
212 */ 212 */
213 if (!(mask & MAY_EXEC) || 213 if (!(mask & MAY_EXEC) ||
214 (inode->i_mode & S_IXUGO) || S_ISDIR(inode->i_mode)) 214 (inode->i_mode & S_IXUGO) || S_ISDIR(inode->i_mode))
215 if (capable(CAP_DAC_OVERRIDE)) 215 if (capable(CAP_DAC_OVERRIDE))
216 return 0; 216 return 0;
217 217
218 /* 218 /*
219 * Searching includes executable on directories, else just read. 219 * Searching includes executable on directories, else just read.
220 */ 220 */
221 if (mask == MAY_READ || (S_ISDIR(inode->i_mode) && !(mask & MAY_WRITE))) 221 if (mask == MAY_READ || (S_ISDIR(inode->i_mode) && !(mask & MAY_WRITE)))
222 if (capable(CAP_DAC_READ_SEARCH)) 222 if (capable(CAP_DAC_READ_SEARCH))
223 return 0; 223 return 0;
224 224
225 return -EACCES; 225 return -EACCES;
226 } 226 }
227 227
228 int permission(struct inode *inode, int mask, struct nameidata *nd) 228 int permission(struct inode *inode, int mask, struct nameidata *nd)
229 { 229 {
230 int retval, submask; 230 int retval, submask;
231 231
232 if (mask & MAY_WRITE) { 232 if (mask & MAY_WRITE) {
233 umode_t mode = inode->i_mode; 233 umode_t mode = inode->i_mode;
234 234
235 /* 235 /*
236 * Nobody gets write access to a read-only fs. 236 * Nobody gets write access to a read-only fs.
237 */ 237 */
238 if (IS_RDONLY(inode) && 238 if (IS_RDONLY(inode) &&
239 (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode))) 239 (S_ISREG(mode) || S_ISDIR(mode) || S_ISLNK(mode)))
240 return -EROFS; 240 return -EROFS;
241 241
242 /* 242 /*
243 * Nobody gets write access to an immutable file. 243 * Nobody gets write access to an immutable file.
244 */ 244 */
245 if (IS_IMMUTABLE(inode)) 245 if (IS_IMMUTABLE(inode))
246 return -EACCES; 246 return -EACCES;
247 } 247 }
248 248
249 249
250 /* Ordinary permission routines do not understand MAY_APPEND. */ 250 /* Ordinary permission routines do not understand MAY_APPEND. */
251 submask = mask & ~MAY_APPEND; 251 submask = mask & ~MAY_APPEND;
252 if (inode->i_op && inode->i_op->permission) 252 if (inode->i_op && inode->i_op->permission)
253 retval = inode->i_op->permission(inode, submask, nd); 253 retval = inode->i_op->permission(inode, submask, nd);
254 else 254 else
255 retval = generic_permission(inode, submask, NULL); 255 retval = generic_permission(inode, submask, NULL);
256 if (retval) 256 if (retval)
257 return retval; 257 return retval;
258 258
259 return security_inode_permission(inode, mask, nd); 259 return security_inode_permission(inode, mask, nd);
260 } 260 }
261 261
262 /** 262 /**
263 * vfs_permission - check for access rights to a given path 263 * vfs_permission - check for access rights to a given path
264 * @nd: lookup result that describes the path 264 * @nd: lookup result that describes the path
265 * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC) 265 * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
266 * 266 *
267 * Used to check for read/write/execute permissions on a path. 267 * Used to check for read/write/execute permissions on a path.
268 * We use "fsuid" for this, letting us set arbitrary permissions 268 * We use "fsuid" for this, letting us set arbitrary permissions
269 * for filesystem access without changing the "normal" uids which 269 * for filesystem access without changing the "normal" uids which
270 * are used for other things. 270 * are used for other things.
271 */ 271 */
272 int vfs_permission(struct nameidata *nd, int mask) 272 int vfs_permission(struct nameidata *nd, int mask)
273 { 273 {
274 return permission(nd->dentry->d_inode, mask, nd); 274 return permission(nd->dentry->d_inode, mask, nd);
275 } 275 }
276 276
277 /** 277 /**
278 * file_permission - check for additional access rights to a given file 278 * file_permission - check for additional access rights to a given file
279 * @file: file to check access rights for 279 * @file: file to check access rights for
280 * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC) 280 * @mask: right to check for (%MAY_READ, %MAY_WRITE, %MAY_EXEC)
281 * 281 *
282 * Used to check for read/write/execute permissions on an already opened 282 * Used to check for read/write/execute permissions on an already opened
283 * file. 283 * file.
284 * 284 *
285 * Note: 285 * Note:
286 * Do not use this function in new code. All access checks should 286 * Do not use this function in new code. All access checks should
287 * be done using vfs_permission(). 287 * be done using vfs_permission().
288 */ 288 */
289 int file_permission(struct file *file, int mask) 289 int file_permission(struct file *file, int mask)
290 { 290 {
291 return permission(file->f_dentry->d_inode, mask, NULL); 291 return permission(file->f_dentry->d_inode, mask, NULL);
292 } 292 }
293 293
294 /* 294 /*
295 * get_write_access() gets write permission for a file. 295 * get_write_access() gets write permission for a file.
296 * put_write_access() releases this write permission. 296 * put_write_access() releases this write permission.
297 * This is used for regular files. 297 * This is used for regular files.
298 * We cannot support write (and maybe mmap read-write shared) accesses and 298 * We cannot support write (and maybe mmap read-write shared) accesses and
299 * MAP_DENYWRITE mmappings simultaneously. The i_writecount field of an inode 299 * MAP_DENYWRITE mmappings simultaneously. The i_writecount field of an inode
300 * can have the following values: 300 * can have the following values:
301 * 0: no writers, no VM_DENYWRITE mappings 301 * 0: no writers, no VM_DENYWRITE mappings
302 * < 0: (-i_writecount) vm_area_structs with VM_DENYWRITE set exist 302 * < 0: (-i_writecount) vm_area_structs with VM_DENYWRITE set exist
303 * > 0: (i_writecount) users are writing to the file. 303 * > 0: (i_writecount) users are writing to the file.
304 * 304 *
305 * Normally we operate on that counter with atomic_{inc,dec} and it's safe 305 * Normally we operate on that counter with atomic_{inc,dec} and it's safe
306 * except for the cases where we don't hold i_writecount yet. Then we need to 306 * except for the cases where we don't hold i_writecount yet. Then we need to
307 * use {get,deny}_write_access() - these functions check the sign and refuse 307 * use {get,deny}_write_access() - these functions check the sign and refuse
308 * to do the change if sign is wrong. Exclusion between them is provided by 308 * to do the change if sign is wrong. Exclusion between them is provided by
309 * the inode->i_lock spinlock. 309 * the inode->i_lock spinlock.
310 */ 310 */
311 311
312 int get_write_access(struct inode * inode) 312 int get_write_access(struct inode * inode)
313 { 313 {
314 spin_lock(&inode->i_lock); 314 spin_lock(&inode->i_lock);
315 if (atomic_read(&inode->i_writecount) < 0) { 315 if (atomic_read(&inode->i_writecount) < 0) {
316 spin_unlock(&inode->i_lock); 316 spin_unlock(&inode->i_lock);
317 return -ETXTBSY; 317 return -ETXTBSY;
318 } 318 }
319 atomic_inc(&inode->i_writecount); 319 atomic_inc(&inode->i_writecount);
320 spin_unlock(&inode->i_lock); 320 spin_unlock(&inode->i_lock);
321 321
322 return 0; 322 return 0;
323 } 323 }
324 324
325 int deny_write_access(struct file * file) 325 int deny_write_access(struct file * file)
326 { 326 {
327 struct inode *inode = file->f_dentry->d_inode; 327 struct inode *inode = file->f_dentry->d_inode;
328 328
329 spin_lock(&inode->i_lock); 329 spin_lock(&inode->i_lock);
330 if (atomic_read(&inode->i_writecount) > 0) { 330 if (atomic_read(&inode->i_writecount) > 0) {
331 spin_unlock(&inode->i_lock); 331 spin_unlock(&inode->i_lock);
332 return -ETXTBSY; 332 return -ETXTBSY;
333 } 333 }
334 atomic_dec(&inode->i_writecount); 334 atomic_dec(&inode->i_writecount);
335 spin_unlock(&inode->i_lock); 335 spin_unlock(&inode->i_lock);
336 336
337 return 0; 337 return 0;
338 } 338 }
339 339
340 void path_release(struct nameidata *nd) 340 void path_release(struct nameidata *nd)
341 { 341 {
342 dput(nd->dentry); 342 dput(nd->dentry);
343 mntput(nd->mnt); 343 mntput(nd->mnt);
344 } 344 }
345 345
346 /* 346 /*
347 * umount() mustn't call path_release()/mntput() as that would clear 347 * umount() mustn't call path_release()/mntput() as that would clear
348 * mnt_expiry_mark 348 * mnt_expiry_mark
349 */ 349 */
350 void path_release_on_umount(struct nameidata *nd) 350 void path_release_on_umount(struct nameidata *nd)
351 { 351 {
352 dput(nd->dentry); 352 dput(nd->dentry);
353 mntput_no_expire(nd->mnt); 353 mntput_no_expire(nd->mnt);
354 } 354 }
355 355
356 /** 356 /**
357 * release_open_intent - free up open intent resources 357 * release_open_intent - free up open intent resources
358 * @nd: pointer to nameidata 358 * @nd: pointer to nameidata
359 */ 359 */
360 void release_open_intent(struct nameidata *nd) 360 void release_open_intent(struct nameidata *nd)
361 { 361 {
362 if (nd->intent.open.file->f_dentry == NULL) 362 if (nd->intent.open.file->f_dentry == NULL)
363 put_filp(nd->intent.open.file); 363 put_filp(nd->intent.open.file);
364 else 364 else
365 fput(nd->intent.open.file); 365 fput(nd->intent.open.file);
366 } 366 }
367 367
368 /* 368 /*
369 * Internal lookup() using the new generic dcache. 369 * Internal lookup() using the new generic dcache.
370 * SMP-safe 370 * SMP-safe
371 */ 371 */
372 static struct dentry * cached_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd) 372 static struct dentry * cached_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd)
373 { 373 {
374 struct dentry * dentry = __d_lookup(parent, name); 374 struct dentry * dentry = __d_lookup(parent, name);
375 375
376 /* lockess __d_lookup may fail due to concurrent d_move() 376 /* lockess __d_lookup may fail due to concurrent d_move()
377 * in some unrelated directory, so try with d_lookup 377 * in some unrelated directory, so try with d_lookup
378 */ 378 */
379 if (!dentry) 379 if (!dentry)
380 dentry = d_lookup(parent, name); 380 dentry = d_lookup(parent, name);
381 381
382 if (dentry && dentry->d_op && dentry->d_op->d_revalidate) { 382 if (dentry && dentry->d_op && dentry->d_op->d_revalidate) {
383 if (!dentry->d_op->d_revalidate(dentry, nd) && !d_invalidate(dentry)) { 383 if (!dentry->d_op->d_revalidate(dentry, nd) && !d_invalidate(dentry)) {
384 dput(dentry); 384 dput(dentry);
385 dentry = NULL; 385 dentry = NULL;
386 } 386 }
387 } 387 }
388 return dentry; 388 return dentry;
389 } 389 }
390 390
391 /* 391 /*
392 * Short-cut version of permission(), for calling by 392 * Short-cut version of permission(), for calling by
393 * path_walk(), when dcache lock is held. Combines parts 393 * path_walk(), when dcache lock is held. Combines parts
394 * of permission() and generic_permission(), and tests ONLY for 394 * of permission() and generic_permission(), and tests ONLY for
395 * MAY_EXEC permission. 395 * MAY_EXEC permission.
396 * 396 *
397 * If appropriate, check DAC only. If not appropriate, or 397 * If appropriate, check DAC only. If not appropriate, or
398 * short-cut DAC fails, then call permission() to do more 398 * short-cut DAC fails, then call permission() to do more
399 * complete permission check. 399 * complete permission check.
400 */ 400 */
401 static int exec_permission_lite(struct inode *inode, 401 static int exec_permission_lite(struct inode *inode,
402 struct nameidata *nd) 402 struct nameidata *nd)
403 { 403 {
404 umode_t mode = inode->i_mode; 404 umode_t mode = inode->i_mode;
405 405
406 if (inode->i_op && inode->i_op->permission) 406 if (inode->i_op && inode->i_op->permission)
407 return -EAGAIN; 407 return -EAGAIN;
408 408
409 if (current->fsuid == inode->i_uid) 409 if (current->fsuid == inode->i_uid)
410 mode >>= 6; 410 mode >>= 6;
411 else if (in_group_p(inode->i_gid)) 411 else if (in_group_p(inode->i_gid))
412 mode >>= 3; 412 mode >>= 3;
413 413
414 if (mode & MAY_EXEC) 414 if (mode & MAY_EXEC)
415 goto ok; 415 goto ok;
416 416
417 if ((inode->i_mode & S_IXUGO) && capable(CAP_DAC_OVERRIDE)) 417 if ((inode->i_mode & S_IXUGO) && capable(CAP_DAC_OVERRIDE))
418 goto ok; 418 goto ok;
419 419
420 if (S_ISDIR(inode->i_mode) && capable(CAP_DAC_OVERRIDE)) 420 if (S_ISDIR(inode->i_mode) && capable(CAP_DAC_OVERRIDE))
421 goto ok; 421 goto ok;
422 422
423 if (S_ISDIR(inode->i_mode) && capable(CAP_DAC_READ_SEARCH)) 423 if (S_ISDIR(inode->i_mode) && capable(CAP_DAC_READ_SEARCH))
424 goto ok; 424 goto ok;
425 425
426 return -EACCES; 426 return -EACCES;
427 ok: 427 ok:
428 return security_inode_permission(inode, MAY_EXEC, nd); 428 return security_inode_permission(inode, MAY_EXEC, nd);
429 } 429 }
430 430
431 /* 431 /*
432 * This is called when everything else fails, and we actually have 432 * This is called when everything else fails, and we actually have
433 * to go to the low-level filesystem to find out what we should do.. 433 * to go to the low-level filesystem to find out what we should do..
434 * 434 *
435 * We get the directory semaphore, and after getting that we also 435 * We get the directory semaphore, and after getting that we also
436 * make sure that nobody added the entry to the dcache in the meantime.. 436 * make sure that nobody added the entry to the dcache in the meantime..
437 * SMP-safe 437 * SMP-safe
438 */ 438 */
439 static struct dentry * real_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd) 439 static struct dentry * real_lookup(struct dentry * parent, struct qstr * name, struct nameidata *nd)
440 { 440 {
441 struct dentry * result; 441 struct dentry * result;
442 struct inode *dir = parent->d_inode; 442 struct inode *dir = parent->d_inode;
443 443
444 mutex_lock(&dir->i_mutex); 444 mutex_lock(&dir->i_mutex);
445 /* 445 /*
446 * First re-do the cached lookup just in case it was created 446 * First re-do the cached lookup just in case it was created
447 * while we waited for the directory semaphore.. 447 * while we waited for the directory semaphore..
448 * 448 *
449 * FIXME! This could use version numbering or similar to 449 * FIXME! This could use version numbering or similar to
450 * avoid unnecessary cache lookups. 450 * avoid unnecessary cache lookups.
451 * 451 *
452 * The "dcache_lock" is purely to protect the RCU list walker 452 * The "dcache_lock" is purely to protect the RCU list walker
453 * from concurrent renames at this point (we mustn't get false 453 * from concurrent renames at this point (we mustn't get false
454 * negatives from the RCU list walk here, unlike the optimistic 454 * negatives from the RCU list walk here, unlike the optimistic
455 * fast walk). 455 * fast walk).
456 * 456 *
457 * so doing d_lookup() (with seqlock), instead of lockfree __d_lookup 457 * so doing d_lookup() (with seqlock), instead of lockfree __d_lookup
458 */ 458 */
459 result = d_lookup(parent, name); 459 result = d_lookup(parent, name);
460 if (!result) { 460 if (!result) {
461 struct dentry * dentry = d_alloc(parent, name); 461 struct dentry * dentry = d_alloc(parent, name);
462 result = ERR_PTR(-ENOMEM); 462 result = ERR_PTR(-ENOMEM);
463 if (dentry) { 463 if (dentry) {
464 result = dir->i_op->lookup(dir, dentry, nd); 464 result = dir->i_op->lookup(dir, dentry, nd);
465 if (result) 465 if (result)
466 dput(dentry); 466 dput(dentry);
467 else 467 else
468 result = dentry; 468 result = dentry;
469 } 469 }
470 mutex_unlock(&dir->i_mutex); 470 mutex_unlock(&dir->i_mutex);
471 return result; 471 return result;
472 } 472 }
473 473
474 /* 474 /*
475 * Uhhuh! Nasty case: the cache was re-populated while 475 * Uhhuh! Nasty case: the cache was re-populated while
476 * we waited on the semaphore. Need to revalidate. 476 * we waited on the semaphore. Need to revalidate.
477 */ 477 */
478 mutex_unlock(&dir->i_mutex); 478 mutex_unlock(&dir->i_mutex);
479 if (result->d_op && result->d_op->d_revalidate) { 479 if (result->d_op && result->d_op->d_revalidate) {
480 if (!result->d_op->d_revalidate(result, nd) && !d_invalidate(result)) { 480 if (!result->d_op->d_revalidate(result, nd) && !d_invalidate(result)) {
481 dput(result); 481 dput(result);
482 result = ERR_PTR(-ENOENT); 482 result = ERR_PTR(-ENOENT);
483 } 483 }
484 } 484 }
485 return result; 485 return result;
486 } 486 }
487 487
488 static int __emul_lookup_dentry(const char *, struct nameidata *); 488 static int __emul_lookup_dentry(const char *, struct nameidata *);
489 489
490 /* SMP-safe */ 490 /* SMP-safe */
491 static __always_inline int 491 static __always_inline int
492 walk_init_root(const char *name, struct nameidata *nd) 492 walk_init_root(const char *name, struct nameidata *nd)
493 { 493 {
494 read_lock(&current->fs->lock); 494 read_lock(&current->fs->lock);
495 if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) { 495 if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) {
496 nd->mnt = mntget(current->fs->altrootmnt); 496 nd->mnt = mntget(current->fs->altrootmnt);
497 nd->dentry = dget(current->fs->altroot); 497 nd->dentry = dget(current->fs->altroot);
498 read_unlock(&current->fs->lock); 498 read_unlock(&current->fs->lock);
499 if (__emul_lookup_dentry(name,nd)) 499 if (__emul_lookup_dentry(name,nd))
500 return 0; 500 return 0;
501 read_lock(&current->fs->lock); 501 read_lock(&current->fs->lock);
502 } 502 }
503 nd->mnt = mntget(current->fs->rootmnt); 503 nd->mnt = mntget(current->fs->rootmnt);
504 nd->dentry = dget(current->fs->root); 504 nd->dentry = dget(current->fs->root);
505 read_unlock(&current->fs->lock); 505 read_unlock(&current->fs->lock);
506 return 1; 506 return 1;
507 } 507 }
508 508
509 static __always_inline int __vfs_follow_link(struct nameidata *nd, const char *link) 509 static __always_inline int __vfs_follow_link(struct nameidata *nd, const char *link)
510 { 510 {
511 int res = 0; 511 int res = 0;
512 char *name; 512 char *name;
513 if (IS_ERR(link)) 513 if (IS_ERR(link))
514 goto fail; 514 goto fail;
515 515
516 if (*link == '/') { 516 if (*link == '/') {
517 path_release(nd); 517 path_release(nd);
518 if (!walk_init_root(link, nd)) 518 if (!walk_init_root(link, nd))
519 /* weird __emul_prefix() stuff did it */ 519 /* weird __emul_prefix() stuff did it */
520 goto out; 520 goto out;
521 } 521 }
522 res = link_path_walk(link, nd); 522 res = link_path_walk(link, nd);
523 out: 523 out:
524 if (nd->depth || res || nd->last_type!=LAST_NORM) 524 if (nd->depth || res || nd->last_type!=LAST_NORM)
525 return res; 525 return res;
526 /* 526 /*
527 * If it is an iterative symlinks resolution in open_namei() we 527 * If it is an iterative symlinks resolution in open_namei() we
528 * have to copy the last component. And all that crap because of 528 * have to copy the last component. And all that crap because of
529 * bloody create() on broken symlinks. Furrfu... 529 * bloody create() on broken symlinks. Furrfu...
530 */ 530 */
531 name = __getname(); 531 name = __getname();
532 if (unlikely(!name)) { 532 if (unlikely(!name)) {
533 path_release(nd); 533 path_release(nd);
534 return -ENOMEM; 534 return -ENOMEM;
535 } 535 }
536 strcpy(name, nd->last.name); 536 strcpy(name, nd->last.name);
537 nd->last.name = name; 537 nd->last.name = name;
538 return 0; 538 return 0;
539 fail: 539 fail:
540 path_release(nd); 540 path_release(nd);
541 return PTR_ERR(link); 541 return PTR_ERR(link);
542 } 542 }
543 543
544 struct path { 544 struct path {
545 struct vfsmount *mnt; 545 struct vfsmount *mnt;
546 struct dentry *dentry; 546 struct dentry *dentry;
547 }; 547 };
548 548
549 static __always_inline int __do_follow_link(struct path *path, struct nameidata *nd) 549 static __always_inline int __do_follow_link(struct path *path, struct nameidata *nd)
550 { 550 {
551 int error; 551 int error;
552 void *cookie; 552 void *cookie;
553 struct dentry *dentry = path->dentry; 553 struct dentry *dentry = path->dentry;
554 554
555 touch_atime(path->mnt, dentry); 555 touch_atime(path->mnt, dentry);
556 nd_set_link(nd, NULL); 556 nd_set_link(nd, NULL);
557 557
558 if (path->mnt == nd->mnt) 558 if (path->mnt == nd->mnt)
559 mntget(path->mnt); 559 mntget(path->mnt);
560 cookie = dentry->d_inode->i_op->follow_link(dentry, nd); 560 cookie = dentry->d_inode->i_op->follow_link(dentry, nd);
561 error = PTR_ERR(cookie); 561 error = PTR_ERR(cookie);
562 if (!IS_ERR(cookie)) { 562 if (!IS_ERR(cookie)) {
563 char *s = nd_get_link(nd); 563 char *s = nd_get_link(nd);
564 error = 0; 564 error = 0;
565 if (s) 565 if (s)
566 error = __vfs_follow_link(nd, s); 566 error = __vfs_follow_link(nd, s);
567 if (dentry->d_inode->i_op->put_link) 567 if (dentry->d_inode->i_op->put_link)
568 dentry->d_inode->i_op->put_link(dentry, nd, cookie); 568 dentry->d_inode->i_op->put_link(dentry, nd, cookie);
569 } 569 }
570 dput(dentry); 570 dput(dentry);
571 mntput(path->mnt); 571 mntput(path->mnt);
572 572
573 return error; 573 return error;
574 } 574 }
575 575
576 static inline void dput_path(struct path *path, struct nameidata *nd) 576 static inline void dput_path(struct path *path, struct nameidata *nd)
577 { 577 {
578 dput(path->dentry); 578 dput(path->dentry);
579 if (path->mnt != nd->mnt) 579 if (path->mnt != nd->mnt)
580 mntput(path->mnt); 580 mntput(path->mnt);
581 } 581 }
582 582
583 static inline void path_to_nameidata(struct path *path, struct nameidata *nd) 583 static inline void path_to_nameidata(struct path *path, struct nameidata *nd)
584 { 584 {
585 dput(nd->dentry); 585 dput(nd->dentry);
586 if (nd->mnt != path->mnt) 586 if (nd->mnt != path->mnt)
587 mntput(nd->mnt); 587 mntput(nd->mnt);
588 nd->mnt = path->mnt; 588 nd->mnt = path->mnt;
589 nd->dentry = path->dentry; 589 nd->dentry = path->dentry;
590 } 590 }
591 591
592 /* 592 /*
593 * This limits recursive symlink follows to 8, while 593 * This limits recursive symlink follows to 8, while
594 * limiting consecutive symlinks to 40. 594 * limiting consecutive symlinks to 40.
595 * 595 *
596 * Without that kind of total limit, nasty chains of consecutive 596 * Without that kind of total limit, nasty chains of consecutive
597 * symlinks can cause almost arbitrarily long lookups. 597 * symlinks can cause almost arbitrarily long lookups.
598 */ 598 */
599 static inline int do_follow_link(struct path *path, struct nameidata *nd) 599 static inline int do_follow_link(struct path *path, struct nameidata *nd)
600 { 600 {
601 int err = -ELOOP; 601 int err = -ELOOP;
602 if (current->link_count >= MAX_NESTED_LINKS) 602 if (current->link_count >= MAX_NESTED_LINKS)
603 goto loop; 603 goto loop;
604 if (current->total_link_count >= 40) 604 if (current->total_link_count >= 40)
605 goto loop; 605 goto loop;
606 BUG_ON(nd->depth >= MAX_NESTED_LINKS); 606 BUG_ON(nd->depth >= MAX_NESTED_LINKS);
607 cond_resched(); 607 cond_resched();
608 err = security_inode_follow_link(path->dentry, nd); 608 err = security_inode_follow_link(path->dentry, nd);
609 if (err) 609 if (err)
610 goto loop; 610 goto loop;
611 current->link_count++; 611 current->link_count++;
612 current->total_link_count++; 612 current->total_link_count++;
613 nd->depth++; 613 nd->depth++;
614 err = __do_follow_link(path, nd); 614 err = __do_follow_link(path, nd);
615 current->link_count--; 615 current->link_count--;
616 nd->depth--; 616 nd->depth--;
617 return err; 617 return err;
618 loop: 618 loop:
619 dput_path(path, nd); 619 dput_path(path, nd);
620 path_release(nd); 620 path_release(nd);
621 return err; 621 return err;
622 } 622 }
623 623
624 int follow_up(struct vfsmount **mnt, struct dentry **dentry) 624 int follow_up(struct vfsmount **mnt, struct dentry **dentry)
625 { 625 {
626 struct vfsmount *parent; 626 struct vfsmount *parent;
627 struct dentry *mountpoint; 627 struct dentry *mountpoint;
628 spin_lock(&vfsmount_lock); 628 spin_lock(&vfsmount_lock);
629 parent=(*mnt)->mnt_parent; 629 parent=(*mnt)->mnt_parent;
630 if (parent == *mnt) { 630 if (parent == *mnt) {
631 spin_unlock(&vfsmount_lock); 631 spin_unlock(&vfsmount_lock);
632 return 0; 632 return 0;
633 } 633 }
634 mntget(parent); 634 mntget(parent);
635 mountpoint=dget((*mnt)->mnt_mountpoint); 635 mountpoint=dget((*mnt)->mnt_mountpoint);
636 spin_unlock(&vfsmount_lock); 636 spin_unlock(&vfsmount_lock);
637 dput(*dentry); 637 dput(*dentry);
638 *dentry = mountpoint; 638 *dentry = mountpoint;
639 mntput(*mnt); 639 mntput(*mnt);
640 *mnt = parent; 640 *mnt = parent;
641 return 1; 641 return 1;
642 } 642 }
643 643
644 /* no need for dcache_lock, as serialization is taken care in 644 /* no need for dcache_lock, as serialization is taken care in
645 * namespace.c 645 * namespace.c
646 */ 646 */
647 static int __follow_mount(struct path *path) 647 static int __follow_mount(struct path *path)
648 { 648 {
649 int res = 0; 649 int res = 0;
650 while (d_mountpoint(path->dentry)) { 650 while (d_mountpoint(path->dentry)) {
651 struct vfsmount *mounted = lookup_mnt(path->mnt, path->dentry); 651 struct vfsmount *mounted = lookup_mnt(path->mnt, path->dentry);
652 if (!mounted) 652 if (!mounted)
653 break; 653 break;
654 dput(path->dentry); 654 dput(path->dentry);
655 if (res) 655 if (res)
656 mntput(path->mnt); 656 mntput(path->mnt);
657 path->mnt = mounted; 657 path->mnt = mounted;
658 path->dentry = dget(mounted->mnt_root); 658 path->dentry = dget(mounted->mnt_root);
659 res = 1; 659 res = 1;
660 } 660 }
661 return res; 661 return res;
662 } 662 }
663 663
664 static void follow_mount(struct vfsmount **mnt, struct dentry **dentry) 664 static void follow_mount(struct vfsmount **mnt, struct dentry **dentry)
665 { 665 {
666 while (d_mountpoint(*dentry)) { 666 while (d_mountpoint(*dentry)) {
667 struct vfsmount *mounted = lookup_mnt(*mnt, *dentry); 667 struct vfsmount *mounted = lookup_mnt(*mnt, *dentry);
668 if (!mounted) 668 if (!mounted)
669 break; 669 break;
670 dput(*dentry); 670 dput(*dentry);
671 mntput(*mnt); 671 mntput(*mnt);
672 *mnt = mounted; 672 *mnt = mounted;
673 *dentry = dget(mounted->mnt_root); 673 *dentry = dget(mounted->mnt_root);
674 } 674 }
675 } 675 }
676 676
677 /* no need for dcache_lock, as serialization is taken care in 677 /* no need for dcache_lock, as serialization is taken care in
678 * namespace.c 678 * namespace.c
679 */ 679 */
680 int follow_down(struct vfsmount **mnt, struct dentry **dentry) 680 int follow_down(struct vfsmount **mnt, struct dentry **dentry)
681 { 681 {
682 struct vfsmount *mounted; 682 struct vfsmount *mounted;
683 683
684 mounted = lookup_mnt(*mnt, *dentry); 684 mounted = lookup_mnt(*mnt, *dentry);
685 if (mounted) { 685 if (mounted) {
686 dput(*dentry); 686 dput(*dentry);
687 mntput(*mnt); 687 mntput(*mnt);
688 *mnt = mounted; 688 *mnt = mounted;
689 *dentry = dget(mounted->mnt_root); 689 *dentry = dget(mounted->mnt_root);
690 return 1; 690 return 1;
691 } 691 }
692 return 0; 692 return 0;
693 } 693 }
694 694
695 static __always_inline void follow_dotdot(struct nameidata *nd) 695 static __always_inline void follow_dotdot(struct nameidata *nd)
696 { 696 {
697 while(1) { 697 while(1) {
698 struct vfsmount *parent; 698 struct vfsmount *parent;
699 struct dentry *old = nd->dentry; 699 struct dentry *old = nd->dentry;
700 700
701 read_lock(&current->fs->lock); 701 read_lock(&current->fs->lock);
702 if (nd->dentry == current->fs->root && 702 if (nd->dentry == current->fs->root &&
703 nd->mnt == current->fs->rootmnt) { 703 nd->mnt == current->fs->rootmnt) {
704 read_unlock(&current->fs->lock); 704 read_unlock(&current->fs->lock);
705 break; 705 break;
706 } 706 }
707 read_unlock(&current->fs->lock); 707 read_unlock(&current->fs->lock);
708 spin_lock(&dcache_lock); 708 spin_lock(&dcache_lock);
709 if (nd->dentry != nd->mnt->mnt_root) { 709 if (nd->dentry != nd->mnt->mnt_root) {
710 nd->dentry = dget(nd->dentry->d_parent); 710 nd->dentry = dget(nd->dentry->d_parent);
711 spin_unlock(&dcache_lock); 711 spin_unlock(&dcache_lock);
712 dput(old); 712 dput(old);
713 break; 713 break;
714 } 714 }
715 spin_unlock(&dcache_lock); 715 spin_unlock(&dcache_lock);
716 spin_lock(&vfsmount_lock); 716 spin_lock(&vfsmount_lock);
717 parent = nd->mnt->mnt_parent; 717 parent = nd->mnt->mnt_parent;
718 if (parent == nd->mnt) { 718 if (parent == nd->mnt) {
719 spin_unlock(&vfsmount_lock); 719 spin_unlock(&vfsmount_lock);
720 break; 720 break;
721 } 721 }
722 mntget(parent); 722 mntget(parent);
723 nd->dentry = dget(nd->mnt->mnt_mountpoint); 723 nd->dentry = dget(nd->mnt->mnt_mountpoint);
724 spin_unlock(&vfsmount_lock); 724 spin_unlock(&vfsmount_lock);
725 dput(old); 725 dput(old);
726 mntput(nd->mnt); 726 mntput(nd->mnt);
727 nd->mnt = parent; 727 nd->mnt = parent;
728 } 728 }
729 follow_mount(&nd->mnt, &nd->dentry); 729 follow_mount(&nd->mnt, &nd->dentry);
730 } 730 }
731 731
732 /* 732 /*
733 * It's more convoluted than I'd like it to be, but... it's still fairly 733 * It's more convoluted than I'd like it to be, but... it's still fairly
734 * small and for now I'd prefer to have fast path as straight as possible. 734 * small and for now I'd prefer to have fast path as straight as possible.
735 * It _is_ time-critical. 735 * It _is_ time-critical.
736 */ 736 */
737 static int do_lookup(struct nameidata *nd, struct qstr *name, 737 static int do_lookup(struct nameidata *nd, struct qstr *name,
738 struct path *path) 738 struct path *path)
739 { 739 {
740 struct vfsmount *mnt = nd->mnt; 740 struct vfsmount *mnt = nd->mnt;
741 struct dentry *dentry = __d_lookup(nd->dentry, name); 741 struct dentry *dentry = __d_lookup(nd->dentry, name);
742 742
743 if (!dentry) 743 if (!dentry)
744 goto need_lookup; 744 goto need_lookup;
745 if (dentry->d_op && dentry->d_op->d_revalidate) 745 if (dentry->d_op && dentry->d_op->d_revalidate)
746 goto need_revalidate; 746 goto need_revalidate;
747 done: 747 done:
748 path->mnt = mnt; 748 path->mnt = mnt;
749 path->dentry = dentry; 749 path->dentry = dentry;
750 __follow_mount(path); 750 __follow_mount(path);
751 return 0; 751 return 0;
752 752
753 need_lookup: 753 need_lookup:
754 dentry = real_lookup(nd->dentry, name, nd); 754 dentry = real_lookup(nd->dentry, name, nd);
755 if (IS_ERR(dentry)) 755 if (IS_ERR(dentry))
756 goto fail; 756 goto fail;
757 goto done; 757 goto done;
758 758
759 need_revalidate: 759 need_revalidate:
760 if (dentry->d_op->d_revalidate(dentry, nd)) 760 if (dentry->d_op->d_revalidate(dentry, nd))
761 goto done; 761 goto done;
762 if (d_invalidate(dentry)) 762 if (d_invalidate(dentry))
763 goto done; 763 goto done;
764 dput(dentry); 764 dput(dentry);
765 goto need_lookup; 765 goto need_lookup;
766 766
767 fail: 767 fail:
768 return PTR_ERR(dentry); 768 return PTR_ERR(dentry);
769 } 769 }
770 770
771 /* 771 /*
772 * Name resolution. 772 * Name resolution.
773 * This is the basic name resolution function, turning a pathname into 773 * This is the basic name resolution function, turning a pathname into
774 * the final dentry. We expect 'base' to be positive and a directory. 774 * the final dentry. We expect 'base' to be positive and a directory.
775 * 775 *
776 * Returns 0 and nd will have valid dentry and mnt on success. 776 * Returns 0 and nd will have valid dentry and mnt on success.
777 * Returns error and drops reference to input namei data on failure. 777 * Returns error and drops reference to input namei data on failure.
778 */ 778 */
779 static fastcall int __link_path_walk(const char * name, struct nameidata *nd) 779 static fastcall int __link_path_walk(const char * name, struct nameidata *nd)
780 { 780 {
781 struct path next; 781 struct path next;
782 struct inode *inode; 782 struct inode *inode;
783 int err; 783 int err;
784 unsigned int lookup_flags = nd->flags; 784 unsigned int lookup_flags = nd->flags;
785 785
786 while (*name=='/') 786 while (*name=='/')
787 name++; 787 name++;
788 if (!*name) 788 if (!*name)
789 goto return_reval; 789 goto return_reval;
790 790
791 inode = nd->dentry->d_inode; 791 inode = nd->dentry->d_inode;
792 if (nd->depth) 792 if (nd->depth)
793 lookup_flags = LOOKUP_FOLLOW | (nd->flags & LOOKUP_CONTINUE); 793 lookup_flags = LOOKUP_FOLLOW | (nd->flags & LOOKUP_CONTINUE);
794 794
795 /* At this point we know we have a real path component. */ 795 /* At this point we know we have a real path component. */
796 for(;;) { 796 for(;;) {
797 unsigned long hash; 797 unsigned long hash;
798 struct qstr this; 798 struct qstr this;
799 unsigned int c; 799 unsigned int c;
800 800
801 nd->flags |= LOOKUP_CONTINUE; 801 nd->flags |= LOOKUP_CONTINUE;
802 err = exec_permission_lite(inode, nd); 802 err = exec_permission_lite(inode, nd);
803 if (err == -EAGAIN) 803 if (err == -EAGAIN)
804 err = vfs_permission(nd, MAY_EXEC); 804 err = vfs_permission(nd, MAY_EXEC);
805 if (err) 805 if (err)
806 break; 806 break;
807 807
808 this.name = name; 808 this.name = name;
809 c = *(const unsigned char *)name; 809 c = *(const unsigned char *)name;
810 810
811 hash = init_name_hash(); 811 hash = init_name_hash();
812 do { 812 do {
813 name++; 813 name++;
814 hash = partial_name_hash(c, hash); 814 hash = partial_name_hash(c, hash);
815 c = *(const unsigned char *)name; 815 c = *(const unsigned char *)name;
816 } while (c && (c != '/')); 816 } while (c && (c != '/'));
817 this.len = name - (const char *) this.name; 817 this.len = name - (const char *) this.name;
818 this.hash = end_name_hash(hash); 818 this.hash = end_name_hash(hash);
819 819
820 /* remove trailing slashes? */ 820 /* remove trailing slashes? */
821 if (!c) 821 if (!c)
822 goto last_component; 822 goto last_component;
823 while (*++name == '/'); 823 while (*++name == '/');
824 if (!*name) 824 if (!*name)
825 goto last_with_slashes; 825 goto last_with_slashes;
826 826
827 /* 827 /*
828 * "." and ".." are special - ".." especially so because it has 828 * "." and ".." are special - ".." especially so because it has
829 * to be able to know about the current root directory and 829 * to be able to know about the current root directory and
830 * parent relationships. 830 * parent relationships.
831 */ 831 */
832 if (this.name[0] == '.') switch (this.len) { 832 if (this.name[0] == '.') switch (this.len) {
833 default: 833 default:
834 break; 834 break;
835 case 2: 835 case 2:
836 if (this.name[1] != '.') 836 if (this.name[1] != '.')
837 break; 837 break;
838 follow_dotdot(nd); 838 follow_dotdot(nd);
839 inode = nd->dentry->d_inode; 839 inode = nd->dentry->d_inode;
840 /* fallthrough */ 840 /* fallthrough */
841 case 1: 841 case 1:
842 continue; 842 continue;
843 } 843 }
844 /* 844 /*
845 * See if the low-level filesystem might want 845 * See if the low-level filesystem might want
846 * to use its own hash.. 846 * to use its own hash..
847 */ 847 */
848 if (nd->dentry->d_op && nd->dentry->d_op->d_hash) { 848 if (nd->dentry->d_op && nd->dentry->d_op->d_hash) {
849 err = nd->dentry->d_op->d_hash(nd->dentry, &this); 849 err = nd->dentry->d_op->d_hash(nd->dentry, &this);
850 if (err < 0) 850 if (err < 0)
851 break; 851 break;
852 } 852 }
853 /* This does the actual lookups.. */ 853 /* This does the actual lookups.. */
854 err = do_lookup(nd, &this, &next); 854 err = do_lookup(nd, &this, &next);
855 if (err) 855 if (err)
856 break; 856 break;
857 857
858 err = -ENOENT; 858 err = -ENOENT;
859 inode = next.dentry->d_inode; 859 inode = next.dentry->d_inode;
860 if (!inode) 860 if (!inode)
861 goto out_dput; 861 goto out_dput;
862 err = -ENOTDIR; 862 err = -ENOTDIR;
863 if (!inode->i_op) 863 if (!inode->i_op)
864 goto out_dput; 864 goto out_dput;
865 865
866 if (inode->i_op->follow_link) { 866 if (inode->i_op->follow_link) {
867 err = do_follow_link(&next, nd); 867 err = do_follow_link(&next, nd);
868 if (err) 868 if (err)
869 goto return_err; 869 goto return_err;
870 err = -ENOENT; 870 err = -ENOENT;
871 inode = nd->dentry->d_inode; 871 inode = nd->dentry->d_inode;
872 if (!inode) 872 if (!inode)
873 break; 873 break;
874 err = -ENOTDIR; 874 err = -ENOTDIR;
875 if (!inode->i_op) 875 if (!inode->i_op)
876 break; 876 break;
877 } else 877 } else
878 path_to_nameidata(&next, nd); 878 path_to_nameidata(&next, nd);
879 err = -ENOTDIR; 879 err = -ENOTDIR;
880 if (!inode->i_op->lookup) 880 if (!inode->i_op->lookup)
881 break; 881 break;
882 continue; 882 continue;
883 /* here ends the main loop */ 883 /* here ends the main loop */
884 884
885 last_with_slashes: 885 last_with_slashes:
886 lookup_flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY; 886 lookup_flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
887 last_component: 887 last_component:
888 /* Clear LOOKUP_CONTINUE iff it was previously unset */ 888 /* Clear LOOKUP_CONTINUE iff it was previously unset */
889 nd->flags &= lookup_flags | ~LOOKUP_CONTINUE; 889 nd->flags &= lookup_flags | ~LOOKUP_CONTINUE;
890 if (lookup_flags & LOOKUP_PARENT) 890 if (lookup_flags & LOOKUP_PARENT)
891 goto lookup_parent; 891 goto lookup_parent;
892 if (this.name[0] == '.') switch (this.len) { 892 if (this.name[0] == '.') switch (this.len) {
893 default: 893 default:
894 break; 894 break;
895 case 2: 895 case 2:
896 if (this.name[1] != '.') 896 if (this.name[1] != '.')
897 break; 897 break;
898 follow_dotdot(nd); 898 follow_dotdot(nd);
899 inode = nd->dentry->d_inode; 899 inode = nd->dentry->d_inode;
900 /* fallthrough */ 900 /* fallthrough */
901 case 1: 901 case 1:
902 goto return_reval; 902 goto return_reval;
903 } 903 }
904 if (nd->dentry->d_op && nd->dentry->d_op->d_hash) { 904 if (nd->dentry->d_op && nd->dentry->d_op->d_hash) {
905 err = nd->dentry->d_op->d_hash(nd->dentry, &this); 905 err = nd->dentry->d_op->d_hash(nd->dentry, &this);
906 if (err < 0) 906 if (err < 0)
907 break; 907 break;
908 } 908 }
909 err = do_lookup(nd, &this, &next); 909 err = do_lookup(nd, &this, &next);
910 if (err) 910 if (err)
911 break; 911 break;
912 inode = next.dentry->d_inode; 912 inode = next.dentry->d_inode;
913 if ((lookup_flags & LOOKUP_FOLLOW) 913 if ((lookup_flags & LOOKUP_FOLLOW)
914 && inode && inode->i_op && inode->i_op->follow_link) { 914 && inode && inode->i_op && inode->i_op->follow_link) {
915 err = do_follow_link(&next, nd); 915 err = do_follow_link(&next, nd);
916 if (err) 916 if (err)
917 goto return_err; 917 goto return_err;
918 inode = nd->dentry->d_inode; 918 inode = nd->dentry->d_inode;
919 } else 919 } else
920 path_to_nameidata(&next, nd); 920 path_to_nameidata(&next, nd);
921 err = -ENOENT; 921 err = -ENOENT;
922 if (!inode) 922 if (!inode)
923 break; 923 break;
924 if (lookup_flags & LOOKUP_DIRECTORY) { 924 if (lookup_flags & LOOKUP_DIRECTORY) {
925 err = -ENOTDIR; 925 err = -ENOTDIR;
926 if (!inode->i_op || !inode->i_op->lookup) 926 if (!inode->i_op || !inode->i_op->lookup)
927 break; 927 break;
928 } 928 }
929 goto return_base; 929 goto return_base;
930 lookup_parent: 930 lookup_parent:
931 nd->last = this; 931 nd->last = this;
932 nd->last_type = LAST_NORM; 932 nd->last_type = LAST_NORM;
933 if (this.name[0] != '.') 933 if (this.name[0] != '.')
934 goto return_base; 934 goto return_base;
935 if (this.len == 1) 935 if (this.len == 1)
936 nd->last_type = LAST_DOT; 936 nd->last_type = LAST_DOT;
937 else if (this.len == 2 && this.name[1] == '.') 937 else if (this.len == 2 && this.name[1] == '.')
938 nd->last_type = LAST_DOTDOT; 938 nd->last_type = LAST_DOTDOT;
939 else 939 else
940 goto return_base; 940 goto return_base;
941 return_reval: 941 return_reval:
942 /* 942 /*
943 * We bypassed the ordinary revalidation routines. 943 * We bypassed the ordinary revalidation routines.
944 * We may need to check the cached dentry for staleness. 944 * We may need to check the cached dentry for staleness.
945 */ 945 */
946 if (nd->dentry && nd->dentry->d_sb && 946 if (nd->dentry && nd->dentry->d_sb &&
947 (nd->dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) { 947 (nd->dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) {
948 err = -ESTALE; 948 err = -ESTALE;
949 /* Note: we do not d_invalidate() */ 949 /* Note: we do not d_invalidate() */
950 if (!nd->dentry->d_op->d_revalidate(nd->dentry, nd)) 950 if (!nd->dentry->d_op->d_revalidate(nd->dentry, nd))
951 break; 951 break;
952 } 952 }
953 return_base: 953 return_base:
954 return 0; 954 return 0;
955 out_dput: 955 out_dput:
956 dput_path(&next, nd); 956 dput_path(&next, nd);
957 break; 957 break;
958 } 958 }
959 path_release(nd); 959 path_release(nd);
960 return_err: 960 return_err:
961 return err; 961 return err;
962 } 962 }
963 963
964 /* 964 /*
965 * Wrapper to retry pathname resolution whenever the underlying 965 * Wrapper to retry pathname resolution whenever the underlying
966 * file system returns an ESTALE. 966 * file system returns an ESTALE.
967 * 967 *
968 * Retry the whole path once, forcing real lookup requests 968 * Retry the whole path once, forcing real lookup requests
969 * instead of relying on the dcache. 969 * instead of relying on the dcache.
970 */ 970 */
971 int fastcall link_path_walk(const char *name, struct nameidata *nd) 971 int fastcall link_path_walk(const char *name, struct nameidata *nd)
972 { 972 {
973 struct nameidata save = *nd; 973 struct nameidata save = *nd;
974 int result; 974 int result;
975 975
976 /* make sure the stuff we saved doesn't go away */ 976 /* make sure the stuff we saved doesn't go away */
977 dget(save.dentry); 977 dget(save.dentry);
978 mntget(save.mnt); 978 mntget(save.mnt);
979 979
980 result = __link_path_walk(name, nd); 980 result = __link_path_walk(name, nd);
981 if (result == -ESTALE) { 981 if (result == -ESTALE) {
982 *nd = save; 982 *nd = save;
983 dget(nd->dentry); 983 dget(nd->dentry);
984 mntget(nd->mnt); 984 mntget(nd->mnt);
985 nd->flags |= LOOKUP_REVAL; 985 nd->flags |= LOOKUP_REVAL;
986 result = __link_path_walk(name, nd); 986 result = __link_path_walk(name, nd);
987 } 987 }
988 988
989 dput(save.dentry); 989 dput(save.dentry);
990 mntput(save.mnt); 990 mntput(save.mnt);
991 991
992 return result; 992 return result;
993 } 993 }
994 994
995 int fastcall path_walk(const char * name, struct nameidata *nd) 995 int fastcall path_walk(const char * name, struct nameidata *nd)
996 { 996 {
997 current->total_link_count = 0; 997 current->total_link_count = 0;
998 return link_path_walk(name, nd); 998 return link_path_walk(name, nd);
999 } 999 }
1000 1000
1001 /* 1001 /*
1002 * SMP-safe: Returns 1 and nd will have valid dentry and mnt, if 1002 * SMP-safe: Returns 1 and nd will have valid dentry and mnt, if
1003 * everything is done. Returns 0 and drops input nd, if lookup failed; 1003 * everything is done. Returns 0 and drops input nd, if lookup failed;
1004 */ 1004 */
1005 static int __emul_lookup_dentry(const char *name, struct nameidata *nd) 1005 static int __emul_lookup_dentry(const char *name, struct nameidata *nd)
1006 { 1006 {
1007 if (path_walk(name, nd)) 1007 if (path_walk(name, nd))
1008 return 0; /* something went wrong... */ 1008 return 0; /* something went wrong... */
1009 1009
1010 if (!nd->dentry->d_inode || S_ISDIR(nd->dentry->d_inode->i_mode)) { 1010 if (!nd->dentry->d_inode || S_ISDIR(nd->dentry->d_inode->i_mode)) {
1011 struct dentry *old_dentry = nd->dentry; 1011 struct dentry *old_dentry = nd->dentry;
1012 struct vfsmount *old_mnt = nd->mnt; 1012 struct vfsmount *old_mnt = nd->mnt;
1013 struct qstr last = nd->last; 1013 struct qstr last = nd->last;
1014 int last_type = nd->last_type; 1014 int last_type = nd->last_type;
1015 /* 1015 /*
1016 * NAME was not found in alternate root or it's a directory. Try to find 1016 * NAME was not found in alternate root or it's a directory. Try to find
1017 * it in the normal root: 1017 * it in the normal root:
1018 */ 1018 */
1019 nd->last_type = LAST_ROOT; 1019 nd->last_type = LAST_ROOT;
1020 read_lock(&current->fs->lock); 1020 read_lock(&current->fs->lock);
1021 nd->mnt = mntget(current->fs->rootmnt); 1021 nd->mnt = mntget(current->fs->rootmnt);
1022 nd->dentry = dget(current->fs->root); 1022 nd->dentry = dget(current->fs->root);
1023 read_unlock(&current->fs->lock); 1023 read_unlock(&current->fs->lock);
1024 if (path_walk(name, nd) == 0) { 1024 if (path_walk(name, nd) == 0) {
1025 if (nd->dentry->d_inode) { 1025 if (nd->dentry->d_inode) {
1026 dput(old_dentry); 1026 dput(old_dentry);
1027 mntput(old_mnt); 1027 mntput(old_mnt);
1028 return 1; 1028 return 1;
1029 } 1029 }
1030 path_release(nd); 1030 path_release(nd);
1031 } 1031 }
1032 nd->dentry = old_dentry; 1032 nd->dentry = old_dentry;
1033 nd->mnt = old_mnt; 1033 nd->mnt = old_mnt;
1034 nd->last = last; 1034 nd->last = last;
1035 nd->last_type = last_type; 1035 nd->last_type = last_type;
1036 } 1036 }
1037 return 1; 1037 return 1;
1038 } 1038 }
1039 1039
1040 void set_fs_altroot(void) 1040 void set_fs_altroot(void)
1041 { 1041 {
1042 char *emul = __emul_prefix(); 1042 char *emul = __emul_prefix();
1043 struct nameidata nd; 1043 struct nameidata nd;
1044 struct vfsmount *mnt = NULL, *oldmnt; 1044 struct vfsmount *mnt = NULL, *oldmnt;
1045 struct dentry *dentry = NULL, *olddentry; 1045 struct dentry *dentry = NULL, *olddentry;
1046 int err; 1046 int err;
1047 1047
1048 if (!emul) 1048 if (!emul)
1049 goto set_it; 1049 goto set_it;
1050 err = path_lookup(emul, LOOKUP_FOLLOW|LOOKUP_DIRECTORY|LOOKUP_NOALT, &nd); 1050 err = path_lookup(emul, LOOKUP_FOLLOW|LOOKUP_DIRECTORY|LOOKUP_NOALT, &nd);
1051 if (!err) { 1051 if (!err) {
1052 mnt = nd.mnt; 1052 mnt = nd.mnt;
1053 dentry = nd.dentry; 1053 dentry = nd.dentry;
1054 } 1054 }
1055 set_it: 1055 set_it:
1056 write_lock(&current->fs->lock); 1056 write_lock(&current->fs->lock);
1057 oldmnt = current->fs->altrootmnt; 1057 oldmnt = current->fs->altrootmnt;
1058 olddentry = current->fs->altroot; 1058 olddentry = current->fs->altroot;
1059 current->fs->altrootmnt = mnt; 1059 current->fs->altrootmnt = mnt;
1060 current->fs->altroot = dentry; 1060 current->fs->altroot = dentry;
1061 write_unlock(&current->fs->lock); 1061 write_unlock(&current->fs->lock);
1062 if (olddentry) { 1062 if (olddentry) {
1063 dput(olddentry); 1063 dput(olddentry);
1064 mntput(oldmnt); 1064 mntput(oldmnt);
1065 } 1065 }
1066 } 1066 }
1067 1067
1068 /* Returns 0 and nd will be valid on success; Retuns error, otherwise. */ 1068 /* Returns 0 and nd will be valid on success; Retuns error, otherwise. */
1069 static int fastcall do_path_lookup(int dfd, const char *name, 1069 static int fastcall do_path_lookup(int dfd, const char *name,
1070 unsigned int flags, struct nameidata *nd) 1070 unsigned int flags, struct nameidata *nd)
1071 { 1071 {
1072 int retval = 0; 1072 int retval = 0;
1073 int fput_needed; 1073 int fput_needed;
1074 struct file *file; 1074 struct file *file;
1075 1075
1076 nd->last_type = LAST_ROOT; /* if there are only slashes... */ 1076 nd->last_type = LAST_ROOT; /* if there are only slashes... */
1077 nd->flags = flags; 1077 nd->flags = flags;
1078 nd->depth = 0; 1078 nd->depth = 0;
1079 1079
1080 read_lock(&current->fs->lock); 1080 read_lock(&current->fs->lock);
1081 if (*name=='/') { 1081 if (*name=='/') {
1082 if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) { 1082 if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) {
1083 nd->mnt = mntget(current->fs->altrootmnt); 1083 nd->mnt = mntget(current->fs->altrootmnt);
1084 nd->dentry = dget(current->fs->altroot); 1084 nd->dentry = dget(current->fs->altroot);
1085 read_unlock(&current->fs->lock); 1085 read_unlock(&current->fs->lock);
1086 if (__emul_lookup_dentry(name,nd)) 1086 if (__emul_lookup_dentry(name,nd))
1087 goto out; /* found in altroot */ 1087 goto out; /* found in altroot */
1088 read_lock(&current->fs->lock); 1088 read_lock(&current->fs->lock);
1089 } 1089 }
1090 nd->mnt = mntget(current->fs->rootmnt); 1090 nd->mnt = mntget(current->fs->rootmnt);
1091 nd->dentry = dget(current->fs->root); 1091 nd->dentry = dget(current->fs->root);
1092 } else if (dfd == AT_FDCWD) { 1092 } else if (dfd == AT_FDCWD) {
1093 nd->mnt = mntget(current->fs->pwdmnt); 1093 nd->mnt = mntget(current->fs->pwdmnt);
1094 nd->dentry = dget(current->fs->pwd); 1094 nd->dentry = dget(current->fs->pwd);
1095 } else { 1095 } else {
1096 struct dentry *dentry; 1096 struct dentry *dentry;
1097 1097
1098 file = fget_light(dfd, &fput_needed); 1098 file = fget_light(dfd, &fput_needed);
1099 retval = -EBADF; 1099 retval = -EBADF;
1100 if (!file) 1100 if (!file)
1101 goto unlock_fail; 1101 goto unlock_fail;
1102 1102
1103 dentry = file->f_dentry; 1103 dentry = file->f_dentry;
1104 1104
1105 retval = -ENOTDIR; 1105 retval = -ENOTDIR;
1106 if (!S_ISDIR(dentry->d_inode->i_mode)) 1106 if (!S_ISDIR(dentry->d_inode->i_mode))
1107 goto fput_unlock_fail; 1107 goto fput_unlock_fail;
1108 1108
1109 retval = file_permission(file, MAY_EXEC); 1109 retval = file_permission(file, MAY_EXEC);
1110 if (retval) 1110 if (retval)
1111 goto fput_unlock_fail; 1111 goto fput_unlock_fail;
1112 1112
1113 nd->mnt = mntget(file->f_vfsmnt); 1113 nd->mnt = mntget(file->f_vfsmnt);
1114 nd->dentry = dget(dentry); 1114 nd->dentry = dget(dentry);
1115 1115
1116 fput_light(file, fput_needed); 1116 fput_light(file, fput_needed);
1117 } 1117 }
1118 read_unlock(&current->fs->lock); 1118 read_unlock(&current->fs->lock);
1119 current->total_link_count = 0; 1119 current->total_link_count = 0;
1120 retval = link_path_walk(name, nd); 1120 retval = link_path_walk(name, nd);
1121 out: 1121 out:
1122 if (likely(retval == 0)) { 1122 if (likely(retval == 0)) {
1123 if (unlikely(current->audit_context && nd && nd->dentry && 1123 if (unlikely(current->audit_context && nd && nd->dentry &&
1124 nd->dentry->d_inode)) 1124 nd->dentry->d_inode))
1125 audit_inode(name, nd->dentry->d_inode, flags); 1125 audit_inode(name, nd->dentry->d_inode, flags);
1126 } 1126 }
1127 return retval; 1127 return retval;
1128 1128
1129 fput_unlock_fail: 1129 fput_unlock_fail:
1130 fput_light(file, fput_needed); 1130 fput_light(file, fput_needed);
1131 unlock_fail: 1131 unlock_fail:
1132 read_unlock(&current->fs->lock); 1132 read_unlock(&current->fs->lock);
1133 return retval; 1133 return retval;
1134 } 1134 }
1135 1135
1136 int fastcall path_lookup(const char *name, unsigned int flags, 1136 int fastcall path_lookup(const char *name, unsigned int flags,
1137 struct nameidata *nd) 1137 struct nameidata *nd)
1138 { 1138 {
1139 return do_path_lookup(AT_FDCWD, name, flags, nd); 1139 return do_path_lookup(AT_FDCWD, name, flags, nd);
1140 } 1140 }
1141 1141
1142 static int __path_lookup_intent_open(int dfd, const char *name, 1142 static int __path_lookup_intent_open(int dfd, const char *name,
1143 unsigned int lookup_flags, struct nameidata *nd, 1143 unsigned int lookup_flags, struct nameidata *nd,
1144 int open_flags, int create_mode) 1144 int open_flags, int create_mode)
1145 { 1145 {
1146 struct file *filp = get_empty_filp(); 1146 struct file *filp = get_empty_filp();
1147 int err; 1147 int err;
1148 1148
1149 if (filp == NULL) 1149 if (filp == NULL)
1150 return -ENFILE; 1150 return -ENFILE;
1151 nd->intent.open.file = filp; 1151 nd->intent.open.file = filp;
1152 nd->intent.open.flags = open_flags; 1152 nd->intent.open.flags = open_flags;
1153 nd->intent.open.create_mode = create_mode; 1153 nd->intent.open.create_mode = create_mode;
1154 err = do_path_lookup(dfd, name, lookup_flags|LOOKUP_OPEN, nd); 1154 err = do_path_lookup(dfd, name, lookup_flags|LOOKUP_OPEN, nd);
1155 if (IS_ERR(nd->intent.open.file)) { 1155 if (IS_ERR(nd->intent.open.file)) {
1156 if (err == 0) { 1156 if (err == 0) {
1157 err = PTR_ERR(nd->intent.open.file); 1157 err = PTR_ERR(nd->intent.open.file);
1158 path_release(nd); 1158 path_release(nd);
1159 } 1159 }
1160 } else if (err != 0) 1160 } else if (err != 0)
1161 release_open_intent(nd); 1161 release_open_intent(nd);
1162 return err; 1162 return err;
1163 } 1163 }
1164 1164
1165 /** 1165 /**
1166 * path_lookup_open - lookup a file path with open intent 1166 * path_lookup_open - lookup a file path with open intent
1167 * @dfd: the directory to use as base, or AT_FDCWD 1167 * @dfd: the directory to use as base, or AT_FDCWD
1168 * @name: pointer to file name 1168 * @name: pointer to file name
1169 * @lookup_flags: lookup intent flags 1169 * @lookup_flags: lookup intent flags
1170 * @nd: pointer to nameidata 1170 * @nd: pointer to nameidata
1171 * @open_flags: open intent flags 1171 * @open_flags: open intent flags
1172 */ 1172 */
1173 int path_lookup_open(int dfd, const char *name, unsigned int lookup_flags, 1173 int path_lookup_open(int dfd, const char *name, unsigned int lookup_flags,
1174 struct nameidata *nd, int open_flags) 1174 struct nameidata *nd, int open_flags)
1175 { 1175 {
1176 return __path_lookup_intent_open(dfd, name, lookup_flags, nd, 1176 return __path_lookup_intent_open(dfd, name, lookup_flags, nd,
1177 open_flags, 0); 1177 open_flags, 0);
1178 } 1178 }
1179 1179
1180 /** 1180 /**
1181 * path_lookup_create - lookup a file path with open + create intent 1181 * path_lookup_create - lookup a file path with open + create intent
1182 * @dfd: the directory to use as base, or AT_FDCWD 1182 * @dfd: the directory to use as base, or AT_FDCWD
1183 * @name: pointer to file name 1183 * @name: pointer to file name
1184 * @lookup_flags: lookup intent flags 1184 * @lookup_flags: lookup intent flags
1185 * @nd: pointer to nameidata 1185 * @nd: pointer to nameidata
1186 * @open_flags: open intent flags 1186 * @open_flags: open intent flags
1187 * @create_mode: create intent flags 1187 * @create_mode: create intent flags
1188 */ 1188 */
1189 static int path_lookup_create(int dfd, const char *name, 1189 static int path_lookup_create(int dfd, const char *name,
1190 unsigned int lookup_flags, struct nameidata *nd, 1190 unsigned int lookup_flags, struct nameidata *nd,
1191 int open_flags, int create_mode) 1191 int open_flags, int create_mode)
1192 { 1192 {
1193 return __path_lookup_intent_open(dfd, name, lookup_flags|LOOKUP_CREATE, 1193 return __path_lookup_intent_open(dfd, name, lookup_flags|LOOKUP_CREATE,
1194 nd, open_flags, create_mode); 1194 nd, open_flags, create_mode);
1195 } 1195 }
1196 1196
1197 int __user_path_lookup_open(const char __user *name, unsigned int lookup_flags, 1197 int __user_path_lookup_open(const char __user *name, unsigned int lookup_flags,
1198 struct nameidata *nd, int open_flags) 1198 struct nameidata *nd, int open_flags)
1199 { 1199 {
1200 char *tmp = getname(name); 1200 char *tmp = getname(name);
1201 int err = PTR_ERR(tmp); 1201 int err = PTR_ERR(tmp);
1202 1202
1203 if (!IS_ERR(tmp)) { 1203 if (!IS_ERR(tmp)) {
1204 err = __path_lookup_intent_open(AT_FDCWD, tmp, lookup_flags, nd, open_flags, 0); 1204 err = __path_lookup_intent_open(AT_FDCWD, tmp, lookup_flags, nd, open_flags, 0);
1205 putname(tmp); 1205 putname(tmp);
1206 } 1206 }
1207 return err; 1207 return err;
1208 } 1208 }
1209 1209
1210 /* 1210 /*
1211 * Restricted form of lookup. Doesn't follow links, single-component only, 1211 * Restricted form of lookup. Doesn't follow links, single-component only,
1212 * needs parent already locked. Doesn't follow mounts. 1212 * needs parent already locked. Doesn't follow mounts.
1213 * SMP-safe. 1213 * SMP-safe.
1214 */ 1214 */
1215 static struct dentry * __lookup_hash(struct qstr *name, struct dentry * base, struct nameidata *nd) 1215 static struct dentry * __lookup_hash(struct qstr *name, struct dentry * base, struct nameidata *nd)
1216 { 1216 {
1217 struct dentry * dentry; 1217 struct dentry * dentry;
1218 struct inode *inode; 1218 struct inode *inode;
1219 int err; 1219 int err;
1220 1220
1221 inode = base->d_inode; 1221 inode = base->d_inode;
1222 err = permission(inode, MAY_EXEC, nd); 1222 err = permission(inode, MAY_EXEC, nd);
1223 dentry = ERR_PTR(err); 1223 dentry = ERR_PTR(err);
1224 if (err) 1224 if (err)
1225 goto out; 1225 goto out;
1226 1226
1227 /* 1227 /*
1228 * See if the low-level filesystem might want 1228 * See if the low-level filesystem might want
1229 * to use its own hash.. 1229 * to use its own hash..
1230 */ 1230 */
1231 if (base->d_op && base->d_op->d_hash) { 1231 if (base->d_op && base->d_op->d_hash) {
1232 err = base->d_op->d_hash(base, name); 1232 err = base->d_op->d_hash(base, name);
1233 dentry = ERR_PTR(err); 1233 dentry = ERR_PTR(err);
1234 if (err < 0) 1234 if (err < 0)
1235 goto out; 1235 goto out;
1236 } 1236 }
1237 1237
1238 dentry = cached_lookup(base, name, nd); 1238 dentry = cached_lookup(base, name, nd);
1239 if (!dentry) { 1239 if (!dentry) {
1240 struct dentry *new = d_alloc(base, name); 1240 struct dentry *new = d_alloc(base, name);
1241 dentry = ERR_PTR(-ENOMEM); 1241 dentry = ERR_PTR(-ENOMEM);
1242 if (!new) 1242 if (!new)
1243 goto out; 1243 goto out;
1244 dentry = inode->i_op->lookup(inode, new, nd); 1244 dentry = inode->i_op->lookup(inode, new, nd);
1245 if (!dentry) 1245 if (!dentry)
1246 dentry = new; 1246 dentry = new;
1247 else 1247 else
1248 dput(new); 1248 dput(new);
1249 } 1249 }
1250 out: 1250 out:
1251 return dentry; 1251 return dentry;
1252 } 1252 }
1253 1253
1254 struct dentry * lookup_hash(struct nameidata *nd) 1254 struct dentry * lookup_hash(struct nameidata *nd)
1255 { 1255 {
1256 return __lookup_hash(&nd->last, nd->dentry, nd); 1256 return __lookup_hash(&nd->last, nd->dentry, nd);
1257 } 1257 }
1258 1258
1259 /* SMP-safe */ 1259 /* SMP-safe */
1260 struct dentry * lookup_one_len(const char * name, struct dentry * base, int len) 1260 struct dentry * lookup_one_len(const char * name, struct dentry * base, int len)
1261 { 1261 {
1262 unsigned long hash; 1262 unsigned long hash;
1263 struct qstr this; 1263 struct qstr this;
1264 unsigned int c; 1264 unsigned int c;
1265 1265
1266 this.name = name; 1266 this.name = name;
1267 this.len = len; 1267 this.len = len;
1268 if (!len) 1268 if (!len)
1269 goto access; 1269 goto access;
1270 1270
1271 hash = init_name_hash(); 1271 hash = init_name_hash();
1272 while (len--) { 1272 while (len--) {
1273 c = *(const unsigned char *)name++; 1273 c = *(const unsigned char *)name++;
1274 if (c == '/' || c == '\0') 1274 if (c == '/' || c == '\0')
1275 goto access; 1275 goto access;
1276 hash = partial_name_hash(c, hash); 1276 hash = partial_name_hash(c, hash);
1277 } 1277 }
1278 this.hash = end_name_hash(hash); 1278 this.hash = end_name_hash(hash);
1279 1279
1280 return __lookup_hash(&this, base, NULL); 1280 return __lookup_hash(&this, base, NULL);
1281 access: 1281 access:
1282 return ERR_PTR(-EACCES); 1282 return ERR_PTR(-EACCES);
1283 } 1283 }
1284 1284
1285 /* 1285 /*
1286 * namei() 1286 * namei()
1287 * 1287 *
1288 * is used by most simple commands to get the inode of a specified name. 1288 * is used by most simple commands to get the inode of a specified name.
1289 * Open, link etc use their own routines, but this is enough for things 1289 * Open, link etc use their own routines, but this is enough for things
1290 * like 'chmod' etc. 1290 * like 'chmod' etc.
1291 * 1291 *
1292 * namei exists in two versions: namei/lnamei. The only difference is 1292 * namei exists in two versions: namei/lnamei. The only difference is
1293 * that namei follows links, while lnamei does not. 1293 * that namei follows links, while lnamei does not.
1294 * SMP-safe 1294 * SMP-safe
1295 */ 1295 */
1296 int fastcall __user_walk_fd(int dfd, const char __user *name, unsigned flags, 1296 int fastcall __user_walk_fd(int dfd, const char __user *name, unsigned flags,
1297 struct nameidata *nd) 1297 struct nameidata *nd)
1298 { 1298 {
1299 char *tmp = getname(name); 1299 char *tmp = getname(name);
1300 int err = PTR_ERR(tmp); 1300 int err = PTR_ERR(tmp);
1301 1301
1302 if (!IS_ERR(tmp)) { 1302 if (!IS_ERR(tmp)) {
1303 err = do_path_lookup(dfd, tmp, flags, nd); 1303 err = do_path_lookup(dfd, tmp, flags, nd);
1304 putname(tmp); 1304 putname(tmp);
1305 } 1305 }
1306 return err; 1306 return err;
1307 } 1307 }
1308 1308
1309 int fastcall __user_walk(const char __user *name, unsigned flags, struct nameidata *nd) 1309 int fastcall __user_walk(const char __user *name, unsigned flags, struct nameidata *nd)
1310 { 1310 {
1311 return __user_walk_fd(AT_FDCWD, name, flags, nd); 1311 return __user_walk_fd(AT_FDCWD, name, flags, nd);
1312 } 1312 }
1313 1313
1314 /* 1314 /*
1315 * It's inline, so penalty for filesystems that don't use sticky bit is 1315 * It's inline, so penalty for filesystems that don't use sticky bit is
1316 * minimal. 1316 * minimal.
1317 */ 1317 */
1318 static inline int check_sticky(struct inode *dir, struct inode *inode) 1318 static inline int check_sticky(struct inode *dir, struct inode *inode)
1319 { 1319 {
1320 if (!(dir->i_mode & S_ISVTX)) 1320 if (!(dir->i_mode & S_ISVTX))
1321 return 0; 1321 return 0;
1322 if (inode->i_uid == current->fsuid) 1322 if (inode->i_uid == current->fsuid)
1323 return 0; 1323 return 0;
1324 if (dir->i_uid == current->fsuid) 1324 if (dir->i_uid == current->fsuid)
1325 return 0; 1325 return 0;
1326 return !capable(CAP_FOWNER); 1326 return !capable(CAP_FOWNER);
1327 } 1327 }
1328 1328
1329 /* 1329 /*
1330 * Check whether we can remove a link victim from directory dir, check 1330 * Check whether we can remove a link victim from directory dir, check
1331 * whether the type of victim is right. 1331 * whether the type of victim is right.
1332 * 1. We can't do it if dir is read-only (done in permission()) 1332 * 1. We can't do it if dir is read-only (done in permission())
1333 * 2. We should have write and exec permissions on dir 1333 * 2. We should have write and exec permissions on dir
1334 * 3. We can't remove anything from append-only dir 1334 * 3. We can't remove anything from append-only dir
1335 * 4. We can't do anything with immutable dir (done in permission()) 1335 * 4. We can't do anything with immutable dir (done in permission())
1336 * 5. If the sticky bit on dir is set we should either 1336 * 5. If the sticky bit on dir is set we should either
1337 * a. be owner of dir, or 1337 * a. be owner of dir, or
1338 * b. be owner of victim, or 1338 * b. be owner of victim, or
1339 * c. have CAP_FOWNER capability 1339 * c. have CAP_FOWNER capability
1340 * 6. If the victim is append-only or immutable we can't do antyhing with 1340 * 6. If the victim is append-only or immutable we can't do antyhing with
1341 * links pointing to it. 1341 * links pointing to it.
1342 * 7. If we were asked to remove a directory and victim isn't one - ENOTDIR. 1342 * 7. If we were asked to remove a directory and victim isn't one - ENOTDIR.
1343 * 8. If we were asked to remove a non-directory and victim isn't one - EISDIR. 1343 * 8. If we were asked to remove a non-directory and victim isn't one - EISDIR.
1344 * 9. We can't remove a root or mountpoint. 1344 * 9. We can't remove a root or mountpoint.
1345 * 10. We don't allow removal of NFS sillyrenamed files; it's handled by 1345 * 10. We don't allow removal of NFS sillyrenamed files; it's handled by
1346 * nfs_async_unlink(). 1346 * nfs_async_unlink().
1347 */ 1347 */
1348 static int may_delete(struct inode *dir,struct dentry *victim,int isdir) 1348 static int may_delete(struct inode *dir,struct dentry *victim,int isdir)
1349 { 1349 {
1350 int error; 1350 int error;
1351 1351
1352 if (!victim->d_inode) 1352 if (!victim->d_inode)
1353 return -ENOENT; 1353 return -ENOENT;
1354 1354
1355 BUG_ON(victim->d_parent->d_inode != dir); 1355 BUG_ON(victim->d_parent->d_inode != dir);
1356 1356
1357 error = permission(dir,MAY_WRITE | MAY_EXEC, NULL); 1357 error = permission(dir,MAY_WRITE | MAY_EXEC, NULL);
1358 if (error) 1358 if (error)
1359 return error; 1359 return error;
1360 if (IS_APPEND(dir)) 1360 if (IS_APPEND(dir))
1361 return -EPERM; 1361 return -EPERM;
1362 if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)|| 1362 if (check_sticky(dir, victim->d_inode)||IS_APPEND(victim->d_inode)||
1363 IS_IMMUTABLE(victim->d_inode)) 1363 IS_IMMUTABLE(victim->d_inode))
1364 return -EPERM; 1364 return -EPERM;
1365 if (isdir) { 1365 if (isdir) {
1366 if (!S_ISDIR(victim->d_inode->i_mode)) 1366 if (!S_ISDIR(victim->d_inode->i_mode))
1367 return -ENOTDIR; 1367 return -ENOTDIR;
1368 if (IS_ROOT(victim)) 1368 if (IS_ROOT(victim))
1369 return -EBUSY; 1369 return -EBUSY;
1370 } else if (S_ISDIR(victim->d_inode->i_mode)) 1370 } else if (S_ISDIR(victim->d_inode->i_mode))
1371 return -EISDIR; 1371 return -EISDIR;
1372 if (IS_DEADDIR(dir)) 1372 if (IS_DEADDIR(dir))
1373 return -ENOENT; 1373 return -ENOENT;
1374 if (victim->d_flags & DCACHE_NFSFS_RENAMED) 1374 if (victim->d_flags & DCACHE_NFSFS_RENAMED)
1375 return -EBUSY; 1375 return -EBUSY;
1376 return 0; 1376 return 0;
1377 } 1377 }
1378 1378
1379 /* Check whether we can create an object with dentry child in directory 1379 /* Check whether we can create an object with dentry child in directory
1380 * dir. 1380 * dir.
1381 * 1. We can't do it if child already exists (open has special treatment for 1381 * 1. We can't do it if child already exists (open has special treatment for
1382 * this case, but since we are inlined it's OK) 1382 * this case, but since we are inlined it's OK)
1383 * 2. We can't do it if dir is read-only (done in permission()) 1383 * 2. We can't do it if dir is read-only (done in permission())
1384 * 3. We should have write and exec permissions on dir 1384 * 3. We should have write and exec permissions on dir
1385 * 4. We can't do it if dir is immutable (done in permission()) 1385 * 4. We can't do it if dir is immutable (done in permission())
1386 */ 1386 */
1387 static inline int may_create(struct inode *dir, struct dentry *child, 1387 static inline int may_create(struct inode *dir, struct dentry *child,
1388 struct nameidata *nd) 1388 struct nameidata *nd)
1389 { 1389 {
1390 if (child->d_inode) 1390 if (child->d_inode)
1391 return -EEXIST; 1391 return -EEXIST;
1392 if (IS_DEADDIR(dir)) 1392 if (IS_DEADDIR(dir))
1393 return -ENOENT; 1393 return -ENOENT;
1394 return permission(dir,MAY_WRITE | MAY_EXEC, nd); 1394 return permission(dir,MAY_WRITE | MAY_EXEC, nd);
1395 } 1395 }
1396 1396
1397 /* 1397 /*
1398 * O_DIRECTORY translates into forcing a directory lookup. 1398 * O_DIRECTORY translates into forcing a directory lookup.
1399 */ 1399 */
1400 static inline int lookup_flags(unsigned int f) 1400 static inline int lookup_flags(unsigned int f)
1401 { 1401 {
1402 unsigned long retval = LOOKUP_FOLLOW; 1402 unsigned long retval = LOOKUP_FOLLOW;
1403 1403
1404 if (f & O_NOFOLLOW) 1404 if (f & O_NOFOLLOW)
1405 retval &= ~LOOKUP_FOLLOW; 1405 retval &= ~LOOKUP_FOLLOW;
1406 1406
1407 if (f & O_DIRECTORY) 1407 if (f & O_DIRECTORY)
1408 retval |= LOOKUP_DIRECTORY; 1408 retval |= LOOKUP_DIRECTORY;
1409 1409
1410 return retval; 1410 return retval;
1411 } 1411 }
1412 1412
1413 /* 1413 /*
1414 * p1 and p2 should be directories on the same fs. 1414 * p1 and p2 should be directories on the same fs.
1415 */ 1415 */
1416 struct dentry *lock_rename(struct dentry *p1, struct dentry *p2) 1416 struct dentry *lock_rename(struct dentry *p1, struct dentry *p2)
1417 { 1417 {
1418 struct dentry *p; 1418 struct dentry *p;
1419 1419
1420 if (p1 == p2) { 1420 if (p1 == p2) {
1421 mutex_lock(&p1->d_inode->i_mutex); 1421 mutex_lock(&p1->d_inode->i_mutex);
1422 return NULL; 1422 return NULL;
1423 } 1423 }
1424 1424
1425 mutex_lock(&p1->d_inode->i_sb->s_vfs_rename_mutex); 1425 mutex_lock(&p1->d_inode->i_sb->s_vfs_rename_mutex);
1426 1426
1427 for (p = p1; p->d_parent != p; p = p->d_parent) { 1427 for (p = p1; p->d_parent != p; p = p->d_parent) {
1428 if (p->d_parent == p2) { 1428 if (p->d_parent == p2) {
1429 mutex_lock(&p2->d_inode->i_mutex); 1429 mutex_lock(&p2->d_inode->i_mutex);
1430 mutex_lock(&p1->d_inode->i_mutex); 1430 mutex_lock(&p1->d_inode->i_mutex);
1431 return p; 1431 return p;
1432 } 1432 }
1433 } 1433 }
1434 1434
1435 for (p = p2; p->d_parent != p; p = p->d_parent) { 1435 for (p = p2; p->d_parent != p; p = p->d_parent) {
1436 if (p->d_parent == p1) { 1436 if (p->d_parent == p1) {
1437 mutex_lock(&p1->d_inode->i_mutex); 1437 mutex_lock(&p1->d_inode->i_mutex);
1438 mutex_lock(&p2->d_inode->i_mutex); 1438 mutex_lock(&p2->d_inode->i_mutex);
1439 return p; 1439 return p;
1440 } 1440 }
1441 } 1441 }
1442 1442
1443 mutex_lock(&p1->d_inode->i_mutex); 1443 mutex_lock(&p1->d_inode->i_mutex);
1444 mutex_lock(&p2->d_inode->i_mutex); 1444 mutex_lock(&p2->d_inode->i_mutex);
1445 return NULL; 1445 return NULL;
1446 } 1446 }
1447 1447
1448 void unlock_rename(struct dentry *p1, struct dentry *p2) 1448 void unlock_rename(struct dentry *p1, struct dentry *p2)
1449 { 1449 {
1450 mutex_unlock(&p1->d_inode->i_mutex); 1450 mutex_unlock(&p1->d_inode->i_mutex);
1451 if (p1 != p2) { 1451 if (p1 != p2) {
1452 mutex_unlock(&p2->d_inode->i_mutex); 1452 mutex_unlock(&p2->d_inode->i_mutex);
1453 mutex_unlock(&p1->d_inode->i_sb->s_vfs_rename_mutex); 1453 mutex_unlock(&p1->d_inode->i_sb->s_vfs_rename_mutex);
1454 } 1454 }
1455 } 1455 }
1456 1456
1457 int vfs_create(struct inode *dir, struct dentry *dentry, int mode, 1457 int vfs_create(struct inode *dir, struct dentry *dentry, int mode,
1458 struct nameidata *nd) 1458 struct nameidata *nd)
1459 { 1459 {
1460 int error = may_create(dir, dentry, nd); 1460 int error = may_create(dir, dentry, nd);
1461 1461
1462 if (error) 1462 if (error)
1463 return error; 1463 return error;
1464 1464
1465 if (!dir->i_op || !dir->i_op->create) 1465 if (!dir->i_op || !dir->i_op->create)
1466 return -EACCES; /* shouldn't it be ENOSYS? */ 1466 return -EACCES; /* shouldn't it be ENOSYS? */
1467 mode &= S_IALLUGO; 1467 mode &= S_IALLUGO;
1468 mode |= S_IFREG; 1468 mode |= S_IFREG;
1469 error = security_inode_create(dir, dentry, mode); 1469 error = security_inode_create(dir, dentry, mode);
1470 if (error) 1470 if (error)
1471 return error; 1471 return error;
1472 DQUOT_INIT(dir); 1472 DQUOT_INIT(dir);
1473 error = dir->i_op->create(dir, dentry, mode, nd); 1473 error = dir->i_op->create(dir, dentry, mode, nd);
1474 if (!error) 1474 if (!error)
1475 fsnotify_create(dir, dentry->d_name.name); 1475 fsnotify_create(dir, dentry->d_name.name);
1476 return error; 1476 return error;
1477 } 1477 }
1478 1478
1479 int may_open(struct nameidata *nd, int acc_mode, int flag) 1479 int may_open(struct nameidata *nd, int acc_mode, int flag)
1480 { 1480 {
1481 struct dentry *dentry = nd->dentry; 1481 struct dentry *dentry = nd->dentry;
1482 struct inode *inode = dentry->d_inode; 1482 struct inode *inode = dentry->d_inode;
1483 int error; 1483 int error;
1484 1484
1485 if (!inode) 1485 if (!inode)
1486 return -ENOENT; 1486 return -ENOENT;
1487 1487
1488 if (S_ISLNK(inode->i_mode)) 1488 if (S_ISLNK(inode->i_mode))
1489 return -ELOOP; 1489 return -ELOOP;
1490 1490
1491 if (S_ISDIR(inode->i_mode) && (flag & FMODE_WRITE)) 1491 if (S_ISDIR(inode->i_mode) && (flag & FMODE_WRITE))
1492 return -EISDIR; 1492 return -EISDIR;
1493 1493
1494 error = vfs_permission(nd, acc_mode); 1494 error = vfs_permission(nd, acc_mode);
1495 if (error) 1495 if (error)
1496 return error; 1496 return error;
1497 1497
1498 /* 1498 /*
1499 * FIFO's, sockets and device files are special: they don't 1499 * FIFO's, sockets and device files are special: they don't
1500 * actually live on the filesystem itself, and as such you 1500 * actually live on the filesystem itself, and as such you
1501 * can write to them even if the filesystem is read-only. 1501 * can write to them even if the filesystem is read-only.
1502 */ 1502 */
1503 if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) { 1503 if (S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) {
1504 flag &= ~O_TRUNC; 1504 flag &= ~O_TRUNC;
1505 } else if (S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode)) { 1505 } else if (S_ISBLK(inode->i_mode) || S_ISCHR(inode->i_mode)) {
1506 if (nd->mnt->mnt_flags & MNT_NODEV) 1506 if (nd->mnt->mnt_flags & MNT_NODEV)
1507 return -EACCES; 1507 return -EACCES;
1508 1508
1509 flag &= ~O_TRUNC; 1509 flag &= ~O_TRUNC;
1510 } else if (IS_RDONLY(inode) && (flag & FMODE_WRITE)) 1510 } else if (IS_RDONLY(inode) && (flag & FMODE_WRITE))
1511 return -EROFS; 1511 return -EROFS;
1512 /* 1512 /*
1513 * An append-only file must be opened in append mode for writing. 1513 * An append-only file must be opened in append mode for writing.
1514 */ 1514 */
1515 if (IS_APPEND(inode)) { 1515 if (IS_APPEND(inode)) {
1516 if ((flag & FMODE_WRITE) && !(flag & O_APPEND)) 1516 if ((flag & FMODE_WRITE) && !(flag & O_APPEND))
1517 return -EPERM; 1517 return -EPERM;
1518 if (flag & O_TRUNC) 1518 if (flag & O_TRUNC)
1519 return -EPERM; 1519 return -EPERM;
1520 } 1520 }
1521 1521
1522 /* O_NOATIME can only be set by the owner or superuser */ 1522 /* O_NOATIME can only be set by the owner or superuser */
1523 if (flag & O_NOATIME) 1523 if (flag & O_NOATIME)
1524 if (current->fsuid != inode->i_uid && !capable(CAP_FOWNER)) 1524 if (current->fsuid != inode->i_uid && !capable(CAP_FOWNER))
1525 return -EPERM; 1525 return -EPERM;
1526 1526
1527 /* 1527 /*
1528 * Ensure there are no outstanding leases on the file. 1528 * Ensure there are no outstanding leases on the file.
1529 */ 1529 */
1530 error = break_lease(inode, flag); 1530 error = break_lease(inode, flag);
1531 if (error) 1531 if (error)
1532 return error; 1532 return error;
1533 1533
1534 if (flag & O_TRUNC) { 1534 if (flag & O_TRUNC) {
1535 error = get_write_access(inode); 1535 error = get_write_access(inode);
1536 if (error) 1536 if (error)
1537 return error; 1537 return error;
1538 1538
1539 /* 1539 /*
1540 * Refuse to truncate files with mandatory locks held on them. 1540 * Refuse to truncate files with mandatory locks held on them.
1541 */ 1541 */
1542 error = locks_verify_locked(inode); 1542 error = locks_verify_locked(inode);
1543 if (!error) { 1543 if (!error) {
1544 DQUOT_INIT(inode); 1544 DQUOT_INIT(inode);
1545 1545
1546 error = do_truncate(dentry, 0, ATTR_MTIME|ATTR_CTIME, NULL); 1546 error = do_truncate(dentry, 0, ATTR_MTIME|ATTR_CTIME, NULL);
1547 } 1547 }
1548 put_write_access(inode); 1548 put_write_access(inode);
1549 if (error) 1549 if (error)
1550 return error; 1550 return error;
1551 } else 1551 } else
1552 if (flag & FMODE_WRITE) 1552 if (flag & FMODE_WRITE)
1553 DQUOT_INIT(inode); 1553 DQUOT_INIT(inode);
1554 1554
1555 return 0; 1555 return 0;
1556 } 1556 }
1557 1557
1558 /* 1558 /*
1559 * open_namei() 1559 * open_namei()
1560 * 1560 *
1561 * namei for open - this is in fact almost the whole open-routine. 1561 * namei for open - this is in fact almost the whole open-routine.
1562 * 1562 *
1563 * Note that the low bits of "flag" aren't the same as in the open 1563 * Note that the low bits of "flag" aren't the same as in the open
1564 * system call - they are 00 - no permissions needed 1564 * system call - they are 00 - no permissions needed
1565 * 01 - read permission needed 1565 * 01 - read permission needed
1566 * 10 - write permission needed 1566 * 10 - write permission needed
1567 * 11 - read/write permissions needed 1567 * 11 - read/write permissions needed
1568 * which is a lot more logical, and also allows the "no perm" needed 1568 * which is a lot more logical, and also allows the "no perm" needed
1569 * for symlinks (where the permissions are checked later). 1569 * for symlinks (where the permissions are checked later).
1570 * SMP-safe 1570 * SMP-safe
1571 */ 1571 */
1572 int open_namei(int dfd, const char *pathname, int flag, 1572 int open_namei(int dfd, const char *pathname, int flag,
1573 int mode, struct nameidata *nd) 1573 int mode, struct nameidata *nd)
1574 { 1574 {
1575 int acc_mode, error; 1575 int acc_mode, error;
1576 struct path path; 1576 struct path path;
1577 struct dentry *dir; 1577 struct dentry *dir;
1578 int count = 0; 1578 int count = 0;
1579 1579
1580 acc_mode = ACC_MODE(flag); 1580 acc_mode = ACC_MODE(flag);
1581 1581
1582 /* O_TRUNC implies we need access checks for write permissions */ 1582 /* O_TRUNC implies we need access checks for write permissions */
1583 if (flag & O_TRUNC) 1583 if (flag & O_TRUNC)
1584 acc_mode |= MAY_WRITE; 1584 acc_mode |= MAY_WRITE;
1585 1585
1586 /* Allow the LSM permission hook to distinguish append 1586 /* Allow the LSM permission hook to distinguish append
1587 access from general write access. */ 1587 access from general write access. */
1588 if (flag & O_APPEND) 1588 if (flag & O_APPEND)
1589 acc_mode |= MAY_APPEND; 1589 acc_mode |= MAY_APPEND;
1590 1590
1591 /* 1591 /*
1592 * The simplest case - just a plain lookup. 1592 * The simplest case - just a plain lookup.
1593 */ 1593 */
1594 if (!(flag & O_CREAT)) { 1594 if (!(flag & O_CREAT)) {
1595 error = path_lookup_open(dfd, pathname, lookup_flags(flag), 1595 error = path_lookup_open(dfd, pathname, lookup_flags(flag),
1596 nd, flag); 1596 nd, flag);
1597 if (error) 1597 if (error)
1598 return error; 1598 return error;
1599 goto ok; 1599 goto ok;
1600 } 1600 }
1601 1601
1602 /* 1602 /*
1603 * Create - we need to know the parent. 1603 * Create - we need to know the parent.
1604 */ 1604 */
1605 error = path_lookup_create(dfd,pathname,LOOKUP_PARENT,nd,flag,mode); 1605 error = path_lookup_create(dfd,pathname,LOOKUP_PARENT,nd,flag,mode);
1606 if (error) 1606 if (error)
1607 return error; 1607 return error;
1608 1608
1609 /* 1609 /*
1610 * We have the parent and last component. First of all, check 1610 * We have the parent and last component. First of all, check
1611 * that we are not asked to creat(2) an obvious directory - that 1611 * that we are not asked to creat(2) an obvious directory - that
1612 * will not do. 1612 * will not do.
1613 */ 1613 */
1614 error = -EISDIR; 1614 error = -EISDIR;
1615 if (nd->last_type != LAST_NORM || nd->last.name[nd->last.len]) 1615 if (nd->last_type != LAST_NORM || nd->last.name[nd->last.len])
1616 goto exit; 1616 goto exit;
1617 1617
1618 dir = nd->dentry; 1618 dir = nd->dentry;
1619 nd->flags &= ~LOOKUP_PARENT; 1619 nd->flags &= ~LOOKUP_PARENT;
1620 mutex_lock(&dir->d_inode->i_mutex); 1620 mutex_lock(&dir->d_inode->i_mutex);
1621 path.dentry = lookup_hash(nd); 1621 path.dentry = lookup_hash(nd);
1622 path.mnt = nd->mnt; 1622 path.mnt = nd->mnt;
1623 1623
1624 do_last: 1624 do_last:
1625 error = PTR_ERR(path.dentry); 1625 error = PTR_ERR(path.dentry);
1626 if (IS_ERR(path.dentry)) { 1626 if (IS_ERR(path.dentry)) {
1627 mutex_unlock(&dir->d_inode->i_mutex); 1627 mutex_unlock(&dir->d_inode->i_mutex);
1628 goto exit; 1628 goto exit;
1629 } 1629 }
1630 1630
1631 if (IS_ERR(nd->intent.open.file)) {
1632 mutex_unlock(&dir->d_inode->i_mutex);
1633 error = PTR_ERR(nd->intent.open.file);
1634 goto exit_dput;
1635 }
1636
1631 /* Negative dentry, just create the file */ 1637 /* Negative dentry, just create the file */
1632 if (!path.dentry->d_inode) { 1638 if (!path.dentry->d_inode) {
1633 if (!IS_POSIXACL(dir->d_inode)) 1639 if (!IS_POSIXACL(dir->d_inode))
1634 mode &= ~current->fs->umask; 1640 mode &= ~current->fs->umask;
1635 error = vfs_create(dir->d_inode, path.dentry, mode, nd); 1641 error = vfs_create(dir->d_inode, path.dentry, mode, nd);
1636 mutex_unlock(&dir->d_inode->i_mutex); 1642 mutex_unlock(&dir->d_inode->i_mutex);
1637 dput(nd->dentry); 1643 dput(nd->dentry);
1638 nd->dentry = path.dentry; 1644 nd->dentry = path.dentry;
1639 if (error) 1645 if (error)
1640 goto exit; 1646 goto exit;
1641 /* Don't check for write permission, don't truncate */ 1647 /* Don't check for write permission, don't truncate */
1642 acc_mode = 0; 1648 acc_mode = 0;
1643 flag &= ~O_TRUNC; 1649 flag &= ~O_TRUNC;
1644 goto ok; 1650 goto ok;
1645 } 1651 }
1646 1652
1647 /* 1653 /*
1648 * It already exists. 1654 * It already exists.
1649 */ 1655 */
1650 mutex_unlock(&dir->d_inode->i_mutex); 1656 mutex_unlock(&dir->d_inode->i_mutex);
1651 1657
1652 error = -EEXIST; 1658 error = -EEXIST;
1653 if (flag & O_EXCL) 1659 if (flag & O_EXCL)
1654 goto exit_dput; 1660 goto exit_dput;
1655 1661
1656 if (__follow_mount(&path)) { 1662 if (__follow_mount(&path)) {
1657 error = -ELOOP; 1663 error = -ELOOP;
1658 if (flag & O_NOFOLLOW) 1664 if (flag & O_NOFOLLOW)
1659 goto exit_dput; 1665 goto exit_dput;
1660 } 1666 }
1661 error = -ENOENT; 1667 error = -ENOENT;
1662 if (!path.dentry->d_inode) 1668 if (!path.dentry->d_inode)
1663 goto exit_dput; 1669 goto exit_dput;
1664 if (path.dentry->d_inode->i_op && path.dentry->d_inode->i_op->follow_link) 1670 if (path.dentry->d_inode->i_op && path.dentry->d_inode->i_op->follow_link)
1665 goto do_link; 1671 goto do_link;
1666 1672
1667 path_to_nameidata(&path, nd); 1673 path_to_nameidata(&path, nd);
1668 error = -EISDIR; 1674 error = -EISDIR;
1669 if (path.dentry->d_inode && S_ISDIR(path.dentry->d_inode->i_mode)) 1675 if (path.dentry->d_inode && S_ISDIR(path.dentry->d_inode->i_mode))
1670 goto exit; 1676 goto exit;
1671 ok: 1677 ok:
1672 error = may_open(nd, acc_mode, flag); 1678 error = may_open(nd, acc_mode, flag);
1673 if (error) 1679 if (error)
1674 goto exit; 1680 goto exit;
1675 return 0; 1681 return 0;
1676 1682
1677 exit_dput: 1683 exit_dput:
1678 dput_path(&path, nd); 1684 dput_path(&path, nd);
1679 exit: 1685 exit:
1680 if (!IS_ERR(nd->intent.open.file)) 1686 if (!IS_ERR(nd->intent.open.file))
1681 release_open_intent(nd); 1687 release_open_intent(nd);
1682 path_release(nd); 1688 path_release(nd);
1683 return error; 1689 return error;
1684 1690
1685 do_link: 1691 do_link:
1686 error = -ELOOP; 1692 error = -ELOOP;
1687 if (flag & O_NOFOLLOW) 1693 if (flag & O_NOFOLLOW)
1688 goto exit_dput; 1694 goto exit_dput;
1689 /* 1695 /*
1690 * This is subtle. Instead of calling do_follow_link() we do the 1696 * This is subtle. Instead of calling do_follow_link() we do the
1691 * thing by hands. The reason is that this way we have zero link_count 1697 * thing by hands. The reason is that this way we have zero link_count
1692 * and path_walk() (called from ->follow_link) honoring LOOKUP_PARENT. 1698 * and path_walk() (called from ->follow_link) honoring LOOKUP_PARENT.
1693 * After that we have the parent and last component, i.e. 1699 * After that we have the parent and last component, i.e.
1694 * we are in the same situation as after the first path_walk(). 1700 * we are in the same situation as after the first path_walk().
1695 * Well, almost - if the last component is normal we get its copy 1701 * Well, almost - if the last component is normal we get its copy
1696 * stored in nd->last.name and we will have to putname() it when we 1702 * stored in nd->last.name and we will have to putname() it when we
1697 * are done. Procfs-like symlinks just set LAST_BIND. 1703 * are done. Procfs-like symlinks just set LAST_BIND.
1698 */ 1704 */
1699 nd->flags |= LOOKUP_PARENT; 1705 nd->flags |= LOOKUP_PARENT;
1700 error = security_inode_follow_link(path.dentry, nd); 1706 error = security_inode_follow_link(path.dentry, nd);
1701 if (error) 1707 if (error)
1702 goto exit_dput; 1708 goto exit_dput;
1703 error = __do_follow_link(&path, nd); 1709 error = __do_follow_link(&path, nd);
1704 if (error) 1710 if (error)
1705 return error; 1711 return error;
1706 nd->flags &= ~LOOKUP_PARENT; 1712 nd->flags &= ~LOOKUP_PARENT;
1707 if (nd->last_type == LAST_BIND) 1713 if (nd->last_type == LAST_BIND)
1708 goto ok; 1714 goto ok;
1709 error = -EISDIR; 1715 error = -EISDIR;
1710 if (nd->last_type != LAST_NORM) 1716 if (nd->last_type != LAST_NORM)
1711 goto exit; 1717 goto exit;
1712 if (nd->last.name[nd->last.len]) { 1718 if (nd->last.name[nd->last.len]) {
1713 __putname(nd->last.name); 1719 __putname(nd->last.name);
1714 goto exit; 1720 goto exit;
1715 } 1721 }
1716 error = -ELOOP; 1722 error = -ELOOP;
1717 if (count++==32) { 1723 if (count++==32) {
1718 __putname(nd->last.name); 1724 __putname(nd->last.name);
1719 goto exit; 1725 goto exit;
1720 } 1726 }
1721 dir = nd->dentry; 1727 dir = nd->dentry;
1722 mutex_lock(&dir->d_inode->i_mutex); 1728 mutex_lock(&dir->d_inode->i_mutex);
1723 path.dentry = lookup_hash(nd); 1729 path.dentry = lookup_hash(nd);
1724 path.mnt = nd->mnt; 1730 path.mnt = nd->mnt;
1725 __putname(nd->last.name); 1731 __putname(nd->last.name);
1726 goto do_last; 1732 goto do_last;
1727 } 1733 }
1728 1734
1729 /** 1735 /**
1730 * lookup_create - lookup a dentry, creating it if it doesn't exist 1736 * lookup_create - lookup a dentry, creating it if it doesn't exist
1731 * @nd: nameidata info 1737 * @nd: nameidata info
1732 * @is_dir: directory flag 1738 * @is_dir: directory flag
1733 * 1739 *
1734 * Simple function to lookup and return a dentry and create it 1740 * Simple function to lookup and return a dentry and create it
1735 * if it doesn't exist. Is SMP-safe. 1741 * if it doesn't exist. Is SMP-safe.
1736 * 1742 *
1737 * Returns with nd->dentry->d_inode->i_mutex locked. 1743 * Returns with nd->dentry->d_inode->i_mutex locked.
1738 */ 1744 */
1739 struct dentry *lookup_create(struct nameidata *nd, int is_dir) 1745 struct dentry *lookup_create(struct nameidata *nd, int is_dir)
1740 { 1746 {
1741 struct dentry *dentry = ERR_PTR(-EEXIST); 1747 struct dentry *dentry = ERR_PTR(-EEXIST);
1742 1748
1743 mutex_lock(&nd->dentry->d_inode->i_mutex); 1749 mutex_lock(&nd->dentry->d_inode->i_mutex);
1744 /* 1750 /*
1745 * Yucky last component or no last component at all? 1751 * Yucky last component or no last component at all?
1746 * (foo/., foo/.., /////) 1752 * (foo/., foo/.., /////)
1747 */ 1753 */
1748 if (nd->last_type != LAST_NORM) 1754 if (nd->last_type != LAST_NORM)
1749 goto fail; 1755 goto fail;
1750 nd->flags &= ~LOOKUP_PARENT; 1756 nd->flags &= ~LOOKUP_PARENT;
1751 1757
1752 /* 1758 /*
1753 * Do the final lookup. 1759 * Do the final lookup.
1754 */ 1760 */
1755 dentry = lookup_hash(nd); 1761 dentry = lookup_hash(nd);
1756 if (IS_ERR(dentry)) 1762 if (IS_ERR(dentry))
1757 goto fail; 1763 goto fail;
1758 1764
1759 /* 1765 /*
1760 * Special case - lookup gave negative, but... we had foo/bar/ 1766 * Special case - lookup gave negative, but... we had foo/bar/
1761 * From the vfs_mknod() POV we just have a negative dentry - 1767 * From the vfs_mknod() POV we just have a negative dentry -
1762 * all is fine. Let's be bastards - you had / on the end, you've 1768 * all is fine. Let's be bastards - you had / on the end, you've
1763 * been asking for (non-existent) directory. -ENOENT for you. 1769 * been asking for (non-existent) directory. -ENOENT for you.
1764 */ 1770 */
1765 if (!is_dir && nd->last.name[nd->last.len] && !dentry->d_inode) 1771 if (!is_dir && nd->last.name[nd->last.len] && !dentry->d_inode)
1766 goto enoent; 1772 goto enoent;
1767 return dentry; 1773 return dentry;
1768 enoent: 1774 enoent:
1769 dput(dentry); 1775 dput(dentry);
1770 dentry = ERR_PTR(-ENOENT); 1776 dentry = ERR_PTR(-ENOENT);
1771 fail: 1777 fail:
1772 return dentry; 1778 return dentry;
1773 } 1779 }
1774 EXPORT_SYMBOL_GPL(lookup_create); 1780 EXPORT_SYMBOL_GPL(lookup_create);
1775 1781
1776 int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev) 1782 int vfs_mknod(struct inode *dir, struct dentry *dentry, int mode, dev_t dev)
1777 { 1783 {
1778 int error = may_create(dir, dentry, NULL); 1784 int error = may_create(dir, dentry, NULL);
1779 1785
1780 if (error) 1786 if (error)
1781 return error; 1787 return error;
1782 1788
1783 if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD)) 1789 if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
1784 return -EPERM; 1790 return -EPERM;
1785 1791
1786 if (!dir->i_op || !dir->i_op->mknod) 1792 if (!dir->i_op || !dir->i_op->mknod)
1787 return -EPERM; 1793 return -EPERM;
1788 1794
1789 error = security_inode_mknod(dir, dentry, mode, dev); 1795 error = security_inode_mknod(dir, dentry, mode, dev);
1790 if (error) 1796 if (error)
1791 return error; 1797 return error;
1792 1798
1793 DQUOT_INIT(dir); 1799 DQUOT_INIT(dir);
1794 error = dir->i_op->mknod(dir, dentry, mode, dev); 1800 error = dir->i_op->mknod(dir, dentry, mode, dev);
1795 if (!error) 1801 if (!error)
1796 fsnotify_create(dir, dentry->d_name.name); 1802 fsnotify_create(dir, dentry->d_name.name);
1797 return error; 1803 return error;
1798 } 1804 }
1799 1805
1800 asmlinkage long sys_mknodat(int dfd, const char __user *filename, int mode, 1806 asmlinkage long sys_mknodat(int dfd, const char __user *filename, int mode,
1801 unsigned dev) 1807 unsigned dev)
1802 { 1808 {
1803 int error = 0; 1809 int error = 0;
1804 char * tmp; 1810 char * tmp;
1805 struct dentry * dentry; 1811 struct dentry * dentry;
1806 struct nameidata nd; 1812 struct nameidata nd;
1807 1813
1808 if (S_ISDIR(mode)) 1814 if (S_ISDIR(mode))
1809 return -EPERM; 1815 return -EPERM;
1810 tmp = getname(filename); 1816 tmp = getname(filename);
1811 if (IS_ERR(tmp)) 1817 if (IS_ERR(tmp))
1812 return PTR_ERR(tmp); 1818 return PTR_ERR(tmp);
1813 1819
1814 error = do_path_lookup(dfd, tmp, LOOKUP_PARENT, &nd); 1820 error = do_path_lookup(dfd, tmp, LOOKUP_PARENT, &nd);
1815 if (error) 1821 if (error)
1816 goto out; 1822 goto out;
1817 dentry = lookup_create(&nd, 0); 1823 dentry = lookup_create(&nd, 0);
1818 error = PTR_ERR(dentry); 1824 error = PTR_ERR(dentry);
1819 1825
1820 if (!IS_POSIXACL(nd.dentry->d_inode)) 1826 if (!IS_POSIXACL(nd.dentry->d_inode))
1821 mode &= ~current->fs->umask; 1827 mode &= ~current->fs->umask;
1822 if (!IS_ERR(dentry)) { 1828 if (!IS_ERR(dentry)) {
1823 switch (mode & S_IFMT) { 1829 switch (mode & S_IFMT) {
1824 case 0: case S_IFREG: 1830 case 0: case S_IFREG:
1825 error = vfs_create(nd.dentry->d_inode,dentry,mode,&nd); 1831 error = vfs_create(nd.dentry->d_inode,dentry,mode,&nd);
1826 break; 1832 break;
1827 case S_IFCHR: case S_IFBLK: 1833 case S_IFCHR: case S_IFBLK:
1828 error = vfs_mknod(nd.dentry->d_inode,dentry,mode, 1834 error = vfs_mknod(nd.dentry->d_inode,dentry,mode,
1829 new_decode_dev(dev)); 1835 new_decode_dev(dev));
1830 break; 1836 break;
1831 case S_IFIFO: case S_IFSOCK: 1837 case S_IFIFO: case S_IFSOCK:
1832 error = vfs_mknod(nd.dentry->d_inode,dentry,mode,0); 1838 error = vfs_mknod(nd.dentry->d_inode,dentry,mode,0);
1833 break; 1839 break;
1834 case S_IFDIR: 1840 case S_IFDIR:
1835 error = -EPERM; 1841 error = -EPERM;
1836 break; 1842 break;
1837 default: 1843 default:
1838 error = -EINVAL; 1844 error = -EINVAL;
1839 } 1845 }
1840 dput(dentry); 1846 dput(dentry);
1841 } 1847 }
1842 mutex_unlock(&nd.dentry->d_inode->i_mutex); 1848 mutex_unlock(&nd.dentry->d_inode->i_mutex);
1843 path_release(&nd); 1849 path_release(&nd);
1844 out: 1850 out:
1845 putname(tmp); 1851 putname(tmp);
1846 1852
1847 return error; 1853 return error;
1848 } 1854 }
1849 1855
1850 asmlinkage long sys_mknod(const char __user *filename, int mode, unsigned dev) 1856 asmlinkage long sys_mknod(const char __user *filename, int mode, unsigned dev)
1851 { 1857 {
1852 return sys_mknodat(AT_FDCWD, filename, mode, dev); 1858 return sys_mknodat(AT_FDCWD, filename, mode, dev);
1853 } 1859 }
1854 1860
1855 int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) 1861 int vfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
1856 { 1862 {
1857 int error = may_create(dir, dentry, NULL); 1863 int error = may_create(dir, dentry, NULL);
1858 1864
1859 if (error) 1865 if (error)
1860 return error; 1866 return error;
1861 1867
1862 if (!dir->i_op || !dir->i_op->mkdir) 1868 if (!dir->i_op || !dir->i_op->mkdir)
1863 return -EPERM; 1869 return -EPERM;
1864 1870
1865 mode &= (S_IRWXUGO|S_ISVTX); 1871 mode &= (S_IRWXUGO|S_ISVTX);
1866 error = security_inode_mkdir(dir, dentry, mode); 1872 error = security_inode_mkdir(dir, dentry, mode);
1867 if (error) 1873 if (error)
1868 return error; 1874 return error;
1869 1875
1870 DQUOT_INIT(dir); 1876 DQUOT_INIT(dir);
1871 error = dir->i_op->mkdir(dir, dentry, mode); 1877 error = dir->i_op->mkdir(dir, dentry, mode);
1872 if (!error) 1878 if (!error)
1873 fsnotify_mkdir(dir, dentry->d_name.name); 1879 fsnotify_mkdir(dir, dentry->d_name.name);
1874 return error; 1880 return error;
1875 } 1881 }
1876 1882
1877 asmlinkage long sys_mkdirat(int dfd, const char __user *pathname, int mode) 1883 asmlinkage long sys_mkdirat(int dfd, const char __user *pathname, int mode)
1878 { 1884 {
1879 int error = 0; 1885 int error = 0;
1880 char * tmp; 1886 char * tmp;
1881 1887
1882 tmp = getname(pathname); 1888 tmp = getname(pathname);
1883 error = PTR_ERR(tmp); 1889 error = PTR_ERR(tmp);
1884 if (!IS_ERR(tmp)) { 1890 if (!IS_ERR(tmp)) {
1885 struct dentry *dentry; 1891 struct dentry *dentry;
1886 struct nameidata nd; 1892 struct nameidata nd;
1887 1893
1888 error = do_path_lookup(dfd, tmp, LOOKUP_PARENT, &nd); 1894 error = do_path_lookup(dfd, tmp, LOOKUP_PARENT, &nd);
1889 if (error) 1895 if (error)
1890 goto out; 1896 goto out;
1891 dentry = lookup_create(&nd, 1); 1897 dentry = lookup_create(&nd, 1);
1892 error = PTR_ERR(dentry); 1898 error = PTR_ERR(dentry);
1893 if (!IS_ERR(dentry)) { 1899 if (!IS_ERR(dentry)) {
1894 if (!IS_POSIXACL(nd.dentry->d_inode)) 1900 if (!IS_POSIXACL(nd.dentry->d_inode))
1895 mode &= ~current->fs->umask; 1901 mode &= ~current->fs->umask;
1896 error = vfs_mkdir(nd.dentry->d_inode, dentry, mode); 1902 error = vfs_mkdir(nd.dentry->d_inode, dentry, mode);
1897 dput(dentry); 1903 dput(dentry);
1898 } 1904 }
1899 mutex_unlock(&nd.dentry->d_inode->i_mutex); 1905 mutex_unlock(&nd.dentry->d_inode->i_mutex);
1900 path_release(&nd); 1906 path_release(&nd);
1901 out: 1907 out:
1902 putname(tmp); 1908 putname(tmp);
1903 } 1909 }
1904 1910
1905 return error; 1911 return error;
1906 } 1912 }
1907 1913
1908 asmlinkage long sys_mkdir(const char __user *pathname, int mode) 1914 asmlinkage long sys_mkdir(const char __user *pathname, int mode)
1909 { 1915 {
1910 return sys_mkdirat(AT_FDCWD, pathname, mode); 1916 return sys_mkdirat(AT_FDCWD, pathname, mode);
1911 } 1917 }
1912 1918
1913 /* 1919 /*
1914 * We try to drop the dentry early: we should have 1920 * We try to drop the dentry early: we should have
1915 * a usage count of 2 if we're the only user of this 1921 * a usage count of 2 if we're the only user of this
1916 * dentry, and if that is true (possibly after pruning 1922 * dentry, and if that is true (possibly after pruning
1917 * the dcache), then we drop the dentry now. 1923 * the dcache), then we drop the dentry now.
1918 * 1924 *
1919 * A low-level filesystem can, if it choses, legally 1925 * A low-level filesystem can, if it choses, legally
1920 * do a 1926 * do a
1921 * 1927 *
1922 * if (!d_unhashed(dentry)) 1928 * if (!d_unhashed(dentry))
1923 * return -EBUSY; 1929 * return -EBUSY;
1924 * 1930 *
1925 * if it cannot handle the case of removing a directory 1931 * if it cannot handle the case of removing a directory
1926 * that is still in use by something else.. 1932 * that is still in use by something else..
1927 */ 1933 */
1928 void dentry_unhash(struct dentry *dentry) 1934 void dentry_unhash(struct dentry *dentry)
1929 { 1935 {
1930 dget(dentry); 1936 dget(dentry);
1931 if (atomic_read(&dentry->d_count)) 1937 if (atomic_read(&dentry->d_count))
1932 shrink_dcache_parent(dentry); 1938 shrink_dcache_parent(dentry);
1933 spin_lock(&dcache_lock); 1939 spin_lock(&dcache_lock);
1934 spin_lock(&dentry->d_lock); 1940 spin_lock(&dentry->d_lock);
1935 if (atomic_read(&dentry->d_count) == 2) 1941 if (atomic_read(&dentry->d_count) == 2)
1936 __d_drop(dentry); 1942 __d_drop(dentry);
1937 spin_unlock(&dentry->d_lock); 1943 spin_unlock(&dentry->d_lock);
1938 spin_unlock(&dcache_lock); 1944 spin_unlock(&dcache_lock);
1939 } 1945 }
1940 1946
1941 int vfs_rmdir(struct inode *dir, struct dentry *dentry) 1947 int vfs_rmdir(struct inode *dir, struct dentry *dentry)
1942 { 1948 {
1943 int error = may_delete(dir, dentry, 1); 1949 int error = may_delete(dir, dentry, 1);
1944 1950
1945 if (error) 1951 if (error)
1946 return error; 1952 return error;
1947 1953
1948 if (!dir->i_op || !dir->i_op->rmdir) 1954 if (!dir->i_op || !dir->i_op->rmdir)
1949 return -EPERM; 1955 return -EPERM;
1950 1956
1951 DQUOT_INIT(dir); 1957 DQUOT_INIT(dir);
1952 1958
1953 mutex_lock(&dentry->d_inode->i_mutex); 1959 mutex_lock(&dentry->d_inode->i_mutex);
1954 dentry_unhash(dentry); 1960 dentry_unhash(dentry);
1955 if (d_mountpoint(dentry)) 1961 if (d_mountpoint(dentry))
1956 error = -EBUSY; 1962 error = -EBUSY;
1957 else { 1963 else {
1958 error = security_inode_rmdir(dir, dentry); 1964 error = security_inode_rmdir(dir, dentry);
1959 if (!error) { 1965 if (!error) {
1960 error = dir->i_op->rmdir(dir, dentry); 1966 error = dir->i_op->rmdir(dir, dentry);
1961 if (!error) 1967 if (!error)
1962 dentry->d_inode->i_flags |= S_DEAD; 1968 dentry->d_inode->i_flags |= S_DEAD;
1963 } 1969 }
1964 } 1970 }
1965 mutex_unlock(&dentry->d_inode->i_mutex); 1971 mutex_unlock(&dentry->d_inode->i_mutex);
1966 if (!error) { 1972 if (!error) {
1967 d_delete(dentry); 1973 d_delete(dentry);
1968 } 1974 }
1969 dput(dentry); 1975 dput(dentry);
1970 1976
1971 return error; 1977 return error;
1972 } 1978 }
1973 1979
1974 static long do_rmdir(int dfd, const char __user *pathname) 1980 static long do_rmdir(int dfd, const char __user *pathname)
1975 { 1981 {
1976 int error = 0; 1982 int error = 0;
1977 char * name; 1983 char * name;
1978 struct dentry *dentry; 1984 struct dentry *dentry;
1979 struct nameidata nd; 1985 struct nameidata nd;
1980 1986
1981 name = getname(pathname); 1987 name = getname(pathname);
1982 if(IS_ERR(name)) 1988 if(IS_ERR(name))
1983 return PTR_ERR(name); 1989 return PTR_ERR(name);
1984 1990
1985 error = do_path_lookup(dfd, name, LOOKUP_PARENT, &nd); 1991 error = do_path_lookup(dfd, name, LOOKUP_PARENT, &nd);
1986 if (error) 1992 if (error)
1987 goto exit; 1993 goto exit;
1988 1994
1989 switch(nd.last_type) { 1995 switch(nd.last_type) {
1990 case LAST_DOTDOT: 1996 case LAST_DOTDOT:
1991 error = -ENOTEMPTY; 1997 error = -ENOTEMPTY;
1992 goto exit1; 1998 goto exit1;
1993 case LAST_DOT: 1999 case LAST_DOT:
1994 error = -EINVAL; 2000 error = -EINVAL;
1995 goto exit1; 2001 goto exit1;
1996 case LAST_ROOT: 2002 case LAST_ROOT:
1997 error = -EBUSY; 2003 error = -EBUSY;
1998 goto exit1; 2004 goto exit1;
1999 } 2005 }
2000 mutex_lock(&nd.dentry->d_inode->i_mutex); 2006 mutex_lock(&nd.dentry->d_inode->i_mutex);
2001 dentry = lookup_hash(&nd); 2007 dentry = lookup_hash(&nd);
2002 error = PTR_ERR(dentry); 2008 error = PTR_ERR(dentry);
2003 if (!IS_ERR(dentry)) { 2009 if (!IS_ERR(dentry)) {
2004 error = vfs_rmdir(nd.dentry->d_inode, dentry); 2010 error = vfs_rmdir(nd.dentry->d_inode, dentry);
2005 dput(dentry); 2011 dput(dentry);
2006 } 2012 }
2007 mutex_unlock(&nd.dentry->d_inode->i_mutex); 2013 mutex_unlock(&nd.dentry->d_inode->i_mutex);
2008 exit1: 2014 exit1:
2009 path_release(&nd); 2015 path_release(&nd);
2010 exit: 2016 exit:
2011 putname(name); 2017 putname(name);
2012 return error; 2018 return error;
2013 } 2019 }
2014 2020
2015 asmlinkage long sys_rmdir(const char __user *pathname) 2021 asmlinkage long sys_rmdir(const char __user *pathname)
2016 { 2022 {
2017 return do_rmdir(AT_FDCWD, pathname); 2023 return do_rmdir(AT_FDCWD, pathname);
2018 } 2024 }
2019 2025
2020 int vfs_unlink(struct inode *dir, struct dentry *dentry) 2026 int vfs_unlink(struct inode *dir, struct dentry *dentry)
2021 { 2027 {
2022 int error = may_delete(dir, dentry, 0); 2028 int error = may_delete(dir, dentry, 0);
2023 2029
2024 if (error) 2030 if (error)
2025 return error; 2031 return error;
2026 2032
2027 if (!dir->i_op || !dir->i_op->unlink) 2033 if (!dir->i_op || !dir->i_op->unlink)
2028 return -EPERM; 2034 return -EPERM;
2029 2035
2030 DQUOT_INIT(dir); 2036 DQUOT_INIT(dir);
2031 2037
2032 mutex_lock(&dentry->d_inode->i_mutex); 2038 mutex_lock(&dentry->d_inode->i_mutex);
2033 if (d_mountpoint(dentry)) 2039 if (d_mountpoint(dentry))
2034 error = -EBUSY; 2040 error = -EBUSY;
2035 else { 2041 else {
2036 error = security_inode_unlink(dir, dentry); 2042 error = security_inode_unlink(dir, dentry);
2037 if (!error) 2043 if (!error)
2038 error = dir->i_op->unlink(dir, dentry); 2044 error = dir->i_op->unlink(dir, dentry);
2039 } 2045 }
2040 mutex_unlock(&dentry->d_inode->i_mutex); 2046 mutex_unlock(&dentry->d_inode->i_mutex);
2041 2047
2042 /* We don't d_delete() NFS sillyrenamed files--they still exist. */ 2048 /* We don't d_delete() NFS sillyrenamed files--they still exist. */
2043 if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) { 2049 if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
2044 d_delete(dentry); 2050 d_delete(dentry);
2045 } 2051 }
2046 2052
2047 return error; 2053 return error;
2048 } 2054 }
2049 2055
2050 /* 2056 /*
2051 * Make sure that the actual truncation of the file will occur outside its 2057 * Make sure that the actual truncation of the file will occur outside its
2052 * directory's i_mutex. Truncate can take a long time if there is a lot of 2058 * directory's i_mutex. Truncate can take a long time if there is a lot of
2053 * writeout happening, and we don't want to prevent access to the directory 2059 * writeout happening, and we don't want to prevent access to the directory
2054 * while waiting on the I/O. 2060 * while waiting on the I/O.
2055 */ 2061 */
2056 static long do_unlinkat(int dfd, const char __user *pathname) 2062 static long do_unlinkat(int dfd, const char __user *pathname)
2057 { 2063 {
2058 int error = 0; 2064 int error = 0;
2059 char * name; 2065 char * name;
2060 struct dentry *dentry; 2066 struct dentry *dentry;
2061 struct nameidata nd; 2067 struct nameidata nd;
2062 struct inode *inode = NULL; 2068 struct inode *inode = NULL;
2063 2069
2064 name = getname(pathname); 2070 name = getname(pathname);
2065 if(IS_ERR(name)) 2071 if(IS_ERR(name))
2066 return PTR_ERR(name); 2072 return PTR_ERR(name);
2067 2073
2068 error = do_path_lookup(dfd, name, LOOKUP_PARENT, &nd); 2074 error = do_path_lookup(dfd, name, LOOKUP_PARENT, &nd);
2069 if (error) 2075 if (error)
2070 goto exit; 2076 goto exit;
2071 error = -EISDIR; 2077 error = -EISDIR;
2072 if (nd.last_type != LAST_NORM) 2078 if (nd.last_type != LAST_NORM)
2073 goto exit1; 2079 goto exit1;
2074 mutex_lock(&nd.dentry->d_inode->i_mutex); 2080 mutex_lock(&nd.dentry->d_inode->i_mutex);
2075 dentry = lookup_hash(&nd); 2081 dentry = lookup_hash(&nd);
2076 error = PTR_ERR(dentry); 2082 error = PTR_ERR(dentry);
2077 if (!IS_ERR(dentry)) { 2083 if (!IS_ERR(dentry)) {
2078 /* Why not before? Because we want correct error value */ 2084 /* Why not before? Because we want correct error value */
2079 if (nd.last.name[nd.last.len]) 2085 if (nd.last.name[nd.last.len])
2080 goto slashes; 2086 goto slashes;
2081 inode = dentry->d_inode; 2087 inode = dentry->d_inode;
2082 if (inode) 2088 if (inode)
2083 atomic_inc(&inode->i_count); 2089 atomic_inc(&inode->i_count);
2084 error = vfs_unlink(nd.dentry->d_inode, dentry); 2090 error = vfs_unlink(nd.dentry->d_inode, dentry);
2085 exit2: 2091 exit2:
2086 dput(dentry); 2092 dput(dentry);
2087 } 2093 }
2088 mutex_unlock(&nd.dentry->d_inode->i_mutex); 2094 mutex_unlock(&nd.dentry->d_inode->i_mutex);
2089 if (inode) 2095 if (inode)
2090 iput(inode); /* truncate the inode here */ 2096 iput(inode); /* truncate the inode here */
2091 exit1: 2097 exit1:
2092 path_release(&nd); 2098 path_release(&nd);
2093 exit: 2099 exit:
2094 putname(name); 2100 putname(name);
2095 return error; 2101 return error;
2096 2102
2097 slashes: 2103 slashes:
2098 error = !dentry->d_inode ? -ENOENT : 2104 error = !dentry->d_inode ? -ENOENT :
2099 S_ISDIR(dentry->d_inode->i_mode) ? -EISDIR : -ENOTDIR; 2105 S_ISDIR(dentry->d_inode->i_mode) ? -EISDIR : -ENOTDIR;
2100 goto exit2; 2106 goto exit2;
2101 } 2107 }
2102 2108
2103 asmlinkage long sys_unlinkat(int dfd, const char __user *pathname, int flag) 2109 asmlinkage long sys_unlinkat(int dfd, const char __user *pathname, int flag)
2104 { 2110 {
2105 if ((flag & ~AT_REMOVEDIR) != 0) 2111 if ((flag & ~AT_REMOVEDIR) != 0)
2106 return -EINVAL; 2112 return -EINVAL;
2107 2113
2108 if (flag & AT_REMOVEDIR) 2114 if (flag & AT_REMOVEDIR)
2109 return do_rmdir(dfd, pathname); 2115 return do_rmdir(dfd, pathname);
2110 2116
2111 return do_unlinkat(dfd, pathname); 2117 return do_unlinkat(dfd, pathname);
2112 } 2118 }
2113 2119
2114 asmlinkage long sys_unlink(const char __user *pathname) 2120 asmlinkage long sys_unlink(const char __user *pathname)
2115 { 2121 {
2116 return do_unlinkat(AT_FDCWD, pathname); 2122 return do_unlinkat(AT_FDCWD, pathname);
2117 } 2123 }
2118 2124
2119 int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname, int mode) 2125 int vfs_symlink(struct inode *dir, struct dentry *dentry, const char *oldname, int mode)
2120 { 2126 {
2121 int error = may_create(dir, dentry, NULL); 2127 int error = may_create(dir, dentry, NULL);
2122 2128
2123 if (error) 2129 if (error)
2124 return error; 2130 return error;
2125 2131
2126 if (!dir->i_op || !dir->i_op->symlink) 2132 if (!dir->i_op || !dir->i_op->symlink)
2127 return -EPERM; 2133 return -EPERM;
2128 2134
2129 error = security_inode_symlink(dir, dentry, oldname); 2135 error = security_inode_symlink(dir, dentry, oldname);
2130 if (error) 2136 if (error)
2131 return error; 2137 return error;
2132 2138
2133 DQUOT_INIT(dir); 2139 DQUOT_INIT(dir);
2134 error = dir->i_op->symlink(dir, dentry, oldname); 2140 error = dir->i_op->symlink(dir, dentry, oldname);
2135 if (!error) 2141 if (!error)
2136 fsnotify_create(dir, dentry->d_name.name); 2142 fsnotify_create(dir, dentry->d_name.name);
2137 return error; 2143 return error;
2138 } 2144 }
2139 2145
2140 asmlinkage long sys_symlinkat(const char __user *oldname, 2146 asmlinkage long sys_symlinkat(const char __user *oldname,
2141 int newdfd, const char __user *newname) 2147 int newdfd, const char __user *newname)
2142 { 2148 {
2143 int error = 0; 2149 int error = 0;
2144 char * from; 2150 char * from;
2145 char * to; 2151 char * to;
2146 2152
2147 from = getname(oldname); 2153 from = getname(oldname);
2148 if(IS_ERR(from)) 2154 if(IS_ERR(from))
2149 return PTR_ERR(from); 2155 return PTR_ERR(from);
2150 to = getname(newname); 2156 to = getname(newname);
2151 error = PTR_ERR(to); 2157 error = PTR_ERR(to);
2152 if (!IS_ERR(to)) { 2158 if (!IS_ERR(to)) {
2153 struct dentry *dentry; 2159 struct dentry *dentry;
2154 struct nameidata nd; 2160 struct nameidata nd;
2155 2161
2156 error = do_path_lookup(newdfd, to, LOOKUP_PARENT, &nd); 2162 error = do_path_lookup(newdfd, to, LOOKUP_PARENT, &nd);
2157 if (error) 2163 if (error)
2158 goto out; 2164 goto out;
2159 dentry = lookup_create(&nd, 0); 2165 dentry = lookup_create(&nd, 0);
2160 error = PTR_ERR(dentry); 2166 error = PTR_ERR(dentry);
2161 if (!IS_ERR(dentry)) { 2167 if (!IS_ERR(dentry)) {
2162 error = vfs_symlink(nd.dentry->d_inode, dentry, from, S_IALLUGO); 2168 error = vfs_symlink(nd.dentry->d_inode, dentry, from, S_IALLUGO);
2163 dput(dentry); 2169 dput(dentry);
2164 } 2170 }
2165 mutex_unlock(&nd.dentry->d_inode->i_mutex); 2171 mutex_unlock(&nd.dentry->d_inode->i_mutex);
2166 path_release(&nd); 2172 path_release(&nd);
2167 out: 2173 out:
2168 putname(to); 2174 putname(to);
2169 } 2175 }
2170 putname(from); 2176 putname(from);
2171 return error; 2177 return error;
2172 } 2178 }
2173 2179
2174 asmlinkage long sys_symlink(const char __user *oldname, const char __user *newname) 2180 asmlinkage long sys_symlink(const char __user *oldname, const char __user *newname)
2175 { 2181 {
2176 return sys_symlinkat(oldname, AT_FDCWD, newname); 2182 return sys_symlinkat(oldname, AT_FDCWD, newname);
2177 } 2183 }
2178 2184
2179 int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry) 2185 int vfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *new_dentry)
2180 { 2186 {
2181 struct inode *inode = old_dentry->d_inode; 2187 struct inode *inode = old_dentry->d_inode;
2182 int error; 2188 int error;
2183 2189
2184 if (!inode) 2190 if (!inode)
2185 return -ENOENT; 2191 return -ENOENT;
2186 2192
2187 error = may_create(dir, new_dentry, NULL); 2193 error = may_create(dir, new_dentry, NULL);
2188 if (error) 2194 if (error)
2189 return error; 2195 return error;
2190 2196
2191 if (dir->i_sb != inode->i_sb) 2197 if (dir->i_sb != inode->i_sb)
2192 return -EXDEV; 2198 return -EXDEV;
2193 2199
2194 /* 2200 /*
2195 * A link to an append-only or immutable file cannot be created. 2201 * A link to an append-only or immutable file cannot be created.
2196 */ 2202 */
2197 if (IS_APPEND(inode) || IS_IMMUTABLE(inode)) 2203 if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
2198 return -EPERM; 2204 return -EPERM;
2199 if (!dir->i_op || !dir->i_op->link) 2205 if (!dir->i_op || !dir->i_op->link)
2200 return -EPERM; 2206 return -EPERM;
2201 if (S_ISDIR(old_dentry->d_inode->i_mode)) 2207 if (S_ISDIR(old_dentry->d_inode->i_mode))
2202 return -EPERM; 2208 return -EPERM;
2203 2209
2204 error = security_inode_link(old_dentry, dir, new_dentry); 2210 error = security_inode_link(old_dentry, dir, new_dentry);
2205 if (error) 2211 if (error)
2206 return error; 2212 return error;
2207 2213
2208 mutex_lock(&old_dentry->d_inode->i_mutex); 2214 mutex_lock(&old_dentry->d_inode->i_mutex);
2209 DQUOT_INIT(dir); 2215 DQUOT_INIT(dir);
2210 error = dir->i_op->link(old_dentry, dir, new_dentry); 2216 error = dir->i_op->link(old_dentry, dir, new_dentry);
2211 mutex_unlock(&old_dentry->d_inode->i_mutex); 2217 mutex_unlock(&old_dentry->d_inode->i_mutex);
2212 if (!error) 2218 if (!error)
2213 fsnotify_create(dir, new_dentry->d_name.name); 2219 fsnotify_create(dir, new_dentry->d_name.name);
2214 return error; 2220 return error;
2215 } 2221 }
2216 2222
2217 /* 2223 /*
2218 * Hardlinks are often used in delicate situations. We avoid 2224 * Hardlinks are often used in delicate situations. We avoid
2219 * security-related surprises by not following symlinks on the 2225 * security-related surprises by not following symlinks on the
2220 * newname. --KAB 2226 * newname. --KAB
2221 * 2227 *
2222 * We don't follow them on the oldname either to be compatible 2228 * We don't follow them on the oldname either to be compatible
2223 * with linux 2.0, and to avoid hard-linking to directories 2229 * with linux 2.0, and to avoid hard-linking to directories
2224 * and other special files. --ADM 2230 * and other special files. --ADM
2225 */ 2231 */
2226 asmlinkage long sys_linkat(int olddfd, const char __user *oldname, 2232 asmlinkage long sys_linkat(int olddfd, const char __user *oldname,
2227 int newdfd, const char __user *newname, 2233 int newdfd, const char __user *newname,
2228 int flags) 2234 int flags)
2229 { 2235 {
2230 struct dentry *new_dentry; 2236 struct dentry *new_dentry;
2231 struct nameidata nd, old_nd; 2237 struct nameidata nd, old_nd;
2232 int error; 2238 int error;
2233 char * to; 2239 char * to;
2234 2240
2235 if (flags != 0) 2241 if (flags != 0)
2236 return -EINVAL; 2242 return -EINVAL;
2237 2243
2238 to = getname(newname); 2244 to = getname(newname);
2239 if (IS_ERR(to)) 2245 if (IS_ERR(to))
2240 return PTR_ERR(to); 2246 return PTR_ERR(to);
2241 2247
2242 error = __user_walk_fd(olddfd, oldname, 0, &old_nd); 2248 error = __user_walk_fd(olddfd, oldname, 0, &old_nd);
2243 if (error) 2249 if (error)
2244 goto exit; 2250 goto exit;
2245 error = do_path_lookup(newdfd, to, LOOKUP_PARENT, &nd); 2251 error = do_path_lookup(newdfd, to, LOOKUP_PARENT, &nd);
2246 if (error) 2252 if (error)
2247 goto out; 2253 goto out;
2248 error = -EXDEV; 2254 error = -EXDEV;
2249 if (old_nd.mnt != nd.mnt) 2255 if (old_nd.mnt != nd.mnt)
2250 goto out_release; 2256 goto out_release;
2251 new_dentry = lookup_create(&nd, 0); 2257 new_dentry = lookup_create(&nd, 0);
2252 error = PTR_ERR(new_dentry); 2258 error = PTR_ERR(new_dentry);
2253 if (!IS_ERR(new_dentry)) { 2259 if (!IS_ERR(new_dentry)) {
2254 error = vfs_link(old_nd.dentry, nd.dentry->d_inode, new_dentry); 2260 error = vfs_link(old_nd.dentry, nd.dentry->d_inode, new_dentry);
2255 dput(new_dentry); 2261 dput(new_dentry);
2256 } 2262 }
2257 mutex_unlock(&nd.dentry->d_inode->i_mutex); 2263 mutex_unlock(&nd.dentry->d_inode->i_mutex);
2258 out_release: 2264 out_release:
2259 path_release(&nd); 2265 path_release(&nd);
2260 out: 2266 out:
2261 path_release(&old_nd); 2267 path_release(&old_nd);
2262 exit: 2268 exit:
2263 putname(to); 2269 putname(to);
2264 2270
2265 return error; 2271 return error;
2266 } 2272 }
2267 2273
2268 asmlinkage long sys_link(const char __user *oldname, const char __user *newname) 2274 asmlinkage long sys_link(const char __user *oldname, const char __user *newname)
2269 { 2275 {
2270 return sys_linkat(AT_FDCWD, oldname, AT_FDCWD, newname, 0); 2276 return sys_linkat(AT_FDCWD, oldname, AT_FDCWD, newname, 0);
2271 } 2277 }
2272 2278
2273 /* 2279 /*
2274 * The worst of all namespace operations - renaming directory. "Perverted" 2280 * The worst of all namespace operations - renaming directory. "Perverted"
2275 * doesn't even start to describe it. Somebody in UCB had a heck of a trip... 2281 * doesn't even start to describe it. Somebody in UCB had a heck of a trip...
2276 * Problems: 2282 * Problems:
2277 * a) we can get into loop creation. Check is done in is_subdir(). 2283 * a) we can get into loop creation. Check is done in is_subdir().
2278 * b) race potential - two innocent renames can create a loop together. 2284 * b) race potential - two innocent renames can create a loop together.
2279 * That's where 4.4 screws up. Current fix: serialization on 2285 * That's where 4.4 screws up. Current fix: serialization on
2280 * sb->s_vfs_rename_mutex. We might be more accurate, but that's another 2286 * sb->s_vfs_rename_mutex. We might be more accurate, but that's another
2281 * story. 2287 * story.
2282 * c) we have to lock _three_ objects - parents and victim (if it exists). 2288 * c) we have to lock _three_ objects - parents and victim (if it exists).
2283 * And that - after we got ->i_mutex on parents (until then we don't know 2289 * And that - after we got ->i_mutex on parents (until then we don't know
2284 * whether the target exists). Solution: try to be smart with locking 2290 * whether the target exists). Solution: try to be smart with locking
2285 * order for inodes. We rely on the fact that tree topology may change 2291 * order for inodes. We rely on the fact that tree topology may change
2286 * only under ->s_vfs_rename_mutex _and_ that parent of the object we 2292 * only under ->s_vfs_rename_mutex _and_ that parent of the object we
2287 * move will be locked. Thus we can rank directories by the tree 2293 * move will be locked. Thus we can rank directories by the tree
2288 * (ancestors first) and rank all non-directories after them. 2294 * (ancestors first) and rank all non-directories after them.
2289 * That works since everybody except rename does "lock parent, lookup, 2295 * That works since everybody except rename does "lock parent, lookup,
2290 * lock child" and rename is under ->s_vfs_rename_mutex. 2296 * lock child" and rename is under ->s_vfs_rename_mutex.
2291 * HOWEVER, it relies on the assumption that any object with ->lookup() 2297 * HOWEVER, it relies on the assumption that any object with ->lookup()
2292 * has no more than 1 dentry. If "hybrid" objects will ever appear, 2298 * has no more than 1 dentry. If "hybrid" objects will ever appear,
2293 * we'd better make sure that there's no link(2) for them. 2299 * we'd better make sure that there's no link(2) for them.
2294 * d) some filesystems don't support opened-but-unlinked directories, 2300 * d) some filesystems don't support opened-but-unlinked directories,
2295 * either because of layout or because they are not ready to deal with 2301 * either because of layout or because they are not ready to deal with
2296 * all cases correctly. The latter will be fixed (taking this sort of 2302 * all cases correctly. The latter will be fixed (taking this sort of
2297 * stuff into VFS), but the former is not going away. Solution: the same 2303 * stuff into VFS), but the former is not going away. Solution: the same
2298 * trick as in rmdir(). 2304 * trick as in rmdir().
2299 * e) conversion from fhandle to dentry may come in the wrong moment - when 2305 * e) conversion from fhandle to dentry may come in the wrong moment - when
2300 * we are removing the target. Solution: we will have to grab ->i_mutex 2306 * we are removing the target. Solution: we will have to grab ->i_mutex
2301 * in the fhandle_to_dentry code. [FIXME - current nfsfh.c relies on 2307 * in the fhandle_to_dentry code. [FIXME - current nfsfh.c relies on
2302 * ->i_mutex on parents, which works but leads to some truely excessive 2308 * ->i_mutex on parents, which works but leads to some truely excessive
2303 * locking]. 2309 * locking].
2304 */ 2310 */
2305 static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry, 2311 static int vfs_rename_dir(struct inode *old_dir, struct dentry *old_dentry,
2306 struct inode *new_dir, struct dentry *new_dentry) 2312 struct inode *new_dir, struct dentry *new_dentry)
2307 { 2313 {
2308 int error = 0; 2314 int error = 0;
2309 struct inode *target; 2315 struct inode *target;
2310 2316
2311 /* 2317 /*
2312 * If we are going to change the parent - check write permissions, 2318 * If we are going to change the parent - check write permissions,
2313 * we'll need to flip '..'. 2319 * we'll need to flip '..'.
2314 */ 2320 */
2315 if (new_dir != old_dir) { 2321 if (new_dir != old_dir) {
2316 error = permission(old_dentry->d_inode, MAY_WRITE, NULL); 2322 error = permission(old_dentry->d_inode, MAY_WRITE, NULL);
2317 if (error) 2323 if (error)
2318 return error; 2324 return error;
2319 } 2325 }
2320 2326
2321 error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry); 2327 error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry);
2322 if (error) 2328 if (error)
2323 return error; 2329 return error;
2324 2330
2325 target = new_dentry->d_inode; 2331 target = new_dentry->d_inode;
2326 if (target) { 2332 if (target) {
2327 mutex_lock(&target->i_mutex); 2333 mutex_lock(&target->i_mutex);
2328 dentry_unhash(new_dentry); 2334 dentry_unhash(new_dentry);
2329 } 2335 }
2330 if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry)) 2336 if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
2331 error = -EBUSY; 2337 error = -EBUSY;
2332 else 2338 else
2333 error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry); 2339 error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
2334 if (target) { 2340 if (target) {
2335 if (!error) 2341 if (!error)
2336 target->i_flags |= S_DEAD; 2342 target->i_flags |= S_DEAD;
2337 mutex_unlock(&target->i_mutex); 2343 mutex_unlock(&target->i_mutex);
2338 if (d_unhashed(new_dentry)) 2344 if (d_unhashed(new_dentry))
2339 d_rehash(new_dentry); 2345 d_rehash(new_dentry);
2340 dput(new_dentry); 2346 dput(new_dentry);
2341 } 2347 }
2342 if (!error) 2348 if (!error)
2343 d_move(old_dentry,new_dentry); 2349 d_move(old_dentry,new_dentry);
2344 return error; 2350 return error;
2345 } 2351 }
2346 2352
2347 static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry, 2353 static int vfs_rename_other(struct inode *old_dir, struct dentry *old_dentry,
2348 struct inode *new_dir, struct dentry *new_dentry) 2354 struct inode *new_dir, struct dentry *new_dentry)
2349 { 2355 {
2350 struct inode *target; 2356 struct inode *target;
2351 int error; 2357 int error;
2352 2358
2353 error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry); 2359 error = security_inode_rename(old_dir, old_dentry, new_dir, new_dentry);
2354 if (error) 2360 if (error)
2355 return error; 2361 return error;
2356 2362
2357 dget(new_dentry); 2363 dget(new_dentry);
2358 target = new_dentry->d_inode; 2364 target = new_dentry->d_inode;
2359 if (target) 2365 if (target)
2360 mutex_lock(&target->i_mutex); 2366 mutex_lock(&target->i_mutex);
2361 if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry)) 2367 if (d_mountpoint(old_dentry)||d_mountpoint(new_dentry))
2362 error = -EBUSY; 2368 error = -EBUSY;
2363 else 2369 else
2364 error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry); 2370 error = old_dir->i_op->rename(old_dir, old_dentry, new_dir, new_dentry);
2365 if (!error) { 2371 if (!error) {
2366 /* The following d_move() should become unconditional */ 2372 /* The following d_move() should become unconditional */
2367 if (!(old_dir->i_sb->s_type->fs_flags & FS_ODD_RENAME)) 2373 if (!(old_dir->i_sb->s_type->fs_flags & FS_ODD_RENAME))
2368 d_move(old_dentry, new_dentry); 2374 d_move(old_dentry, new_dentry);
2369 } 2375 }
2370 if (target) 2376 if (target)
2371 mutex_unlock(&target->i_mutex); 2377 mutex_unlock(&target->i_mutex);
2372 dput(new_dentry); 2378 dput(new_dentry);
2373 return error; 2379 return error;
2374 } 2380 }
2375 2381
2376 int vfs_rename(struct inode *old_dir, struct dentry *old_dentry, 2382 int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
2377 struct inode *new_dir, struct dentry *new_dentry) 2383 struct inode *new_dir, struct dentry *new_dentry)
2378 { 2384 {
2379 int error; 2385 int error;
2380 int is_dir = S_ISDIR(old_dentry->d_inode->i_mode); 2386 int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
2381 const char *old_name; 2387 const char *old_name;
2382 2388
2383 if (old_dentry->d_inode == new_dentry->d_inode) 2389 if (old_dentry->d_inode == new_dentry->d_inode)
2384 return 0; 2390 return 0;
2385 2391
2386 error = may_delete(old_dir, old_dentry, is_dir); 2392 error = may_delete(old_dir, old_dentry, is_dir);
2387 if (error) 2393 if (error)
2388 return error; 2394 return error;
2389 2395
2390 if (!new_dentry->d_inode) 2396 if (!new_dentry->d_inode)
2391 error = may_create(new_dir, new_dentry, NULL); 2397 error = may_create(new_dir, new_dentry, NULL);
2392 else 2398 else
2393 error = may_delete(new_dir, new_dentry, is_dir); 2399 error = may_delete(new_dir, new_dentry, is_dir);
2394 if (error) 2400 if (error)
2395 return error; 2401 return error;
2396 2402
2397 if (!old_dir->i_op || !old_dir->i_op->rename) 2403 if (!old_dir->i_op || !old_dir->i_op->rename)
2398 return -EPERM; 2404 return -EPERM;
2399 2405
2400 DQUOT_INIT(old_dir); 2406 DQUOT_INIT(old_dir);
2401 DQUOT_INIT(new_dir); 2407 DQUOT_INIT(new_dir);
2402 2408
2403 old_name = fsnotify_oldname_init(old_dentry->d_name.name); 2409 old_name = fsnotify_oldname_init(old_dentry->d_name.name);
2404 2410
2405 if (is_dir) 2411 if (is_dir)
2406 error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry); 2412 error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry);
2407 else 2413 else
2408 error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry); 2414 error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry);
2409 if (!error) { 2415 if (!error) {
2410 const char *new_name = old_dentry->d_name.name; 2416 const char *new_name = old_dentry->d_name.name;
2411 fsnotify_move(old_dir, new_dir, old_name, new_name, is_dir, 2417 fsnotify_move(old_dir, new_dir, old_name, new_name, is_dir,
2412 new_dentry->d_inode, old_dentry->d_inode); 2418 new_dentry->d_inode, old_dentry->d_inode);
2413 } 2419 }
2414 fsnotify_oldname_free(old_name); 2420 fsnotify_oldname_free(old_name);
2415 2421
2416 return error; 2422 return error;
2417 } 2423 }
2418 2424
2419 static int do_rename(int olddfd, const char *oldname, 2425 static int do_rename(int olddfd, const char *oldname,
2420 int newdfd, const char *newname) 2426 int newdfd, const char *newname)
2421 { 2427 {
2422 int error = 0; 2428 int error = 0;
2423 struct dentry * old_dir, * new_dir; 2429 struct dentry * old_dir, * new_dir;
2424 struct dentry * old_dentry, *new_dentry; 2430 struct dentry * old_dentry, *new_dentry;
2425 struct dentry * trap; 2431 struct dentry * trap;
2426 struct nameidata oldnd, newnd; 2432 struct nameidata oldnd, newnd;
2427 2433
2428 error = do_path_lookup(olddfd, oldname, LOOKUP_PARENT, &oldnd); 2434 error = do_path_lookup(olddfd, oldname, LOOKUP_PARENT, &oldnd);
2429 if (error) 2435 if (error)
2430 goto exit; 2436 goto exit;
2431 2437
2432 error = do_path_lookup(newdfd, newname, LOOKUP_PARENT, &newnd); 2438 error = do_path_lookup(newdfd, newname, LOOKUP_PARENT, &newnd);
2433 if (error) 2439 if (error)
2434 goto exit1; 2440 goto exit1;
2435 2441
2436 error = -EXDEV; 2442 error = -EXDEV;
2437 if (oldnd.mnt != newnd.mnt) 2443 if (oldnd.mnt != newnd.mnt)
2438 goto exit2; 2444 goto exit2;
2439 2445
2440 old_dir = oldnd.dentry; 2446 old_dir = oldnd.dentry;
2441 error = -EBUSY; 2447 error = -EBUSY;
2442 if (oldnd.last_type != LAST_NORM) 2448 if (oldnd.last_type != LAST_NORM)
2443 goto exit2; 2449 goto exit2;
2444 2450
2445 new_dir = newnd.dentry; 2451 new_dir = newnd.dentry;
2446 if (newnd.last_type != LAST_NORM) 2452 if (newnd.last_type != LAST_NORM)
2447 goto exit2; 2453 goto exit2;
2448 2454
2449 trap = lock_rename(new_dir, old_dir); 2455 trap = lock_rename(new_dir, old_dir);
2450 2456
2451 old_dentry = lookup_hash(&oldnd); 2457 old_dentry = lookup_hash(&oldnd);
2452 error = PTR_ERR(old_dentry); 2458 error = PTR_ERR(old_dentry);
2453 if (IS_ERR(old_dentry)) 2459 if (IS_ERR(old_dentry))
2454 goto exit3; 2460 goto exit3;
2455 /* source must exist */ 2461 /* source must exist */
2456 error = -ENOENT; 2462 error = -ENOENT;
2457 if (!old_dentry->d_inode) 2463 if (!old_dentry->d_inode)
2458 goto exit4; 2464 goto exit4;
2459 /* unless the source is a directory trailing slashes give -ENOTDIR */ 2465 /* unless the source is a directory trailing slashes give -ENOTDIR */
2460 if (!S_ISDIR(old_dentry->d_inode->i_mode)) { 2466 if (!S_ISDIR(old_dentry->d_inode->i_mode)) {
2461 error = -ENOTDIR; 2467 error = -ENOTDIR;
2462 if (oldnd.last.name[oldnd.last.len]) 2468 if (oldnd.last.name[oldnd.last.len])
2463 goto exit4; 2469 goto exit4;
2464 if (newnd.last.name[newnd.last.len]) 2470 if (newnd.last.name[newnd.last.len])
2465 goto exit4; 2471 goto exit4;
2466 } 2472 }
2467 /* source should not be ancestor of target */ 2473 /* source should not be ancestor of target */
2468 error = -EINVAL; 2474 error = -EINVAL;
2469 if (old_dentry == trap) 2475 if (old_dentry == trap)
2470 goto exit4; 2476 goto exit4;
2471 new_dentry = lookup_hash(&newnd); 2477 new_dentry = lookup_hash(&newnd);
2472 error = PTR_ERR(new_dentry); 2478 error = PTR_ERR(new_dentry);
2473 if (IS_ERR(new_dentry)) 2479 if (IS_ERR(new_dentry))
2474 goto exit4; 2480 goto exit4;
2475 /* target should not be an ancestor of source */ 2481 /* target should not be an ancestor of source */
2476 error = -ENOTEMPTY; 2482 error = -ENOTEMPTY;
2477 if (new_dentry == trap) 2483 if (new_dentry == trap)
2478 goto exit5; 2484 goto exit5;
2479 2485
2480 error = vfs_rename(old_dir->d_inode, old_dentry, 2486 error = vfs_rename(old_dir->d_inode, old_dentry,
2481 new_dir->d_inode, new_dentry); 2487 new_dir->d_inode, new_dentry);
2482 exit5: 2488 exit5:
2483 dput(new_dentry); 2489 dput(new_dentry);
2484 exit4: 2490 exit4:
2485 dput(old_dentry); 2491 dput(old_dentry);
2486 exit3: 2492 exit3:
2487 unlock_rename(new_dir, old_dir); 2493 unlock_rename(new_dir, old_dir);
2488 exit2: 2494 exit2:
2489 path_release(&newnd); 2495 path_release(&newnd);
2490 exit1: 2496 exit1:
2491 path_release(&oldnd); 2497 path_release(&oldnd);
2492 exit: 2498 exit:
2493 return error; 2499 return error;
2494 } 2500 }
2495 2501
2496 asmlinkage long sys_renameat(int olddfd, const char __user *oldname, 2502 asmlinkage long sys_renameat(int olddfd, const char __user *oldname,
2497 int newdfd, const char __user *newname) 2503 int newdfd, const char __user *newname)
2498 { 2504 {
2499 int error; 2505 int error;
2500 char * from; 2506 char * from;
2501 char * to; 2507 char * to;
2502 2508
2503 from = getname(oldname); 2509 from = getname(oldname);
2504 if(IS_ERR(from)) 2510 if(IS_ERR(from))
2505 return PTR_ERR(from); 2511 return PTR_ERR(from);
2506 to = getname(newname); 2512 to = getname(newname);
2507 error = PTR_ERR(to); 2513 error = PTR_ERR(to);
2508 if (!IS_ERR(to)) { 2514 if (!IS_ERR(to)) {
2509 error = do_rename(olddfd, from, newdfd, to); 2515 error = do_rename(olddfd, from, newdfd, to);
2510 putname(to); 2516 putname(to);
2511 } 2517 }
2512 putname(from); 2518 putname(from);
2513 return error; 2519 return error;
2514 } 2520 }
2515 2521
2516 asmlinkage long sys_rename(const char __user *oldname, const char __user *newname) 2522 asmlinkage long sys_rename(const char __user *oldname, const char __user *newname)
2517 { 2523 {
2518 return sys_renameat(AT_FDCWD, oldname, AT_FDCWD, newname); 2524 return sys_renameat(AT_FDCWD, oldname, AT_FDCWD, newname);
2519 } 2525 }
2520 2526
2521 int vfs_readlink(struct dentry *dentry, char __user *buffer, int buflen, const char *link) 2527 int vfs_readlink(struct dentry *dentry, char __user *buffer, int buflen, const char *link)
2522 { 2528 {
2523 int len; 2529 int len;
2524 2530
2525 len = PTR_ERR(link); 2531 len = PTR_ERR(link);
2526 if (IS_ERR(link)) 2532 if (IS_ERR(link))
2527 goto out; 2533 goto out;
2528 2534
2529 len = strlen(link); 2535 len = strlen(link);
2530 if (len > (unsigned) buflen) 2536 if (len > (unsigned) buflen)
2531 len = buflen; 2537 len = buflen;
2532 if (copy_to_user(buffer, link, len)) 2538 if (copy_to_user(buffer, link, len))
2533 len = -EFAULT; 2539 len = -EFAULT;
2534 out: 2540 out:
2535 return len; 2541 return len;
2536 } 2542 }
2537 2543
2538 /* 2544 /*
2539 * A helper for ->readlink(). This should be used *ONLY* for symlinks that 2545 * A helper for ->readlink(). This should be used *ONLY* for symlinks that
2540 * have ->follow_link() touching nd only in nd_set_link(). Using (or not 2546 * have ->follow_link() touching nd only in nd_set_link(). Using (or not
2541 * using) it for any given inode is up to filesystem. 2547 * using) it for any given inode is up to filesystem.
2542 */ 2548 */
2543 int generic_readlink(struct dentry *dentry, char __user *buffer, int buflen) 2549 int generic_readlink(struct dentry *dentry, char __user *buffer, int buflen)
2544 { 2550 {
2545 struct nameidata nd; 2551 struct nameidata nd;
2546 void *cookie; 2552 void *cookie;
2547 2553
2548 nd.depth = 0; 2554 nd.depth = 0;
2549 cookie = dentry->d_inode->i_op->follow_link(dentry, &nd); 2555 cookie = dentry->d_inode->i_op->follow_link(dentry, &nd);
2550 if (!IS_ERR(cookie)) { 2556 if (!IS_ERR(cookie)) {
2551 int res = vfs_readlink(dentry, buffer, buflen, nd_get_link(&nd)); 2557 int res = vfs_readlink(dentry, buffer, buflen, nd_get_link(&nd));
2552 if (dentry->d_inode->i_op->put_link) 2558 if (dentry->d_inode->i_op->put_link)
2553 dentry->d_inode->i_op->put_link(dentry, &nd, cookie); 2559 dentry->d_inode->i_op->put_link(dentry, &nd, cookie);
2554 cookie = ERR_PTR(res); 2560 cookie = ERR_PTR(res);
2555 } 2561 }
2556 return PTR_ERR(cookie); 2562 return PTR_ERR(cookie);
2557 } 2563 }
2558 2564
2559 int vfs_follow_link(struct nameidata *nd, const char *link) 2565 int vfs_follow_link(struct nameidata *nd, const char *link)
2560 { 2566 {
2561 return __vfs_follow_link(nd, link); 2567 return __vfs_follow_link(nd, link);
2562 } 2568 }
2563 2569
2564 /* get the link contents into pagecache */ 2570 /* get the link contents into pagecache */
2565 static char *page_getlink(struct dentry * dentry, struct page **ppage) 2571 static char *page_getlink(struct dentry * dentry, struct page **ppage)
2566 { 2572 {
2567 struct page * page; 2573 struct page * page;
2568 struct address_space *mapping = dentry->d_inode->i_mapping; 2574 struct address_space *mapping = dentry->d_inode->i_mapping;
2569 page = read_cache_page(mapping, 0, (filler_t *)mapping->a_ops->readpage, 2575 page = read_cache_page(mapping, 0, (filler_t *)mapping->a_ops->readpage,
2570 NULL); 2576 NULL);
2571 if (IS_ERR(page)) 2577 if (IS_ERR(page))
2572 goto sync_fail; 2578 goto sync_fail;
2573 wait_on_page_locked(page); 2579 wait_on_page_locked(page);
2574 if (!PageUptodate(page)) 2580 if (!PageUptodate(page))
2575 goto async_fail; 2581 goto async_fail;
2576 *ppage = page; 2582 *ppage = page;
2577 return kmap(page); 2583 return kmap(page);
2578 2584
2579 async_fail: 2585 async_fail:
2580 page_cache_release(page); 2586 page_cache_release(page);
2581 return ERR_PTR(-EIO); 2587 return ERR_PTR(-EIO);
2582 2588
2583 sync_fail: 2589 sync_fail:
2584 return (char*)page; 2590 return (char*)page;
2585 } 2591 }
2586 2592
2587 int page_readlink(struct dentry *dentry, char __user *buffer, int buflen) 2593 int page_readlink(struct dentry *dentry, char __user *buffer, int buflen)
2588 { 2594 {
2589 struct page *page = NULL; 2595 struct page *page = NULL;
2590 char *s = page_getlink(dentry, &page); 2596 char *s = page_getlink(dentry, &page);
2591 int res = vfs_readlink(dentry,buffer,buflen,s); 2597 int res = vfs_readlink(dentry,buffer,buflen,s);
2592 if (page) { 2598 if (page) {
2593 kunmap(page); 2599 kunmap(page);
2594 page_cache_release(page); 2600 page_cache_release(page);
2595 } 2601 }
2596 return res; 2602 return res;
2597 } 2603 }
2598 2604
2599 void *page_follow_link_light(struct dentry *dentry, struct nameidata *nd) 2605 void *page_follow_link_light(struct dentry *dentry, struct nameidata *nd)
2600 { 2606 {
2601 struct page *page = NULL; 2607 struct page *page = NULL;
2602 nd_set_link(nd, page_getlink(dentry, &page)); 2608 nd_set_link(nd, page_getlink(dentry, &page));
2603 return page; 2609 return page;
2604 } 2610 }
2605 2611
2606 void page_put_link(struct dentry *dentry, struct nameidata *nd, void *cookie) 2612 void page_put_link(struct dentry *dentry, struct nameidata *nd, void *cookie)
2607 { 2613 {
2608 struct page *page = cookie; 2614 struct page *page = cookie;
2609 2615
2610 if (page) { 2616 if (page) {
2611 kunmap(page); 2617 kunmap(page);
2612 page_cache_release(page); 2618 page_cache_release(page);
2613 } 2619 }
2614 } 2620 }
2615 2621
2616 int __page_symlink(struct inode *inode, const char *symname, int len, 2622 int __page_symlink(struct inode *inode, const char *symname, int len,
2617 gfp_t gfp_mask) 2623 gfp_t gfp_mask)
2618 { 2624 {
2619 struct address_space *mapping = inode->i_mapping; 2625 struct address_space *mapping = inode->i_mapping;
2620 struct page *page; 2626 struct page *page;
2621 int err = -ENOMEM; 2627 int err = -ENOMEM;
2622 char *kaddr; 2628 char *kaddr;
2623 2629
2624 page = find_or_create_page(mapping, 0, gfp_mask); 2630 page = find_or_create_page(mapping, 0, gfp_mask);
2625 if (!page) 2631 if (!page)
2626 goto fail; 2632 goto fail;
2627 err = mapping->a_ops->prepare_write(NULL, page, 0, len-1); 2633 err = mapping->a_ops->prepare_write(NULL, page, 0, len-1);
2628 if (err) 2634 if (err)
2629 goto fail_map; 2635 goto fail_map;
2630 kaddr = kmap_atomic(page, KM_USER0); 2636 kaddr = kmap_atomic(page, KM_USER0);
2631 memcpy(kaddr, symname, len-1); 2637 memcpy(kaddr, symname, len-1);
2632 kunmap_atomic(kaddr, KM_USER0); 2638 kunmap_atomic(kaddr, KM_USER0);
2633 mapping->a_ops->commit_write(NULL, page, 0, len-1); 2639 mapping->a_ops->commit_write(NULL, page, 0, len-1);
2634 /* 2640 /*
2635 * Notice that we are _not_ going to block here - end of page is 2641 * Notice that we are _not_ going to block here - end of page is
2636 * unmapped, so this will only try to map the rest of page, see 2642 * unmapped, so this will only try to map the rest of page, see
2637 * that it is unmapped (typically even will not look into inode - 2643 * that it is unmapped (typically even will not look into inode -
2638 * ->i_size will be enough for everything) and zero it out. 2644 * ->i_size will be enough for everything) and zero it out.
2639 * OTOH it's obviously correct and should make the page up-to-date. 2645 * OTOH it's obviously correct and should make the page up-to-date.
2640 */ 2646 */
2641 if (!PageUptodate(page)) { 2647 if (!PageUptodate(page)) {
2642 err = mapping->a_ops->readpage(NULL, page); 2648 err = mapping->a_ops->readpage(NULL, page);
2643 wait_on_page_locked(page); 2649 wait_on_page_locked(page);
2644 } else { 2650 } else {
2645 unlock_page(page); 2651 unlock_page(page);
2646 } 2652 }
2647 page_cache_release(page); 2653 page_cache_release(page);
2648 if (err < 0) 2654 if (err < 0)
2649 goto fail; 2655 goto fail;
2650 mark_inode_dirty(inode); 2656 mark_inode_dirty(inode);
2651 return 0; 2657 return 0;
2652 fail_map: 2658 fail_map:
2653 unlock_page(page); 2659 unlock_page(page);
2654 page_cache_release(page); 2660 page_cache_release(page);
2655 fail: 2661 fail:
2656 return err; 2662 return err;
2657 } 2663 }
2658 2664
2659 int page_symlink(struct inode *inode, const char *symname, int len) 2665 int page_symlink(struct inode *inode, const char *symname, int len)
2660 { 2666 {
2661 return __page_symlink(inode, symname, len, 2667 return __page_symlink(inode, symname, len,
2662 mapping_gfp_mask(inode->i_mapping)); 2668 mapping_gfp_mask(inode->i_mapping));
2663 } 2669 }
2664 2670
2665 struct inode_operations page_symlink_inode_operations = { 2671 struct inode_operations page_symlink_inode_operations = {
2666 .readlink = generic_readlink, 2672 .readlink = generic_readlink,
2667 .follow_link = page_follow_link_light, 2673 .follow_link = page_follow_link_light,
2668 .put_link = page_put_link, 2674 .put_link = page_put_link,
2669 }; 2675 };
2670 2676
2671 EXPORT_SYMBOL(__user_walk); 2677 EXPORT_SYMBOL(__user_walk);
2672 EXPORT_SYMBOL(__user_walk_fd); 2678 EXPORT_SYMBOL(__user_walk_fd);
2673 EXPORT_SYMBOL(follow_down); 2679 EXPORT_SYMBOL(follow_down);
2674 EXPORT_SYMBOL(follow_up); 2680 EXPORT_SYMBOL(follow_up);
2675 EXPORT_SYMBOL(get_write_access); /* binfmt_aout */ 2681 EXPORT_SYMBOL(get_write_access); /* binfmt_aout */
2676 EXPORT_SYMBOL(getname); 2682 EXPORT_SYMBOL(getname);
2677 EXPORT_SYMBOL(lock_rename); 2683 EXPORT_SYMBOL(lock_rename);
2678 EXPORT_SYMBOL(lookup_hash); 2684 EXPORT_SYMBOL(lookup_hash);
2679 EXPORT_SYMBOL(lookup_one_len); 2685 EXPORT_SYMBOL(lookup_one_len);
2680 EXPORT_SYMBOL(page_follow_link_light); 2686 EXPORT_SYMBOL(page_follow_link_light);
2681 EXPORT_SYMBOL(page_put_link); 2687 EXPORT_SYMBOL(page_put_link);
2682 EXPORT_SYMBOL(page_readlink); 2688 EXPORT_SYMBOL(page_readlink);
2683 EXPORT_SYMBOL(__page_symlink); 2689 EXPORT_SYMBOL(__page_symlink);
2684 EXPORT_SYMBOL(page_symlink); 2690 EXPORT_SYMBOL(page_symlink);
2685 EXPORT_SYMBOL(page_symlink_inode_operations); 2691 EXPORT_SYMBOL(page_symlink_inode_operations);
2686 EXPORT_SYMBOL(path_lookup); 2692 EXPORT_SYMBOL(path_lookup);
2687 EXPORT_SYMBOL(path_release); 2693 EXPORT_SYMBOL(path_release);
2688 EXPORT_SYMBOL(path_walk); 2694 EXPORT_SYMBOL(path_walk);
2689 EXPORT_SYMBOL(permission); 2695 EXPORT_SYMBOL(permission);
2690 EXPORT_SYMBOL(vfs_permission); 2696 EXPORT_SYMBOL(vfs_permission);
2691 EXPORT_SYMBOL(file_permission); 2697 EXPORT_SYMBOL(file_permission);
2692 EXPORT_SYMBOL(unlock_rename); 2698 EXPORT_SYMBOL(unlock_rename);
2693 EXPORT_SYMBOL(vfs_create); 2699 EXPORT_SYMBOL(vfs_create);
2694 EXPORT_SYMBOL(vfs_follow_link); 2700 EXPORT_SYMBOL(vfs_follow_link);
2695 EXPORT_SYMBOL(vfs_link); 2701 EXPORT_SYMBOL(vfs_link);
2696 EXPORT_SYMBOL(vfs_mkdir); 2702 EXPORT_SYMBOL(vfs_mkdir);
2697 EXPORT_SYMBOL(vfs_mknod); 2703 EXPORT_SYMBOL(vfs_mknod);
2698 EXPORT_SYMBOL(generic_permission); 2704 EXPORT_SYMBOL(generic_permission);
2699 EXPORT_SYMBOL(vfs_readlink); 2705 EXPORT_SYMBOL(vfs_readlink);
2700 EXPORT_SYMBOL(vfs_rename); 2706 EXPORT_SYMBOL(vfs_rename);
2701 EXPORT_SYMBOL(vfs_rmdir); 2707 EXPORT_SYMBOL(vfs_rmdir);
2702 EXPORT_SYMBOL(vfs_symlink); 2708 EXPORT_SYMBOL(vfs_symlink);
2703 EXPORT_SYMBOL(vfs_unlink); 2709 EXPORT_SYMBOL(vfs_unlink);
2704 EXPORT_SYMBOL(dentry_unhash); 2710 EXPORT_SYMBOL(dentry_unhash);
2705 EXPORT_SYMBOL(generic_readlink); 2711 EXPORT_SYMBOL(generic_readlink);
2706 2712