Commit 96391e2bae0f8882b6f44809202a68be66e91dce

Authored by Boaz Harrosh
1 parent 86093aaff5

exofs: Error recovery if object is missing from storage

If an object is referenced by a directory but does not
exist on a target, it is a very serious corruption that
means:
1. Either a power failure with very slim chance of it
  happening. Because the directory update is always submitted
  much after object creation, but if a directory is written
  to one device and the object creation to another it might
  theoretically happen.
2. It only ever happened to me while developing with BUGs
  causing file corruption. Crashes could also cause it but
  they are more like case 1.

In any way the object does not exist, so data is surely lost.
If there is a mix-up in the obj-id or data-map, then lost objects
can be salvaged by off-line fsck. The only recoverable information
is the directory name. By letting it appear as a regular empty file,
with date==0 (1970 Jan 1st) ownership to root, we enable recovery
of the only useful information. And also enable deletion or over-write.
I can see how this can hurt.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>

Showing 1 changed file with 11 additions and 1 deletions Side-by-side Diff

... ... @@ -903,8 +903,18 @@
903 903 ios->in_attr_len = ARRAY_SIZE(attrs);
904 904  
905 905 ret = exofs_sbi_read(ios);
906   - if (ret)
  906 + if (unlikely(ret)) {
  907 + EXOFS_ERR("object(0x%llx) corrupted, return empty file=>%d\n",
  908 + _LLU(ios->obj.id), ret);
  909 + memset(inode, 0, sizeof(*inode));
  910 + inode->i_mode = 0040000 | (0777 & ~022);
  911 + /* If object is lost on target we might as well enable it's
  912 + * delete.
  913 + */
  914 + if ((ret == -ENOENT) || (ret == -EINVAL))
  915 + ret = 0;
907 916 goto out;
  917 + }
908 918  
909 919 ret = extract_attr_from_ios(ios, &attrs[0]);
910 920 if (ret) {