11 Oct, 2016

2 commits

  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Al Viro
     

28 Sep, 2016

1 commit

  • current_fs_time() uses struct super_block* as an argument.
    As per Linus's suggestion, this is changed to take struct
    inode* as a parameter instead. This is because the function
    is primarily meant for vfs inode timestamps.
    Also the function was renamed as per Arnd's suggestion.

    Change all calls to current_fs_time() to use the new
    current_time() function instead. current_fs_time() will be
    deleted.

    Signed-off-by: Deepa Dinamani
    Signed-off-by: Al Viro

    Deepa Dinamani
     

27 Sep, 2016

2 commits

  • Generated patch:

    sed -i "s/\.rename2\t/\.rename\t\t/" `git grep -wl rename2`
    sed -i "s/\brename2\b/rename/g" `git grep -wl rename2`

    Signed-off-by: Miklos Szeredi

    Miklos Szeredi
     
  • This is trivial to do:

    - add flags argument to foo_rename()
    - check if flags doesn't have any other than RENAME_NOREPLACE
    - assign foo_rename() to .rename2 instead of .rename

    Filesystems converted:

    affs, bfs, exofs, ext2, hfs, hfsplus, jffs2, jfs, logfs, minix, msdos,
    nilfs2, omfs, reiserfs, sysvfs, ubifs, udf, ufs, vfat.

    Signed-off-by: Miklos Szeredi
    Acked-by: Boaz Harrosh
    Acked-by: Richard Weinberger
    Acked-by: Bob Copeland
    Acked-by: Jan Kara
    Cc: Theodore Ts'o
    Cc: Jaegeuk Kim
    Cc: OGAWA Hirofumi
    Cc: Mikulas Patocka
    Cc: David Woodhouse
    Cc: Dave Kleikamp
    Cc: Ryusuke Konishi
    Cc: Christoph Hellwig

    Miklos Szeredi
     

22 Sep, 2016

1 commit

  • inode_change_ok() will be resposible for clearing capabilities and IMA
    extended attributes and as such will need dentry. Give it as an argument
    to inode_change_ok() instead of an inode. Also rename inode_change_ok()
    to setattr_prepare() to better relect that it does also some
    modifications in addition to checks.

    Reviewed-by: Christoph Hellwig
    Signed-off-by: Jan Kara

    Jan Kara
     

19 Sep, 2016

1 commit


07 Sep, 2016

1 commit


27 Jul, 2016

2 commits

  • Pull block driver updates from Jens Axboe:
    "This branch also contains core changes. I've come to the conclusion
    that from 4.9 and forward, I'll be doing just a single branch. We
    often have dependencies between core and drivers, and it's hard to
    always split them up appropriately without pulling core into drivers
    when that happens.

    That said, this contains:

    - separate secure erase type for the core block layer, from
    Christoph.

    - set of discard fixes, from Christoph.

    - bio shrinking fixes from Christoph, as a followup up to the
    op/flags change in the core branch.

    - map and append request fixes from Christoph.

    - NVMeF (NVMe over Fabrics) code from Christoph. This is pretty
    exciting!

    - nvme-loop fixes from Arnd.

    - removal of ->driverfs_dev from Dan, after providing a
    device_add_disk() helper.

    - bcache fixes from Bhaktipriya and Yijing.

    - cdrom subchannel read fix from Vchannaiah.

    - set of lightnvm updates from Wenwei, Matias, Johannes, and Javier.

    - set of drbd updates and fixes from Fabian, Lars, and Philipp.

    - mg_disk error path fix from Bart.

    - user notification for failed device add for loop, from Minfei.

    - NVMe in general:
    + NVMe delay quirk from Guilherme.
    + SR-IOV support and command retry limits from Keith.
    + fix for memory-less NUMA node from Masayoshi.
    + use UINT_MAX for discard sectors, from Minfei.
    + cancel IO fixes from Ming.
    + don't allocate unused major, from Neil.
    + error code fixup from Dan.
    + use constants for PSDT/FUSE from James.
    + variable init fix from Jay.
    + fabrics fixes from Ming, Sagi, and Wei.
    + various fixes"

    * 'for-4.8/drivers' of git://git.kernel.dk/linux-block: (115 commits)
    nvme/pci: Provide SR-IOV support
    nvme: initialize variable before logical OR'ing it
    block: unexport various bio mapping helpers
    scsi/osd: open code blk_make_request
    target: stop using blk_make_request
    block: simplify and export blk_rq_append_bio
    block: ensure bios return from blk_get_request are properly initialized
    virtio_blk: use blk_rq_map_kern
    memstick: don't allow REQ_TYPE_BLOCK_PC requests
    block: shrink bio size again
    block: simplify and cleanup bvec pool handling
    block: get rid of bio_rw and READA
    block: don't ignore -EOPNOTSUPP blkdev_issue_write_same
    block: introduce BLKDEV_DISCARD_ZERO to fix zeroout
    NVMe: don't allocate unused nvme_major
    nvme: avoid crashes when node 0 is memoryless node.
    nvme: Limit command retries
    loop: Make user notify for adding loop device failed
    nvme-loop: fix nvme-loop Kconfig dependencies
    nvmet: fix return value check in nvmet_subsys_alloc()
    ...

    Linus Torvalds
     
  • Pull core block updates from Jens Axboe:

    - the big change is the cleanup from Mike Christie, cleaning up our
    uses of command types and modified flags. This is what will throw
    some merge conflicts

    - regression fix for the above for btrfs, from Vincent

    - following up to the above, better packing of struct request from
    Christoph

    - a 2038 fix for blktrace from Arnd

    - a few trivial/spelling fixes from Bart Van Assche

    - a front merge check fix from Damien, which could cause issues on
    SMR drives

    - Atari partition fix from Gabriel

    - convert cfq to highres timers, since jiffies isn't granular enough
    for some devices these days. From Jan and Jeff

    - CFQ priority boost fix idle classes, from me

    - cleanup series from Ming, improving our bio/bvec iteration

    - a direct issue fix for blk-mq from Omar

    - fix for plug merging not involving the IO scheduler, like we do for
    other types of merges. From Tahsin

    - expose DAX type internally and through sysfs. From Toshi and Yigal

    * 'for-4.8/core' of git://git.kernel.dk/linux-block: (76 commits)
    block: Fix front merge check
    block: do not merge requests without consulting with io scheduler
    block: Fix spelling in a source code comment
    block: expose QUEUE_FLAG_DAX in sysfs
    block: add QUEUE_FLAG_DAX for devices to advertise their DAX support
    Btrfs: fix comparison in __btrfs_map_block()
    block: atari: Return early for unsupported sector size
    Doc: block: Fix a typo in queue-sysfs.txt
    cfq-iosched: Charge at least 1 jiffie instead of 1 ns
    cfq-iosched: Fix regression in bonnie++ rewrite performance
    cfq-iosched: Convert slice_resid from u64 to s64
    block: Convert fifo_time from ulong to u64
    blktrace: avoid using timespec
    block/blk-cgroup.c: Declare local symbols static
    block/bio-integrity.c: Add #include "blk.h"
    block/partition-generic.c: Remove a set-but-not-used variable
    block: bio: kill BIO_MAX_SIZE
    cfq-iosched: temporarily boost queue priority for idle classes
    block: drbd: avoid to use BIO_MAX_SIZE
    block: bio: remove BIO_MAX_SECTORS
    ...

    Linus Torvalds
     

21 Jul, 2016

1 commit

  • These two are confusing leftover of the old world order, combining
    values of the REQ_OP_ and REQ_ namespaces. For callers that don't
    special case we mostly just replace bi_rw with bio_data_dir or
    op_is_write, except for the few cases where a switch over the REQ_OP_
    values makes more sense. Any check for READA is replaced with an
    explicit check for REQ_RAHEAD. Also remove the READA alias for
    REQ_RAHEAD.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Mike Christie
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Jun, 2016

1 commit


19 May, 2016

3 commits

  • UDF/OSTA terminology is confusing. Partition Numbers (PNs) are arbitrary
    16-bit values, one for each physical partition in the volume. Partition
    Reference Numbers (PRNs) are indices into the the Partition Map Table
    and do not necessarily equal the PN of the mapped partition.

    The current metadata code mistakenly uses the PN instead of the PRN when
    mapping metadata blocks to physical/sparable blocks. Windows-created
    UDF 2.5 discs for some reason use large, arbitrary PNs, resulting in
    mount failure and KASAN read warnings in udf_read_inode().

    For example, a NetBSD UDF 2.5 partition might look like this:

    PRN PN Type
    --- -- ----
    0 0 Sparable
    1 0 Metadata

    Since PRN == PN, we are fine.

    But Windows could gives us:

    PRN PN Type
    --- ---- ----
    0 8192 Sparable
    1 8192 Metadata

    So udf_read_inode() will start out by checking the partition length in
    sbi->s_partmaps[8192], which is obviously out of bounds.

    Fix this by creating a new field (s_phys_partition_ref) in struct
    udf_meta_data, referencing whatever physical or sparable map has the
    same partition number as the metadata partition.

    [JK: Add comment about s_phys_partition_ref, change its name]

    Signed-off-by: Alden Tondettar
    Signed-off-by: Jan Kara

    Alden Tondettar
     
  • Currently when udf_get_pblock_meta25() fails to map a block using the
    primary metadata file, it will attempt to load the mirror file entry by
    calling udf_find_metadata_inode_efe(). That function will return a ERR_PTR
    if it fails, but the return value is only checked against NULL. Test the
    return value using IS_ERR() and change it to NULL if needed.

    Signed-off-by: Alden Tondettar
    Signed-off-by: Jan Kara

    Alden Tondettar
     
  • Currently, if a metadata partition map is missing its partition descriptor,
    then udf_get_pblock_meta25() will BUG() out the first time it is called.
    This is rather drastic for a corrupted filesystem, so just treat this case
    as an invalid mapping instead.

    Signed-off-by: Alden Tondettar
    Signed-off-by: Jan Kara

    Alden Tondettar
     

18 May, 2016

2 commits


17 May, 2016

1 commit


09 May, 2016

1 commit


03 May, 2016

1 commit


02 May, 2016

3 commits


28 Apr, 2016

1 commit


26 Apr, 2016

1 commit

  • Presently, a corrupted or malicious UDF filesystem containing a very large
    number (or cycle) of Logical Volume Integrity Descriptor extent
    indirections may trigger a stack overflow and kernel panic in
    udf_load_logicalvolint() on mount.

    Replace the unnecessary recursion in udf_load_logicalvolint() with
    simple iteration. Set an arbitrary limit of 1000 indirections (which would
    have almost certainly overflowed the stack without this fix), and treat
    such cases as if there were no LVID.

    Signed-off-by: Alden Tondettar
    Signed-off-by: Jan Kara

    Alden Tondettar
     

25 Apr, 2016

1 commit

  • Commit 9293fcfbc1812a22ad5ce1b542eb90c1bbe01be1
    ("udf: Remove struct ustr as non-needed intermediate storage"),
    while getting rid of 'struct ustr', does not take any special care
    of 'dstring' fields and effectively use fixed field length instead
    of actual string length, encoded in the last byte of the field.

    Also, commit 484a10f49387e4386bf2708532e75bf78ffea2cb
    ("udf: Merge linux specific translation into CS0 conversion function")
    introduced checking of the length of the string being converted,
    requiring proper alignment to number of bytes constituing each
    character.

    The UDF volume identifier is represented as a 32-bytes 'dstring',
    and needs to be converted from CS0 to UTF8, while mounting UDF
    filesystem. The changes in mentioned commits can in some cases
    lead to incorrect handling of volume identifier:
    - if the actual string in 'dstring' is of maximal length and
    does not have zero bytes separating it from dstring encoded
    length in last byte, that last byte may be included in conversion,
    thus making incorrect resulting string;
    - if the identifier is encoded with 2-bytes characters (compression
    code is 16), the length of 31 bytes (32 bytes of field length minus
    1 byte of compression code), taken as the string length, is reported
    as an incorrect (unaligned) length, and the conversion fails, which
    in its turn leads to volume mounting failure.

    This patch introduces handling of 'dstring' encoded length field
    in udf_CS0toUTF8 function, that is used in all and only cases
    when 'dstring' fields are converted. Currently these cases are
    processing of Volume Identifier and Volume Set Identifier fields.
    The function is also renamed to udf_dstrCS0toUTF8 to distinctly
    indicate that it handles 'dstring' input.

    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov
     

11 Apr, 2016

1 commit


05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

09 Feb, 2016

6 commits

  • Current implementation of udf_translate_to_linux function does not
    support multi-bytes characters at all: it counts bytes while calculating
    extension length, when inserting CRC inside the name it doesn't
    take into account inter-character boundaries and can break into
    the middle of the character.

    The most efficient way to properly support multi-bytes characters is
    merging of translation operations directly into conversion function.
    This can help to avoid extra passes along the string or parsing
    the multi-bytes character back into unicode to find out it's length.

    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov
     
  • Although 'struct ustr' tries to structurize the data by combining
    the string and its length, it doesn't actually make much benefit,
    since it saves only one parameter, but introduces an extra copying
    of the whole buffer, serving as an intermediate storage. It looks
    quite inefficient and not actually needed.

    This commit gets rid of the struct ustr by changing the parameters
    of some functions appropriately.

    Also, it removes using 'dstring' type, since it doesn't make much
    sense too.

    Just using the occasion, add a 'const' qualifier to udf_get_filename
    to make consistent parameters sets.

    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov
     
  • Code in udf_find_entry() and udf_readdir() used the same buffer for
    storing filename that was split among blocks and for the resulting
    filename in utf8. This worked because udf_get_filename() first
    internally copied the name into a different buffer and only then
    performed a conversion into the destination buffer. However we want to
    get rid of intermediate buffers so use separate buffer for converted
    name and name split between blocks so that we don't have the same source
    and destination buffer when converting split names.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Actual name length restriction is 254 bytes, this is used in 'ustr'
    structure, and this is what fits into UDF File Ident structures.
    And in most cases the constant is used as UDF_NAME_LEN-2.
    So, it's better to just modify the constant to make it closer
    to reality.

    Also, in some cases it's useful to have a separate constant for
    the maximum length of file name field in CS0 encoding in UDF File
    Ident structures.

    Also, remove the unused UDF_PATH_LEN constant.

    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov
     
  • There is no much sense to have separate functions for UTF8 and
    NLS conversions, since UTF8 encoding is actually the special case
    of NLS.

    However, although UTF8 is also supported by general NLS framework,
    it would be good to have separate UTF8 character conversion functions
    (char2uni and uni2char) locally in UDF code, so that they could be
    used even if NLS support is not enabled in the kernel configuration.

    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov
     
  • Make the desired output length a parameter rather than have it
    hard-coded to UDF_NAME_LEN. Although all call sites still have
    this length the same, this parameterization will make the function
    more universal and also consistent with udf_get_filename.

    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov
     

24 Jan, 2016

1 commit

  • Pull final vfs updates from Al Viro:

    - The ->i_mutex wrappers (with small prereq in lustre)

    - a fix for too early freeing of symlink bodies on shmem (they need to
    be RCU-delayed) (-stable fodder)

    - followup to dedupe stuff merged this cycle

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: abort dedupe loop if fatal signals are pending
    make sure that freeing shmem fast symlinks is RCU-delayed
    wrappers for ->i_mutex access
    lustre: remove unused declaration

    Linus Torvalds
     

23 Jan, 2016

2 commits

  • There are many locations that do

    if (memory_was_allocated_by_vmalloc)
    vfree(ptr);
    else
    kfree(ptr);

    but kvfree() can handle both kmalloc()ed memory and vmalloc()ed memory
    using is_vmalloc_addr(). Unless callers have special reasons, we can
    replace this branch with kvfree(). Please check and reply if you found
    problems.

    Signed-off-by: Tetsuo Handa
    Acked-by: Michal Hocko
    Acked-by: Jan Kara
    Acked-by: Russell King
    Reviewed-by: Andreas Dilger
    Acked-by: "Rafael J. Wysocki"
    Acked-by: David Rientjes
    Cc: "Luck, Tony"
    Cc: Oleg Drokin
    Cc: Boris Petkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

16 Jan, 2016

1 commit

  • Pull UDF fixes and quota cleanups from Jan Kara:
    "Several UDF fixes and some minor quota cleanups"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    udf: Check output buffer length when converting name to CS0
    udf: Prevent buffer overrun with multi-byte characters
    quota: constify qtree_fmt_operations structures
    udf: avoid uninitialized variable use
    udf: Fix lost indirect extent block
    udf: Factor out code for creating indirect extent
    udf: limit the maximum number of indirect extents in a row
    udf: limit the maximum number of TD redirections
    fs: make quota/dquot.c explicitly non-modular
    fs: make quota/netlink.c explicitly non-modular

    Linus Torvalds
     

15 Jan, 2016

1 commit

  • Mark those kmem allocations that are known to be easily triggered from
    userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
    memcg. For the list, see below:

    - threadinfo
    - task_struct
    - task_delay_info
    - pid
    - cred
    - mm_struct
    - vm_area_struct and vm_region (nommu)
    - anon_vma and anon_vma_chain
    - signal_struct
    - sighand_struct
    - fs_struct
    - files_struct
    - fdtable and fdtable->full_fds_bits
    - dentry and external_name
    - inode for all filesystems. This is the most tedious part, because
    most filesystems overwrite the alloc_inode method.

    The list is far from complete, so feel free to add more objects.
    Nevertheless, it should be close to "account everything" approach and
    keep most workloads within bounds. Malevolent users will be able to
    breach the limit, but this was possible even with the former "account
    everything" approach (simply because it did not account everything in
    fact).

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Vladimir Davydov
    Acked-by: Johannes Weiner
    Acked-by: Michal Hocko
    Cc: Tejun Heo
    Cc: Greg Thelen
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

04 Jan, 2016

1 commit

  • If a name contains at least some characters with Unicode values
    exceeding single byte, the CS0 output should have 2 bytes per character.
    And if other input characters have single byte Unicode values, then
    the single input byte is converted to 2 output bytes, and the length
    of output becomes larger than the length of input. And if the input
    name is long enough, the output length may exceed the allocated buffer
    length.

    All this means that conversion from UTF8 or NLS to CS0 requires
    checking of output length in order to stop when it exceeds the given
    output buffer size.

    [JK: Make code return -ENAMETOOLONG instead of silently truncating the
    name]

    CC: stable@vger.kernel.org
    Signed-off-by: Andrew Gabbasov
    Signed-off-by: Jan Kara

    Andrew Gabbasov