06 Jan, 2017

40 commits

  • commit 9eff1140a82db8c5520f76e51c21827b4af670b3 upstream.

    Systemd on reboot enables shutdown watchdog that leaves the watchdog
    device open to ensure that even if power down process get stuck the
    platform reboots nonetheless.
    The iamt_wdt is an alarm-only watchdog and can't reboot system, but the
    FW will generate an alarm event reboot was completed in time, as the
    watchdog is not automatically disabled during power cycle.
    So we should request stop watchdog on reboot to eliminate wrong alarm
    from the FW.

    Signed-off-by: Alexander Usyskin
    Signed-off-by: Tomas Winkler
    Reviewed-by: Guenter Roeck
    Signed-off-by: Guenter Roeck
    Signed-off-by: Greg Kroah-Hartman

    Alexander Usyskin
     
  • commit 4d1f0fb096aedea7bb5489af93498a82e467c480 upstream.

    NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ.
    Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP
    pointing to the code interrupted by IRQ which was interrupted by NMI.
    NULL isn't a problem: in this case watchdog calls dump_stack() and
    prints full stack trace including NMI. But if we're stuck in IRQ
    handler then NMI watchlog will print stack trace without IRQ part at
    all.

    This patch uses registers snapshot passed into NMI handler as arguments:
    these registers point exactly to the instruction interrupted by NMI.

    Fixes: 55537871ef66 ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup")
    Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz
    Signed-off-by: Konstantin Khlebnikov
    Cc: Jiri Kosina
    Cc: Ulrich Obergfell
    Cc: Aaron Tomlin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Konstantin Khlebnikov
     
  • commit e3d240e9d505fc67f8f8735836df97a794bbd946 upstream.

    If maxBuf is not 0 but less than a size of SMB2 lock structure
    we can end up with a memory corruption.

    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Greg Kroah-Hartman

    Pavel Shilovsky
     
  • commit b0a752b5ce76bd1d8b733a53757c3263511dcb69 upstream.

    Reviewed-by: Aurelien Aptel
    Acked-by: Sachin Prabhu
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Greg Kroah-Hartman

    Pavel Shilovsky
     
  • commit 96a988ffeb90dba33a71c3826086fe67c897a183 upstream.

    With the current code it is possible to lock a mutex twice when
    a subsequent reconnects are triggered. On the 1st reconnect we
    reconnect sessions and tcons and then persistent file handles.
    If the 2nd reconnect happens during the reconnecting of persistent
    file handles then the following sequence of calls is observed:

    cifs_reopen_file -> SMB2_open -> small_smb2_init -> smb2_reconnect
    -> cifs_reopen_persistent_file_handles -> cifs_reopen_file (again!).

    So, we are trying to acquire the same cfile->fh_mutex twice which
    is wrong. Fix this by moving reconnecting of persistent handles to
    the delayed work (smb2_reconnect_server) and submitting this work
    every time we reconnect tcon in SMB2 commands handling codepath.

    This can also lead to corruption of a temporary file list in
    cifs_reopen_persistent_file_handles() because we can recursively
    call this function twice.

    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Greg Kroah-Hartman

    Pavel Shilovsky
     
  • commit 4772c79599564bd08ee6682715a7d3516f67433f upstream.

    Acked-by: Sachin Prabhu
    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Greg Kroah-Hartman

    Pavel Shilovsky
     
  • commit 53e0e11efe9289535b060a51d4cf37c25e0d0f2b upstream.

    We can not unlock/lock cifs_tcp_ses_lock while walking through ses
    and tcon lists because it can corrupt list iterator pointers and
    a tcon structure can be released if we don't hold an extra reference.
    Fix it by moving a reconnect process to a separate delayed work
    and acquiring a reference to every tcon that needs to be reconnected.
    Also do not send an echo request on newly established connections.

    Signed-off-by: Pavel Shilovsky
    Signed-off-by: Greg Kroah-Hartman

    Pavel Shilovsky
     
  • commit 06deeec77a5a689cc94b21a8a91a76e42176685d upstream.

    smbencrypt() points a scatterlist to the stack, which is breaks if
    CONFIG_VMAP_STACK=y.

    Fix it by switching to crypto_cipher_encrypt_one(). The new code
    should be considerably faster as an added benefit.

    This code is nearly identical to some code that Eric Biggers
    suggested.

    Reported-by: Eric Biggers
    Signed-off-by: Andy Lutomirski
    Acked-by: Jeff Layton
    Signed-off-by: Steve French
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 2fc995a87f2efcd803438f07bfecd35cc3d90d32 upstream.

    When ASoC Intel SST Medfield driver is probed but without codec / card
    assigned, it causes an Oops and freezes the kernel at suspend/resume,

    PM: Suspending system (freeze)
    Suspending console(s) (use no_console_suspend to debug)
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
    IP: [] sst_soc_prepare+0x19/0xa0 [snd_soc_sst_mfld_platform]
    Oops: 0000 [#1] PREEMPT SMP
    CPU: 0 PID: 1552 Comm: systemd-sleep Tainted: G W 4.9.0-rc6-1.g5f5c2ad-default #1
    Call Trace:
    [] dpm_prepare+0x209/0x460
    [] dpm_suspend_start+0x11/0x60
    [] suspend_devices_and_enter+0xb2/0x710
    [] pm_suspend+0x30e/0x390
    [] state_store+0x8a/0x90
    [] kobj_attr_store+0xf/0x20
    [] sysfs_kf_write+0x37/0x40
    [] kernfs_fop_write+0x11c/0x1b0
    [] __vfs_write+0x28/0x140
    [] ? apparmor_file_permission+0x18/0x20
    [] ? security_file_permission+0x3b/0xc0
    [] vfs_write+0xb5/0x1a0
    [] SyS_write+0x46/0xa0
    [] entry_SYSCALL_64_fastpath+0x1e/0xad

    Add proper NULL checks in the PM code of mdfld driver.

    Signed-off-by: Takashi Iwai
    Acked-by: Vinod Koul
    Signed-off-by: Mark Brown
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 314c25c56c1ee5026cf99c570bdfe01847927acb upstream.

    In dm_sm_metadata_create() we temporarily change the dm_space_map
    operations from 'ops' (whose .destroy function deallocates the
    sm_metadata) to 'bootstrap_ops' (whose .destroy function doesn't).

    If dm_sm_metadata_create() fails in sm_ll_new_metadata() or
    sm_ll_extend(), it exits back to dm_tm_create_internal(), which calls
    dm_sm_destroy() with the intention of freeing the sm_metadata, but it
    doesn't (because the dm_space_map operations is still set to
    'bootstrap_ops').

    Fix this by setting the dm_space_map operations back to 'ops' if
    dm_sm_metadata_create() fails when it is set to 'bootstrap_ops'.

    Signed-off-by: Benjamin Marzinski
    Acked-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Benjamin Marzinski
     
  • commit 11e2968478edc07a75ee1efb45011b3033c621c2 upstream.

    Commit ecbfb9f118 ("dm raid: add raid level takeover support") moved the
    configure_discard_support() call from raid_ctr() to raid_preresume().

    Enabling/disabling discard _must_ happen during table load (through the
    .ctr hook). Fix this regression by moving the
    configure_discard_support() call back to raid_ctr().

    Fixes: ecbfb9f118 ("dm raid: add raid level takeover support")
    Signed-off-by: Heinz Mauelshagen
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Heinz Mauelshagen
     
  • commit d15bb3a6467e102e60d954aadda5fb19ce6fd8ec upstream.

    It is required to hold the queue lock when calling blk_run_queue_async()
    to avoid that a race between blk_run_queue_async() and
    blk_cleanup_queue() is triggered.

    Signed-off-by: Bart Van Assche
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit 265e9098bac02bc5e36cda21fdbad34cb5b2f48d upstream.

    In crypt_set_key(), if a failure occurs while replacing the old key
    (e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag
    set. Otherwise, the crypto layer would have an invalid key that still
    has DM_CRYPT_KEY_VALID flag set.

    Signed-off-by: Ondrej Kozina
    Reviewed-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Ondrej Kozina
     
  • commit bff7e067ee518f9ed7e1cbc63e4c9e01670d0b71 upstream.

    Fix to return error code -EINVAL instead of 0, as is done elsewhere in
    this function.

    Fixes: e80d1c805a3b ("dm: do not override error code returned from dm_get_device()")
    Signed-off-by: Wei Yongjun
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Wei Yongjun
     
  • commit 301fc3f5efb98633115bd887655b19f42c6dfaa8 upstream.

    When dm_table_set_type() is used by a target to establish a DM table's
    type (e.g. DM_TYPE_MQ_REQUEST_BASED in the case of DM multipath) the
    DM core must go on to verify that the devices in the table are
    compatible with the established type.

    Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature")
    Signed-off-by: Bart Van Assche
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit 6936c12cf809850180b24947271b8f068fdb15e9 upstream.

    An earlier DM multipath table could have been build ontop of underlying
    devices that were all using blk-mq. In that case, if that active
    multipath table is replaced with an empty DM multipath table (that
    reflects all paths have failed) then it is important that the
    'all_blk_mq' state of the active table is transfered to the new empty DM
    table. Otherwise dm-rq.c:dm_old_prep_tio() will incorrectly clone a
    request that isn't needed by the DM multipath target when it is to issue
    IO to an underlying blk-mq device.

    Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature")
    Reported-by: Bart Van Assche
    Tested-by: Bart Van Assche
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Mike Snitzer
     
  • commit bc27c01b5c46d3bfec42c96537c7a3fae0bb2cc4 upstream.

    The meaning of the BLK_MQ_S_STOPPED flag is "do not call
    .queue_rq()". Hence modify blk_mq_make_request() such that requests
    are queued instead of issued if a queue has been stopped.

    Reported-by: Ming Lei
    Signed-off-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Ming Lei
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Johannes Thumshirn
    Reviewed-by: Sagi Grimberg
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Bart Van Assche
     
  • commit dc39d06fcd7a4a82d72eae7b71e94e888b96d29e upstream.

    The OPP structure must not be used out of the rcu protected section.
    Cache the values to be used in separate variables instead.

    Signed-off-by: Viresh Kumar
    Reviewed-by: Stephen Boyd
    Tested-by: Dave Gerlach
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Viresh Kumar
     
  • commit 91291d9ad92faa65a56a9a19d658d8049b78d3d4 upstream.

    Joonyoung Shim reported an interesting problem on his ARM octa-core
    Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator()
    was failing for a struct device for which dev_pm_opp_set_regulator() is
    called earlier.

    This happened because an earlier call to
    dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file)
    removed all the entries from opp_table->dev_list apart from the last CPU
    device in the cpumask of CPUs sharing the OPP.

    But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator()
    routines get CPU device for the first CPU in the cpumask. And so the OPP
    core failed to find the OPP table for the struct device.

    This patch attempts to fix this problem by returning a pointer to the
    opp_table from dev_pm_opp_set_regulator() and using that as the
    parameter to dev_pm_opp_put_regulator(). This ensures that the
    dev_pm_opp_put_regulator() doesn't fail to find the opp table.

    Note that similar design problem also exists with other
    dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and
    so we don't need to update them for now.

    Reported-by: Joonyoung Shim
    Signed-off-by: Stephen Boyd
    Signed-off-by: Viresh Kumar
    [ Viresh: Wrote commit log and tested on exynos 5250 ]
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Stephen Boyd
     
  • commit eaa496ffaaf19591fe471a36cef366146eeb9153 upstream.

    ep->mult is supposed to be set to Isochronous and
    Interrupt Endapoint's multiplier value. This value
    is computed from different places depending on the
    link speed.

    If we're dealing with HighSpeed, then it's part of
    bits [12:11] of wMaxPacketSize. This case wasn't
    taken into consideration before.

    While at that, also make sure the ep->mult defaults
    to one so drivers can use it unconditionally and
    assume they'll never multiply ep->maxpacket to zero.

    Cc:
    Signed-off-by: Felipe Balbi
    Signed-off-by: Greg Kroah-Hartman

    Felipe Balbi
     
  • commit a6de734bc002fe2027ccc074fbbd87d72957b7a4 upstream.

    Vlastimil Babka pointed out that commit 479f854a207c ("mm, page_alloc:
    defer debugging checks of pages allocated from the PCP") will allow the
    per-cpu list counter to be out of sync with the per-cpu list contents if
    a struct page is corrupted.

    The consequence is an infinite loop if the per-cpu lists get fully
    drained by free_pcppages_bulk because all the lists are empty but the
    count is positive. The infinite loop occurs here

    do {
    batch_free++;
    if (++migratetype == MIGRATE_PCPTYPES)
    migratetype = 0;
    list = &pcp->lists[migratetype];
    } while (list_empty(list));

    What the user sees is a bad page warning followed by a soft lockup with
    interrupts disabled in free_pcppages_bulk().

    This patch keeps the accounting in sync.

    Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP")
    Link: http://lkml.kernel.org/r/20161202112951.23346-2-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Acked-by: Michal Hocko
    Acked-by: Hillf Danton
    Cc: Christoph Lameter
    Cc: Johannes Weiner
    Cc: Jesper Dangaard Brouer
    Cc: Joonsoo Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Mel Gorman
     
  • commit 5f33a0803bbd781de916f5c7448cbbbbc763d911 upstream.

    Our system uses significantly more slab memory with memcg enabled with
    the latest kernel. With 3.10 kernel, slab uses 2G memory, while with
    4.6 kernel, 6G memory is used. The shrinker has problem. Let's see we
    have two memcg for one shrinker. In do_shrink_slab:

    1. Check cg1. nr_deferred = 0, assume total_scan = 700. batch size
    is 1024, then no memory is freed. nr_deferred = 700

    2. Check cg2. nr_deferred = 700. Assume freeable = 20, then
    total_scan = 10 or 40. Let's assume it's 10. No memory is freed.
    nr_deferred = 10.

    The deferred share of cg1 is lost in this case. kswapd will free no
    memory even run above steps again and again.

    The fix makes sure one memcg's deferred share isn't lost.

    Link: http://lkml.kernel.org/r/2414be961b5d25892060315fbb56bb19d81d0c07.1476227351.git.shli@fb.com
    Signed-off-by: Shaohua Li
    Cc: Johannes Weiner
    Cc: Michal Hocko
    Cc: Vladimir Davydov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Shaohua Li
     
  • commit e4fcf07cca6a3b6c4be00df16f08be894325eaa3 upstream.

    When removing a namespace we delete it from the subsystem namespaces
    list with list_del_init which allows us to know if it is enabled or
    not.

    The problem is that list_del_init initialize the list next and does
    not respect the RCU list-traversal we do on the IO path for locating
    a namespace. Instead we need to use list_del_rcu which is allowed to
    run concurrently with the _rcu list-traversal primitives (keeps list
    next intact) and guarantees concurrent nvmet_find_naespace forward
    progress.

    By changing that, we cannot rely on ns->dev_link for knowing if the
    namspace is enabled, so add enabled indicator entry to nvmet_ns for
    that.

    Signed-off-by: Sagi Grimberg
    Signed-off-by: Solganik Alexander
    Signed-off-by: Greg Kroah-Hartman

    Solganik Alexander
     
  • commit b4a567e8114327518c09f5632339a5954ab975a3 upstream.

    ->queue_rq() should return one of the BLK_MQ_RQ_QUEUE_* constants, not
    an errno.

    Fixes: f4aa4c7bbac6 ("block: loop: convert to per-device workqueue")
    Signed-off-by: Omar Sandoval
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Omar Sandoval
     
  • commit 8508e44ae98622f841f5ef29d0bf3d5db4e0c1cc upstream.

    We don't guarantee cp_addr is fixed by cp_version.
    This is to sync with f2fs-tools.

    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim
     
  • commit e87f7329bbd6760c2acc4f1eb423362b08851a71 upstream.

    In the last ilen case, i was already increased, resulting in accessing out-
    of-boundary entry of do_replace and blkaddr.
    Fix to check ilen first to exit the loop.

    Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up")
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim
     
  • commit 05e6ea2685c964db1e675a24a4f4e2adc22d2388 upstream.

    The struct file_operations instance serving the f2fs/status debugfs file
    lacks an initialization of its ->owner.

    This means that although that file might have been opened, the f2fs module
    can still get removed. Any further operation on that opened file, releasing
    included, will cause accesses to unmapped memory.

    Indeed, Mike Marshall reported the following:

    BUG: unable to handle kernel paging request at ffffffffa0307430
    IP: [] full_proxy_release+0x24/0x90

    Call Trace:
    [] __fput+0xdf/0x1d0
    [] ____fput+0xe/0x10
    [] task_work_run+0x8e/0xc0
    [] do_exit+0x2ae/0xae0
    [] ? __audit_syscall_entry+0xae/0x100
    [] ? syscall_trace_enter+0x1ca/0x310
    [] do_group_exit+0x44/0xc0
    [] SyS_exit_group+0x14/0x20
    [] do_syscall_64+0x61/0x150
    [] entry_SYSCALL64_slow_path+0x25/0x25

    ---[ end trace f22ae883fa3ea6b8 ]---
    Fixing recursive fault but reboot is needed!

    Fix this by initializing the f2fs/status file_operations' ->owner with
    THIS_MODULE.

    This will allow debugfs to grab a reference to the f2fs module upon any
    open on that file, thus preventing it from getting removed.

    Fixes: 902829aa0b72 ("f2fs: move proc files to debugfs")
    Reported-by: Mike Marshall
    Reported-by: Martin Brandenburg
    Signed-off-by: Nicolai Stange
    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Nicolai Stange
     
  • commit 204706c7accfabb67b97eef9f9a28361b6201199 upstream.

    This reverts commit 1beba1b3a953107c3ff5448ab4e4297db4619c76.

    The perpcu_counter doesn't provide atomicity in single core and consume more
    DRAM. That incurs fs_mark test failure due to ENOMEM.

    Signed-off-by: Jaegeuk Kim
    Signed-off-by: Greg Kroah-Hartman

    Jaegeuk Kim
     
  • commit 73b92a2a5e97d17cc4d5c4fe9d724d3273fb6fd2 upstream.

    Currently data journalling is incompatible with encryption: enabling both
    at the same time has never been supported by design, and would result in
    unpredictable behavior. However, users are not precluded from turning on
    both features simultaneously. This change programmatically replaces data
    journaling for encrypted regular files with ordered data journaling mode.

    Background:
    Journaling encrypted data has not been supported because it operates on
    buffer heads of the page in the page cache. Namely, when the commit
    happens, which could be up to five seconds after caching, the commit
    thread uses the buffer heads attached to the page to copy the contents of
    the page to the journal. With encryption, it would have been required to
    keep the bounce buffer with ciphertext for up to the aforementioned five
    seconds, since the page cache can only hold plaintext and could not be
    used for journaling. Alternatively, it would be required to setup the
    journal to initiate a callback at the commit time to perform deferred
    encryption - in this case, not only would the data have to be written
    twice, but it would also have to be encrypted twice. This level of
    complexity was not justified for a mode that in practice is very rarely
    used because of the overhead from the data journalling.

    Solution:
    If data=journaled has been set as a mount option for a filesystem, or if
    journaling is enabled on a regular file, do not perform journaling if the
    file is also encrypted, instead fall back to the data=ordered mode for the
    file.

    Rationale:
    The intent is to allow seamless and proper filesystem operation when
    journaling and encryption have both been enabled, and have these two
    conflicting features gracefully resolved by the filesystem.

    Fixes: 4461471107b7
    Signed-off-by: Sergey Karamov
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Sergey Karamov
     
  • commit 578620f451f836389424833f1454eeeb2ffc9e9f upstream.

    We should set the error code if kzalloc() fails.

    Fixes: 67cf5b09a46f ("ext4: add the basic function for inline data support")
    Signed-off-by: Dan Carpenter
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • commit 7e6e1ef48fc02f3ac5d0edecbb0c6087cd758d58 upstream.

    Don't load an inode with a negative size; this causes integer overflow
    problems in the VFS.

    [ Added EXT4_ERROR_INODE() to mark file system as corrupted. -TYT]

    Fixes: a48380f769df (ext4: rename i_dir_acl to i_size_high)
    Signed-off-by: Darrick J. Wong
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Darrick J. Wong
     
  • commit c48ae41bafe31e9a66d8be2ced4e42a6b57fa814 upstream.

    The commit "ext4: sanity check the block and cluster size at mount
    time" should prevent any problems, but in case the superblock is
    modified while the file system is mounted, add an extra safety check
    to make sure we won't overrun the allocated buffer.

    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit 5aee0f8a3f42c94c5012f1673420aee96315925a upstream.

    Fix a large number of problems with how we handle mount options in the
    superblock. For one, if the string in the superblock is long enough
    that it is not null terminated, we could run off the end of the string
    and try to interpret superblocks fields as characters. It's unlikely
    this will cause a security problem, but it could result in an invalid
    parse. Also, parse_options is destructive to the string, so in some
    cases if there is a comma-separated string, it would be modified in
    the superblock. (Fortunately it only happens on file systems with a
    1k block size.)

    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit cd6bb35bf7f6d7d922509bf50265383a0ceabe96 upstream.

    Centralize the checks for inodes_per_block and be more strict to make
    sure the inodes_per_block_group can't end up being zero.

    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit 30a9d7afe70ed6bd9191d3000e2ef1a34fb58493 upstream.

    The number of 'counters' elements needed in 'struct sg' is
    super_block->s_blocksize_bits + 2. Presently we have 16 'counters'
    elements in the array. This is insufficient for block sizes >= 32k. In
    such cases the memcpy operation performed in ext4_mb_seq_groups_show()
    would cause stack memory corruption.

    Fixes: c9de560ded61f
    Signed-off-by: Chandan Rajendra
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Chandan Rajendra
     
  • commit 69e43e8cc971a79dd1ee5d4343d8e63f82725123 upstream.

    'border' variable is set to a value of 2 times the block size of the
    underlying filesystem. With 64k block size, the resulting value won't
    fit into a 16-bit variable. Hence this commit changes the data type of
    'border' to 'unsigned int'.

    Fixes: c9de560ded61f
    Signed-off-by: Chandan Rajendra
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Andreas Dilger
    Signed-off-by: Greg Kroah-Hartman

    Chandan Rajendra
     
  • commit 1566a48aaa10c6bb29b9a69dd8279f9a4fc41e35 upstream.

    If there is an error reported in mballoc via ext4_grp_locked_error(),
    the code is holding a spinlock, so ext4_commit_super() must not try to
    lock the buffer head, or else it will trigger a BUG:

    BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:358
    in_atomic(): 1, irqs_disabled(): 0, pid: 993, name: mount
    CPU: 0 PID: 993 Comm: mount Not tainted 4.9.0-rc1-clouder1 #62
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
    ffff880006423548 ffffffff81318c89 ffffffff819ecdd0 0000000000000166
    ffff880006423558 ffffffff810810b0 ffff880006423580 ffffffff81081153
    ffff880006e5a1a0 ffff88000690e400 0000000000000000 ffff8800064235c0
    Call Trace:
    [] dump_stack+0x67/0x9e
    [] ___might_sleep+0xf0/0x140
    [] __might_sleep+0x53/0xb0
    [] ext4_commit_super+0x19c/0x290
    [] __ext4_grp_locked_error+0x14a/0x230
    [] ? __might_sleep+0x53/0xb0
    [] ext4_mb_generate_buddy+0x1de/0x320

    Since ext4_grp_locked_error() calls ext4_commit_super with sync == 0
    (and it is the only caller which does so), avoid locking and unlocking
    the buffer in this case.

    This can result in races with ext4_commit_super() if there are other
    problems (which is what commit 4743f83990614 was trying to address),
    but a Warning is better than BUG.

    Fixes: 4743f83990614
    Reported-by: Nikolay Borisov
    Signed-off-by: Theodore Ts'o
    Reviewed-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Theodore Ts'o
     
  • commit d128af17876d79b87edf048303f98b35f6a53dbc upstream.

    The AEAD givenc descriptor relies on moving the IV through the
    output FIFO and then back to the CTX2 for authentication. The
    SEQ FIFO STORE could be scheduled before the data can be
    read from OFIFO, especially since the SEQ FIFO LOAD needs
    to wait for the SEQ FIFO LOAD SKIP to finish first. The
    SKIP takes more time when the input is SG than when it's
    a contiguous buffer. If the SEQ FIFO LOAD is not scheduled
    before the STORE, the DECO will hang waiting for data
    to be available in the OFIFO so it can be transferred to C2.
    In order to overcome this, first force transfer of IV to C2
    by starting the "cryptlen" transfer first and then starting to
    store data from OFIFO to the output buffer.

    Fixes: 1acebad3d8db8 ("crypto: caam - faster aead implementation")
    Signed-off-by: Alex Porosanu
    Signed-off-by: Horia Geantă
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Alex Porosanu
     
  • commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3 upstream.

    It is the reasonable expectation that if an executable file is not
    readable there will be no way for a user without special privileges to
    read the file. This is enforced in ptrace_attach but if ptrace
    is already attached before exec there is no enforcement for read-only
    executables.

    As the only way to read such an mm is through access_process_vm
    spin a variant called ptrace_access_vm that will fail if the
    target process is not being ptraced by the current process, or
    the current process did not have sufficient privileges when ptracing
    began to read the target processes mm.

    In the ptrace implementations replace access_process_vm by
    ptrace_access_vm. There remain several ptrace sites that still use
    access_process_vm as they are reading the target executables
    instructions (for kernel consumption) or register stacks. As such it
    does not appear necessary to add a permission check to those calls.

    This bug has always existed in Linux.

    Fixes: v1.0
    Reported-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 64b875f7ac8a5d60a4e191479299e931ee949b67 upstream.

    When the flag PT_PTRACE_CAP was added the PTRACE_TRACEME path was
    overlooked. This can result in incorrect behavior when an application
    like strace traces an exec of a setuid executable.

    Further PT_PTRACE_CAP does not have enough information for making good
    security decisions as it does not report which user namespace the
    capability is in. This has already allowed one mistake through
    insufficient granulariy.

    I found this issue when I was testing another corner case of exec and
    discovered that I could not get strace to set PT_PTRACE_CAP even when
    running strace as root with a full set of caps.

    This change fixes the above issue with strace allowing stracing as
    root a setuid executable without disabling setuid. More fundamentaly
    this change allows what is allowable at all times, by using the correct
    information in it's decision.

    Fixes: 4214e42f96d4 ("v2.4.9.11 -> v2.4.9.12")
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman