17 Jan, 2021

2 commits

  • commit f09ced4053bc0a2094a12b60b646114c966ef4c6 upstream.

    Fix a race when multiple sockets are simultaneously calling sendto()
    when the completion ring is shared in the SKB case. This is the case
    when you share the same netdev and queue id through the
    XDP_SHARED_UMEM bind flag. The problem is that multiple processes can
    be in xsk_generic_xmit() and call the backpressure mechanism in
    xskq_prod_reserve(xs->pool->cq). As this is a shared resource in this
    specific scenario, a race might occur since the rings are
    single-producer single-consumer.

    Fix this by moving the tx_completion_lock from the socket to the pool
    as the pool is shared between the sockets that share the completion
    ring. (The pool is not shared when this is not the case.) And then
    protect the accesses to xskq_prod_reserve() with this lock. The
    tx_completion_lock is renamed cq_lock to better reflect that it
    protects accesses to the potentially shared completion ring.

    Fixes: 35fcde7f8deb ("xsk: support for Tx")
    Reported-by: Xuan Zhuo
    Signed-off-by: Magnus Karlsson
    Signed-off-by: Daniel Borkmann
    Acked-by: Björn Töpel
    Link: https://lore.kernel.org/bpf/20201218134525.13119-2-magnus.karlsson@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Magnus Karlsson
     
  • commit 2ca408d9c749c32288bc28725f9f12ba30299e8f upstream.

    Commit

    121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")

    converted native x86-32 which take 64-bit arguments to use the
    compat handlers to allow conversion to passing args via pt_regs.
    sys_fanotify_mark() was however missed, as it has a general compat
    handler. Add a config option that will use the syscall wrapper that
    takes the split args for native 32-bit.

    [ bp: Fix typo in Kconfig help text. ]

    Fixes: 121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")
    Reported-by: Paweł Jasiak
    Signed-off-by: Brian Gerst
    Signed-off-by: Borislav Petkov
    Acked-by: Jan Kara
    Acked-by: Andy Lutomirski
    Link: https://lkml.kernel.org/r/20201130223059.101286-1-brgerst@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Brian Gerst
     

13 Jan, 2021

7 commits

  • commit b16671e8f493e3df40b1fb0dff4078f391c5099a upstream.

    When large bucket feature was added, BCH_FEATURE_INCOMPAT_LARGE_BUCKET
    was introduced into the incompat feature set. It used bucket_size_hi
    (which was added at the tail of struct cache_sb_disk) to extend current
    16bit bucket size to 32bit with existing bucket_size in struct
    cache_sb_disk.

    This is not a good idea, there are two obvious problems,
    - Bucket size is always value power of 2, if store log2(bucket size) in
    existing bucket_size of struct cache_sb_disk, it is unnecessary to add
    bucket_size_hi.
    - Macro csum_set() assumes d[SB_JOURNAL_BUCKETS] is the last member in
    struct cache_sb_disk, bucket_size_hi was added after d[] which makes
    csum_set calculate an unexpected super block checksum.

    To fix the above problems, this patch introduces a new incompat feature
    bit BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE, when this bit is set, it
    means bucket_size in struct cache_sb_disk stores the order of power-of-2
    bucket size value. When user specifies a bucket size larger than 32768
    sectors, BCH_FEATURE_INCOMPAT_LOG_LARGE_BUCKET_SIZE will be set to
    incompat feature set, and bucket_size stores log2(bucket size) more
    than store the real bucket size value.

    The obsoleted BCH_FEATURE_INCOMPAT_LARGE_BUCKET won't be used anymore,
    it is renamed to BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET and still only
    recognized by kernel driver for legacy compatible purpose. The previous
    bucket_size_hi is renmaed to obso_bucket_size_hi in struct cache_sb_disk
    and not used in bcache-tools anymore.

    For cache device created with BCH_FEATURE_INCOMPAT_LARGE_BUCKET feature,
    bcache-tools and kernel driver still recognize the feature string and
    display it as "obso_large_bucket".

    With this change, the unnecessary extra space extend of bcache on-disk
    super block can be avoided, and csum_set() may generate expected check
    sum as well.

    Fixes: ffa470327572 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
    Signed-off-by: Coly Li
    Cc: stable@vger.kernel.org # 5.9+
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Coly Li
     
  • commit 9ad9f45b3b91162b33abfe175ae75ab65718dbf5 upstream.

    'struct intel_svm' is shared by all devices bound to a give process,
    but records only a single pointer to a 'struct intel_iommu'. Consequently,
    cache invalidations may only be applied to a single DMAR unit, and are
    erroneously skipped for the other devices.

    In preparation for fixing this, rework the structures so that the iommu
    pointer resides in 'struct intel_svm_dev', allowing 'struct intel_svm'
    to track them in its device list.

    Fixes: 1c4f88b7f1f9 ("iommu/vt-d: Shared virtual address in scalable mode")
    Cc: Lu Baolu
    Cc: Jacob Pan
    Cc: Raj Ashok
    Cc: David Woodhouse
    Reported-by: Guo Kaijie
    Reported-by: Xin Zeng
    Signed-off-by: Guo Kaijie
    Signed-off-by: Xin Zeng
    Signed-off-by: Liu Yi L
    Tested-by: Guo Kaijie
    Cc: stable@vger.kernel.org # v5.0+
    Acked-by: Lu Baolu
    Link: https://lore.kernel.org/r/1609949037-25291-2-git-send-email-yi.l.liu@intel.com
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Liu Yi L
     
  • [ Upstream commit 52abca64fd9410ea6c9a3a74eab25663b403d7da ]

    blk_queue_enter() accepts BLK_MQ_REQ_PM requests independent of the runtime
    power management state. Now that SCSI domain validation no longer depends
    on this behavior, modify the behavior of blk_queue_enter() as follows:

    - Do not accept any requests while suspended.

    - Only process power management requests while suspending or resuming.

    Submitting BLK_MQ_REQ_PM requests to a device that is runtime suspended
    causes runtime-suspended devices not to resume as they should. The request
    which should cause a runtime resume instead gets issued directly, without
    resuming the device first. Of course the device can't handle it properly,
    the I/O fails, and the device remains suspended.

    The problem is fixed by checking that the queue's runtime-PM status isn't
    RPM_SUSPENDED before allowing a request to be issued, and queuing a
    runtime-resume request if it is. In particular, the inline
    blk_pm_request_resume() routine is renamed blk_pm_resume_queue() and the
    code is unified by merging the surrounding checks into the routine. If the
    queue isn't set up for runtime PM, or there currently is no restriction on
    allowed requests, the request is allowed. Likewise if the BLK_MQ_REQ_PM
    flag is set and the status isn't RPM_SUSPENDED. Otherwise a runtime resume
    is queued and the request is blocked until conditions are more suitable.

    [ bvanassche: modified commit message and removed Cc: stable because
    without the previous patches from this series this patch would break
    parallel SCSI domain validation + introduced queue_rpm_status() ]

    Link: https://lore.kernel.org/r/20201209052951.16136-9-bvanassche@acm.org
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: Hannes Reinecke
    Cc: Can Guo
    Cc: Stanley Chu
    Cc: Ming Lei
    Cc: Rafael J. Wysocki
    Reported-and-tested-by: Martin Kepplinger
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Can Guo
    Signed-off-by: Alan Stern
    Signed-off-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Alan Stern
     
  • [ Upstream commit a4d34da715e3cb7e0741fe603dcd511bed067e00 ]

    Remove flag RQF_PREEMPT and BLK_MQ_REQ_PREEMPT since these are no longer
    used by any kernel code.

    Link: https://lore.kernel.org/r/20201209052951.16136-8-bvanassche@acm.org
    Cc: Can Guo
    Cc: Stanley Chu
    Cc: Alan Stern
    Cc: Ming Lei
    Cc: Rafael J. Wysocki
    Cc: Martin Kepplinger
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Jens Axboe
    Reviewed-by: Can Guo
    Signed-off-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Bart Van Assche
     
  • [ Upstream commit 87dbc209ea04645fd2351981f09eff5d23f8e2e9 ]

    Make mandatory in include/asm-generic/Kbuild and
    remove all arch/*/include/asm/local64.h arch-specific files since they
    only #include .

    This fixes build errors on arch/c6x/ and arch/nios2/ for
    block/blk-iocost.c.

    Build-tested on 21 of 25 arch-es. (tools problems on the others)

    Yes, we could even rename to
    and change all #includes to use
    instead.

    Link: https://lkml.kernel.org/r/20201227024446.17018-1-rdunlap@infradead.org
    Signed-off-by: Randy Dunlap
    Suggested-by: Christoph Hellwig
    Reviewed-by: Masahiro Yamada
    Cc: Jens Axboe
    Cc: Ley Foon Tan
    Cc: Mark Salter
    Cc: Aurelien Jacquiot
    Cc: Peter Zijlstra
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Randy Dunlap
     
  • [ Upstream commit 0854bcdcdec26aecdc92c303816f349ee1fba2bc ]

    Introduce the BLK_MQ_REQ_PM flag. This flag makes the request allocation
    functions set RQF_PM. This is the first step towards removing
    BLK_MQ_REQ_PREEMPT.

    Link: https://lore.kernel.org/r/20201209052951.16136-3-bvanassche@acm.org
    Cc: Alan Stern
    Cc: Stanley Chu
    Cc: Ming Lei
    Cc: Rafael J. Wysocki
    Cc: Can Guo
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Hannes Reinecke
    Reviewed-by: Jens Axboe
    Reviewed-by: Can Guo
    Signed-off-by: Bart Van Assche
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Sasha Levin

    Bart Van Assche
     
  • [ Upstream commit bd1248f1ddbc48b0c30565fce897a3b6423313b8 ]

    Check Scell_log shift size in red_check_params() and modify all callers
    of red_check_params() to pass Scell_log.

    This prevents a shift out-of-bounds as detected by UBSAN:
    UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22
    shift exponent 72 is too large for 32-bit type 'int'

    Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
    Signed-off-by: Randy Dunlap
    Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com
    Cc: Nogah Frankel
    Cc: Jamal Hadi Salim
    Cc: Cong Wang
    Cc: Jiri Pirko
    Cc: netdev@vger.kernel.org
    Cc: "David S. Miller"
    Cc: Jakub Kicinski
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Randy Dunlap
     

09 Jan, 2021

5 commits

  • [ Upstream commit f7cfd871ae0c5008d94b6f66834e7845caa93c15 ]

    Recently syzbot reported[0] that there is a deadlock amongst the users
    of exec_update_mutex. The problematic lock ordering found by lockdep
    was:

    perf_event_open (exec_update_mutex -> ovl_i_mutex)
    chown (ovl_i_mutex -> sb_writes)
    sendfile (sb_writes -> p->lock)
    by reading from a proc file and writing to overlayfs
    proc_pid_syscall (p->lock -> exec_update_mutex)

    While looking at possible solutions it occured to me that all of the
    users and possible users involved only wanted to state of the given
    process to remain the same. They are all readers. The only writer is
    exec.

    There is no reason for readers to block on each other. So fix
    this deadlock by transforming exec_update_mutex into a rw_semaphore
    named exec_update_lock that only exec takes for writing.

    Cc: Jann Horn
    Cc: Vasiliy Kulikov
    Cc: Al Viro
    Cc: Bernd Edlinger
    Cc: Oleg Nesterov
    Cc: Christopher Yeoh
    Cc: Cyrill Gorcunov
    Cc: Sargun Dhillon
    Cc: Christian Brauner
    Cc: Arnd Bergmann
    Cc: Peter Zijlstra
    Cc: Ingo Molnar
    Cc: Arnaldo Carvalho de Melo
    Fixes: eea9673250db ("exec: Add exec_update_mutex to replace cred_guard_mutex")
    [0] https://lkml.kernel.org/r/00000000000063640c05ade8e3de@google.com
    Reported-by: syzbot+db9cdf3dd1f64252c6ef@syzkaller.appspotmail.com
    Link: https://lkml.kernel.org/r/87ft4mbqen.fsf@x220.int.ebiederm.org
    Signed-off-by: Eric W. Biederman
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     
  • [ Upstream commit 31784cff7ee073b34d6eddabb95e3be2880a425c ]

    In preparation for converting exec_update_mutex to a rwsem so that
    multiple readers can execute in parallel and not deadlock, add
    down_read_interruptible. This is needed for perf_event_open to be
    converted (with no semantic changes) from working on a mutex to
    wroking on a rwsem.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/87k0tybqfy.fsf@x220.int.ebiederm.org
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     
  • [ Upstream commit 0f9368b5bf6db0c04afc5454b1be79022a681615 ]

    In preparation for converting exec_update_mutex to a rwsem so that
    multiple readers can execute in parallel and not deadlock, add
    down_read_killable_nested. This is needed so that kcmp_lock
    can be converted from working on a mutexes to working on rw_semaphores.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Peter Zijlstra (Intel)
    Link: https://lkml.kernel.org/r/87o8jabqh3.fsf@x220.int.ebiederm.org
    Signed-off-by: Sasha Levin

    Eric W. Biederman
     
  • [ Upstream commit 5a7a9e038b032137ae9c45d5429f18a2ffdf7d42 ]

    Use the ib_dma_* helpers to skip the DMA translation instead. This
    removes the last user if dma_virt_ops and keeps the weird layering
    violation inside the RDMA core instead of burderning the DMA mapping
    subsystems with it. This also means the software RDMA drivers now don't
    have to mess with DMA parameters that are not relevant to them at all, and
    that in the future we can use PCI P2P transfers even for software RDMA, as
    there is no first fake layer of DMA mapping that the P2P DMA support.

    Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de
    Signed-off-by: Christoph Hellwig
    Tested-by: Mike Marciniszyn
    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Sasha Levin

    Christoph Hellwig
     
  • commit aa8c7db494d0a83ecae583aa193f1134ef25d506 upstream.

    Silly GCC doesn't always inline these trivial functions.

    Fixes the following warning:

    arch/x86/kernel/sys_ia32.o: warning: objtool: cp_stat64()+0xd8: call to new_encode_dev() with UACCESS enabled

    Link: https://lkml.kernel.org/r/984353b44a4484d86ba9f73884b7306232e25e30.1608737428.git.jpoimboe@redhat.com
    Signed-off-by: Josh Poimboeuf
    Reported-by: Randy Dunlap
    Acked-by: Randy Dunlap [build-tested]
    Cc: Peter Zijlstra
    Cc: Greg Kroah-Hartman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Josh Poimboeuf
     

06 Jan, 2021

2 commits

  • commit a85cbe6159ffc973e5702f70a3bd5185f8f3c38d upstream.

    and include in UAPI headers instead of .

    The reason is to avoid indirect include when using
    some network headers: or others ->
    -> .

    This indirect include causes on MUSL redefinition of struct sysinfo when
    included both and some of UAPI headers:

    In file included from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/kernel.h:5,
    from x86_64-buildroot-linux-musl/sysroot/usr/include/linux/netlink.h:5,
    from ../include/tst_netlink.h:14,
    from tst_crypto.c:13:
    x86_64-buildroot-linux-musl/sysroot/usr/include/linux/sysinfo.h:8:8: error: redefinition of `struct sysinfo'
    struct sysinfo {
    ^~~~~~~
    In file included from ../include/tst_safe_macros.h:15,
    from ../include/tst_test.h:93,
    from tst_crypto.c:11:
    x86_64-buildroot-linux-musl/sysroot/usr/include/sys/sysinfo.h:10:8: note: originally defined here

    Link: https://lkml.kernel.org/r/20201015190013.8901-1-petr.vorel@gmail.com
    Signed-off-by: Petr Vorel
    Suggested-by: Rich Felker
    Acked-by: Rich Felker
    Cc: Peter Korsgaard
    Cc: Baruch Siach
    Cc: Florian Weimer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Petr Vorel
     
  • commit dc2da7b45ffe954a0090f5d0310ed7b0b37d2bd2 upstream.

    VMware observed a performance regression during memmap init on their
    platform, and bisected to commit 73a6e474cb376 ("mm: memmap_init:
    iterate over memblock regions rather that check each PFN") causing it.

    Before the commit:

    [0.033176] Normal zone: 1445888 pages used for memmap
    [0.033176] Normal zone: 89391104 pages, LIFO batch:63
    [0.035851] ACPI: PM-Timer IO Port: 0x448

    With commit

    [0.026874] Normal zone: 1445888 pages used for memmap
    [0.026875] Normal zone: 89391104 pages, LIFO batch:63
    [2.028450] ACPI: PM-Timer IO Port: 0x448

    The root cause is the current memmap defer init doesn't work as expected.

    Before, memmap_init_zone() was used to do memmap init of one whole zone,
    to initialize all low zones of one numa node, but defer memmap init of
    the last zone in that numa node. However, since commit 73a6e474cb376,
    function memmap_init() is adapted to iterater over memblock regions
    inside one zone, then call memmap_init_zone() to do memmap init for each
    region.

    E.g, on VMware's system, the memory layout is as below, there are two
    memory regions in node 2. The current code will mistakenly initialize the
    whole 1st region [mem 0xab00000000-0xfcffffffff], then do memmap defer to
    iniatialize only one memmory section on the 2nd region [mem
    0x10000000000-0x1033fffffff]. In fact, we only expect to see that there's
    only one memory section's memmap initialized. That's why more time is
    costed at the time.

    [ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
    [ 0.008842] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff]
    [ 0.008843] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x55ffffffff]
    [ 0.008844] ACPI: SRAT: Node 1 PXM 1 [mem 0x5600000000-0xaaffffffff]
    [ 0.008844] ACPI: SRAT: Node 2 PXM 2 [mem 0xab00000000-0xfcffffffff]
    [ 0.008845] ACPI: SRAT: Node 2 PXM 2 [mem 0x10000000000-0x1033fffffff]

    Now, let's add a parameter 'zone_end_pfn' to memmap_init_zone() to pass
    down the real zone end pfn so that defer_init() can use it to judge
    whether defer need be taken in zone wide.

    Link: https://lkml.kernel.org/r/20201223080811.16211-1-bhe@redhat.com
    Link: https://lkml.kernel.org/r/20201223080811.16211-2-bhe@redhat.com
    Fixes: commit 73a6e474cb376 ("mm: memmap_init: iterate over memblock regions rather that check each PFN")
    Signed-off-by: Baoquan He
    Reported-by: Rahul Gopakumar
    Reviewed-by: Mike Rapoport
    Cc: David Hildenbrand
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Baoquan He
     

30 Dec, 2020

19 commits

  • commit 5812b32e01c6d86ba7a84110702b46d8a8531fe9 upstream.

    Specify type alignment when declaring linker-section match-table entries
    to prevent gcc from increasing alignment and corrupting the various
    tables with padding (e.g. timers, irqchips, clocks, reserved memory).

    This is specifically needed on x86 where gcc (typically) aligns larger
    objects like struct of_device_id with static extent on 32-byte
    boundaries which at best prevents matching on anything but the first
    entry. Specifying alignment when declaring variables suppresses this
    optimisation.

    Here's a 64-bit example where all entries are corrupt as 16 bytes of
    padding has been inserted before the first entry:

    ffffffff8266b4b0 D __clk_of_table
    ffffffff8266b4c0 d __of_table_fixed_factor_clk
    ffffffff8266b5a0 d __of_table_fixed_clk
    ffffffff8266b680 d __clk_of_table_sentinel

    And here's a 32-bit example where the 8-byte-aligned table happens to be
    placed on a 32-byte boundary so that all but the first entry are corrupt
    due to the 28 bytes of padding inserted between entries:

    812b3ec0 D __irqchip_of_table
    812b3ec0 d __of_table_irqchip1
    812b3fa0 d __of_table_irqchip2
    812b4080 d __of_table_irqchip3
    812b4160 d irqchip_of_match_end

    Verified on x86 using gcc-9.3 and gcc-4.9 (which uses 64-byte
    alignment), and on arm using gcc-7.2.

    Note that there are no in-tree users of these tables on x86 currently
    (even if they are included in the image).

    Fixes: 54196ccbe0ba ("of: consolidate linker section OF match table declarations")
    Fixes: f6e916b82022 ("irqchip: add basic infrastructure")
    Cc: stable # 3.9
    Signed-off-by: Johan Hovold
    Link: https://lore.kernel.org/r/20201123102319.8090-2-johan@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 3dc86ca6b4c8cfcba9da7996189d1b5a358a94fc upstream.

    This commit adds a counter of pending messages for each watch in the
    struct. It is used to skip unnecessary pending messages lookup in
    'unregister_xenbus_watch()'. It could also be used in 'will_handle'
    callback.

    This is part of XSA-349

    Cc: stable@vger.kernel.org
    Signed-off-by: SeongJae Park
    Reported-by: Michael Kurth
    Reported-by: Pawel Wieczorkiewicz
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross
    Signed-off-by: Greg Kroah-Hartman

    SeongJae Park
     
  • commit 2e85d32b1c865bec703ce0c962221a5e955c52c2 upstream.

    Some code does not directly make 'xenbus_watch' object and call
    'register_xenbus_watch()' but use 'xenbus_watch_path()' instead. This
    commit adds support of 'will_handle' callback in the
    'xenbus_watch_path()' and it's wrapper, 'xenbus_watch_pathfmt()'.

    This is part of XSA-349

    Cc: stable@vger.kernel.org
    Signed-off-by: SeongJae Park
    Reported-by: Michael Kurth
    Reported-by: Pawel Wieczorkiewicz
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross
    Signed-off-by: Greg Kroah-Hartman

    SeongJae Park
     
  • commit fed1755b118147721f2c87b37b9d66e62c39b668 upstream.

    If handling logics of watch events are slower than the events enqueue
    logic and the events can be created from the guests, the guests could
    trigger memory pressure by intensively inducing the events, because it
    will create a huge number of pending events that exhausting the memory.

    Fortunately, some watch events could be ignored, depending on its
    handler callback. For example, if the callback has interest in only one
    single path, the watch wouldn't want multiple pending events. Or, some
    watches could ignore events to same path.

    To let such watches to volutarily help avoiding the memory pressure
    situation, this commit introduces new watch callback, 'will_handle'. If
    it is not NULL, it will be called for each new event just before
    enqueuing it. Then, if the callback returns false, the event will be
    discarded. No watch is using the callback for now, though.

    This is part of XSA-349

    Cc: stable@vger.kernel.org
    Signed-off-by: SeongJae Park
    Reported-by: Michael Kurth
    Reported-by: Pawel Wieczorkiewicz
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross
    Signed-off-by: Greg Kroah-Hartman

    SeongJae Park
     
  • commit 0fb6ee8d0b5e90b72f870f76debc8bd31a742014 upstream.

    Use a heap allocated memory for the SPI transfer buffer. Using stack memory
    can corrupt stack memory when using DMA on some systems.

    This change moves the buffer from the stack of the trigger handler call to
    the heap of the buffer of the state struct. The size increases takes into
    account the alignment for the timestamp, which is 8 bytes.

    The 'data' buffer is split into 'tx_buf' and 'rx_buf', to make a clearer
    separation of which part of the buffer should be used for TX & RX.

    Fixes: af3008485ea03 ("iio:adc: Add common code for ADI Sigma Delta devices")
    Signed-off-by: Lars-Peter Clausen
    Signed-off-by: Alexandru Ardelean
    Link: https://lore.kernel.org/r/20201124123807.19717-1-alexandru.ardelean@analog.com
    Cc:
    Signed-off-by: Jonathan Cameron
    Signed-off-by: Greg Kroah-Hartman

    Lars-Peter Clausen
     
  • commit fecc4559780d52d174ea05e3bf543669165389c3 upstream.

    fsnotify_parent() used to send two separate events to backends when a
    parent inode is watching children and the child inode is also watching.
    In an attempt to avoid duplicate events in fanotify, we unified the two
    backend callbacks to a single callback and handled the reporting of the
    two separate events for the relevant backends (inotify and dnotify).
    However the handling is buggy and can result in inotify and dnotify
    listeners receiving events of the type they never asked for or spurious
    events.

    The problem is the unified event callback with two inode marks (parent and
    child) is called when any of the parent and child inodes are watched and
    interested in the event, but the parent inode's mark that is interested
    in the event on the child is not necessarily the one we are currently
    reporting to (it could belong to a different group).

    So before reporting the parent or child event flavor to backend we need
    to check that the mark is really interested in that event flavor.

    The semantics of INODE and CHILD marks were hard to follow and made the
    logic more complicated than it should have been. Replace it with INODE
    and PARENT marks semantics to hopefully make the logic more clear.

    Thanks to Hugh Dickins for spotting a bug in the earlier version of this
    patch.

    Fixes: 497b0c5a7c06 ("fsnotify: send event to parent and child with single callback")
    CC: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20201202120713.702387-4-amir73il@gmail.com
    Reported-by: Hugh Dickins
    Signed-off-by: Amir Goldstein
    Signed-off-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Amir Goldstein
     
  • commit 950cc0d2bef078e1f6459900ca4d4b2a2e0e3c37 upstream.

    The handle_inode_event() interface was added as (quoting comment):
    "a simple variant of handle_event() for groups that only have inode
    marks and don't have ignore mask".

    In other words, all backends except fanotify. The inotify backend
    also falls under this category, but because it required extra arguments
    it was left out of the initial pass of backends conversion to the
    simple interface.

    This results in code duplication between the generic helper
    fsnotify_handle_event() and the inotify_handle_event() callback
    which also happen to be buggy code.

    Generalize the handle_inode_event() arguments and add the check for
    FS_EXCL_UNLINK flag to the generic helper, so inotify backend could
    be converted to use the simple interface.

    Link: https://lore.kernel.org/r/20201202120713.702387-2-amir73il@gmail.com
    CC: stable@vger.kernel.org
    Fixes: b9a1b9772509 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
    Signed-off-by: Amir Goldstein
    Signed-off-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Amir Goldstein
     
  • commit 0f966cba95c78029f491b433ea95ff38f414a761 upstream.

    Add a per-transaction flag to indicate that the buffer
    must be cleared when the transaction is complete to
    prevent copies of sensitive data from being preserved
    in memory.

    Signed-off-by: Todd Kjos
    Link: https://lore.kernel.org/r/20201120233743.3617529-1-tkjos@google.com
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Todd Kjos
     
  • commit 7482c5cb90e5a7f9e9e12dd154d405e0219656e3 upstream.

    The idea behind acpi_pm_set_bridge_wakeup() was to allow bridges to
    be reference counted for wakeup enabling, because they may be enabled
    to signal wakeup on behalf of their subordinate devices and that
    may happen for multiple times in a row, whereas for the other devices
    it only makes sense to enable wakeup signaling once.

    However, this becomes problematic if the bridge itself is suspended,
    because it is treated as a "regular" device in that case and the
    reference counting doesn't work.

    For instance, suppose that there are two devices below a bridge and
    they both can signal wakeup. Every time one of them is suspended,
    wakeup signaling is enabled for the bridge, so when they both have
    been suspended, the bridge's wakeup reference counter value is 2.

    Say that the bridge is suspended subsequently and acpi_pci_wakeup()
    is called for it. Because the bridge can signal wakeup, that
    function will invoke acpi_pm_set_device_wakeup() to configure it
    and __acpi_pm_set_device_wakeup() will be called with the last
    argument equal to 1. This causes __acpi_device_wakeup_enable()
    invoked by it to omit the reference counting, because the reference
    counter of the target device (the bridge) is 2 at that time.

    Now say that the bridge resumes and one of the device below it
    resumes too, so the bridge's reference counter becomes 0 and
    wakeup signaling is disabled for it, but there is still the other
    suspended device which may need the bridge to signal wakeup on its
    behalf and that is not going to work.

    To address this scenario, use wakeup enable reference counting for
    all devices, not just for bridges, so drop the last argument from
    __acpi_device_wakeup_enable() and __acpi_pm_set_device_wakeup(),
    which causes acpi_pm_set_device_wakeup() and
    acpi_pm_set_bridge_wakeup() to become identical, so drop the latter
    and use the former instead of it everywhere.

    Fixes: 1ba51a7c1496 ("ACPI / PCI / PM: Rework acpi_pci_propagate_wakeup()")
    Signed-off-by: Rafael J. Wysocki
    Reviewed-by: Mika Westerberg
    Acked-by: Bjorn Helgaas
    Cc: 4.14+ # 4.14+
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • [ Upstream commit 75f4d4544db9fa34e1f04174f27d9f8a387be37d ]

    The BIT() macro is not available for the UAPI headers. Moreover, it can
    be defined differently in user space headers. Thus, replace its usage
    with the _BITUL() macro which is already used in other macro definitions
    in .

    Fixes: dc64cc7c6310 ("devlink: Add devlink reload limit option")
    Signed-off-by: Tobias Klauser
    Link: https://lore.kernel.org/r/20201215102531.16958-1-tklauser@distanz.ch
    Signed-off-by: Jakub Kicinski
    Signed-off-by: Sasha Levin

    Tobias Klauser
     
  • [ Upstream commit c6c75deda81344c3a95d1d1f606d5cee109e5d54 ]

    Commit 1fde6f21d90f ("proc: fix /proc/net/* after setns(2)") only forced
    revalidation of regular files under /proc/net/

    However, /proc/net/ is unusual in the sense of /proc/net/foo handlers
    take netns pointer from parent directory which is old netns.

    Steps to reproduce:

    (void)open("/proc/net/sctp/snmp", O_RDONLY);
    unshare(CLONE_NEWNET);

    int fd = open("/proc/net/sctp/snmp", O_RDONLY);
    read(fd, &c, 1);

    Read will read wrong data from original netns.

    Patch forces lookup on every directory under /proc/net .

    Link: https://lkml.kernel.org/r/20201205160916.GA109739@localhost.localdomain
    Fixes: 1da4d377f943 ("proc: revalidate misc dentries")
    Signed-off-by: Alexey Dobriyan
    Reported-by: "Rantala, Tommi T. (Nokia - FI/Espoo)"
    Cc: Al Viro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Alexey Dobriyan
     
  • [ Upstream commit 013339df116c2ee0d796dd8bfb8f293a2030c063 ]

    Since commit 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic
    v2"), the code to check the secondary MMU's page table access bit is
    broken for !(TTU_IGNORE_ACCESS) because the page is unmapped from the
    secondary MMU's page table before the check. More specifically for those
    secondary MMUs which unmap the memory in
    mmu_notifier_invalidate_range_start() like kvm.

    However memory reclaim is the only user of !(TTU_IGNORE_ACCESS) or the
    absence of TTU_IGNORE_ACCESS and it explicitly performs the page table
    access check before trying to unmap the page. So, at worst the reclaim
    will miss accesses in a very short window if we remove page table access
    check in unmapping code.

    There is an unintented consequence of !(TTU_IGNORE_ACCESS) for the memcg
    reclaim. From memcg reclaim the page_referenced() only account the
    accesses from the processes which are in the same memcg of the target page
    but the unmapping code is considering accesses from all the processes, so,
    decreasing the effectiveness of memcg reclaim.

    The simplest solution is to always assume TTU_IGNORE_ACCESS in unmapping
    code.

    Link: https://lkml.kernel.org/r/20201104231928.1494083-1-shakeelb@google.com
    Fixes: 369ea8242c0f ("mm/rmap: update to new mmu_notifier semantic v2")
    Signed-off-by: Shakeel Butt
    Acked-by: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Jerome Glisse
    Cc: Vlastimil Babka
    Cc: Michal Hocko
    Cc: Andrea Arcangeli
    Cc: Dan Williams
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Shakeel Butt
     
  • [ Upstream commit 57efa1fe5957694fa541c9062de0a127f0b9acb0 ]

    Since commit 70e806e4e645 ("mm: Do early cow for pinned pages during
    fork() for ptes") pages under a FOLL_PIN will not be write protected
    during COW for fork. This means that pages returned from
    pin_user_pages(FOLL_WRITE) should not become write protected while the pin
    is active.

    However, there is a small race where get_user_pages_fast(FOLL_PIN) can
    establish a FOLL_PIN at the same time copy_present_page() is write
    protecting it:

    CPU 0 CPU 1
    get_user_pages_fast()
    internal_get_user_pages_fast()
    copy_page_range()
    pte_alloc_map_lock()
    copy_present_page()
    atomic_read(has_pinned) == 0
    page_maybe_dma_pinned() == false
    atomic_set(has_pinned, 1);
    gup_pgd_range()
    gup_pte_range()
    pte_t pte = gup_get_pte(ptep)
    pte_access_permitted(pte)
    try_grab_compound_head()
    pte = pte_wrprotect(pte)
    set_pte_at();
    pte_unmap_unlock()
    // GUP now returns with a write protected page

    The first attempt to resolve this by using the write protect caused
    problems (and was missing a barrrier), see commit f3c64eda3e50 ("mm: avoid
    early COW write protect games during fork()")

    Instead wrap copy_p4d_range() with the write side of a seqcount and check
    the read side around gup_pgd_range(). If there is a collision then
    get_user_pages_fast() fails and falls back to slow GUP.

    Slow GUP is safe against this race because copy_page_range() is only
    called while holding the exclusive side of the mmap_lock on the src
    mm_struct.

    [akpm@linux-foundation.org: coding style fixes]
    Link: https://lore.kernel.org/r/CAHk-=wi=iCnYCARbPGjkVJu9eyYeZ13N64tZYLdOB8CP5Q_PLw@mail.gmail.com

    Link: https://lkml.kernel.org/r/2-v4-908497cf359a+4782-gup_fork_jgg@nvidia.com
    Fixes: f3c64eda3e50 ("mm: avoid early COW write protect games during fork()")
    Signed-off-by: Jason Gunthorpe
    Suggested-by: Linus Torvalds
    Reviewed-by: John Hubbard
    Reviewed-by: Jan Kara
    Reviewed-by: Peter Xu
    Acked-by: "Ahmed S. Darwish" [seqcount_t parts]
    Cc: Andrea Arcangeli
    Cc: "Aneesh Kumar K.V"
    Cc: Christoph Hellwig
    Cc: Hugh Dickins
    Cc: Jann Horn
    Cc: Kirill Shutemov
    Cc: Kirill Tkhai
    Cc: Leon Romanovsky
    Cc: Michal Hocko
    Cc: Oleg Nesterov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Sasha Levin

    Jason Gunthorpe
     
  • [ Upstream commit 88149082bb8ef31b289673669e080ec6a00c2e59 ]

    If generic_drop_inode() returns true, it means iput_final() can evict
    this inode regardless of whether it is dirty or not. If we check
    I_DONTCACHE in generic_drop_inode(), any inode with this bit set will be
    evicted unconditionally. This is not the desired behavior because
    I_DONTCACHE only means the inode shouldn't be cached on the LRU list.
    As for whether we need to evict this inode, this is what
    generic_drop_inode() should do. This patch corrects the usage of
    I_DONTCACHE.

    This patch was proposed in [1].

    [1]: https://lore.kernel.org/linux-fsdevel/20200831003407.GE12096@dread.disaster.area/

    Fixes: dae2f8ed7992 ("fs: Lift XFS_IDONTCACHE to the VFS layer")
    Signed-off-by: Hao Li
    Reviewed-by: Dave Chinner
    Reviewed-by: Ira Weiny
    Signed-off-by: Al Viro
    Signed-off-by: Sasha Levin

    Hao Li
     
  • [ Upstream commit e0da68994d16b46384cce7b86eb645f1ef7c51ef ]

    Fix incorrect type of max_entries in UVERBS_METHOD_QUERY_GID_TABLE -
    max_entries is of type size_t although it can take negative values.

    The following static check revealed it:

    drivers/infiniband/core/uverbs_std_types_device.c:338 ib_uverbs_handler_UVERBS_METHOD_QUERY_GID_TABLE() warn: 'max_entries' unsigned
    Signed-off-by: Avihai Horon
    Signed-off-by: Leon Romanovsky
    Signed-off-by: Jason Gunthorpe
    Signed-off-by: Sasha Levin

    Avihai Horon
     
  • [ Upstream commit d9a9280a0d0ae51dc1d4142138b99242b7ec8ac6 ]

    Building with W=2 prints a number of warnings for one function that
    has a pointer type mismatch:

    linux/seq_buf.h: In function 'seq_buf_init':
    linux/seq_buf.h:35:12: warning: pointer targets in assignment from 'unsigned char *' to 'char *' differ in signedness [-Wpointer-sign]

    Change the type in the function prototype according to the type in
    the structure.

    Link: https://lkml.kernel.org/r/20201026161108.3707783-1-arnd@kernel.org

    Fixes: 9a7777935c34 ("tracing: Convert seq_buf fields to be like seq_file fields")
    Reviewed-by: Cezary Rojewski
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Sasha Levin

    Arnd Bergmann
     
  • [ Upstream commit d5aa6b22e2258f05317313ecc02efbb988ed6d38 ]

    According to RFC5666, the correct netid for an IPv6 addressed RDMA
    transport is "rdma6", which we've supported as a mount option since
    Linux-4.7. The problem is when we try to load the module "xprtrdma6",
    that will fail, since there is no modulealias of that name.

    Fixes: 181342c5ebe8 ("xprtrdma: Add rdma6 option to support NFS/RDMA IPv6")
    Signed-off-by: Trond Myklebust
    Signed-off-by: Sasha Levin

    Trond Myklebust
     
  • [ Upstream commit b3cc73d2bf14e7c6e0376fa9433e708349e9ddfc ]

    Document that the caller of v4l2_fwnode_endpoint_parse() must init the
    fields of struct v4l2_fwnode_endpoint (vep argument) fields.

    It used to be that the fields were zeroed by v4l2_fwnode_endpoint_parse
    when bus type was set to V4L2_MBUS_UNKNOWN but with recent changes (Fixes:
    line below) that no longer makes sense.

    Fixes: bb4bba9232fc ("media: v4l2-fwnode: Make bus configuration a struct")
    Signed-off-by: Sakari Ailus
    Reviewed-by: Niklas Söderlund
    Reviewed-by: Laurent Pinchart
    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Sasha Levin

    Sakari Ailus
     
  • [ Upstream commit 69baf338fc16a4d55c78da8874ce3f06feb38c78 ]

    Return -EINVAL if invalid bus-type is detected while parsing endpoints.

    Fixes: 26c1126c9b56 ("media: v4l: fwnode: Use media bus type for bus parser selection")
    Signed-off-by: Lad Prabhakar
    Signed-off-by: Sakari Ailus
    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Sasha Levin

    Lad Prabhakar
     

26 Dec, 2020

3 commits

  • commit 92eb6c3060ebe3adf381fd9899451c5b047bb14d upstream.

    Commit 3f69cc60768b ("crypto: af_alg - Allow arbitrarily long algorithm
    names") made the kernel start accepting arbitrarily long algorithm names
    in sockaddr_alg. However, the actual length of the salg_name field
    stayed at the original 64 bytes.

    This is broken because the kernel can access indices >= 64 in salg_name,
    which is undefined behavior -- even though the memory that is accessed
    is still located within the sockaddr structure. It would only be
    defined behavior if the array were properly marked as arbitrary-length
    (either by making it a flexible array, which is the recommended way
    these days, or by making it an array of length 0 or 1).

    We can't simply change salg_name into a flexible array, since that would
    break source compatibility with userspace programs that embed
    sockaddr_alg into another struct, or (more commonly) declare a
    sockaddr_alg like 'struct sockaddr_alg sa = { .salg_name = "foo" };'.

    One solution would be to change salg_name into a flexible array only
    when '#ifdef __KERNEL__'. However, that would keep userspace without an
    easy way to actually use the longer algorithm names.

    Instead, add a new structure 'sockaddr_alg_new' that has the flexible
    array field, and expose it to both userspace and the kernel.
    Make the kernel use it correctly in alg_bind().

    This addresses the syzbot report
    "UBSAN: array-index-out-of-bounds in alg_bind"
    (https://syzkaller.appspot.com/bug?extid=92ead4eb8e26a26d465e).

    Reported-by: syzbot+92ead4eb8e26a26d465e@syzkaller.appspotmail.com
    Fixes: 3f69cc60768b ("crypto: af_alg - Allow arbitrarily long algorithm names")
    Cc: # v4.12+
    Signed-off-by: Eric Biggers
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 159e1de201b6fca10bfec50405a3b53a561096a8 upstream.

    It's possible to create a duplicate filename in an encrypted directory
    by creating a file concurrently with adding the encryption key.

    Specifically, sys_open(O_CREAT) (or sys_mkdir(), sys_mknod(), or
    sys_symlink()) can lookup the target filename while the directory's
    encryption key hasn't been added yet, resulting in a negative no-key
    dentry. The VFS then calls ->create() (or ->mkdir(), ->mknod(), or
    ->symlink()) because the dentry is negative. Normally, ->create() would
    return -ENOKEY due to the directory's key being unavailable. However,
    if the key was added between the dentry lookup and ->create(), then the
    filesystem will go ahead and try to create the file.

    If the target filename happens to already exist as a normal name (not a
    no-key name), a duplicate filename may be added to the directory.

    In order to fix this, we need to fix the filesystems to prevent
    ->create(), ->mkdir(), ->mknod(), and ->symlink() on no-key names.
    (->rename() and ->link() need it too, but those are already handled
    correctly by fscrypt_prepare_rename() and fscrypt_prepare_link().)

    In preparation for this, add a helper function fscrypt_is_nokey_name()
    that filesystems can use to do this check. Use this helper function for
    the existing checks that fs/crypto/ does for rename and link.

    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20201118075609.120337-2-ebiggers@kernel.org
    Signed-off-by: Eric Biggers
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     
  • commit 3ceb6543e9cf6ed87cc1fbc6f23ca2db903564cd upstream.

    There isn't really any valid reason to use __FSCRYPT_MODE_MAX or
    FSCRYPT_POLICY_FLAGS_VALID in a userspace program. These constants are
    only meant to be used by the kernel internally, and they are defined in
    the UAPI header next to the mode numbers and flags only so that kernel
    developers don't forget to update them when adding new modes or flags.

    In https://lkml.kernel.org/r/20201005074133.1958633-2-satyat@google.com
    there was an example of someone wanting to use __FSCRYPT_MODE_MAX in a
    user program, and it was wrong because the program would have broken if
    __FSCRYPT_MODE_MAX were ever increased. So having this definition
    available is harmful. FSCRYPT_POLICY_FLAGS_VALID has the same problem.

    So, remove these definitions from the UAPI header. Replace
    FSCRYPT_POLICY_FLAGS_VALID with just listing the valid flags explicitly
    in the one kernel function that needs it. Move __FSCRYPT_MODE_MAX to
    fscrypt_private.h, remove the double underscores (which were only
    present to discourage use by userspace), and add a BUILD_BUG_ON() and
    comments to (hopefully) ensure it is kept in sync.

    Keep the old name FS_POLICY_FLAGS_VALID, since it's been around for
    longer and there's a greater chance that removing it would break source
    compatibility with some program. Indeed, mtd-utils is using it in
    an #ifdef, and removing it would introduce compiler warnings (about
    FS_POLICY_FLAGS_PAD_* being redefined) into the mtd-utils build.
    However, reduce its value to 0x07 so that it only includes the flags
    with old names (the ones present before Linux 5.4), and try to make it
    clear that it's now "frozen" and no new flags should be added to it.

    Fixes: 2336d0deb2d4 ("fscrypt: use FSCRYPT_ prefix for uapi constants")
    Cc: # v5.4+
    Link: https://lore.kernel.org/r/20201024005132.495952-1-ebiggers@kernel.org
    Signed-off-by: Eric Biggers
    Signed-off-by: Greg Kroah-Hartman

    Eric Biggers
     

21 Dec, 2020

2 commits

  • commit 8010622c86ca5bb44bc98492f5968726fc7c7a21 upstream.

    UAS does not share the pessimistic assumption storage is making that
    devices cannot deal with WRITE_SAME. A few devices supported by UAS,
    are reported to not deal well with WRITE_SAME. Those need a quirk.

    Add it to the device that needs it.

    Reported-by: David C. Partridge
    Signed-off-by: Oliver Neukum
    Cc: stable
    Link: https://lore.kernel.org/r/20201209152639.9195-1-oneukum@suse.com
    Signed-off-by: Greg Kroah-Hartman

    Oliver Neukum
     
  • commit 0032ce0f85a269a006e91277be5fdbc05fad8426 upstream.

    ptrace_get_syscall_info() is potentially copying uninitialized stack
    memory to userspace, since the compiler may leave a 3-byte hole near the
    beginning of `info`. Fix it by adding a padding field to `struct
    ptrace_syscall_info`.

    Fixes: 201766a20e30 ("ptrace: add PTRACE_GET_SYSCALL_INFO request")
    Suggested-by: Dan Carpenter
    Signed-off-by: Peilin Ye
    Reviewed-by: Dmitry V. Levin
    Cc: stable@vger.kernel.org
    Link: https://lore.kernel.org/r/20200801152044.230416-1-yepeilin.cs@gmail.com
    Signed-off-by: Christian Brauner
    Signed-off-by: Greg Kroah-Hartman

    Peilin Ye