09 Jan, 2015

40 commits

  • commit 9cc46516ddf497ea16e8d7cb986ae03a0f6b92f8 upstream.

    - Expose the knob to user space through a proc file /proc//setgroups

    A value of "deny" means the setgroups system call is disabled in the
    current processes user namespace and can not be enabled in the
    future in this user namespace.

    A value of "allow" means the segtoups system call is enabled.

    - Descendant user namespaces inherit the value of setgroups from
    their parents.

    - A proc file is used (instead of a sysctl) as sysctls currently do
    not allow checking the permissions at open time.

    - Writing to the proc file is restricted to before the gid_map
    for the user namespace is set.

    This ensures that disabling setgroups at a user namespace
    level will never remove the ability to call setgroups
    from a process that already has that ability.

    A process may opt in to the setgroups disable for itself by
    creating, entering and configuring a user namespace or by calling
    setns on an existing user namespace with setgroups disabled.
    Processes without privileges already can not call setgroups so this
    is a noop. Prodcess with privilege become processes without
    privilege when entering a user namespace and as with any other path
    to dropping privilege they would not have the ability to call
    setgroups. So this remains within the bounds of what is possible
    without a knob to disable setgroups permanently in a user namespace.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit f0d62aec931e4ae3333c797d346dc4f188f454ba upstream.

    Generalize id_map_mutex so it can be used for more state of a user namespace.

    Reviewed-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit f95d7918bd1e724675de4940039f2865e5eec5fe upstream.

    If you did not create the user namespace and are allowed
    to write to uid_map or gid_map you should already have the necessary
    privilege in the parent user namespace to establish any mapping
    you want so this will not affect userspace in practice.

    Limiting unprivileged uid mapping establishment to the creator of the
    user namespace makes it easier to verify all credentials obtained with
    the uid mapping can be obtained without the uid mapping without
    privilege.

    Limiting unprivileged gid mapping establishment (which is temporarily
    absent) to the creator of the user namespace also ensures that the
    combination of uid and gid can already be obtained without privilege.

    This is part of the fix for CVE-2014-8989.

    Reviewed-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 80dd00a23784b384ccea049bfb3f259d3f973b9d upstream.

    setresuid allows the euid to be set to any of uid, euid, suid, and
    fsuid. Therefor it is safe to allow an unprivileged user to map
    their euid and use CAP_SETUID privileged with exactly that uid,
    as no new credentials can be obtained.

    I can not find a combination of existing system calls that allows setting
    uid, euid, suid, and fsuid from the fsuid making the previous use
    of fsuid for allowing unprivileged mappings a bug.

    This is part of a fix for CVE-2014-8989.

    Reviewed-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit be7c6dba2332cef0677fbabb606e279ae76652c3 upstream.

    As any gid mapping will allow and must allow for backwards
    compatibility dropping groups don't allow any gid mappings to be
    established without CAP_SETGID in the parent user namespace.

    For a small class of applications this change breaks userspace
    and removes useful functionality. This small class of applications
    includes tools/testing/selftests/mount/unprivilged-remount-test.c

    Most of the removed functionality will be added back with the addition
    of a one way knob to disable setgroups. Once setgroups is disabled
    setting the gid_map becomes as safe as setting the uid_map.

    For more common applications that set the uid_map and the gid_map
    with privilege this change will have no affect.

    This is part of a fix for CVE-2014-8989.

    Reviewed-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 273d2c67c3e179adb1e74f403d1e9a06e3f841b5 upstream.

    setgroups is unique in not needing a valid mapping before it can be called,
    in the case of setgroups(0, NULL) which drops all supplemental groups.

    The design of the user namespace assumes that CAP_SETGID can not actually
    be used until a gid mapping is established. Therefore add a helper function
    to see if the user namespace gid mapping has been established and call
    that function in the setgroups permission check.

    This is part of the fix for CVE-2014-8989, being able to drop groups
    without privilege using user namespaces.

    Reviewed-by: Andy Lutomirski
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 0542f17bf2c1f2430d368f44c8fcf2f82ec9e53e upstream.

    The rule is simple. Don't allow anything that wouldn't be allowed
    without unprivileged mappings.

    It was previously overlooked that establishing gid mappings would
    allow dropping groups and potentially gaining permission to files and
    directories that had lesser permissions for a specific group than for
    all other users.

    This is the rule needed to fix CVE-2014-8989 and prevent any other
    security issues with new_idmap_permitted.

    The reason for this rule is that the unix permission model is old and
    there are programs out there somewhere that take advantage of every
    little corner of it. So allowing a uid or gid mapping to be
    established without privielge that would allow anything that would not
    be allowed without that mapping will result in expectations from some
    code somewhere being violated. Violated expectations about the
    behavior of the OS is a long way to say a security issue.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 7ff4d90b4c24a03666f296c3d4878cd39001e81e upstream.

    Today there are 3 instances of setgroups and due to an oversight their
    permission checking has diverged. Add a common function so that
    they may all share the same permission checking code.

    This corrects the current oversight in the current permission checks
    and adds a helper to avoid this in the future.

    A user namespace security fix will update this new helper, shortly.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit b2f5d4dc38e034eecb7987e513255265ff9aa1cf upstream.

    Forced unmount affects not just the mount namespace but the underlying
    superblock as well. Restrict forced unmount to the global root user
    for now. Otherwise it becomes possible a user in a less privileged
    mount namespace to force the shutdown of a superblock of a filesystem
    in a more privileged mount namespace, allowing a DOS attack on root.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 4a44a19b470a886997d6647a77bb3e38dcbfa8c5 upstream.

    - MNT_NODEV should be irrelevant except when reading back mount flags,
    no longer specify MNT_NODEV on remount.

    - Test MNT_NODEV on devpts where it is meaningful even for unprivileged mounts.

    - Add a test to verify that remount of a prexisting mount with the same flags
    is allowed and does not change those flags.

    - Cleanup up the definitions of MS_REC, MS_RELATIME, MS_STRICTATIME that are used
    when the code is built in an environment without them.

    - Correct the test error messages when tests fail. There were not 5 tests
    that tested MS_RELATIME.

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 3e1866410f11356a9fd869beb3e95983dc79c067 upstream.

    Now that remount is properly enforcing the rule that you can't remove
    nodev at least sandstorm.io is breaking when performing a remount.

    It turns out that there is an easy intuitive solution implicitly
    add nodev on remount when nodev was implicitly added on mount.

    Tested-by: Cedric Bosdonnat
    Tested-by: Richard Weinberger
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 9d367e5e7b05c71a8c1ac4e9b6e00ba45a79f2fc upstream.

    thermal_unregister_governors() and class_unregister() were being called in
    the wrong order.

    Fixes: 80a26a5c22b9 ("Thermal: build thermal governors into thermal_sys module")
    Signed-off-by: Luis Henriques
    Signed-off-by: Zhang Rui
    Signed-off-by: Greg Kroah-Hartman

    Luis Henriques
     
  • commit c297abfdf15b4480704d6b566ca5ca9438b12456 upstream.

    While reviewing the code of umount_tree I realized that when we append
    to a preexisting unmounted list we do not change pprev of the former
    first item in the list.

    Which means later in namespace_unlock hlist_del_init(&mnt->mnt_hash) on
    the former first item of the list will stomp unmounted.first leaving
    it set to some random mount point which we are likely to free soon.

    This isn't likely to hit, but if it does I don't know how anyone could
    track it down.

    [ This happened because we don't have all the same operations for
    hlist's as we do for normal doubly-linked lists. In particular,
    list_splice() is easy on our standard doubly-linked lists, while
    hlist_splice() doesn't exist and needs both start/end entries of the
    hlist. And commit 38129a13e6e7 incorrectly open-coded that missing
    hlist_splice().

    We should think about making these kinds of "mindless" conversions
    easier to get right by adding the missing hlist helpers - Linus ]

    Fixes: 38129a13e6e71f666e0468e99fdd932a687b4d7e switch mnt_hash to hlist
    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Eric W. Biederman
     
  • commit 28a9bc68124c319b2b3dc861e80828a8865fd1ba upstream.

    When writing the code to allow per-station GTKs, I neglected to
    take into account the management frame keys (index 4 and 5) when
    freeing the station and only added code to free the first four
    data frame keys.

    Fix this by iterating the array of keys over the right length.

    Fixes: e31b82136d1a ("cfg80211/mac80211: allow per-station GTKs")
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Johannes Berg
     
  • commit d025933e29872cb1fe19fc54d80e4dfa4ee5779c upstream.

    As multicast-frames can't be fragmented, "dot11MulticastReceivedFrameCount"
    stopped being incremented after the use-after-free fix. Furthermore, the
    RX-LED will be triggered by every multicast frame (which wouldn't happen
    before) which wouldn't allow the LED to rest at all.

    Fixes https://bugzilla.kernel.org/show_bug.cgi?id=89431 which also had the
    patch.

    Fixes: b8fff407a180 ("mac80211: fix use-after-free in defragmentation")
    Signed-off-by: Andreas Müller
    [rewrite commit message]
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Andreas Müller
     
  • commit 7e6225a1604d0c6aa4140289bf5761868ffc9c83 upstream.

    Avoid a case where we would access uninitialized stack data if the AP
    advertises HT support without 40MHz channel support.

    Fixes: f3000e1b43f1 ("mac80211: fix broken use of VHT/20Mhz with some APs")
    Signed-off-by: Jes Sorensen
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Jes Sorensen
     
  • commit 2967e031d4d737d9cc8252d878a17924d7b704f0 upstream.

    Instead of keeping track of all those special cases where
    VLAN interfaces have no bss_conf.chandef, just make sure
    they have the same as the AP interface they belong to.

    Among others, this fixes a crash getting a VLAN's channel
    from userspace since a NULL channel is returned as a good
    result (return value 0) for VLANs since the commit below.

    Fixes: c12bc4885f4b3 ("mac80211: return the vif's chandef in ieee80211_cfg_get_channel()")
    Signed-off-by: Felix Fietkau
    [rewrite commit log]
    Signed-off-by: Johannes Berg
    Signed-off-by: Greg Kroah-Hartman

    Felix Fietkau
     
  • commit b26bdde5bb27f3f900e25a95e33a0c476c8c2c48 upstream.

    When loading encrypted-keys module, if the last check of
    aes_get_sizes() in init_encrypted() fails, the driver just returns an
    error without unregistering its key type. This results in the stale
    entry in the list. In addition to memory leaks, this leads to a kernel
    crash when registering a new key type later.

    This patch fixes the problem by swapping the calls of aes_get_sizes()
    and register_key_type(), and releasing resources properly at the error
    paths.

    Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=908163
    Signed-off-by: Takashi Iwai
    Signed-off-by: Mimi Zohar
    Signed-off-by: Greg Kroah-Hartman

    Takashi Iwai
     
  • commit 25cdb9c86826f8d035d8aaa07fc36832e76bd8a0 upstream.

    I'm such a moron! The simple solution of saving the BSP patch
    for use on resume was too simple (and wrong!), hint:
    sizeof(struct microcode_intel).

    What needs to be done instead is to fish out the microcode patch
    we have stashed previously and apply that on the BSP in case the
    late loader hasn't been utilized.

    So do that instead.

    Signed-off-by: Borislav Petkov
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/20141208110820.GB20057@pd.tnic
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit fbae4ba8c4a387e306adc9c710e5c225cece7678 upstream.

    Normally, we do reapply microcode on resume. However, in the cases where
    that microcode comes from the early loader and the late loader hasn't
    been utilized yet, there's no easy way for us to go and apply the patch
    applied during boot by the early loader.

    Thus, reuse the patch stashed by the early loader for the BSP.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit a18a0f6850d4b286a5ebf02cd5b22fe496b86349 upstream.

    Paravirtual guests are not expected to load microcode into processors
    and therefore it is not necessary to initialize microcode loading
    logic.

    In fact, under certain circumstances initializing this logic may cause
    the guest to crash. Specifically, 32-bit kernels use __pa_nodebug()
    macro which does not work in Xen (the code path that leads to this macro
    happens during resume when we call mc_bp_resume()->load_ucode_ap()
    ->check_loader_disabled_ap())

    Signed-off-by: Boris Ostrovsky
    Link: http://lkml.kernel.org/r/1417469264-31470-1-git-send-email-boris.ostrovsky@oracle.com
    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Boris Ostrovsky
     
  • commit 47768626c6db42cd06ff077ba12dd2cb10ab818b upstream.

    apply_microcode_early() doesn't use mc_saved_data, kill it.

    Signed-off-by: Borislav Petkov
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit 2ef84b3bb97f03332f0c1edb4466b1750dcf97b5 upstream.

    Hand down the cpu number instead, otherwise lockdep screams when doing

    echo 1 > /sys/devices/system/cpu/microcode/reload.

    BUG: using smp_processor_id() in preemptible [00000000] code: amd64-microcode/2470
    caller is debug_smp_processor_id+0x12/0x20
    CPU: 1 PID: 2470 Comm: amd64-microcode Not tainted 3.18.0-rc6+ #26
    ...

    Signed-off-by: Borislav Petkov
    Link: http://lkml.kernel.org/r/1417428741-4501-1-git-send-email-bp@alien8.de
    Signed-off-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit 4e2024624e678f0ebb916e6192bd23c1f9fdf696 upstream.

    We didn't check length of rock ridge ER records before printing them.
    Thus corrupted isofs image can cause us to access and print some memory
    behind the buffer with obvious consequences.

    Reported-and-tested-by: Carl Henrik Lunde
    Signed-off-by: Jan Kara
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 3fb2f4237bb452eb4e98f6a5dbd5a445b4fed9d0 upstream.

    It turns out that there's a lurking ABI issue. GCC, when
    compiling this in a 32-bit program:

    struct user_desc desc = {
    .entry_number = idx,
    .base_addr = base,
    .limit = 0xfffff,
    .seg_32bit = 1,
    .contents = 0, /* Data, grow-up */
    .read_exec_only = 0,
    .limit_in_pages = 1,
    .seg_not_present = 0,
    .useable = 0,
    };

    will leave .lm uninitialized. This means that anything in the
    kernel that reads user_desc.lm for 32-bit tasks is unreliable.

    Revert the .lm check in set_thread_area(). The value never did
    anything in the first place.

    Fixes: 0e58af4e1d21 ("x86/tls: Disallow unusual TLS segments")
    Signed-off-by: Andy Lutomirski
    Acked-by: Thomas Gleixner
    Cc: Linus Torvalds
    Link: http://lkml.kernel.org/r/d7875b60e28c512f6a6fc0baf5714d58e7eaadbb.1418856405.git.luto@amacapital.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 7ddc6a2199f1da405a2fb68c40db8899b1a8cd87 upstream.

    These functions can be executed on the int3 stack, so kprobes
    are dangerous. Tracing is probably a bad idea, too.

    Fixes: b645af2d5905 ("x86_64, traps: Rework bad_iret")
    Signed-off-by: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/50e33d26adca60816f3ba968875801652507d0c4.1416870125.git.luto@amacapital.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit ab1e85372168892387dd1ac171158fc8c3119be4 upstream.

    Commit a095b1c78a35 ("ARM: mvebu: sort DT nodes by address")
    missed placing the system-controller in the correct order.

    Fixes: a095b1c78a35 ("ARM: mvebu: sort DT nodes by address")
    Signed-off-by: Uwe Kleine-König
    Acked-by: Andrew Lunn
    Link: https://lkml.kernel.org/r/20141114204333.GS27002@pengutronix.de
    Signed-off-by: Jason Cooper
    Signed-off-by: Greg Kroah-Hartman

    Uwe Kleine-König
     
  • commit b4607572ef86b288a856b9df410ea593c5371dec upstream.

    Back when audio was enabled, the muxing of some MPP pins was causing
    problems. However, since commit fea038ed55ae ("ARM: mvebu: Add proper
    pin muxing on the Armada 370 DB board"), those problematic MPP pins
    have been assigned a proper muxing for the Ethernet interfaces. This
    proper muxing is now conflicting with the hog pins muxing that had
    been added as part of 249f3822509b ("ARM: mvebu: add audio support to
    Armada 370 DB").

    Therefore, this commit simply removes the hog pins muxing, which
    solves a warning a boot time due to the conflicting muxing
    requirements.

    Fixes: fea038ed55ae ("ARM: mvebu: Add proper pin muxing on the Armada 370 DB board")
    Cc: Ezequiel Garcia
    Signed-off-by: Thomas Petazzoni
    Acked-by: Andrew Lunn
    Link: https://lkml.kernel.org/r/1414512524-24466-5-git-send-email-thomas.petazzoni@free-electrons.com
    Signed-off-by: Jason Cooper
    Signed-off-by: Greg Kroah-Hartman

    Thomas Petazzoni
     
  • commit e55355453600a33bb5ca4f71f2d7214875f3b061 upstream.

    Enabling the hardware I/O coherency on Armada 370, Armada 375, Armada
    38x and Armada XP requires a certain number of conditions:

    - On Armada 370, the cache policy must be set to write-allocate.

    - On Armada 375, 38x and XP, the cache policy must be set to
    write-allocate, the pages must be mapped with the shareable
    attribute, and the SMP bit must be set

    Currently, on Armada XP, when CONFIG_SMP is enabled, those conditions
    are met. However, when Armada XP is used in a !CONFIG_SMP kernel, none
    of these conditions are met. With Armada 370, the situation is worse:
    since the processor is single core, regardless of whether CONFIG_SMP
    or !CONFIG_SMP is used, the cache policy will be set to write-back by
    the kernel and not write-allocate.

    Since solving this problem turns out to be quite complicated, and we
    don't want to let users with a mainline kernel known to have
    infrequent but existing data corruptions, this commit proposes to
    simply disable hardware I/O coherency in situations where it is known
    not to work.

    And basically, the is_smp() function of the kernel tells us whether it
    is OK to enable hardware I/O coherency or not, so this commit slightly
    refactors the coherency_type() function to return
    COHERENCY_FABRIC_TYPE_NONE when is_smp() is false, or the appropriate
    type of the coherency fabric in the other case.

    Thanks to this, the I/O coherency fabric will no longer be used at all
    in !CONFIG_SMP configurations. It will continue to be used in
    CONFIG_SMP configurations on Armada XP, Armada 375 and Armada 38x
    (which are multiple cores processors), but will no longer be used on
    Armada 370 (which is a single core processor).

    In the process, it simplifies the implementation of the
    coherency_type() function, and adds a missing call to of_node_put().

    Signed-off-by: Thomas Petazzoni
    Fixes: e60304f8cb7bb545e79fe62d9b9762460c254ec2 ("arm: mvebu: Add hardware I/O Coherency support")
    Acked-by: Gregory CLEMENT
    Link: https://lkml.kernel.org/r/1415871540-20302-3-git-send-email-thomas.petazzoni@free-electrons.com
    Signed-off-by: Jason Cooper
    Signed-off-by: Greg Kroah-Hartman

    Thomas Petazzoni
     
  • commit 30cdef97107370a7f63ab5d80fd2de30540750c8 upstream.

    The ll_add_cpu_to_smp_group(), ll_enable_coherency() and
    ll_disable_coherency() are used on Armada XP to control the coherency
    fabric. However, they make the assumption that the coherency fabric is
    always available, which is currently a correct assumption but will no
    longer be true with a followup commit that disables the usage of the
    coherency fabric when the conditions are not met to use it.

    Therefore, this commit modifies those functions so that they check the
    return value of ll_get_coherency_base(), and if the return value is 0,
    they simply return without configuring anything in the coherency
    fabric.

    The ll_get_coherency_base() function is also modified to properly
    return 0 when the function is called with the MMU disabled. In this
    case, it normally returns the physical address of the coherency
    fabric, but we now check if the virtual address is 0, and if that's
    case, return a physical address of 0 to indicate that the coherency
    fabric is not enabled.

    Signed-off-by: Thomas Petazzoni
    Acked-by: Gregory CLEMENT
    Link: https://lkml.kernel.org/r/1415871540-20302-2-git-send-email-thomas.petazzoni@free-electrons.com
    Signed-off-by: Jason Cooper
    Signed-off-by: Greg Kroah-Hartman

    Thomas Petazzoni
     
  • commit e4a680099a6e97ecdbb81081cff9e4a489a4dc44 upstream.

    Commit d127e9c ("ARM: tegra: make tegra_resume can work with current and later
    chips") removed tegra_get_soc_id macro leaving used cpu register corrupted after
    branching to v7_invalidate_l1() and as result causing execution of unintended
    code on tegra20. Possibly it was expected that r6 would be SoC id func argument
    since common cpu reset handler is setting r6 before branching to tegra_resume(),
    but neither tegra20_lp1_reset() nor tegra30_lp1_reset() aren't setting r6
    register before jumping to resume function. Fix it by re-adding macro.

    Fixes: d127e9c (ARM: tegra: make tegra_resume can work with current and later chips)
    Reviewed-by: Felipe Balbi
    Signed-off-by: Dmitry Osipenko
    Signed-off-by: Thierry Reding
    Signed-off-by: Greg Kroah-Hartman

    Dmitry Osipenko
     
  • commit dc6057ecb39edb34b0461ca55382094410bd257a upstream.

    When creating a dumb buffer object using the DRM_IOCTL_MODE_CREATE_DUMB
    IOCTL, only the width, height, bpp and flags parameters are inputs. The
    caller is not guaranteed to zero out or set handle, pitch and size, so
    the driver must not treat these values as possible inputs.

    Fixes a bug where running the Weston compositor on Tegra DRM would cause
    an attempt to allocate a 3 GiB framebuffer to be allocated.

    Fixes: de2ba664c30f ("gpu: host1x: drm: Add memory manager and fb")
    Signed-off-by: Thierry Reding
    Signed-off-by: Greg Kroah-Hartman

    Thierry Reding
     
  • commit 51c9fbb1b146f3336a93d398c439b6fbfe5ab489 upstream.

    Earlier implementation assumed last instruction is BPF_EXIT.
    Since this is no longer a restriction in eBPF, we remove this
    limitation.

    Per Alexei Starovoitov [1]:
    > classic BPF has a restriction that last insn is always BPF_RET.
    > eBPF doesn't have BPF_RET instruction and this restriction.
    > It has BPF_EXIT insn which can appear anywhere in the program
    > one or more times and it doesn't have to be last insn.

    [1] https://lkml.org/lkml/2014/11/27/2

    Fixes: e54bcde3d69d ("arm64: eBPF JIT compiler")
    Acked-by: Alexei Starovoitov
    Signed-off-by: Zi Shen Lim
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Zi Shen Lim
     
  • commit 7d57511d2dba03a8046c8b428dd9192a4bfc1e73 upstream.

    Commit a469abd0f868 (ARM: elf: add new hwcap for identifying atomic
    ldrd/strd instructions) introduces HWCAP_ELF for 32-bit ARM
    applications. As LPAE is always present on arm64, report the
    corresponding compat HWCAP to user space.

    Signed-off-by: Catalin Marinas
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Catalin Marinas
     
  • commit 17181fb7a0c3a279196c0eeb2caba65a1519614b upstream.

    As long as struct thin_c is in the list, anyone can grab a reference of
    it. Consequently, we must wait for the reference count to drop to zero
    *after* we remove the structure from the list, not before.

    Signed-off-by: Mikulas Patocka
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Mikulas Patocka
     
  • commit 2c43fd26e46734430122b8d2ad3024bb532df3ef upstream.

    Discard bios and thin device deletion have the potential to release data
    blocks. If the thin-pool is in out-of-data-space mode, and blocks were
    released, transition the thin-pool back to full write mode.

    The correct time to do this is just after the thin-pool metadata commit.
    It cannot be done before the commit because the space maps will not
    allow immediate reuse of the data blocks in case there's a rollback
    following power failure.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Joe Thornber
     
  • commit 45ec9bd0fd7abf8705e7cf12205ff69fe9d51181 upstream.

    When the pool was in PM_OUT_OF_SPACE mode its process_prepared_discard
    function pointer was incorrectly being set to
    process_prepared_discard_passdown rather than process_prepared_discard.

    This incorrect function pointer meant the discard was being passed down,
    but not effecting the mapping. As such any discard that was issued, in
    an attempt to reclaim blocks, would not successfully free data space.

    Reported-by: Eric Sandeen
    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Joe Thornber
     
  • commit c1c6156fe4d4577444b769d7edd5dd503e57bbc9 upstream.

    This function isn't right and it causes a static checker warning:

    drivers/md/dm-thin.c:3016 maybe_resize_data_dev()
    error: potentially using uninitialized 'sb_data_size'.

    It should set "*count" and return zero on success the same as the
    sm_metadata_get_nr_blocks() function does earlier.

    Fixes: 3241b1d3e0aa ('dm: add persistent data library')
    Signed-off-by: Dan Carpenter
    Acked-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • commit f824a2af3dfbbb766c02e19df21f985bceadf0ee upstream.

    We never bother caching a partial block that is at the back end of the
    origin device. No cell ever gets locked, but the calling code was
    assuming it was and trying to release it.

    Now the code only releases if the cell has been set to a non NULL
    value.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Joe Thornber
     
  • commit 1e32134a5a404e80bfb47fad8a94e9bbfcbdacc5 upstream.

    If the incoming bio is a WRITE and completely covers a block then we
    don't bother to do any copying for a promotion operation. Once this is
    done the cache block and origin block will be different, so we need to
    set it to 'dirty'.

    Signed-off-by: Joe Thornber
    Signed-off-by: Mike Snitzer
    Signed-off-by: Greg Kroah-Hartman

    Joe Thornber