30 Dec, 2020

1 commit

  • commit 0f966cba95c78029f491b433ea95ff38f414a761 upstream.

    Add a per-transaction flag to indicate that the buffer
    must be cleared when the transaction is complete to
    prevent copies of sensitive data from being preserved
    in memory.

    Signed-off-by: Todd Kjos
    Link: https://lore.kernel.org/r/20201120233743.3617529-1-tkjos@google.com
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Todd Kjos
     

18 Oct, 2020

1 commit

  • A previous commit changed the notification mode from true/false to an
    int, allowing notify-no, notify-yes, or signal-notify. This was
    backwards compatible in the sense that any existing true/false user
    would translate to either 0 (on notification sent) or 1, the latter
    which mapped to TWA_RESUME. TWA_SIGNAL was assigned a value of 2.

    Clean this up properly, and define a proper enum for the notification
    mode. Now we have:

    - TWA_NONE. This is 0, same as before the original change, meaning no
    notification requested.
    - TWA_RESUME. This is 1, same as before the original change, meaning
    that we use TIF_NOTIFY_RESUME.
    - TWA_SIGNAL. This uses TIF_SIGPENDING/JOBCTL_TASK_WORK for the
    notification.

    Clean up all the callers, switching their 0/1/false/true to using the
    appropriate TWA_* mode for notifications.

    Fixes: e91b48162332 ("task_work: teach task_work_add() to do signal_wake_up()")
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Jens Axboe

    Jens Axboe
     

10 Oct, 2020

1 commit

  • When releasing a thread todo list when tearing down
    a binder_proc, the following race was possible which
    could result in a use-after-free:

    1. Thread 1: enter binder_release_work from binder_thread_release
    2. Thread 2: binder_update_ref_for_handle() -> binder_dec_node_ilocked()
    3. Thread 2: dec nodeA --> 0 (will free node)
    4. Thread 1: ACQ inner_proc_lock
    5. Thread 2: block on inner_proc_lock
    6. Thread 1: dequeue work (BINDER_WORK_NODE, part of nodeA)
    7. Thread 1: REL inner_proc_lock
    8. Thread 2: ACQ inner_proc_lock
    9. Thread 2: todo list cleanup, but work was already dequeued
    10. Thread 2: free node
    11. Thread 2: REL inner_proc_lock
    12. Thread 1: deref w->type (UAF)

    The problem was that for a BINDER_WORK_NODE, the binder_work element
    must not be accessed after releasing the inner_proc_lock while
    processing the todo list elements since another thread might be
    handling a deref on the node containing the binder_work element
    leading to the node being freed.

    Signed-off-by: Todd Kjos
    Link: https://lore.kernel.org/r/20201009232455.4054810-1-tkjos@google.com
    Cc: # 4.14, 4.19, 5.4, 5.8
    Signed-off-by: Greg Kroah-Hartman

    Todd Kjos
     

05 Oct, 2020

1 commit


16 Sep, 2020

1 commit

  • The pointer n is being initialized with a value that is
    never read and it is being updated later with a new value. The
    initialization is redundant and can be removed.

    Acked-by: Todd Kjos
    Acked-by: Christian Brauner
    Signed-off-by: Colin Ian King
    Link: https://lore.kernel.org/r/20200910151221.751464-1-colin.king@canonical.com
    Signed-off-by: Greg Kroah-Hartman

    Colin Ian King
     

04 Sep, 2020

4 commits

  • The most common cause of the binder transaction buffer filling up is a
    client rapidly firing oneway transactions into a process, before it has
    a chance to handle them. Yet the root cause of this is often hard to
    debug, because either the system or the app will stop, and by that time
    binder debug information we dump in bugreports is no longer relevant.

    This change warns as soon as a process dips below 80% of its oneway
    space (less than 100kB available in the configuration), when any one
    process is responsible for either more than 50 transactions, or more
    than 50% of the oneway space.

    Signed-off-by: Martijn Coenen
    Acked-by: Todd Kjos
    Link: https://lore.kernel.org/r/20200821122544.1277051-1-maco@android.com
    Signed-off-by: Greg Kroah-Hartman

    Martijn Coenen
     
  • The sparse tool complains as follows:

    drivers/android/binderfs.c:66:32: warning:
    symbol 'binderfs_fs_parameters' was not declared. Should it be static?

    This variable is not used outside of binderfs.c, so this commit
    marks it static.

    Fixes: 095cf502b31e ("binderfs: port to new mount api")
    Reported-by: Hulk Robot
    Signed-off-by: Wei Yongjun
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20200818112245.43891-1-weiyongjun1@huawei.com
    Signed-off-by: Greg Kroah-Hartman

    Wei Yongjun
     
  • The function name should is binder_alloc_new_buf()

    Signed-off-by: YangHui
    Reviewed-by: Martijn Coenen
    Link: https://lore.kernel.org/r/1597714444-3614-1-git-send-email-yanghui.def@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    YangHui
     
  • While binder transactions with the same binder_proc as sender and recipient
    are forbidden, transactions with the same task_struct as sender and
    recipient are possible (even though currently there is a weird check in
    binder_transaction() that rejects them in the target==0 case).
    Therefore, task_struct identities can't be used to distinguish whether
    the caller is running in the context of the sender or the recipient.

    Since I see no easy way to make this WARN_ON() useful and correct, let's
    just remove it.

    Fixes: 44d8047f1d87 ("binder: use standard functions to allocate fds")
    Reported-by: syzbot+e113a0b970b7b3f394ba@syzkaller.appspotmail.com
    Acked-by: Christian Brauner
    Acked-by: Todd Kjos
    Signed-off-by: Jann Horn
    Link: https://lore.kernel.org/r/20200806165359.2381483-1-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     

29 Jul, 2020

6 commits

  • C source files should have `//` as SPDX comment and not `/**/`. Fix this
    by running checkpatch on the file.

    Signed-off-by: Mrinal Pandey
    Link: https://lore.kernel.org/r/20200724131449.zvjutbemg3vqhrzh@mrinalpandey
    Signed-off-by: Greg Kroah-Hartman

    Mrinal Pandey
     
  • Add a blank line after variable declarations as suggested by checkpatch.

    Signed-off-by: Mrinal Pandey
    Link: https://lore.kernel.org/r/20200724131433.stf3ycooogawyzb3@mrinalpandey
    Signed-off-by: Greg Kroah-Hartman

    Mrinal Pandey
     
  • Remove braces for both if and else block as suggested by checkpatch.

    Signed-off-by: Mrinal Pandey
    Link: https://lore.kernel.org/r/20200724131403.dahfhdwa3wirzkxj@mrinalpandey
    Signed-off-by: Greg Kroah-Hartman

    Mrinal Pandey
     
  • Remove the unnecessary else branch after return statement as suggested by
    checkpatch.

    Signed-off-by: Mrinal Pandey
    Link: https://lore.kernel.org/r/20200724131348.haz4ocxcferdcsgn@mrinalpandey
    Signed-off-by: Greg Kroah-Hartman

    Mrinal Pandey
     
  • Add a blank line after variable declarations as suggested by checkpatch.

    Signed-off-by: Mrinal Pandey
    Link: https://lore.kernel.org/r/20200724131254.qxbvderrws36dzzq@mrinalpandey
    Signed-off-by: Greg Kroah-Hartman

    Mrinal Pandey
     
  • Binder is designed such that a binder_proc never has references to
    itself. If this rule is violated, memory corruption can occur when a
    process sends a transaction to itself; see e.g.
    .

    There is a remaining edgecase through which such a transaction-to-self
    can still occur from the context of a task with BINDER_SET_CONTEXT_MGR
    access:

    - task A opens /dev/binder twice, creating binder_proc instances P1
    and P2
    - P1 becomes context manager
    - P2 calls ACQUIRE on the magic handle 0, allocating index 0 in its
    handle table
    - P1 dies (by closing the /dev/binder fd and waiting a bit)
    - P2 becomes context manager
    - P2 calls ACQUIRE on the magic handle 0, allocating index 1 in its
    handle table
    [this triggers a warning: "binder: 1974:1974 tried to acquire
    reference to desc 0, got 1 instead"]
    - task B opens /dev/binder once, creating binder_proc instance P3
    - P3 calls P2 (via magic handle 0) with (void*)1 as argument (two-way
    transaction)
    - P2 receives the handle and uses it to call P3 (two-way transaction)
    - P3 calls P2 (via magic handle 0) (two-way transaction)
    - P2 calls P2 (via handle 1) (two-way transaction)

    And then, if P2 does *NOT* accept the incoming transaction work, but
    instead closes the binder fd, we get a crash.

    Solve it by preventing the context manager from using ACQUIRE on ref 0.
    There shouldn't be any legitimate reason for the context manager to do
    that.

    Additionally, print a warning if someone manages to find another way to
    trigger a transaction-to-self bug in the future.

    Cc: stable@vger.kernel.org
    Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
    Acked-by: Todd Kjos
    Signed-off-by: Jann Horn
    Reviewed-by: Martijn Coenen
    Link: https://lore.kernel.org/r/20200727120424.1627555-1-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     

23 Jul, 2020

1 commit

  • syzbot is reporting that mmput() from shrinker function has a risk of
    deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
    kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
    uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.

    Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
    callback") replaced mmput() with mmput_async() in order to avoid sleeping
    with spinlock held. But this patch replaces mmput() with mmput_async() in
    order not to start __mmput() from shrinker context.

    [1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d45

    Reported-by: syzbot
    Reported-by: syzbot
    Signed-off-by: Tetsuo Handa
    Reviewed-by: Michal Hocko
    Acked-by: Todd Kjos
    Acked-by: Christian Brauner
    Cc: stable
    Link: https://lore.kernel.org/r/4ba9adb2-43f5-2de0-22de-f6075c1fab50@i-love.sakura.ne.jp
    Signed-off-by: Greg Kroah-Hartman

    Tetsuo Handa
     

23 Jun, 2020

1 commit

  • The binder driver makes the assumption proc->context pointer is invariant after
    initialization (as documented in the kerneldoc header for struct proc).
    However, in commit f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
    proc->context is set to NULL during binder_deferred_release().

    Another proc was in the middle of setting up a transaction to the dying
    process and crashed on a NULL pointer deref on "context" which is a local
    set to &proc->context:

    new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1;

    Here's the stack:

    [ 5237.855435] Call trace:
    [ 5237.855441] binder_get_ref_for_node_olocked+0x100/0x2ec
    [ 5237.855446] binder_inc_ref_for_node+0x140/0x280
    [ 5237.855451] binder_translate_binder+0x1d0/0x388
    [ 5237.855456] binder_transaction+0x2228/0x3730
    [ 5237.855461] binder_thread_write+0x640/0x25bc
    [ 5237.855466] binder_ioctl_write_read+0xb0/0x464
    [ 5237.855471] binder_ioctl+0x30c/0x96c
    [ 5237.855477] do_vfs_ioctl+0x3e0/0x700
    [ 5237.855482] __arm64_sys_ioctl+0x78/0xa4
    [ 5237.855488] el0_svc_common+0xb4/0x194
    [ 5237.855493] el0_svc_handler+0x74/0x98
    [ 5237.855497] el0_svc+0x8/0xc

    The fix is to move the kfree of the binder_device to binder_free_proc()
    so the binder_device is freed when we know there are no references
    remaining on the binder_proc.

    Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
    Acked-by: Christian Brauner
    Signed-off-by: Todd Kjos
    Cc: stable
    Link: https://lore.kernel.org/r/20200622200715.114382-1-tkjos@google.com
    Signed-off-by: Greg Kroah-Hartman

    Todd Kjos
     

14 Jun, 2020

1 commit

  • Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
    '---help---'"), the number of '---help---' has been gradually
    decreasing, but there are still more than 2400 instances.

    This commit finishes the conversion. While I touched the lines,
    I also fixed the indentation.

    There are a variety of indentation styles found.

    a) 4 spaces + '---help---'
    b) 7 spaces + '---help---'
    c) 8 spaces + '---help---'
    d) 1 space + 1 tab + '---help---'
    e) 1 tab + '---help---' (correct indentation)
    f) 1 tab + 1 space + '---help---'
    g) 1 tab + 2 spaces + '---help---'

    In order to convert all of them to 1 tab + 'help', I ran the
    following commend:

    $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

    Signed-off-by: Masahiro Yamada

    Masahiro Yamada
     

10 Jun, 2020

2 commits

  • Convert comments that reference old mmap_sem APIs to reference
    corresponding new mmap locking APIs instead.

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Vlastimil Babka
    Reviewed-by: Davidlohr Bueso
    Reviewed-by: Daniel Jordan
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Laurent Dufour
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     
  • This change converts the existing mmap_sem rwsem calls to use the new mmap
    locking API instead.

    The change is generated using coccinelle with the following rule:

    // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

    @@
    expression mm;
    @@
    (
    -init_rwsem
    +mmap_init_lock
    |
    -down_write
    +mmap_write_lock
    |
    -down_write_killable
    +mmap_write_lock_killable
    |
    -down_write_trylock
    +mmap_write_trylock
    |
    -up_write
    +mmap_write_unlock
    |
    -downgrade_write
    +mmap_write_downgrade
    |
    -down_read
    +mmap_read_lock
    |
    -down_read_killable
    +mmap_read_lock_killable
    |
    -down_read_trylock
    +mmap_read_trylock
    |
    -up_read
    +mmap_read_unlock
    )
    -(&mm->mmap_sem)
    +(mm)

    Signed-off-by: Michel Lespinasse
    Signed-off-by: Andrew Morton
    Reviewed-by: Daniel Jordan
    Reviewed-by: Laurent Dufour
    Reviewed-by: Vlastimil Babka
    Cc: Davidlohr Bueso
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Jason Gunthorpe
    Cc: Jerome Glisse
    Cc: John Hubbard
    Cc: Liam Howlett
    Cc: Matthew Wilcox
    Cc: Peter Zijlstra
    Cc: Ying Han
    Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

23 Apr, 2020

2 commits

  • The pointer ctx is being initialized with a value that is never read
    and it is being updated later with a new value. The initialization
    is redundant and can be removed.

    Addresses-Coverity: ("Unused value")
    Signed-off-by: Colin Ian King
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20200402105000.506296-1-colin.king@canonical.com
    Signed-off-by: Greg Kroah-Hartman

    Colin Ian King
     
  • Fix missing braces compilation warning in the ARM
    compiler environment:
    drivers/android/binderfs.c: In function 'binderfs_fill_super':
    drivers/android/binderfs.c:650:9: warning: missing braces around initializer [-Wmissing-braces]
    struct binderfs_device device_info = { 0 };
    drivers/android/binderfs.c:650:9: warning: (near initialization for ‘device_info.name’) [-Wmissing-braces]

    Acked-by: Christian Brauner
    Signed-off-by: Tang Bin
    Link: https://lore.kernel.org/r/20200411145151.5576-1-tangbin@cmss.chinamobile.com
    Signed-off-by: Greg Kroah-Hartman

    Tang Bin
     

23 Mar, 2020

1 commit


19 Mar, 2020

1 commit

  • When I first wrote binderfs the new mount api had not yet landed. Now
    that it has been around for a little while and a bunch of filesystems
    have already been ported we should do so too. When Al sent his
    mount-api-conversion pr he requested that binderfs (and a few others) be
    ported separately. It's time we port binderfs. We can make use of the
    new option parser, get nicer infrastructure and it will be easier if we
    ever add any new mount options.

    This survives testing with the binderfs selftests:

    for i in `seq 1 1000`; do ./binderfs_test; done

    including the new stress tests I sent out for review today:

    TAP version 13
    1..1
    # selftests: filesystems/binderfs: binderfs_test
    # [==========] Running 3 tests from 1 test cases.
    # [ RUN ] global.binderfs_stress
    # [ XFAIL! ] Tests are not run as root. Skipping privileged tests
    # [==========] Running 3 tests from 1 test cases.
    # [ RUN ] global.binderfs_stress
    # [ OK ] global.binderfs_stress
    # [ RUN ] global.binderfs_test_privileged
    # [ OK ] global.binderfs_test_privileged
    # [ RUN ] global.binderfs_test_unprivileged
    # # Allocated new binder device with major 243, minor 4, and name my-binder
    # # Detected binder version: 8
    # [==========] Running 3 tests from 1 test cases.
    # [ RUN ] global.binderfs_stress
    # [ OK ] global.binderfs_stress
    # [ RUN ] global.binderfs_test_privileged
    # [ OK ] global.binderfs_test_privileged
    # [ RUN ] global.binderfs_test_unprivileged
    # [ OK ] global.binderfs_test_unprivileged
    # [==========] 3 / 3 tests passed.
    # [ PASSED ]
    ok 1 selftests: filesystems/binderfs: binderfs_test

    Cc: Todd Kjos
    Signed-off-by: Christian Brauner
    Reviewed-by: Kees Cook
    Link: https://lore.kernel.org/r/20200313153427.141789-1-christian.brauner@ubuntu.com
    Signed-off-by: Greg Kroah-Hartman

    Christian Brauner
     

12 Mar, 2020

1 commit

  • Binderfs binder-control devices are cleaned up via binderfs_evict_inode
    too() which will use refcount_dec_and_test(). However, we missed to set
    the refcount for binderfs binder-control devices and so we underflowed
    when the binderfs instance got unmounted. Pretty obvious oversight and
    should have been part of the more general UAF fix. The good news is that
    having test cases (suprisingly) helps.

    Technically, we could detect that we're about to cleanup the
    binder-control dentry in binderfs_evict_inode() and then simply clean it
    up. But that makes the assumption that the binder driver itself will
    never make use of a binderfs binder-control device after the binderfs
    instance it belongs to has been unmounted and the superblock for it been
    destroyed. While it is unlikely to ever come to this let's be on the
    safe side. Performance-wise this also really doesn't matter since the
    binder-control device is only every really when creating the binderfs
    filesystem or creating additional binder devices. Both operations are
    pretty rare.

    Fixes: f0fe2c0f050d ("binder: prevent UAF for binderfs devices II")
    Link: https://lore.kernel.org/r/CA+G9fYusdfg7PMfC9Xce-xLT7NiyKSbgojpK35GOm=Pf9jXXrA@mail.gmail.com
    Reported-by: Naresh Kamboju
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Brauner
    Acked-by: Todd Kjos
    Link: https://lore.kernel.org/r/20200311105309.1742827-1-christian.brauner@ubuntu.com
    Signed-off-by: Greg Kroah-Hartman

    Christian Brauner
     

04 Mar, 2020

1 commit

  • This is a necessary follow up to the first fix I proposed and we merged
    in 2669b8b0c79 ("binder: prevent UAF for binderfs devices"). I have been
    overly optimistic that the simple fix I proposed would work. But alas,
    ihold() + iput() won't work since the inodes won't survive the
    destruction of the superblock.
    So all we get with my prior fix is a different race with a tinier
    race-window but it doesn't solve the issue. Fwiw, the problem lies with
    generic_shutdown_super(). It even has this cozy Al-style comment:

    if (!list_empty(&sb->s_inodes)) {
    printk("VFS: Busy inodes after unmount of %s. "
    "Self-destruct in 5 seconds. Have a nice day...\n",
    sb->s_id);
    }

    On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
    called which punts the actual cleanup operation to a workqueue. At some
    point, binder_deferred_func() will be called which will end up calling
    binder_deferred_release() which will retrieve and cleanup the
    binder_context attach to this struct binder_proc.

    If we trace back where this binder_context is attached to binder_proc we
    see that it is set in binder_open() and is taken from the struct
    binder_device it is associated with. This obviously assumes that the
    struct binder_device that context is attached to is _never_ freed. While
    that might be true for devtmpfs binder devices it is most certainly
    wrong for binderfs binder devices.

    So, assume binder_open() is called on a binderfs binder devices. We now
    stash away the struct binder_context associated with that struct
    binder_devices:
    proc->context = &binder_dev->context;
    /* binderfs stashes devices in i_private */
    if (is_binderfs_device(nodp)) {
    binder_dev = nodp->i_private;
    info = nodp->i_sb->s_fs_info;
    binder_binderfs_dir_entry_proc = info->proc_log_dir;
    } else {
    .
    .
    .
    proc->context = &binder_dev->context;

    Now let's assume that the binderfs instance for that binder devices is
    shutdown via umount() and/or the mount namespace associated with it goes
    away. As long as there is still an fd open for that binderfs binder
    device things are fine. But let's assume we now close the last fd for
    that binderfs binder device. Now binder_release() is called and punts to
    the workqueue. Assume that the workqueue has quite a bit of stuff to do
    and doesn't get to cleaning up the struct binder_proc and the associated
    struct binder_context with it for that binderfs binder device right
    away. In the meantime, the VFS is killing the super block and is
    ultimately calling sb->evict_inode() which means it will call
    binderfs_evict_inode() which does:

    static void binderfs_evict_inode(struct inode *inode)
    {
    struct binder_device *device = inode->i_private;
    struct binderfs_info *info = BINDERFS_I(inode);

    clear_inode(inode);

    if (!S_ISCHR(inode->i_mode) || !device)
    return;

    mutex_lock(&binderfs_minors_mutex);
    --info->device_count;
    ida_free(&binderfs_minors, device->miscdev.minor);
    mutex_unlock(&binderfs_minors_mutex);

    kfree(device->context.name);
    kfree(device);
    }

    thereby freeing the struct binder_device including struct
    binder_context.

    Now the workqueue finally has time to get around to cleaning up struct
    binder_proc and is now trying to access the associate struct
    binder_context. Since it's already freed it will OOPs.

    Fix this by introducing a refounct on binder devices.

    This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
    print_binder_transaction_log_entry()").

    Fixes: 3ad20fe393b3 ("binder: implement binderfs")
    Fixes: 2669b8b0c798 ("binder: prevent UAF for binderfs devices")
    Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
    Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Brauner
    Acked-by: Todd Kjos
    Link: https://lore.kernel.org/r/20200303164340.670054-1-christian.brauner@ubuntu.com
    Signed-off-by: Greg Kroah-Hartman

    Christian Brauner
     

03 Mar, 2020

1 commit

  • On binder_release(), binder_defer_work(proc, BINDER_DEFERRED_RELEASE) is
    called which punts the actual cleanup operation to a workqueue. At some
    point, binder_deferred_func() will be called which will end up calling
    binder_deferred_release() which will retrieve and cleanup the
    binder_context attach to this struct binder_proc.

    If we trace back where this binder_context is attached to binder_proc we
    see that it is set in binder_open() and is taken from the struct
    binder_device it is associated with. This obviously assumes that the
    struct binder_device that context is attached to is _never_ freed. While
    that might be true for devtmpfs binder devices it is most certainly
    wrong for binderfs binder devices.

    So, assume binder_open() is called on a binderfs binder devices. We now
    stash away the struct binder_context associated with that struct
    binder_devices:
    proc->context = &binder_dev->context;
    /* binderfs stashes devices in i_private */
    if (is_binderfs_device(nodp)) {
    binder_dev = nodp->i_private;
    info = nodp->i_sb->s_fs_info;
    binder_binderfs_dir_entry_proc = info->proc_log_dir;
    } else {
    .
    .
    .
    proc->context = &binder_dev->context;

    Now let's assume that the binderfs instance for that binder devices is
    shutdown via umount() and/or the mount namespace associated with it goes
    away. As long as there is still an fd open for that binderfs binder
    device things are fine. But let's assume we now close the last fd for
    that binderfs binder device. Now binder_release() is called and punts to
    the workqueue. Assume that the workqueue has quite a bit of stuff to do
    and doesn't get to cleaning up the struct binder_proc and the associated
    struct binder_context with it for that binderfs binder device right
    away. In the meantime, the VFS is killing the super block and is
    ultimately calling sb->evict_inode() which means it will call
    binderfs_evict_inode() which does:

    static void binderfs_evict_inode(struct inode *inode)
    {
    struct binder_device *device = inode->i_private;
    struct binderfs_info *info = BINDERFS_I(inode);

    clear_inode(inode);

    if (!S_ISCHR(inode->i_mode) || !device)
    return;

    mutex_lock(&binderfs_minors_mutex);
    --info->device_count;
    ida_free(&binderfs_minors, device->miscdev.minor);
    mutex_unlock(&binderfs_minors_mutex);

    kfree(device->context.name);
    kfree(device);
    }

    thereby freeing the struct binder_device including struct
    binder_context.

    Now the workqueue finally has time to get around to cleaning up struct
    binder_proc and is now trying to access the associate struct
    binder_context. Since it's already freed it will OOPs.

    Fix this by holding an additional reference to the inode that is only
    released once the workqueue is done cleaning up struct binder_proc. This
    is an easy alternative to introducing separate refcounting on struct
    binder_device which we can always do later if it becomes necessary.

    This is an alternative fix to 51d8a7eca677 ("binder: prevent UAF read in
    print_binder_transaction_log_entry()").

    Fixes: 3ad20fe393b3 ("binder: implement binderfs")
    Fixes: 03e2e07e3814 ("binder: Make transaction_log available in binderfs")
    Related : 51d8a7eca677 ("binder: prevent UAF read in print_binder_transaction_log_entry()")
    Cc: stable@vger.kernel.org
    Signed-off-by: Christian Brauner
    Acked-by: Todd Kjos
    Signed-off-by: Greg Kroah-Hartman

    Christian Brauner
     

30 Jan, 2020

1 commit

  • Pull io_uring updates from Jens Axboe:

    - Support for various new opcodes (fallocate, openat, close, statx,
    fadvise, madvise, openat2, non-vectored read/write, send/recv, and
    epoll_ctl)

    - Faster ring quiesce for fileset updates

    - Optimizations for overflow condition checking

    - Support for max-sized clamping

    - Support for probing what opcodes are supported

    - Support for io-wq backend sharing between "sibling" rings

    - Support for registering personalities

    - Lots of little fixes and improvements

    * tag 'for-5.6/io_uring-vfs-2020-01-29' of git://git.kernel.dk/linux-block: (64 commits)
    io_uring: add support for epoll_ctl(2)
    eventpoll: support non-blocking do_epoll_ctl() calls
    eventpoll: abstract out epoll_ctl() handler
    io_uring: fix linked command file table usage
    io_uring: support using a registered personality for commands
    io_uring: allow registering credentials
    io_uring: add io-wq workqueue sharing
    io-wq: allow grabbing existing io-wq
    io_uring/io-wq: don't use static creds/mm assignments
    io-wq: make the io_wq ref counted
    io_uring: fix refcounting with batched allocations at OOM
    io_uring: add comment for drain_next
    io_uring: don't attempt to copy iovec for READ/WRITE
    io_uring: honor IOSQE_ASYNC for linked reqs
    io_uring: prep req when do IOSQE_ASYNC
    io_uring: use labeled array init in io_op_defs
    io_uring: optimise sqe-to-req flags translation
    io_uring: remove REQ_F_IO_DRAINED
    io_uring: file switch work needs to get flushed on exit
    io_uring: hide uring_fd in ctx
    ...

    Linus Torvalds
     

22 Jan, 2020

1 commit

  • Since commit 43e23b6c0b01 ("debugfs: log errors when something goes wrong")
    debugfs logs attempts to create existing files.

    However binder attempts to create multiple debugfs files with
    the same name when a single PID has multiple contexts, this leads
    to log spamming during an Android boot (17 such messages during
    boot on my system).

    Fix this by checking if we already know the PID and only create
    the debugfs entry for the first context per PID.

    Do the same thing for binderfs for symmetry.

    Signed-off-by: Martin Fuzzey
    Acked-by: Todd Kjos
    Fixes: 43e23b6c0b01 ("debugfs: log errors when something goes wrong")
    Cc: stable
    Link: https://lore.kernel.org/r/1578671054-5982-1-git-send-email-martin.fuzzey@flowbird.group
    Signed-off-by: Greg Kroah-Hartman

    Martin Fuzzey
     

21 Jan, 2020

1 commit


14 Dec, 2019

1 commit

  • For BINDER_TYPE_PTR and BINDER_TYPE_FDA transactions, the
    num_valid local was calculated incorrectly causing the
    range check in binder_validate_ptr() to miss out-of-bounds
    offsets.

    Fixes: bde4a19fc04f ("binder: use userspace pointer as base of buffer space")
    Signed-off-by: Todd Kjos
    Cc: stable
    Link: https://lore.kernel.org/r/20191213202531.55010-1-tkjos@google.com
    Signed-off-by: Greg Kroah-Hartman

    Todd Kjos
     

02 Dec, 2019

1 commit

  • Pull removal of most of fs/compat_ioctl.c from Arnd Bergmann:
    "As part of the cleanup of some remaining y2038 issues, I came to
    fs/compat_ioctl.c, which still has a couple of commands that need
    support for time64_t.

    In completely unrelated work, I spent time on cleaning up parts of
    this file in the past, moving things out into drivers instead.

    After Al Viro reviewed an earlier version of this series and did a lot
    more of that cleanup, I decided to try to completely eliminate the
    rest of it and move it all into drivers.

    This series incorporates some of Al's work and many patches of my own,
    but in the end stops short of actually removing the last part, which
    is the scsi ioctl handlers. I have patches for those as well, but they
    need more testing or possibly a rewrite"

    * tag 'compat-ioctl-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (42 commits)
    scsi: sd: enable compat ioctls for sed-opal
    pktcdvd: add compat_ioctl handler
    compat_ioctl: move SG_GET_REQUEST_TABLE handling
    compat_ioctl: ppp: move simple commands into ppp_generic.c
    compat_ioctl: handle PPPIOCGIDLE for 64-bit time_t
    compat_ioctl: move PPPIOCSCOMPRESS to ppp_generic
    compat_ioctl: unify copy-in of ppp filters
    tty: handle compat PPP ioctls
    compat_ioctl: move SIOCOUTQ out of compat_ioctl.c
    compat_ioctl: handle SIOCOUTQNSD
    af_unix: add compat_ioctl support
    compat_ioctl: reimplement SG_IO handling
    compat_ioctl: move WDIOC handling into wdt drivers
    fs: compat_ioctl: move FITRIM emulation into file systems
    gfs2: add compat_ioctl support
    compat_ioctl: remove unused convert_in_user macro
    compat_ioctl: remove last RAID handling code
    compat_ioctl: remove /dev/raw ioctl translation
    compat_ioctl: remove PCI ioctl translation
    compat_ioctl: remove joystick ioctl translation
    ...

    Linus Torvalds
     

14 Nov, 2019

3 commits

  • The old loop wouldn't stop when reaching `start` if `start==NULL`, instead
    continuing backwards to index -1 and crashing.

    Luckily you need to be highly privileged to map things at NULL, so it's not
    a big problem.

    Fix it by adjusting the loop so that the loop variable is always in bounds.

    This patch is deliberately minimal to simplify backporting, but IMO this
    function could use a refactor. The jump labels in the second loop body are
    horrible (the error gotos should be jumping to free_range instead), and
    both loops would look nicer if they just iterated upwards through indices.
    And the up_read()+mmput() shouldn't be duplicated like that.

    Cc: stable@vger.kernel.org
    Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
    Signed-off-by: Jann Horn
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191018205631.248274-3-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     
  • binder_alloc_mmap_handler() attempts to detect the use of ->mmap() on a
    binder_proc whose binder_alloc has already been initialized by checking
    whether alloc->buffer is non-zero.

    Before commit 880211667b20 ("binder: remove kernel vm_area for buffer
    space"), alloc->buffer was a kernel mapping address, which is always
    non-zero, but since that commit, it is a userspace mapping address.

    A sufficiently privileged user can map /dev/binder at NULL, tricking
    binder_alloc_mmap_handler() into assuming that the binder_proc has not been
    mapped yet. This leads to memory unsafety.
    Luckily, no context on Android has such privileges, and on a typical Linux
    desktop system, you need to be root to do that.

    Fix it by using the mapping size instead of the mapping address to
    distinguish the mapped case. A valid VMA can't have size zero.

    Fixes: 880211667b20 ("binder: remove kernel vm_area for buffer space")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jann Horn
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191018205631.248274-2-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     
  • binder_alloc_print_pages() iterates over
    alloc->pages[0..alloc->buffer_size-1] under alloc->mutex.
    binder_alloc_mmap_handler() writes alloc->pages and alloc->buffer_size
    without holding that lock, and even writes them before the last bailout
    point.

    Unfortunately we can't take the alloc->mutex in the ->mmap() handler
    because mmap_sem can be taken while alloc->mutex is held.
    So instead, we have to locklessly check whether the binder_alloc has been
    fully initialized with binder_alloc_get_vma(), like in
    binder_alloc_new_buf_locked().

    Fixes: 8ef4665aa129 ("android: binder: Add page usage in binder stats")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jann Horn
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191018205631.248274-1-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     

28 Oct, 2019

1 commit


23 Oct, 2019

1 commit

  • The .ioctl and .compat_ioctl file operations have the same prototype so
    they can both point to the same function, which works great almost all
    the time when all the commands are compatible.

    One exception is the s390 architecture, where a compat pointer is only
    31 bit wide, and converting it into a 64-bit pointer requires calling
    compat_ptr(). Most drivers here will never run in s390, but since we now
    have a generic helper for it, it's easy enough to use it consistently.

    I double-checked all these drivers to ensure that all ioctl arguments
    are used as pointers or are ignored, but are not interpreted as integer
    values.

    Acked-by: Jason Gunthorpe
    Acked-by: Daniel Vetter
    Acked-by: Mauro Carvalho Chehab
    Acked-by: Greg Kroah-Hartman
    Acked-by: David Sterba
    Acked-by: Darren Hart (VMware)
    Acked-by: Jonathan Cameron
    Acked-by: Bjorn Andersson
    Acked-by: Dan Williams
    Signed-off-by: Arnd Bergmann

    Arnd Bergmann
     

22 Oct, 2019

1 commit

  • vm_insert_page() does increment the page refcount, and just to be sure,
    I've confirmed it by printing page_count(page[0].page_ptr) before and after
    vm_insert_page(). It's 1 before, 2 afterwards, as expected.

    Fixes: a145dd411eb2 ("VM: add "vm_insert_page()" function")
    Signed-off-by: Jann Horn
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191018153946.128584-1-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     

17 Oct, 2019

2 commits

  • SZ_1K has been defined in include/linux/sizes.h since v3.6. Get rid of the
    duplicate definition.

    Signed-off-by: Jann Horn
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191016150119.154756-2-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn
     
  • binder_mmap() tries to prevent the creation of overly big binder mappings
    by silently truncating the size of the VMA to 4MiB. However, this violates
    the API contract of mmap(). If userspace attempts to create a large binder
    VMA, and later attempts to unmap that VMA, it will call munmap() on a range
    beyond the end of the VMA, which may have been allocated to another VMA in
    the meantime. This can lead to userspace memory corruption.

    The following sequence of calls leads to a segfault without this commit:

    int main(void) {
    int binder_fd = open("/dev/binder", O_RDWR);
    if (binder_fd == -1) err(1, "open binder");
    void *binder_mapping = mmap(NULL, 0x800000UL, PROT_READ, MAP_SHARED,
    binder_fd, 0);
    if (binder_mapping == MAP_FAILED) err(1, "mmap binder");
    void *data_mapping = mmap(NULL, 0x400000UL, PROT_READ|PROT_WRITE,
    MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    if (data_mapping == MAP_FAILED) err(1, "mmap data");
    munmap(binder_mapping, 0x800000UL);
    *(char*)data_mapping = 1;
    return 0;
    }

    Cc: stable@vger.kernel.org
    Signed-off-by: Jann Horn
    Acked-by: Todd Kjos
    Acked-by: Christian Brauner
    Link: https://lore.kernel.org/r/20191016150119.154756-1-jannh@google.com
    Signed-off-by: Greg Kroah-Hartman

    Jann Horn