21 May, 2016

1 commit

  • UUID library provides uuid_be type and uuid_be_to_bin() function. This
    substitutes open coded variant by generic library calls.

    Signed-off-by: Andy Shevchenko
    Reviewed-by: Matt Fleming
    Cc: Dmitry Kasatkin
    Cc: Mimi Zohar
    Cc: Rasmus Villemoes
    Cc: Arnd Bergmann
    Cc: "Theodore Ts'o"
    Cc: Al Viro
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Shevchenko
     

23 Mar, 2016

1 commit

  • This commit fixes the following security hole affecting systems where
    all of the following conditions are fulfilled:

    - The fs.suid_dumpable sysctl is set to 2.
    - The kernel.core_pattern sysctl's value starts with "/". (Systems
    where kernel.core_pattern starts with "|/" are not affected.)
    - Unprivileged user namespace creation is permitted. (This is
    true on Linux >=3.8, but some distributions disallow it by
    default using a distro patch.)

    Under these conditions, if a program executes under secure exec rules,
    causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
    namespace, changes its root directory and crashes, the coredump will be
    written using fsuid=0 and a path derived from kernel.core_pattern - but
    this path is interpreted relative to the root directory of the process,
    allowing the attacker to control where a coredump will be written with
    root privileges.

    To fix the security issue, always interpret core_pattern for dumps that
    are written under SUID_DUMP_ROOT relative to the root directory of init.

    Signed-off-by: Jann Horn
    Acked-by: Kees Cook
    Cc: Al Viro
    Cc: "Eric W. Biederman"
    Cc: Andy Lutomirski
    Cc: Oleg Nesterov
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jann Horn
     

11 Dec, 2014

1 commit

  • There have been several times where I have had to rebuild a kernel to
    cause a panic when hitting a WARN() in the code in order to get a crash
    dump from a system. Sometimes this is easy to do, other times (such as
    in the case of a remote admin) it is not trivial to send new images to
    the user.

    A much easier method would be a switch to change the WARN() over to a
    panic. This makes debugging easier in that I can now test the actual
    image the WARN() was seen on and I do not have to engage in remote
    debugging.

    This patch adds a panic_on_warn kernel parameter and
    /proc/sys/kernel/panic_on_warn calls panic() in the
    warn_slowpath_common() path. The function will still print out the
    location of the warning.

    An example of the panic_on_warn output:

    The first line below is from the WARN_ON() to output the WARN_ON()'s
    location. After that the panic() output is displayed.

    WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]()
    Kernel panic - not syncing: panic_on_warn set ...

    CPU: 30 PID: 11698 Comm: insmod Tainted: G W OE 3.17.0+ #57
    Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013
    0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190
    0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec
    ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df
    Call Trace:
    [] dump_stack+0x46/0x58
    [] panic+0xd0/0x204
    [] ? init_dummy+0x1f/0x30 [dummy_module]
    [] warn_slowpath_common+0xd0/0xd0
    [] ? dummy_greetings+0x40/0x40 [dummy_module]
    [] warn_slowpath_null+0x1a/0x20
    [] init_dummy+0x1f/0x30 [dummy_module]
    [] do_one_initcall+0xd4/0x210
    [] ? __vunmap+0xc2/0x110
    [] load_module+0x16a9/0x1b30
    [] ? store_uevent+0x70/0x70
    [] ? copy_module_from_fd.isra.44+0x129/0x180
    [] SyS_finit_module+0xa6/0xd0
    [] system_call_fastpath+0x12/0x17

    Successfully tested by me.

    hpa said: There is another very valid use for this: many operators would
    rather a machine shuts down than being potentially compromised either
    functionally or security-wise.

    Signed-off-by: Prarit Bhargava
    Cc: Jonathan Corbet
    Cc: Rusty Russell
    Cc: "H. Peter Anvin"
    Cc: Andi Kleen
    Cc: Masami Hiramatsu
    Acked-by: Yasuaki Ishimatsu
    Cc: Fabian Frederick
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Prarit Bhargava
     

08 Oct, 2014

1 commit

  • Pull dmaengine updates from Dan Williams:
    "Even though this has fixes marked for -stable, given the size and the
    needed conflict resolutions this is 3.18-rc1/merge-window material.

    These patches have been languishing in my tree for a long while. The
    fact that I do not have the time to do proper/prompt maintenance of
    this tree is a primary factor in the decision to step down as
    dmaengine maintainer. That and the fact that the bulk of drivers/dma/
    activity is going through Vinod these days.

    The net_dma removal has not been in -next. It has developed simple
    conflicts against mainline and net-next (for-3.18).

    Continuing thanks to Vinod for staying on top of drivers/dma/.

    Summary:

    1/ Step down as dmaengine maintainer see commit 08223d80df38
    "dmaengine maintainer update"

    2/ Removal of net_dma, as it has been marked 'broken' since 3.13
    (commit 77873803363c "net_dma: mark broken"), without reports of
    performance regression.

    3/ Miscellaneous fixes"

    * tag 'dmaengine-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine:
    net: make tcp_cleanup_rbuf private
    net_dma: revert 'copied_early'
    net_dma: simple removal
    dmaengine maintainer update
    dmatest: prevent memory leakage on error path in thread
    ioat: Use time_before_jiffies()
    dmaengine: fix xor sources continuation
    dma: mv_xor: Rename __mv_xor_slot_cleanup() to mv_xor_slot_cleanup()
    dma: mv_xor: Remove all callers of mv_xor_slot_cleanup()
    dma: mv_xor: Remove unneeded mv_xor_clean_completed_slots() call
    ioat: Use pci_enable_msix_exact() instead of pci_enable_msix()
    drivers: dma: Include appropriate header file in dca.c
    drivers: dma: Mark functions as static in dma_v3.c
    dma: mv_xor: Add DMA API error checks
    ioat/dca: Use dev_is_pci() to check whether it is pci device

    Linus Torvalds
     

28 Sep, 2014

1 commit

  • Per commit "77873803363c net_dma: mark broken" net_dma is no longer used
    and there is no plan to fix it.

    This is the mechanical removal of bits in CONFIG_NET_DMA ifdef guards.
    Reverting the remainder of the net_dma induced changes is deferred to
    subsequent patches.

    Marked for stable due to Roman's report of a memory leak in
    dma_pin_iovec_pages():

    https://lkml.org/lkml/2014/9/3/177

    Cc: Dave Jiang
    Cc: Vinod Koul
    Cc: David Whipple
    Cc: Alexander Duyck
    Cc:
    Reported-by: Roman Gushchin
    Acked-by: David S. Miller
    Signed-off-by: Dan Williams

    Dan Williams
     

02 Jul, 2014

1 commit

  • This can be used in virtual networking applications, and
    may have other uses as well. The option is disabled by
    default.

    A specific use case is setting up virtual routers, bridges, and
    hosts on a single OS without the use of network namespaces or
    virtual machines. With proper use of ip rules, routing tables,
    veth interface pairs and/or other virtual interfaces,
    and applications that can bind to interfaces and/or IP addresses,
    it is possibly to create one or more virtual routers with multiple
    hosts attached. The host interfaces can act as IPv6 systems,
    with radvd running on the ports in the virtual routers. With the
    option provided in this patch enabled, those hosts can now properly
    obtain IPv6 addresses from the radvd.

    Signed-off-by: Ben Greear
    Signed-off-by: David S. Miller

    Ben Greear
     

13 Nov, 2013

1 commit


26 Jun, 2013

1 commit


10 May, 2013

1 commit


28 Feb, 2013

1 commit


27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

26 Feb, 2013

1 commit


06 Feb, 2013

1 commit

  • TCP Appropriate Byte Count was added by me, but later disabled.
    There is no point in maintaining it since it is a potential source
    of bugs and Linux already implements other better window protection
    heuristics.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     

19 Nov, 2012

1 commit

  • The expressions tsk->nsproxy->pid_ns and task_active_pid_ns
    aka ns_of_pid(task_pid(tsk)) should have the same number of
    cache line misses with the practical difference that
    ns_of_pid(task_pid(tsk)) is released later in a processes life.

    Furthermore by using task_active_pid_ns it becomes trivial
    to write an unshare implementation for the the pid namespace.

    So I have used task_active_pid_ns everywhere I can.

    In fork since the pid has not yet been attached to the
    process I use ns_of_pid, to achieve the same effect.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

01 Aug, 2012

1 commit

  • Since per-BDI flusher threads were introduced in 2.6, the pdflush
    mechanism is not used any more. But the old interface exported through
    /proc/sys/vm/nr_pdflush_threads still exists and is obviously useless.

    For back-compatibility, printk warning information and return 2 to notify
    the users that the interface is removed.

    Signed-off-by: Wanpeng Li
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     

21 Dec, 2011

1 commit

  • binary_sysctl() calls sysctl_getname() which allocates from names_cache
    slab usin __getname()

    The matching function to free the name is __putname(), and not putname()
    which should be used only to match getname() allocations.

    This is because when auditing is enabled, putname() calls audit_putname
    *instead* (not in addition) to __putname(). Then, if a syscall is in
    progress, audit_putname does not release the name - instead, it expects
    the name to get released when the syscall completes, but that will happen
    only if audit_getname() was called previously, i.e. if the name was
    allocated with getname() rather than the naked __getname(). So,
    __getname() followed by putname() ends up leaking memory.

    Signed-off-by: Michel Lespinasse
    Acked-by: Al Viro
    Cc: Christoph Hellwig
    Cc: Eric Paris
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

04 Oct, 2011

1 commit


13 Aug, 2011

1 commit

  • Use the move from Linux 2.6 to Linux 3.x as an excuse to kill the
    annoying subdirectories in the XFS source code. Besides the large
    amount of file rename the only changes are to the Makefile, a few
    files including headers with the subdirectory prefix, and the binary
    sysctl compat code that includes a header under fs/xfs/ from
    kernel/.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Alex Elder

    Christoph Hellwig
     

14 Mar, 2011

1 commit

  • new function: file_open_root(dentry, mnt, name, flags) opens the file
    vfs_path_lookup would arrive to.

    Note that name can be empty; in that case the usual requirement that
    dentry should be a directory is lifted.

    open-coded equivalents switched to it, may_open() got down exactly
    one caller and became static.

    Signed-off-by: Al Viro

    Al Viro
     

14 Jan, 2011

1 commit

  • * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits)
    Documentation/trace/events.txt: Remove obsolete sched_signal_send.
    writeback: fix global_dirty_limits comment runtime -> real-time
    ppc: fix comment typo singal -> signal
    drivers: fix comment typo diable -> disable.
    m68k: fix comment typo diable -> disable.
    wireless: comment typo fix diable -> disable.
    media: comment typo fix diable -> disable.
    remove doc for obsolete dynamic-printk kernel-parameter
    remove extraneous 'is' from Documentation/iostats.txt
    Fix spelling milisec -> ms in snd_ps3 module parameter description
    Fix spelling mistakes in comments
    Revert conflicting V4L changes
    i7core_edac: fix typos in comments
    mm/rmap.c: fix comment
    sound, ca0106: Fix assignment to 'channel'.
    hrtimer: fix a typo in comment
    init/Kconfig: fix typo
    anon_inodes: fix wrong function name in comment
    fix comment typos concerning "consistent"
    poll: fix a typo in comment
    ...

    Fix up trivial conflicts in:
    - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c)
    - fs/ext4/ext4.h

    Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.

    Linus Torvalds
     

10 Dec, 2010

1 commit

  • Originally adapted from Huang Ying's patch which moved the
    unknown_nmi_panic to the traps.c file. Because the old nmi
    watchdog was deleted before this change happened, the
    unknown_nmi_panic sysctl was lost. This re-adds it.

    Also, the nmi_watchdog sysctl was re-implemented and its
    documentation updated accordingly.

    Patch-inspired-by: Huang Ying
    Signed-off-by: Don Zickus
    Reviewed-by: Cyrill Gorcunov
    Acked-by: Yinghai Lu
    Cc: fweisbec@gmail.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Don Zickus
     

02 Nov, 2010

1 commit

  • "gadget", "through", "command", "maintain", "maintain", "controller", "address",
    "between", "initiali[zs]e", "instead", "function", "select", "already",
    "equal", "access", "management", "hierarchy", "registration", "interest",
    "relative", "memory", "offset", "already",

    Signed-off-by: Uwe Kleine-König
    Signed-off-by: Jiri Kosina

    Uwe Kleine-König
     

25 May, 2010

1 commit


08 May, 2010

1 commit

  • A while back there was a discussion regarding the rt_secret_interval timer.
    Given that we've had the ability to do emergency route cache rebuilds for awhile
    now, based on a statistical analysis of the various hash chain lengths in the
    cache, the use of the flush timer is somewhat redundant. This patch removes the
    rt_secret_interval sysctl, allowing us to rely solely on the statistical
    analysis mechanism to determine the need for route cache flushes.

    Signed-off-by: Neil Horman
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Neil Horman
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

04 Mar, 2010

1 commit


24 Dec, 2009

1 commit

  • When printing legacy sysctls print the warning message
    for each of them only once. This way there is a guarantee
    the syslog won't be flooded for any sane program.

    The original attempt at this made the tables non const and stored
    the flag inline.

    Linus suggested using a separate hash table for this, this is based on a
    code snippet from him.

    The hash implies this is not exact and can sometimes not print a
    new sysctl due to a hash collision, but in practice this should not
    be a problem

    I used a FNV32 hash over the binary string with a 32byte bitmap. This
    gives relatively little collisions when all the predefined binary sysctls
    are hashed:

    size 256
    bucket
    length number
    0: [25]
    1: [67]
    2: [88]
    3: [47]
    4: [22]
    5: [6]
    6: [1]

    The worst case is a single collision of 6 hash values.

    Signed-off-by: Andi Kleen

    Andi Kleen
     

17 Dec, 2009

1 commit

  • As predicted during code review, the sysctl(2) changes made systems with
    old glibc nearly unusable. About every command gives a:

    warning: process `ls' used the deprecated sysctl system call with 1.4

    warning in the log.

    I see this on a SUSE 10.0 system with glibc 2.3.5.

    Don't warn for this common case.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

12 Nov, 2009

3 commits


11 Nov, 2009

1 commit

  • To simply maintenance and to be able to remove all of the binary
    sysctl support from various subsystems I have rewritten the binary
    sysctl code as a compatibility wrapper around proc/sys.

    The code is built around a hard coded table based on the table
    in sysctl_check.c that lists all of our current binary sysctls
    and provides enough information to convert from the sysctl
    binary input into into ascii and back again. New in this
    patch is the realization that the only dynamic entries
    that need to be handled have ifname as the asscii string
    and ifindex as their ctl_name.

    When a sys_sysctl is called the code now looks in the
    translation table converting the binary name to the
    path under /proc where the value is to be found. Opens
    that file, and calls into a format conversion wrapper
    that calls fop->read and then fop->write as appropriate.

    Since in practice the practically no one uses or tests
    sys_sysctl rewritting the code to be beautiful is a little
    silly. The redeeming merit of this work is it allows us to
    rip out all of the binary sysctl syscall support from
    everywhere else in the tree. Allowing us to remove
    a lot of dead (after this patch) and barely maintained code.

    In addition it becomes much easier to optimize the sysctl
    implementation for being the backing store of /proc/sys,
    without having to worry about sys_sysctl.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     

06 Nov, 2009

4 commits

  • Now that all of the architectures use compat_sys_sysctl do_sysctl
    can become static.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     
  • This uses compat_alloc_userspace to remove the various
    hacks to allow do_sysctl to write to throuh oldlenp.

    The rest of our mature compat syscall helper facitilies
    are used as well to ensure we have a nice clean maintainable
    compat syscall that can be used on all architectures.

    The motiviation for a generic compat sysctl (besides the
    obvious hack removal) is to reduce the number of compat
    sysctl defintions out there so I can refactor the
    binary sysctl implementation.

    ppc already used the name compat_sys_sysctl so I remove the
    ppcs version here.

    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Acked-by: Arnd Bergmann
    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     
  • Read in the binary sysctl path once, instead of reread it
    from user space each time the code needs to access a path
    element.

    The deprecated sysctl warning is moved to do_sysctl so
    that the compat_sysctl entries syscalls will also warn.

    The return of -ENOSYS when !CONFIG_SYSCTL_SYSCALL is moved
    to binary_sysctl. Always leaving a do_sysctl available
    that handles !CONFIG_SYSCTL_SYSCALL and printing the
    deprecated sysctl warning allows for a single defitition
    of the sysctl syscall.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman
     
  • In preparation for more invasive cleanups separate the core
    binary sysctl logic into it's own file.

    Signed-off-by: Eric W. Biederman

    Eric W. Biederman