07 Jun, 2013

1 commit

  • I broke them in this commit:

    commit 1be374a0518a288147c6a7398792583200a67261
    Author: Andy Lutomirski
    Date: Wed May 22 14:07:44 2013 -0700

    net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

    This patch adds __sys_sendmsg and __sys_sendmsg as common helpers that accept
    MSG_CMSG_COMPAT and blocks MSG_CMSG_COMPAT at the syscall entrypoints. It
    also reverts some unnecessary checks in sys_socketcall.

    Apparently I was suffering from underscore blindness the first time around.

    Signed-off-by: Andy Lutomirski
    Tested-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Andy Lutomirski
     

29 May, 2013

1 commit

  • To: linux-kernel@vger.kernel.org
    Cc: x86@kernel.org, trinity@vger.kernel.org, Andy Lutomirski , netdev@vger.kernel.org, "David S.
    Miller"
    Subject: [PATCH 5/5] net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg

    MSG_CMSG_COMPAT is (AFAIK) not intended to be part of the API --
    it's a hack that steals a bit to indicate to other networking code
    that a compat entry was used. So don't allow it from a non-compat
    syscall.

    This prevents an oops when running this code:

    int main()
    {
    int s;
    struct sockaddr_in addr;
    struct msghdr *hdr;

    char *highpage = mmap((void*)(TASK_SIZE_MAX - 4096), 4096,
    PROT_READ | PROT_WRITE,
    MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
    if (highpage == MAP_FAILED)
    err(1, "mmap");

    s = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
    if (s == -1)
    err(1, "socket");

    addr.sin_family = AF_INET;
    addr.sin_port = htons(1);
    addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
    if (connect(s, (struct sockaddr*)&addr, sizeof(addr)) != 0)
    err(1, "connect");

    void *evil = highpage + 4096 - COMPAT_MSGHDR_SIZE;
    printf("Evil address is %p\n", evil);

    if (syscall(__NR_sendmmsg, s, evil, 1, MSG_CMSG_COMPAT) < 0)
    err(1, "sendmmsg");

    return 0;
    }

    Cc: David S. Miller
    Signed-off-by: Andy Lutomirski
    Signed-off-by: David S. Miller

    Andy Lutomirski
     

12 May, 2013

1 commit

  • Pull audit changes from Eric Paris:
    "Al used to send pull requests every couple of years but he told me to
    just start pushing them to you directly.

    Our touching outside of core audit code is pretty straight forward. A
    couple of interface changes which hit net/. A simple argument bug
    calling audit functions in namei.c and the removal of some assembly
    branch prediction code on ppc"

    * git://git.infradead.org/users/eparis/audit: (31 commits)
    audit: fix message spacing printing auid
    Revert "audit: move kaudit thread start from auditd registration to kaudit init"
    audit: vfs: fix audit_inode call in O_CREAT case of do_last
    audit: Make testing for a valid loginuid explicit.
    audit: fix event coverage of AUDIT_ANOM_LINK
    audit: use spin_lock in audit_receive_msg to process tty logging
    audit: do not needlessly take a lock in tty_audit_exit
    audit: do not needlessly take a spinlock in copy_signal
    audit: add an option to control logging of passwords with pam_tty_audit
    audit: use spin_lock_irqsave/restore in audit tty code
    helper for some session id stuff
    audit: use a consistent audit helper to log lsm information
    audit: push loginuid and sessionid processing down
    audit: stop pushing loginid, uid, sessionid as arguments
    audit: remove the old depricated kernel interface
    audit: make validity checking generic
    audit: allow checking the type of audit message in the user filter
    audit: fix build break when AUDIT_DEBUG == 2
    audit: remove duplicate export of audit_enabled
    Audit: do not print error when LSMs disabled
    ...

    Linus Torvalds
     

02 May, 2013

1 commit

  • Pull VFS updates from Al Viro,

    Misc cleanups all over the place, mainly wrt /proc interfaces (switch
    create_proc_entry to proc_create(), get rid of the deprecated
    create_proc_read_entry() in favor of using proc_create_data() and
    seq_file etc).

    7kloc removed.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
    don't bother with deferred freeing of fdtables
    proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
    proc: Make the PROC_I() and PDE() macros internal to procfs
    proc: Supply a function to remove a proc entry by PDE
    take cgroup_open() and cpuset_open() to fs/proc/base.c
    ppc: Clean up scanlog
    ppc: Clean up rtas_flash driver somewhat
    hostap: proc: Use remove_proc_subtree()
    drm: proc: Use remove_proc_subtree()
    drm: proc: Use minor->index to label things, not PDE->name
    drm: Constify drm_proc_list[]
    zoran: Don't print proc_dir_entry data in debug
    reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
    proc: Supply an accessor for getting the data from a PDE's parent
    airo: Use remove_proc_subtree()
    rtl8192u: Don't need to save device proc dir PDE
    rtl8187se: Use a dir under /proc/net/r8180/
    proc: Add proc_mkdir_data()
    proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
    proc: Move PDE_NET() to fs/proc/proc_net.c
    ...

    Linus Torvalds
     

30 Apr, 2013

1 commit


20 Apr, 2013

1 commit

  • Currently, ktime2ts is a small helper function that is only used in
    net/socket.c. Move this helper into the ktime API as a small inline
    function, so that i) it's maintained together with ktime routines,
    and ii) also other files can make use of it. The function is named
    ktime_to_timespec_cond() and placed into the generic part of ktime,
    since we internally make use of ktime_to_timespec(). ktime_to_timespec()
    itself does not check the ktime variable for zero, hence, we name
    this function ktime_to_timespec_cond() for only a conditional
    conversion, and adapt its users to it.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

15 Apr, 2013

1 commit

  • Currently, sock_tx_timestamp() always returns 0. The comment that
    describes the sock_tx_timestamp() function wrongly says that it
    returns an error when an invalid argument is passed (from commit
    20d4947353be, ``net: socket infrastructure for SO_TIMESTAMPING'').
    Make the function void, so that we can also remove all the unneeded
    if conditions that check for such a _non-existant_ error case in the
    output path.

    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

11 Apr, 2013

1 commit


27 Feb, 2013

1 commit

  • Pull vfs pile (part one) from Al Viro:
    "Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
    locking violations, etc.

    The most visible changes here are death of FS_REVAL_DOT (replaced with
    "has ->d_weak_revalidate()") and a new helper getting from struct file
    to inode. Some bits of preparation to xattr method interface changes.

    Misc patches by various people sent this cycle *and* ocfs2 fixes from
    several cycles ago that should've been upstream right then.

    PS: the next vfs pile will be xattr stuff."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
    saner proc_get_inode() calling conventions
    proc: avoid extra pde_put() in proc_fill_super()
    fs: change return values from -EACCES to -EPERM
    fs/exec.c: make bprm_mm_init() static
    ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
    ocfs2: fix possible use-after-free with AIO
    ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
    get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
    target: writev() on single-element vector is pointless
    export kernel_write(), convert open-coded instances
    fs: encode_fh: return FILEID_INVALID if invalid fid_type
    kill f_vfsmnt
    vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
    nfsd: handle vfs_getattr errors in acl protocol
    switch vfs_getattr() to struct path
    default SET_PERSONALITY() in linux/elf.h
    ceph: prepopulate inodes only when request is aborted
    d_hash_and_lookup(): export, switch open-coded instances
    9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
    9p: split dropping the acls from v9fs_set_create_acl()
    ...

    Linus Torvalds
     

26 Feb, 2013

1 commit


23 Feb, 2013

1 commit

  • Allocating a file structure in function get_empty_filp() might fail because
    of several reasons:
    - not enough memory for file structures
    - operation is not allowed
    - user is over its limit

    Currently the function returns NULL in all cases and we loose the exact
    reason of the error. All callers of get_empty_filp() assume that the function
    can fail with ENFILE only.

    Return error through pointer. Change all callers to preserve this error code.

    [AV: cleaned up a bit, carved the get_empty_filp() part out into a separate commit
    (things remaining here deal with alloc_file()), removed pipe(2) behaviour change]

    Signed-off-by: Anatol Pomozov
    Reviewed-by: "Theodore Ts'o"
    Signed-off-by: Al Viro

    Anatol Pomozov
     

12 Feb, 2013

1 commit


01 Feb, 2013

1 commit

  • The original suggestion to delete wanrouter started earlier
    with the mainline commit f0d1b3c2bcc5de8a17af5f2274f7fcde8292b5fc
    ("net/wanrouter: Deprecate and schedule for removal") in May 2012.

    More importantly, Dan Carpenter found[1] that the driver had a
    fundamental breakage introduced back in 2008, with commit
    7be6065b39c3 ("netdevice wanrouter: Convert directly reference of
    netdev->priv"). So we know with certainty that the code hasn't been
    used by anyone willing to at least take the effort to send an e-mail
    report of breakage for at least 4 years.

    This commit does a decouple of the wanrouter subsystem, by going
    after the Makefile/Kconfig and similar files, so that these mainline
    files that we are keeping do not have the big wanrouter file/driver
    deletion commit tied into their history.

    Once this commit is in place, we then can remove the obsolete cyclomx
    drivers and similar that have a dependency on CONFIG_WAN_ROUTER_DRIVERS.

    [1] http://www.spinics.net/lists/netdev/msg218670.html

    Originally-by: Joe Perches
    Cc: Dan Carpenter
    Cc: Arnaldo Carvalho de Melo
    Signed-off-by: Paul Gortmaker

    Paul Gortmaker
     

26 Oct, 2012

2 commits

  • The cgroup logic part of net_cls is very similar as the one in
    net_prio. Let's stream line the net_cls logic with the net_prio one.

    The net_prio update logic was changed by following commit (note there
    were some changes necessary later on)

    commit 406a3c638ce8b17d9704052c07955490f732c2b8
    Author: John Fastabend
    Date: Fri Jul 20 10:39:25 2012 +0000

    net: netprio_cgroup: rework update socket logic

    Instead of updating the sk_cgrp_prioidx struct field on every send
    this only updates the field when a task is moved via cgroup
    infrastructure.

    This allows sockets that may be used by a kernel worker thread
    to be managed. For example in the iscsi case today a user can
    put iscsid in a netprio cgroup and control traffic will be sent
    with the correct sk_cgrp_prioidx value set but as soon as data
    is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
    is updated with the kernel worker threads value which is the
    default case.

    It seems more correct to only update the field when the user
    explicitly sets it via control group infrastructure. This allows
    the users to manage sockets that may be used with other threads.

    Since classid is now updated when the task is moved between the
    cgroups, we don't have to call sock_update_classid() from various
    places to ensure we always using the latest classid value.

    [v2: Use iterate_fd() instead of open coding]

    Signed-off-by: Daniel Wagner
    Cc: Li Zefan
    Cc: "David S. Miller"
    Cc: "Michael S. Tsirkin"
    Cc: Jamal Hadi Salim
    Cc: Joe Perches
    Cc: John Fastabend
    Cc: Neil Horman
    Cc: Stanislav Kinsbursky
    Cc: Tejun Heo
    Cc:
    Cc:
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Wagner
     
  • sock_update_classid() assumes that the update operation always are
    applied on the current task. sock_update_classid() needs to know on
    which tasks to work on in order to be able to migrate task between
    cgroups using the struct cgroup_subsys attach() callback.

    Signed-off-by: Daniel Wagner
    Cc: "David S. Miller"
    Cc: "Michael S. Tsirkin"
    Cc: Eric Dumazet
    Cc: Glauber Costa
    Cc: Joe Perches
    Cc: Neil Horman
    Cc: Stanislav Kinsbursky
    Cc: Tejun Heo
    Cc:
    Cc:
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Wagner
     

03 Oct, 2012

1 commit

  • Pull vfs update from Al Viro:

    - big one - consolidation of descriptor-related logics; almost all of
    that is moved to fs/file.c

    (BTW, I'm seriously tempted to rename the result to fd.c. As it is,
    we have a situation when file_table.c is about handling of struct
    file and file.c is about handling of descriptor tables; the reasons
    are historical - file_table.c used to be about a static array of
    struct file we used to have way back).

    A lot of stray ends got cleaned up and converted to saner primitives,
    disgusting mess in android/binder.c is still disgusting, but at least
    doesn't poke so much in descriptor table guts anymore. A bunch of
    relatively minor races got fixed in process, plus an ext4 struct file
    leak.

    - related thing - fget_light() partially unuglified; see fdget() in
    there (and yes, it generates the code as good as we used to have).

    - also related - bits of Cyrill's procfs stuff that got entangled into
    that work; _not_ all of it, just the initial move to fs/proc/fd.c and
    switch of fdinfo to seq_file.

    - Alex's fs/coredump.c spiltoff - the same story, had been easier to
    take that commit than mess with conflicts. The rest is a separate
    pile, this was just a mechanical code movement.

    - a few misc patches all over the place. Not all for this cycle,
    there'll be more (and quite a few currently sit in akpm's tree)."

    Fix up trivial conflicts in the android binder driver, and some fairly
    simple conflicts due to two different changes to the sock_alloc_file()
    interface ("take descriptor handling from sock_alloc_file() to callers"
    vs "net: Providing protocol type via system.sockprotoname xattr of
    /proc/PID/fd entries" adding a dentry name to the socket)

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
    MAX_LFS_FILESIZE should be a loff_t
    compat: fs: Generic compat_sys_sendfile implementation
    fs: push rcu_barrier() from deactivate_locked_super() to filesystems
    btrfs: reada_extent doesn't need kref for refcount
    coredump: move core dump functionality into its own file
    coredump: prevent double-free on an error path in core dumper
    usb/gadget: fix misannotations
    fcntl: fix misannotations
    ceph: don't abuse d_delete() on failure exits
    hypfs: ->d_parent is never NULL or negative
    vfs: delete surplus inode NULL check
    switch simple cases of fget_light to fdget
    new helpers: fdget()/fdput()
    switch o2hb_region_dev_write() to fget_light()
    proc_map_files_readdir(): don't bother with grabbing files
    make get_file() return its argument
    vhost_set_vring(): turn pollstart/pollstop into bool
    switch prctl_set_mm_exe_file() to fget_light()
    switch xfs_find_handle() to fget_light()
    switch xfs_swapext() to fget_light()
    ...

    Linus Torvalds
     

28 Sep, 2012

1 commit

  • It seems sk_init() has no value today and even does strange things :

    # grep . /proc/sys/net/core/?mem_*
    /proc/sys/net/core/rmem_default:212992
    /proc/sys/net/core/rmem_max:131071
    /proc/sys/net/core/wmem_default:212992
    /proc/sys/net/core/wmem_max:131071

    We can remove it completely.

    Signed-off-by: Eric Dumazet
    Reviewed-by: Shan Wei
    Signed-off-by: David S. Miller

    Eric Dumazet
     

27 Sep, 2012

2 commits


15 Sep, 2012

1 commit

  • Conflicts:
    net/netfilter/nfnetlink_log.c
    net/netfilter/xt_LOG.c

    Rather easy conflict resolution, the 'net' tree had bug fixes to make
    sure we checked if a socket is a time-wait one or not and elide the
    logging code if so.

    Whereas on the 'net-next' side we are calculating the UID and GID from
    the creds using different interfaces due to the user namespace changes
    from Eric Biederman.

    Signed-off-by: David S. Miller

    David S. Miller
     

06 Sep, 2012

1 commit

  • Commit 644595f89620 ("compat: Handle COMPAT_USE_64BIT_TIME in
    net/socket.c") introduced a bug where the helper functions to take
    either a 64-bit or compat time[spec|val] got the arguments in the wrong
    order, passing the kernel stack pointer off as a user pointer (and vice
    versa).

    Because of the user address range check, that in turn then causes an
    EFAULT due to the user pointer range checking failing for the kernel
    address. Incorrectly resuling in a failed system call for 32-bit
    processes with a 64-bit kernel.

    On odder architectures like HP-PA (with separate user/kernel address
    spaces), it can be used read kernel memory.

    Signed-off-by: Mikulas Patocka
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Mikulas Patocka
     

05 Sep, 2012

1 commit

  • lsof reports some of socket descriptors as "can't identify protocol" like:

    [yamato@localhost]/tmp% sudo lsof | grep dbus | grep iden
    dbus-daem 652 dbus 6u sock ... 17812 can't identify protocol
    dbus-daem 652 dbus 34u sock ... 24689 can't identify protocol
    dbus-daem 652 dbus 42u sock ... 24739 can't identify protocol
    dbus-daem 652 dbus 48u sock ... 22329 can't identify protocol
    ...

    lsof cannot resolve the protocol used in a socket because procfs
    doesn't provide the map between inode number on sockfs and protocol
    type of the socket.

    For improving the situation this patch adds an extended attribute named
    'system.sockprotoname' in which the protocol name for
    /proc/PID/fd/SOCKET is stored. So lsof can know the protocol for a
    given /proc/PID/fd/SOCKET with getxattr system call.

    A few weeks ago I submitted a patch for the same purpose. The patch
    was introduced /proc/net/sockfs which enumerates inodes and protocols
    of all sockets alive on a system. However, it was rejected because (1)
    a global lock was needed, and (2) the layout of struct socket was
    changed with the patch.

    This patch doesn't use any global lock; and doesn't change the layout
    of any structs.

    In this patch, a protocol name is stored to dentry->d_name of sockfs
    when new socket is associated with a file descriptor. Before this
    patch dentry->d_name was not used; it was just filled with empty
    string. lsof may use an extended attribute named
    'system.sockprotoname' to retrieve the value of dentry->d_name.

    It is nice if we can see the protocol name with ls -l
    /proc/PID/fd. However, "socket:[#INODE]", the name format returned
    from sockfs_dname() was already defined. To keep the compatibility
    between kernel and user land, the extended attribute is used to
    prepare the value of dentry->d_name.

    Signed-off-by: Masatake YAMATO
    Signed-off-by: David S. Miller

    Masatake YAMATO
     

16 Aug, 2012

1 commit

  • The implementation of dev_ifconf() for the compat ioctl interface uses
    an intermediate ifc structure allocated in userland for the duration of
    the syscall. Though, it fails to initialize the padding bytes inserted
    for alignment and that for leaks four bytes of kernel stack. Add an
    explicit memset(0) before filling the structure to avoid the info leak.

    Signed-off-by: Mathias Krause
    Signed-off-by: David S. Miller

    Mathias Krause
     

23 Jul, 2012

1 commit

  • Instead of updating the sk_cgrp_prioidx struct field on every send
    this only updates the field when a task is moved via cgroup
    infrastructure.

    This allows sockets that may be used by a kernel worker thread
    to be managed. For example in the iscsi case today a user can
    put iscsid in a netprio cgroup and control traffic will be sent
    with the correct sk_cgrp_prioidx value set but as soon as data
    is sent the kernel worker thread isssues a send and sk_cgrp_prioidx
    is updated with the kernel worker threads value which is the
    default case.

    It seems more correct to only update the field when the user
    explicitly sets it via control group infrastructure. This allows
    the users to manage sockets that may be used with other threads.

    Signed-off-by: John Fastabend
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    John Fastabend
     

21 Jul, 2012

1 commit

  • This patch fixes a crash
    tun_chr_close -> netdev_run_todo -> tun_free_netdev -> sk_release_kernel ->
    sock_release -> iput(SOCK_INODE(sock))
    introduced by commit 1ab5ecb90cb6a3df1476e052f76a6e8f6511cb3d

    The problem is that this socket is embedded in struct tun_struct, it has
    no inode, iput is called on invalid inode, which modifies invalid memory
    and optionally causes a crash.

    sock_release also decrements sockets_in_use, this causes a bug that
    "sockets: used" field in /proc/*/net/sockstat keeps on decreasing when
    creating and closing tun devices.

    This patch introduces a flag SOCK_EXTERNALLY_ALLOCATED that instructs
    sock_release to not free the inode and not decrement sockets_in_use,
    fixing both memory corruption and sockets_in_use underflow.

    It should be backported to 3.3 an 3.4 stabke.

    Signed-off-by: Mikulas Patocka
    Cc: stable@kernel.org
    Signed-off-by: David S. Miller

    Mikulas Patocka
     

23 May, 2012

1 commit


16 May, 2012

1 commit


15 May, 2012

1 commit

  • percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace
    them for further code clean up.

    And in preempt safe scenario, __this_cpu_xxx funcs may has a bit
    better performance since __this_cpu_xxx has no redundant
    preempt_enable/preempt_disable on some architectures.

    Signed-off-by: Alex Shi
    Acked-by: Eric Dumazet
    Acked-by: David S. Miller
    Cc: Patrick McHardy
    Signed-off-by: Andrew Morton
    Signed-off-by: Tejun Heo

    Alex Shi
     

22 Apr, 2012

1 commit

  • iov of more than 8 entries are allocated in sendmsg()/recvmsg() through
    sock_kmalloc()

    As these allocations are temporary only and small enough, it makes sense
    to use plain kmalloc() and avoid sk_omem_alloc atomic overhead.

    Slightly changed fast path to be even faster.

    Signed-off-by: Eric Dumazet
    Cc: Mike Waychison
    Signed-off-by: David S. Miller

    Eric Dumazet
     

21 Apr, 2012

1 commit


16 Apr, 2012

1 commit


06 Apr, 2012

1 commit

  • commit 2f533844242 (tcp: allow splice() to build full TSO packets) added
    a regression for splice() calls using SPLICE_F_MORE.

    We need to call tcp_flush() at the end of the last page processed in
    tcp_sendpages(), or else transmits can be deferred and future sends
    stall.

    Add a new internal flag, MSG_SENDPAGE_NOTLAST, acting like MSG_MORE, but
    with different semantic.

    For all sendpage() providers, its a transparent change. Only
    sock_sendpage() and tcp_sendpages() can differentiate the two different
    flags provided by pipe_to_sendpage()

    Reported-by: Tom Herbert
    Cc: Nandita Dukkipati
    Cc: Neal Cardwell
    Cc: Tom Herbert
    Cc: Yuchung Cheng
    Cc: H.K. Jerry Chu
    Cc: Maciej Żenczykowski
    Cc: Mahesh Bandewar
    Cc: Ilpo Järvinen
    Signed-off-by: Eric Dumazet com>
    Signed-off-by: David S. Miller

    Eric Dumazet
     

30 Mar, 2012

1 commit

  • Pull x32 support for x86-64 from Ingo Molnar:
    "This tree introduces the X32 binary format and execution mode for x86:
    32-bit data space binaries using 64-bit instructions and 64-bit kernel
    syscalls.

    This allows applications whose working set fits into a 32 bits address
    space to make use of 64-bit instructions while using a 32-bit address
    space with shorter pointers, more compressed data structures, etc."

    Fix up trivial context conflicts in arch/x86/{Kconfig,vdso/vma.c}

    * 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (71 commits)
    x32: Fix alignment fail in struct compat_siginfo
    x32: Fix stupid ia32/x32 inversion in the siginfo format
    x32: Add ptrace for x32
    x32: Switch to a 64-bit clock_t
    x32: Provide separate is_ia32_task() and is_x32_task() predicates
    x86, mtrr: Use explicit sizing and padding for the 64-bit ioctls
    x86/x32: Fix the binutils auto-detect
    x32: Warn and disable rather than error if binutils too old
    x32: Only clear TIF_X32 flag once
    x32: Make sure TS_COMPAT is cleared for x32 tasks
    fs: Remove missed ->fds_bits from cessation use of fd_set structs internally
    fs: Fix close_on_exec pointer in alloc_fdtable
    x32: Drop non-__vdso weak symbols from the x32 VDSO
    x32: Fix coding style violations in the x32 VDSO code
    x32: Add x32 VDSO support
    x32: Allow x32 to be configured
    x32: If configured, add x32 system calls to system call tables
    x32: Handle process creation
    x32: Signal-related system calls
    x86: Add #ifdef CONFIG_COMPAT to
    ...

    Linus Torvalds
     

12 Mar, 2012

1 commit

  • The following 4 functions:
    move_addr_to_kernel
    move_addr_to_user
    verify_iovec
    verify_compat_iovec
    are always effectively called with a sockaddr_storage.

    Make this explicit by changing their signature.

    This removes a large number of casts from sockaddr_storage to sockaddr.

    Signed-off-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     

21 Feb, 2012

1 commit


13 Jan, 2012

1 commit

  • commit a9b3cd7f32 (rcu: convert uses of rcu_assign_pointer(x, NULL) to
    RCU_INIT_POINTER) did a lot of incorrect changes, since it did a
    complete conversion of rcu_assign_pointer(x, y) to RCU_INIT_POINTER(x,
    y).

    We miss needed barriers, even on x86, when y is not NULL.

    Signed-off-by: Eric Dumazet
    CC: Stephen Hemminger
    CC: Paul E. McKenney
    Signed-off-by: David S. Miller

    Eric Dumazet
     

07 Jan, 2012

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1958 commits)
    net: pack skb_shared_info more efficiently
    net_sched: red: split red_parms into parms and vars
    net_sched: sfq: extend limits
    cnic: Improve error recovery on bnx2x devices
    cnic: Re-init dev->stats_addr after chip reset
    net_sched: Bug in netem reordering
    bna: fix sparse warnings/errors
    bna: make ethtool_ops and strings const
    xgmac: cleanups
    net: make ethtool_ops const
    vmxnet3" make ethtool ops const
    xen-netback: make ops structs const
    virtio_net: Pass gfp flags when allocating rx buffers.
    ixgbe: FCoE: Add support for ndo_get_fcoe_hbainfo() call
    netdev: FCoE: Add new ndo_get_fcoe_hbainfo() call
    igb: reset PHY after recovering from PHY power down
    igb: add basic runtime PM support
    igb: Add support for byte queue limits.
    e1000: cleanup CE4100 MDIO registers access
    e1000: unmap ce4100_gbe_mdio_base_virt in e1000_remove
    ...

    Linus Torvalds
     

06 Jan, 2012

1 commit

  • We're doing some odd things there, which already messes up various users
    (see the net/socket.c code that this removes), and it was going to add
    yet more crud to the block layer because of the incorrect error code
    translation.

    ENOIOCTLCMD is not an error return that should be returned to user mode
    from the "ioctl()" system call, but it should *not* be translated as
    EINVAL ("Invalid argument"). It should be translated as ENOTTY
    ("Inappropriate ioctl for device").

    That EINVAL confusion has apparently so permeated some code that the
    block layer actually checks for it, which is sad. We continue to do so
    for now, but add a big comment about how wrong that is, and we should
    remove it entirely eventually. In the meantime, this tries to keep the
    changes localized to just the EINVAL -> ENOTTY fix, and removing code
    that makes it harder to do the right thing.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Jan, 2012

1 commit

  • Define special location values for RX NFC that request the driver to
    select the actual rule location. This allows for implementation on
    devices that use hash-based filter lookup, whereas currently the API is
    more suited to devices with TCAM lookup or linear search.

    In ethtool_set_rxnfc() and the compat wrapper ethtool_ioctl(), copy
    the structure back to user-space after insertion so that the actual
    location is returned.

    Signed-off-by: Ben Hutchings
    Signed-off-by: David S. Miller

    Ben Hutchings
     

23 Nov, 2011

1 commit

  • This patch adds in the infrastructure code to create the network priority
    cgroup. The cgroup, in addition to the standard processes file creates two
    control files:

    1) prioidx - This is a read-only file that exports the index of this cgroup.
    This is a value that is both arbitrary and unique to a cgroup in this subsystem,
    and is used to index the per-device priority map

    2) priomap - This is a writeable file. On read it reports a table of 2-tuples
    where name is the name of a network interface and priority is
    indicates the priority assigned to frames egresessing on the named interface and
    originating from a pid in this cgroup

    This cgroup allows for skb priority to be set prior to a root qdisc getting
    selected. This is benenficial for DCB enabled systems, in that it allows for any
    application to use dcb configured priorities so without application modification

    Signed-off-by: Neil Horman
    Signed-off-by: John Fastabend
    CC: Robert Love
    CC: "David S. Miller"
    Signed-off-by: David S. Miller

    Neil Horman