11 Oct, 2016

6 commits

  • Pull networking fixes from David Miller:

    1) Netfilter list handling fix, from Linus.

    2) RXRPC/AFS bug fixes from David Howells (oops on call to serviceless
    endpoints, build warnings, missing notifications, etc.) From David
    Howells.

    3) Kernel log message missing newlines, from Colin Ian King.

    4) Don't enter direct reclaim in netlink dumps, the idea is to use a
    high order allocation first and fallback quickly to a 0-order
    allocation if such a high-order one cannot be done cheaply and
    without reclaim. From Eric Dumazet.

    5) Fix firmware download errors in btusb bluetooth driver, from Ethan
    Hsieh.

    6) Missing Kconfig deps for QCOM_EMAC, from Geert Uytterhoeven.

    7) Fix MDIO_XGENE dup Kconfig entry. From Laura Abbott.

    8) Constrain ipv6 rtr_solicits sysctl values properly, from Maciej
    Żenczykowski.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits)
    netfilter: Fix slab corruption.
    be2net: Enable VF link state setting for BE3
    be2net: Fix TX stats for TSO packets
    be2net: Update Copyright string in be_hw.h
    be2net: NCSI FW section should be properly updated with ethtool for BE3
    be2net: Provide an alternate way to read pf_num for BEx chips
    wan/fsl_ucc_hdlc: Fix size used in dma_free_coherent()
    net: macb: NULL out phydev after removing mdio bus
    xen-netback: make sure that hashes are not send to unaware frontends
    Fixing a bug in team driver due to incorrect 'unsigned int' to 'int' conversion
    MAINTAINERS: add myself as a maintainer of xen-netback
    ipv6 addrconf: disallow rtr_solicits < -1
    Bluetooth: btusb: Fix atheros firmware download error
    drivers: net: phy: Correct duplicate MDIO_XGENE entry
    ethernet: qualcomm: QCOM_EMAC should depend on HAS_DMA and HAS_IOMEM
    net: ethernet: mediatek: remove hwlro property in the device tree
    net: ethernet: mediatek: get hw lro capability by the chip id instead of by the dtsi
    net: ethernet: mediatek: get the chip id by ETHDMASYS registers
    net: bgmac: Fix errant feature flag check
    netlink: do not enter direct reclaim from netlink_dump()
    ...

    Linus Torvalds
     
  • Use the correct pattern for singly linked list insertion and
    deletion. We can also calculate the list head outside of the
    mutex.

    Fixes: e3b37f11e6e4 ("netfilter: replace list_head with single linked list")
    Signed-off-by: Linus Torvalds
    Reviewed-by: Aaron Conole
    Signed-off-by: David S. Miller

    net/netfilter/core.c | 108 ++++++++++++++++-----------------------------------
    1 file changed, 33 insertions(+), 75 deletions(-)

    Linus Torvalds
     
  • Pull more vfs updates from Al Viro:
    ">rename2() work from Miklos + current_time() from Deepa"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    fs: Replace current_fs_time() with current_time()
    fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
    fs: Replace CURRENT_TIME with current_time() for inode timestamps
    fs: proc: Delete inode time initializations in proc_alloc_inode()
    vfs: Add current_time() api
    vfs: add note about i_op->rename changes to porting
    fs: rename "rename2" i_op to "rename"
    vfs: remove unused i_op->rename
    fs: make remaining filesystems use .rename2
    libfs: support RENAME_NOREPLACE in simple_rename()
    fs: support RENAME_NOREPLACE for local filesystems
    ncpfs: fix unused variable warning

    Linus Torvalds
     
  • Al Viro
     
  • Pull vfs xattr updates from Al Viro:
    "xattr stuff from Andreas

    This completes the switch to xattr_handler ->get()/->set() from
    ->getxattr/->setxattr/->removexattr"

    * 'work.xattr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    vfs: Remove {get,set,remove}xattr inode operations
    xattr: Stop calling {get,set,remove}xattr inode operations
    vfs: Check for the IOP_XATTR flag in listxattr
    xattr: Add __vfs_{get,set,remove}xattr helpers
    libfs: Use IOP_XATTR flag for empty directory handling
    vfs: Use IOP_XATTR flag for bad-inode handling
    vfs: Add IOP_XATTR inode operations flag
    vfs: Move xattr_resolve_name to the front of fs/xattr.c
    ecryptfs: Switch to generic xattr handlers
    sockfs: Get rid of getxattr iop
    sockfs: getxattr: Fail with -EOPNOTSUPP for invalid attribute names
    kernfs: Switch to generic xattr handlers
    hfs: Switch to generic xattr handlers
    jffs2: Remove jffs2_{get,set,remove}xattr macros
    xattr: Remove unnecessary NULL attribute name check

    Linus Torvalds
     
  • Pull Ceph updates from Ilya Dryomov:
    "The big ticket item here is support for rbd exclusive-lock feature,
    with maintenance operations offloaded to userspace (Douglas Fuller,
    Mike Christie and myself). Another block device bullet is a series
    fixing up layering error paths (myself).

    On the filesystem side, we've got patches that improve our handling of
    buffered vs dio write races (Neil Brown) and a few assorted fixes from
    Zheng. Also included a couple of random cleanups and a minor CRUSH
    update"

    * tag 'ceph-for-4.9-rc1' of git://github.com/ceph/ceph-client: (39 commits)
    crush: remove redundant local variable
    crush: don't normalize input of crush_ln iteratively
    libceph: ceph_build_auth() doesn't need ceph_auth_build_hello()
    libceph: use CEPH_AUTH_UNKNOWN in ceph_auth_build_hello()
    ceph: fix description for rsize and rasize mount options
    rbd: use kmalloc_array() in rbd_header_from_disk()
    ceph: use list_move instead of list_del/list_add
    ceph: handle CEPH_SESSION_REJECT message
    ceph: avoid accessing / when mounting a subpath
    ceph: fix mandatory flock check
    ceph: remove warning when ceph_releasepage() is called on dirty page
    ceph: ignore error from invalidate_inode_pages2_range() in direct write
    ceph: fix error handling of start_read()
    rbd: add rbd_obj_request_error() helper
    rbd: img_data requests don't own their page array
    rbd: don't call rbd_osd_req_format_read() for !img_data requests
    rbd: rework rbd_img_obj_exists_submit() error paths
    rbd: don't crash or leak on errors in rbd_img_obj_parent_read_full_callback()
    rbd: move bumping img_request refcount into rbd_obj_request_submit()
    rbd: mark the original request as done if stat request fails
    ...

    Linus Torvalds
     

10 Oct, 2016

1 commit

  • Pull main rdma updates from Doug Ledford:
    "This is the main pull request for the rdma stack this release. The
    code has been through 0day and I had it tagged for linux-next testing
    for a couple days.

    Summary:

    - updates to mlx5

    - updates to mlx4 (two conflicts, both minor and easily resolved)

    - updates to iw_cxgb4 (one conflict, not so obvious to resolve,
    proper resolution is to keep the code in cxgb4_main.c as it is in
    Linus' tree as attach_uld was refactored and moved into
    cxgb4_uld.c)

    - improvements to uAPI (moved vendor specific API elements to uAPI
    area)

    - add hns-roce driver and hns and hns-roce ACPI reset support

    - conversion of all rdma code away from deprecated
    create_singlethread_workqueue

    - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in
    staging)"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits)
    staging/lustre: Disable InfiniBand support
    iw_cxgb4: add fast-path for small REG_MR operations
    cxgb4: advertise support for FR_NSMR_TPTE_WR
    IB/core: correctly handle rdma_rw_init_mrs() failure
    IB/srp: Fix infinite loop when FMR sg[0].offset != 0
    IB/srp: Remove an unused argument
    IB/core: Improve ib_map_mr_sg() documentation
    IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets
    IB/mthca: Move user vendor structures
    IB/nes: Move user vendor structures
    IB/ocrdma: Move user vendor structures
    IB/mlx4: Move user vendor structures
    IB/cxgb4: Move user vendor structures
    IB/cxgb3: Move user vendor structures
    IB/mlx5: Move and decouple user vendor structures
    IB/{core,hw}: Add constant for node_desc
    ipoib: Make ipoib_warn ratelimited
    IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue
    IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue
    IB/ipoib: Remove deprecated create_singlethread_workqueue
    ...

    Linus Torvalds
     

08 Oct, 2016

7 commits

  • Johan Hedberg says:

    ====================
    pull request: bluetooth 2016-10-08

    Here are a couple of Bluetooth fixes for the 4.9 kernel:

    - Firmware download fix for Atheros controllers
    - Fixes to the content of LE scan response
    - New USB ID for a Marvell chipset

    Please let me know if there are any issues pulling. Thanks.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Merge updates from Andrew Morton:

    - fsnotify updates

    - ocfs2 updates

    - all of MM

    * emailed patches from Andrew Morton : (127 commits)
    console: don't prefer first registered if DT specifies stdout-path
    cred: simpler, 1D supplementary groups
    CREDITS: update Pavel's information, add GPG key, remove snail mail address
    mailmap: add Johan Hovold
    .gitattributes: set git diff driver for C source code files
    uprobes: remove function declarations from arch/{mips,s390}
    spelling.txt: "modeled" is spelt correctly
    nmi_backtrace: generate one-line reports for idle cpus
    arch/tile: adopt the new nmi_backtrace framework
    nmi_backtrace: do a local dump_stack() instead of a self-NMI
    nmi_backtrace: add more trigger_*_cpu_backtrace() methods
    min/max: remove sparse warnings when they're nested
    Documentation/filesystems/proc.txt: add more description for maps/smaps
    mm, proc: fix region lost in /proc/self/smaps
    proc: fix timerslack_ns CAP_SYS_NICE check when adjusting self
    proc: add LSM hook checks to /proc//timerslack_ns
    proc: relax /proc//timerslack_ns capability requirements
    meminfo: break apart a very long seq_printf with #ifdefs
    seq/proc: modify seq_put_decimal_[u]ll to take a const char *, not char
    proc: faster /proc/*/status
    ...

    Linus Torvalds
     
  • This disallows setting /proc/sys/net/ipv6/conf/*/router_solicitations
    to values below -1.

    -1 continues to mean an unlimited number of retransmits.

    Note: this depends on 'ipv6 addrconf: remove addrconf_sysctl_hop_limit()'

    Signed-off-by: Maciej Żenczykowski
    Signed-off-by: David S. Miller

    Maciej Żenczykowski
     
  • These inode operations are no longer used; remove them.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • Current supplementary groups code can massively overallocate memory and
    is implemented in a way so that access to individual gid is done via 2D
    array.

    If number of gids is
    Cc: Vasily Kulikov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • The cgroup core and the memory controller need to track socket ownership
    for different purposes, but the tracking sites being entirely different
    is kind of ugly.

    Be a better citizen and rename the memory controller callbacks to match
    the cgroup core callbacks, then move them to the same place.

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/20160914194846.11153-3-hannes@cmpxchg.org
    Signed-off-by: Johannes Weiner
    Acked-by: Tejun Heo
    Cc: "David S. Miller"
    Cc: Michal Hocko
    Cc: Vladimir Davydov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Pull VFS splice updates from Al Viro:
    "There's a bunch of branches this cycle, both mine and from other folks
    and I'd rather send pull requests separately.

    This one is the conversion of ->splice_read() to ITER_PIPE iov_iter
    (and introduction of such). Gets rid of a lot of code in fs/splice.c
    and elsewhere; there will be followups, but these are for the next
    cycle... Some pipe/splice-related cleanups from Miklos in the same
    branch as well"

    * 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    pipe: fix comment in pipe_buf_operations
    pipe: add pipe_buf_steal() helper
    pipe: add pipe_buf_confirm() helper
    pipe: add pipe_buf_release() helper
    pipe: add pipe_buf_get() helper
    relay: simplify relay_file_read()
    switch default_file_splice_read() to use of pipe-backed iov_iter
    switch generic_file_splice_read() to use of ->read_iter()
    new iov_iter flavour: pipe-backed
    fuse_dev_splice_read(): switch to add_to_pipe()
    skb_splice_bits(): get rid of callback
    new helper: add_to_pipe()
    splice: lift pipe_lock out of splice_to_pipe()
    splice: switch get_iovec_page_array() to iov_iter
    splice_to_pipe(): don't open-code wakeup_pipe_readers()
    consistent treatment of EFAULT on O_DIRECT read/write

    Linus Torvalds
     

07 Oct, 2016

6 commits

  • If we allow pseudo-filesystems created with mount_pseudo to have xattr
    handlers, we can replace sockfs_getxattr with a sockfs_xattr_get handler
    to use the xattr handler name parsing.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • The standard return value for unsupported attribute names is
    -EOPNOTSUPP, as opposed to undefined but supported attributes
    (-ENODATA).

    Also, fail for attribute names like "system.sockprotonameXXX" and
    simplify the code a bit.

    Signed-off-by: Andreas Gruenbacher
    Signed-off-by: Al Viro

    Andreas Gruenbacher
     
  • …git/dhowells/linux-fs

    David Howells says:

    ====================
    rxrpc: Fixes

    This set of patches contains a bunch of fixes:

    (1) Fix an oops on incoming call to a local endpoint without a bound
    service.

    (2) Only ping for a lost reply in a client call (this is inapplicable to
    service calls).

    (3) Fix maybe uninitialised variable warnings in the ACK/ABORT sending
    function by splitting it.

    (4) Fix loss of PING RESPONSE ACKs due to them being subsumed by PING ACK
    generation.

    (5) OpenAFS improperly terminates calls it makes as a client under some
    circumstances by not fully hard-ACK'ing the last DATA packets. This
    is alleviated by a new call appearing on the same channel implicitly
    completing the previous call on that channel. Handle this implicit
    completion.

    (6) Properly handle expiry of service calls due to the aforementioned
    improper termination with no follow up call to implicitly complete it:

    (a) The call's background processor needs to be queued to complete the
    call, send an abort and notify the socket.

    (b) The call's background processor needs to notify the socket (or the
    kernel service) when it has completed the call.

    (c) A negative error code must thence be returned to the kernel
    service so that it knows the call died.

    (d) The AFS filesystem must detect the fatal error and end the call.

    (7) Must produce a DELAY ACK when the actual service operation takes a
    while to process and must cancel the ACK when the reply is ready.

    (8) Don't request an ACK on the last DATA packet of the Tx phase as this
    confuses OpenAFS.
    ====================

    Signed-off-by: David S. Miller <davem@davemloft.net>

    David S. Miller
     
  • Since linux-3.15, netlink_dump() can use up to 16384 bytes skb
    allocations.

    Due to struct skb_shared_info ~320 bytes overhead, we end up using
    order-3 (on x86) page allocations, that might trigger direct reclaim and
    add stress.

    The intent was really to attempt a large allocation but immediately
    fallback to a smaller one (order-1 on x86) in case of memory stress.

    On recent kernels (linux-4.4), we can remove __GFP_DIRECT_RECLAIM to
    meet the goal. Old kernels would need to remove __GFP_WAIT

    While we are at it, since we do an order-3 allocation, allow to use
    all the allocated bytes instead of 16384 to reduce syscalls during
    large dumps.

    iproute2 already uses 32KB recvmsg() buffer sizes.

    Alexei provided an initial patch downsizing to SKB_WITH_OVERHEAD(16384)

    Fixes: 9063e21fb026 ("netlink: autosize skb lengthes")
    Signed-off-by: Eric Dumazet
    Reported-by: Alexei Starovoitov
    Cc: Greg Thelen
    Reviewed-by: Greg Rose
    Acked-by: Alexei Starovoitov
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • If a socket has FANOUT sockopt set, a new proto_hook is registered
    as part of fanout_add(). When processing a NETDEV_UNREGISTER event in
    af_packet, __fanout_unlink is called for all sockets, but prot_hook which was
    registered as part of fanout_add is not removed. Call fanout_release, on a
    NETDEV_UNREGISTER, which removes prot_hook and removes fanout from the
    fanout_list.

    This fixes BUG_ON(!list_empty(&dev->ptype_specific)) in netdev_run_todo()

    Signed-off-by: Anoob Soman
    Signed-off-by: David S. Miller

    Anoob Soman
     
  • Pull namespace updates from Eric Biederman:
    "This set of changes is a number of smaller things that have been
    overlooked in other development cycles focused on more fundamental
    change. The devpts changes are small things that were a distraction
    until we managed to kill off DEVPTS_MULTPLE_INSTANCES. There is an
    trivial regression fix to autofs for the unprivileged mount changes
    that went in last cycle. A pair of ioctls has been added by Andrey
    Vagin making it is possible to discover the relationships between
    namespaces when referring to them through file descriptors.

    The big user visible change is starting to add simple resource limits
    to catch programs that misbehave. With namespaces in general and user
    namespaces in particular allowing users to use more kinds of
    resources, it has become important to have something to limit errant
    programs. Because the purpose of these limits is to catch errant
    programs the code needs to be inexpensive to use as it always on, and
    the default limits need to be high enough that well behaved programs
    on well behaved systems don't encounter them.

    To this end, after some review I have implemented per user per user
    namespace limits, and use them to limit the number of namespaces. The
    limits being per user mean that one user can not exhause the limits of
    another user. The limits being per user namespace allow contexts where
    the limit is 0 and security conscious folks can remove from their
    threat anlysis the code used to manage namespaces (as they have
    historically done as it root only). At the same time the limits being
    per user namespace allow other parts of the system to use namespaces.

    Namespaces are increasingly being used in application sand boxing
    scenarios so an all or nothing disable for the entire system for the
    security conscious folks makes increasing use of these sandboxes
    impossible.

    There is also added a limit on the maximum number of mounts present in
    a single mount namespace. It is nontrivial to guess what a reasonable
    system wide limit on the number of mount structure in the kernel would
    be, especially as it various based on how a system is using
    containers. A limit on the number of mounts in a mount namespace
    however is much easier to understand and set. In most cases in
    practice only about 1000 mounts are used. Given that some autofs
    scenarious have the potential to be 30,000 to 50,000 mounts I have set
    the default limit for the number of mounts at 100,000 which is well
    above every known set of users but low enough that the mount hash
    tables don't degrade unreaonsably.

    These limits are a start. I expect this estabilishes a pattern that
    other limits for resources that namespaces use will follow. There has
    been interest in making inotify event limits per user per user
    namespace as well as interest expressed in making details about what
    is going on in the kernel more visible"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (28 commits)
    autofs: Fix automounts by using current_real_cred()->uid
    mnt: Add a per mount namespace limit on the number of mounts
    netns: move {inc,dec}_net_namespaces into #ifdef
    nsfs: Simplify __ns_get_path
    tools/testing: add a test to check nsfs ioctl-s
    nsfs: add ioctl to get a parent namespace
    nsfs: add ioctl to get an owning user namespace for ns file descriptor
    kernel: add a helper to get an owning user namespace for a namespace
    devpts: Change the owner of /dev/pts/ptmx to the mounter of /dev/pts
    devpts: Remove sync_filesystems
    devpts: Make devpts_kill_sb safe if fsi is NULL
    devpts: Simplify devpts_mount by using mount_nodev
    devpts: Move the creation of /dev/pts/ptmx into fill_super
    devpts: Move parse_mount_options into fill_super
    userns: When the per user per user namespace limit is reached return ENOSPC
    userns; Document per user per user namespace limits.
    mntns: Add a limit on the number of mount namespaces.
    netns: Add a limit on the number of net namespaces
    cgroupns: Add a limit on the number of cgroup namespaces
    ipcns: Add a limit on the number of ipc namespaces
    ...

    Linus Torvalds
     

06 Oct, 2016

18 commits

  • Use eir_append_data to remove code duplication.

    Signed-off-by: Michał Narajowski
    Signed-off-by: Marcel Holtmann

    Michał Narajowski
     
  • Add appearance value to beginning of scan rsp data for
    default advertising instance if the value is not 0.

    Signed-off-by: Michał Narajowski
    Signed-off-by: Marcel Holtmann

    Michał Narajowski
     
  • Use complete name if it fits. If not and there is short name
    check if it fits. If not then use shortened name as prefix
    of complete name.

    Signed-off-by: Michał Narajowski
    Signed-off-by: Marcel Holtmann

    Michał Narajowski
     
  • Don't request an ACK on the last DATA packet of a call's Tx phase as for a
    client there will be a reply packet or some sort of ACK to shift phase. If
    the ACK is requested, OpenAFS sends a REQUESTED-ACK ACK with soft-ACKs in
    it and doesn't follow up with a hard-ACK.

    If we don't set the flag, OpenAFS will send a DELAY ACK that hard-ACKs the
    reply data, thereby allowing the call to terminate cleanly.

    Signed-off-by: David Howells

    David Howells
     
  • We need to generate a DELAY ACK from the service end of an operation if we
    start doing the actual operation work and it takes longer than expected.
    This will hard-ACK the request data and allow the client to release its
    resources.

    To make this work:

    (1) We have to set the ack timer and propose an ACK when the call moves to
    the RXRPC_CALL_SERVER_ACK_REQUEST and clear the pending ACK and cancel
    the timer when we start transmitting the reply (the first DATA packet
    of the reply implicitly ACKs the request phase).

    (2) It must be possible to set the timer when the caller is holding
    call->state_lock, so split the lock-getting part of the timer function
    out.

    (3) Add trace notes for the ACK we're requesting and the timer we clear.

    Signed-off-by: David Howells

    David Howells
     
  • In rxrpc_kernel_recv_data(), when we return the error number incurred by a
    failed call, we must negate it before returning it as it's stored as
    positive (that's what we have to pass back to userspace).

    Signed-off-by: David Howells

    David Howells
     
  • The call's background processor work item needs to notify the socket when
    it completes a call so that recvmsg() or the AFS fs can deal with it.
    Without this, call expiry isn't handled.

    Signed-off-by: David Howells

    David Howells
     
  • When a call expires, it must be queued for the background processor to deal
    with otherwise a service call that is improperly terminated will just sit
    there awaiting an ACK and won't expire.

    Signed-off-by: David Howells

    David Howells
     
  • OpenAFS doesn't always correctly terminate client calls that it makes -
    this includes calls the OpenAFS servers make to the cache manager service.
    It should end the client call with either:

    (1) An ACK that has firstPacket set to one greater than the seq number of
    the reply DATA packet with the LAST_PACKET flag set (thereby
    hard-ACK'ing all packets). nAcks should be 0 and acks[] should be
    empty (ie. no soft-ACKs).

    (2) An ACKALL packet.

    OpenAFS, though, may send an ACK packet with firstPacket set to the last
    seq number or less and soft-ACKs listed for all packets up to and including
    the last DATA packet.

    The transmitter, however, is obliged to keep the call live and the
    soft-ACK'd DATA packets around until they're hard-ACK'd as the receiver is
    permitted to drop any merely soft-ACK'd packet and request retransmission
    by sending an ACK packet with a NACK in it.

    Further, OpenAFS will also terminate a client call by beginning the next
    client call on the same connection channel. This implicitly completes the
    previous call.

    This patch handles implicit ACK of a call on a channel by the reception of
    the first packet of the next call on that channel.

    If another call doesn't come along to implicitly ACK a call, then we have
    to time the call out. There are some bugs there that will be addressed in
    subsequent patches.

    Signed-off-by: David Howells

    David Howells
     
  • Separate the output of PING ACKs from the output of other sorts of ACK so
    that if we receive a PING ACK and schedule transmission of a PING RESPONSE
    ACK, the response doesn't get cancelled by a PING ACK we happen to be
    scheduling transmission of at the same time.

    If a PING RESPONSE gets lost, the other side might just sit there waiting
    for it and refuse to proceed otherwise.

    Signed-off-by: David Howells

    David Howells
     
  • Split rxrpc_send_data_packet() to separate ACK generation (which is more
    complicated) from ABORT generation. This simplifies the code a bit and
    fixes the following warning:

    In file included from ../net/rxrpc/output.c:20:0:
    net/rxrpc/output.c: In function 'rxrpc_send_call_packet':
    net/rxrpc/ar-internal.h:1187:27: error: 'top' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    net/rxrpc/output.c:103:24: note: 'top' was declared here
    net/rxrpc/output.c:225:25: error: 'hard_ack' may be used uninitialized in this function [-Werror=maybe-uninitialized]

    Reported-by: Arnd Bergmann
    Signed-off-by: David Howells

    David Howells
     
  • When a reply is deemed lost, we send a ping to find out the other end
    received all the request data packets we sent. This should be limited to
    client calls and we shouldn't do this on service calls.

    Signed-off-by: David Howells

    David Howells
     
  • If an call comes in to a local endpoint that isn't listening for any
    incoming calls at the moment, an oops will happen. We need to check that
    the local endpoint's service pointer isn't NULL before we dereference it.

    Signed-off-by: David Howells

    David Howells
     
  • Remove a duplicate const keyword.

    Signed-off-by: David Howells

    David Howells
     
  • struct rxrpc_local->service is marked __rcu - this means that accesses of
    it need to be managed using RCU wrappers. There are two such places in
    rxrpc_release_sock() where the value is checked and cleared. Fix this by
    using the appropriate wrappers.

    Signed-off-by: David Howells

    David Howells
     
  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net-next

    This is a pull request to address fallout from previous nf-next pull
    request, only fixes going on here:

    1) Address a potential null dereference in nf_unregister_net_hook()
    when becomes nf_hook_entry_head is NULL, from Aaron Conole.

    2) Missing ifdef for CONFIG_NETFILTER_INGRESS, also from Aaron.

    3) Fix linking problems in xt_hashlimit in x86_32, from Pai.

    4) Fix permissions of nf_log sysctl from unpriviledge netns, from
    Jann Horn.

    5) Fix possible divide by zero in nft_limit, from Liping Zhang.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Remove extra x1 variable, it's just temporary placeholder that
    clutters the code unnecessarily.

    Reflects ceph.git commit 0d19408d91dd747340d70287b4ef9efd89e95c6b.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     
  • Use __builtin_clz() supported by GCC and Clang to figure out
    how many bits we should shift instead of shifting by a bit
    in a loop until the value gets normalized. Improves performance
    of this function by up to 3x in worst-case scenario and overall
    straw2 performance by ~10%.

    Reflects ceph.git commit 110de33ca497d94fc4737e5154d3fe781fa84a0a.

    Signed-off-by: Ilya Dryomov

    Ilya Dryomov
     

04 Oct, 2016

2 commits

  • Resolve the merge conflict between Felix's/my and Toke's patches
    coming into the tree through net and mac80211-next respectively.
    Most of Felix's changes go away due to Toke's new infrastructure
    work, my patch changes to "goto begin" (the label wasn't there
    before) instead of returning NULL so flow control towards drivers
    is preserved better.

    Signed-off-by: Johannes Berg

    Johannes Berg
     
  • After I input the following nftables rule, a panic happened on my system:
    # nft add rule filter OUTPUT limit rate 0xf00000000 bytes/second

    divide error: 0000 [#1] SMP
    [ ... ]
    RIP: 0010:[] []
    nft_limit_pkt_bytes_eval+0x2e/0xa0 [nft_limit]
    Call Trace:
    [] nft_do_chain+0xfb/0x4e0 [nf_tables]
    [] ? nf_nat_setup_info+0x96/0x480 [nf_nat]
    [] ? ipt_do_table+0x327/0x610
    [] ? __nf_nat_alloc_null_binding+0x57/0x80 [nf_nat]
    [] nft_ipv4_output+0xaf/0xd0 [nf_tables_ipv4]
    [] nf_iterate+0x62/0x80
    [] nf_hook_slow+0x73/0xd0
    [] __ip_local_out+0xcd/0xe0
    [] ? ip_forward_options+0x1b0/0x1b0
    [] ip_local_out+0x1c/0x40

    This is because divisor is 64-bit, but we treat it as a 32-bit integer,
    then 0xf00000000 becomes zero, i.e. divisor becomes 0.

    Signed-off-by: Liping Zhang
    Signed-off-by: Pablo Neira Ayuso

    Liping Zhang