10 Jul, 2013

40 commits

  • Pull networking updates from David Miller:
    "This is a re-do of the net-next pull request for the current merge
    window. The only difference from the one I made the other day is that
    this has Eliezer's interface renames and the timeout handling changes
    made based upon your feedback, as well as a few bug fixes that have
    trickeled in.

    Highlights:

    1) Low latency device polling, eliminating the cost of interrupt
    handling and context switches. Allows direct polling of a network
    device from socket operations, such as recvmsg() and poll().

    Currently ixgbe, mlx4, and bnx2x support this feature.

    Full high level description, performance numbers, and design in
    commit 0a4db187a999 ("Merge branch 'll_poll'")

    From Eliezer Tamir.

    2) With the routing cache removed, ip_check_mc_rcu() gets exercised
    more than ever before in the case where we have lots of multicast
    addresses. Use a hash table instead of a simple linked list, from
    Eric Dumazet.

    3) Add driver for Atheros CQA98xx 802.11ac wireless devices, from
    Bartosz Markowski, Janusz Dziedzic, Kalle Valo, Marek Kwaczynski,
    Marek Puzyniak, Michal Kazior, and Sujith Manoharan.

    4) Support reporting the TUN device persist flag to userspace, from
    Pavel Emelyanov.

    5) Allow controlling network device VF link state using netlink, from
    Rony Efraim.

    6) Support GRE tunneling in openvswitch, from Pravin B Shelar.

    7) Adjust SOCK_MIN_RCVBUF and SOCK_MIN_SNDBUF for modern times, from
    Daniel Borkmann and Eric Dumazet.

    8) Allow controlling of TCP quickack behavior on a per-route basis,
    from Cong Wang.

    9) Several bug fixes and improvements to vxlan from Stephen
    Hemminger, Pravin B Shelar, and Mike Rapoport. In particular,
    support receiving on multiple UDP ports.

    10) Major cleanups, particular in the area of debugging and cookie
    lifetime handline, to the SCTP protocol code. From Daniel
    Borkmann.

    11) Allow packets to cross network namespaces when traversing tunnel
    devices. From Nicolas Dichtel.

    12) Allow monitoring netlink traffic via AF_PACKET sockets, in a
    manner akin to how we monitor real network traffic via ptype_all.
    From Daniel Borkmann.

    13) Several bug fixes and improvements for the new alx device driver,
    from Johannes Berg.

    14) Fix scalability issues in the netem packet scheduler's time queue,
    by using an rbtree. From Eric Dumazet.

    15) Several bug fixes in TCP loss recovery handling, from Yuchung
    Cheng.

    16) Add support for GSO segmentation of MPLS packets, from Simon
    Horman.

    17) Make network notifiers have a real data type for the opaque
    pointer that's passed into them. Use this to properly handle
    network device flag changes in arp_netdev_event(). From Jiri
    Pirko and Timo Teräs.

    18) Convert several drivers over to module_pci_driver(), from Peter
    Huewe.

    19) tcp_fixup_rcvbuf() can loop 500 times over loopback, just use a
    O(1) calculation instead. From Eric Dumazet.

    20) Support setting of explicit tunnel peer addresses in ipv6, just
    like ipv4. From Nicolas Dichtel.

    21) Protect x86 BPF JIT against spraying attacks, from Eric Dumazet.

    22) Prevent a single high rate flow from overruning an individual cpu
    during RX packet processing via selective flow shedding. From
    Willem de Bruijn.

    23) Don't use spinlocks in TCP md5 signing fast paths, from Eric
    Dumazet.

    24) Don't just drop GSO packets which are above the TBF scheduler's
    burst limit, chop them up so they are in-bounds instead. Also
    from Eric Dumazet.

    25) VLAN offloads are missed when configured on top of a bridge, fix
    from Vlad Yasevich.

    26) Support IPV6 in ping sockets. From Lorenzo Colitti.

    27) Receive flow steering targets should be updated at poll() time
    too, from David Majnemer.

    28) Fix several corner case regressions in PMTU/redirect handling due
    to the routing cache removal, from Timo Teräs.

    29) We have to be mindful of ipv4 mapped ipv6 sockets in
    upd_v6_push_pending_frames(). From Hannes Frederic Sowa.

    30) Fix L2TP sequence number handling bugs, from James Chapman."

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1214 commits)
    drivers/net: caif: fix wrong rtnl_is_locked() usage
    drivers/net: enic: release rtnl_lock on error-path
    vhost-net: fix use-after-free in vhost_net_flush
    net: mv643xx_eth: do not use port number as platform device id
    net: sctp: confirm route during forward progress
    virtio_net: fix race in RX VQ processing
    virtio: support unlocked queue poll
    net/cadence/macb: fix bug/typo in extracting gem_irq_read_clear bit
    Documentation: Fix references to defunct linux-net@vger.kernel.org
    net/fs: change busy poll time accounting
    net: rename low latency sockets functions to busy poll
    bridge: fix some kernel warning in multicast timer
    sfc: Fix memory leak when discarding scattered packets
    sit: fix tunnel update via netlink
    dt:net:stmmac: Add dt specific phy reset callback support.
    dt:net:stmmac: Add support to dwmac version 3.610 and 3.710
    dt:net:stmmac: Allocate platform data only if its NULL.
    net:stmmac: fix memleak in the open method
    ipv6: rt6_check_neigh should successfully verify neigh if no NUD information are available
    net: ipv6: fix wrong ping_v6_sendmsg return value
    ...

    Linus Torvalds
     
  • Pull drm updates from Dave Airlie:
    "Okay this is the big one, I was stalled on the fbdev pull req as I
    stupidly let fbdev guys merge a patch I required to fix a warning with
    some patches I had, they ended up merging the patch from the wrong
    place, but the warning should be fixed. In future I'll just take the
    patch myself!

    Outside drm:

    There are some snd changes for the HDMI audio interactions on haswell,
    they've been acked for inclusion via my tree. This relies on the
    wound/wait tree from Ingo which is already merged.

    Major changes:

    AMD finally released the dynamic power management code for all their
    GPUs from r600->present day, this is great, off by default for now but
    also a huge amount of code, in fact it is most of this pull request.

    Since it landed there has been a lot of community testing and Alex has
    sent a lot of fixes for any bugs found so far. I suspect radeon might
    now be the biggest kernel driver ever :-P p.s. radeon.dpm=1 to enable
    dynamic powermanagement for anyone.

    New drivers:

    Renesas r-car display unit.

    Other highlights:

    - core: GEM CMA prime support, use new w/w mutexs for TTM
    reservations, cursor hotspot, doc updates
    - dvo chips: chrontel 7010B support
    - i915: Haswell (fbc, ips, vecs, watermarks, audio powerwell),
    Valleyview (enabled by default, rc6), lots of pll reworking, 30bpp
    support (this time for sure)
    - nouveau: async buffer object deletion, context/register init
    updates, kernel vp2 engine support, GF117 support, GK110 accel
    support (with external nvidia ucode), context cleanups.
    - exynos: memory leak fixes, Add S3C64XX SoC series support, device
    tree updates, common clock framework support,
    - qxl: cursor hotspot support, multi-monitor support, suspend/resume
    support
    - mgag200: hw cursor support, g200 mode limiting
    - shmobile: prime support
    - tegra: fixes mostly

    I've been banging on this quite a lot due to the size of it, and it
    seems to okay on everything I've tested it on."

    * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (811 commits)
    drm/radeon/dpm: implement vblank_too_short callback for si
    drm/radeon/dpm: implement vblank_too_short callback for cayman
    drm/radeon/dpm: implement vblank_too_short callback for btc
    drm/radeon/dpm: implement vblank_too_short callback for evergreen
    drm/radeon/dpm: implement vblank_too_short callback for 7xx
    drm/radeon/dpm: add checks against vblank time
    drm/radeon/dpm: add helper to calculate vblank time
    drm/radeon: remove stray line in old pm code
    drm/radeon/dpm: fix display_gap programming on rv7xx
    drm/nvc0/gr: fix gpc firmware regression
    drm/nouveau: fix minor thinko causing bo moves to not be async on kepler
    drm/radeon/dpm: implement force performance level for TN
    drm/radeon/dpm: implement force performance level for ON/LN
    drm/radeon/dpm: implement force performance level for SI
    drm/radeon/dpm: implement force performance level for cayman
    drm/radeon/dpm: implement force performance levels for 7xx/eg/btc
    drm/radeon/dpm: add infrastructure to force performance levels
    drm/radeon: fix surface setup on r1xx
    drm/radeon: add support for 3d perf states on older asics
    drm/radeon: set default clocks for SI when DPM is disabled
    ...

    Linus Torvalds
     
  • Pull fbdev update from Jean-Christophe PLAGNIOL-VILLARD:
    "Various fbdev changes for 3.11
    - xilinxfb updates
    - Small cleanups and fixes to multiple drivers
    - OMAP display subsystem bug updates
    - imxfb dt support"

    * tag 'fbdev-for-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/plagnioj/linux-fbdev: (95 commits)
    video: imxfb: Add DT support
    video: i740fb: Make i740fb_init static
    fb: make fp_get_options name argument const
    video: mmp: fix graphics/video layer enable/mask swap issue
    video: mmp: fix memcpy wrong size for mmp_addr issue
    radeon: use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)
    aty128fb: use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)
    video: of_display_timing.h: Declare 'display_timing'
    fbdev: bfin-lq035q1-fb: Use dev_pm_ops
    fbmem: return -EFAULT on copy_to_user() failure
    OMAPDSS: DPI: Fix wrong pixel clock limit
    video: replace strict_strtoul() with kstrtoul()
    uvesafb: Correct/simplify warning message
    fb: fix atyfb unused data warnings
    fb: fix atyfb build warning
    video: imxfb: Make local symbols static
    video: udlfb: Make local symbol static
    video: udlfb: Use NULL instead of 0
    video: smscufx: Use NULL instead of 0
    video: remove unnecessary platform_set_drvdata()
    ...

    Linus Torvalds
     
  • Merge second patch-bomb from Andrew Morton:
    - misc fixes
    - audit stuff
    - fanotify/inotify/dnotify things
    - most of the rest of MM. The new cache shrinker code from Glauber and
    Dave Chinner probably isn't quite stabilized yet.
    - ptrace
    - ipc
    - partitions
    - reboot cleanups
    - add LZ4 decompressor, use it for kernel compression

    * emailed patches from Andrew Morton : (118 commits)
    lib/scatterlist: error handling in __sg_alloc_table()
    scsi_debug: fix do_device_access() with wrap around range
    crypto: talitos: use sg_pcopy_to_buffer()
    lib/scatterlist: introduce sg_pcopy_from_buffer() and sg_pcopy_to_buffer()
    lib/scatterlist: factor out sg_miter_get_next_page() from sg_miter_next()
    crypto: add lz4 Cryptographic API
    lib: add lz4 compressor module
    arm: add support for LZ4-compressed kernel
    lib: add support for LZ4-compressed kernel
    decompressor: add LZ4 decompressor module
    lib: add weak clz/ctz functions
    reboot: move arch/x86 reboot= handling to generic kernel
    reboot: arm: change reboot_mode to use enum reboot_mode
    reboot: arm: prepare reboot_mode for moving to generic kernel code
    reboot: arm: remove unused restart_mode fields from some arm subarchs
    reboot: unicore32: prepare reboot_mode for moving to generic kernel code
    reboot: x86: prepare reboot_mode for moving to generic kernel code
    reboot: checkpatch.pl the new kernel/reboot.c file
    reboot: move shutdown/reboot related functions to kernel/reboot.c
    reboot: remove -stable friendly PF_THREAD_BOUND define
    ...

    Linus Torvalds
     
  • rtnl_is_locked() doesn't check who holds this lock, it just tells that it's
    locked right now. if caif::ldisc_close really can be called under rtrnl_lock
    then it should release net device in other context because there is no way
    to grab rtnl_lock without deadlock.

    This patch adds work which releases these devices. Also this patch fixes calling
    dev_close/unregister_netdevice without rtnl_lock from caif_ser_exit().

    Signed-off-by: Konstantin Khlebnikov
    Cc: Dmitry Tarnyagin
    Signed-off-by: David S. Miller

    Konstantin Khlebnikov
     
  • enic_change_mtu_work() must call rtnl_unlock() on all exiting paths.

    Signed-off-by: Konstantin Khlebnikov
    Cc: Christian Benvenuti
    Cc: Roopa Prabhu
    Cc: Neel Patel
    Cc: Nishank Trivedi
    Signed-off-by: David S. Miller

    Konstantin Khlebnikov
     
  • vhost_net_ubuf_put_and_wait has a confusing name:
    it will actually also free it's argument.
    Thus since commit 1280c27f8e29acf4af2da914e80ec27c3dbd5c01
    "vhost-net: flush outstanding DMAs on memory change"
    vhost_net_flush tries to use the argument after passing it
    to vhost_net_ubuf_put_and_wait, this results
    in use after free.
    To fix, don't free the argument in vhost_net_ubuf_put_and_wait,
    add an new API for callers that want to free ubufs.

    Acked-by: Asias He
    Acked-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • …inux/kernel/git/ericvh/v9fs

    Pull 9p update from Eric Van Hensbergen:
    "Grab bag of little fixes and enhancements:
    - optional security enhancements
    - fix path coverage in MAINTAINERS
    - switch to using most used protocol and transport as default
    - clean up buffer dumps in trace code

    Held off on RDMA patches as they need to be cleaned up a bit, but will
    try to get the cleaned, checked, and pushed by mid-week"

    * tag 'for-linus-3.11-merge-window-part-1' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
    9p: Add rest of 9p files to MAINTAINERS entry
    9p: trace: use %*ph to dump buffer
    net/9p: Handle error in zero copy request correctly for 9p2000.u
    net/9p: Use virtio transpart as the default transport
    net/9p: Make 9P2000.L the default protocol for 9p file system

    Linus Torvalds
     
  • The port number is only local to the ethernet block, not global, so
    there can be two ethernet blocks both using the same port, like
    kirkwood with both using port 0.

    Fix this by using the array index offset for the allocated platform
    devices as the id.

    Signed-off-by: Jonas Gorski
    Signed-off-by: David S. Miller

    Jonas Gorski
     
  • This fix has been proposed originally by Vlad Yasevich. He says:

    When SCTP makes forward progress (receives a SACK that acks new chunks,
    renegs, or answeres 0-window probes) or when HB-ACK arrives, mark
    the route as confirmed so we don't unnecessarily send NUD probes.

    Having a simple SCTP client/server that exchange data chunks every 1sec,
    without this patch ARP requests are sent periodically every 40-60sec.
    With this fix applied, an ARP request is only done once right at the
    "session" beginning. Also, when clearing the related ARP cache entry
    manually during the session, a new request is correctly done. I have
    only "backported" this to net-next and tested that it works, so full
    credit goes to Vlad.

    Signed-off-by: Vlad Yasevich
    Signed-off-by: Daniel Borkmann
    Acked-by: Neil Horman
    Signed-off-by: David S. Miller

    Daniel Borkmann
     
  • Michael S. Tsirkin says:

    ====================
    Jason Wang reported a race in RX VQ processing:
    virtqueue_enable_cb is called outside napi lock,
    violating virtio serialization rules.
    The race has been there from day 1, but it got especially nasty in 3.0
    when commit a5c262c5fd83ece01bd649fb08416c501d4c59d7
    "virtio_ring: support event idx feature"
    added more dependency on vq state.

    Please review, and consider for 3.11 and stable.

    Changes from v1:
    - Added Jason's Tested-by tag
    - minor coding style fix
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • virtio net called virtqueue_enable_cq on RX path after napi_complete, so
    with NAPI_STATE_SCHED clear - outside the implicit napi lock.
    This violates the requirement to synchronize virtqueue_enable_cq wrt
    virtqueue_add_buf. In particular, used event can move backwards,
    causing us to lose interrupts.
    In a debug build, this can trigger panic within START_USE.

    Jason Wang reports that he can trigger the races artificially,
    by adding udelay() in virtqueue_enable_cb() after virtio_mb().

    However, we must call napi_complete to clear NAPI_STATE_SCHED before
    polling the virtqueue for used buffers, otherwise napi_schedule_prep in
    a callback will fail, causing us to lose RX events.

    To fix, call virtqueue_enable_cb_prepare with NAPI_STATE_SCHED
    set (under napi lock), later call virtqueue_poll with
    NAPI_STATE_SCHED clear (outside the lock).

    Reported-by: Jason Wang
    Tested-by: Jason Wang
    Acked-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • This adds a way to check ring empty state after enable_cb outside any
    locks. Will be used by virtio_net.

    Note: there's room for more optimization: caller is likely to have a
    memory barrier already, which means we might be able to get rid of a
    barrier here. Deferring this optimization until we do some
    benchmarking.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     
  • Signed-off-by: Jongsung Kim
    Acked-by: Nicolas Ferre
    Signed-off-by: David S. Miller

    Jongsung Kim
     
  • linux-net@vger.kernel.org was replaced by netdev@oss.sgi.com was replaced
    by netdev@vger.kernel.org.

    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: David S. Miller

    Geert Uytterhoeven
     
  • Pull Ceph updates from Sage Weil:
    "There is some follow-on RBD cleanup after the last window's code drop,
    a series from Yan fixing multi-mds behavior in cephfs, and then a
    sprinkling of bug fixes all around. Some warnings, sleeping while
    atomic, a null dereference, and cleanups"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (36 commits)
    libceph: fix invalid unsigned->signed conversion for timespec encoding
    libceph: call r_unsafe_callback when unsafe reply is received
    ceph: fix race between cap issue and revoke
    ceph: fix cap revoke race
    ceph: fix pending vmtruncate race
    ceph: avoid accessing invalid memory
    libceph: Fix NULL pointer dereference in auth client code
    ceph: Reconstruct the func ceph_reserve_caps.
    ceph: Free mdsc if alloc mdsc->mdsmap failed.
    ceph: remove sb_start/end_write in ceph_aio_write.
    ceph: avoid meaningless calling ceph_caps_revoking if sync_mode == WB_SYNC_ALL.
    ceph: fix sleeping function called from invalid context.
    ceph: move inode to proper flushing list when auth MDS changes
    rbd: fix a couple warnings
    ceph: clear migrate seq when MDS restarts
    ceph: check migrate seq before changing auth cap
    ceph: fix race between page writeback and truncate
    ceph: reset iov_len when discarding cap release messages
    ceph: fix cap release race
    libceph: fix truncate size calculation
    ...

    Linus Torvalds
     
  • Pull btrfs update from Chris Mason:
    "These are the usual mixture of bugs, cleanups and performance fixes.
    Miao has some really nice tuning of our crc code as well as our
    transaction commits.

    Josef is peeling off more and more problems related to early enospc,
    and has a number of important bug fixes in here too"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (81 commits)
    Btrfs: wait ordered range before doing direct io
    Btrfs: only do the tree_mod_log_free_eb if this is our last ref
    Btrfs: hold the tree mod lock in __tree_mod_log_rewind
    Btrfs: make backref walking code handle skinny metadata
    Btrfs: fix crash regarding to ulist_add_merge
    Btrfs: fix several potential problems in copy_nocow_pages_for_inode
    Btrfs: cleanup the code of copy_nocow_pages_for_inode()
    Btrfs: fix oops when recovering the file data by scrub function
    Btrfs: make the chunk allocator completely tree lockless
    Btrfs: cleanup orphaned root orphan item
    Btrfs: fix wrong mirror number tuning
    Btrfs: cleanup redundant code in btrfs_submit_direct()
    Btrfs: remove btrfs_sector_sum structure
    Btrfs: check if we can nocow if we don't have data space
    Btrfs: stop using try_to_writeback_inodes_sb_nr to flush delalloc
    Btrfs: use a percpu to keep track of possibly pinned bytes
    Btrfs: check for actual acls rather than just xattrs when caching no acl
    Btrfs: move btrfs_truncate_page to btrfs_cont_expand instead of btrfs_truncate
    Btrfs: optimize reada_for_balance
    Btrfs: optimize read_block_for_search
    ...

    Linus Torvalds
     
  • Suggested by Linus:
    Changed time accounting for busy-poll:
    - Make it microsecond based.
    - Use unsigned longs.
    - Revert back to use time_after instead of time_in_range.
    Reorder poll/select busy loop conditions:
    - Clear busy_flag after one time we can't busy-poll.
    - Only init busy_end if we actually are going to busy-poll.
    Added one more missing need_resched() test.

    Signed-off-by: Eliezer Tamir
    Signed-off-by: David S. Miller

    Eliezer Tamir
     
  • Pull xfs update from Ben Myers:
    "This includes several bugfixes, part of the work for project quotas
    and group quotas to be used together, performance improvements for
    inode creation/deletion, buffer readahead, and bulkstat,
    implementation of the inode change count, an inode create transaction,
    and the removal of a bunch of dead code.

    There are also some duplicate commits that you already have from the
    3.10-rc series.

    - part of the work to allow project quotas and group quotas to be
    used together
    - inode change count
    - inode create transaction
    - block queue plugging in buffer readahead and bulkstat
    - ordered log vector support
    - removal of dead code in and around xfs_sync_inode_grab,
    xfs_ialloc_get_rec, XFS_MOUNT_RETERR, XFS_ALLOCFREE_LOG_RES,
    XFS_DIROP_LOG_RES, xfs_chash, ctl_table, and
    xfs_growfs_data_private
    - don't keep silent if sunit/swidth can not be changed via mount
    - fix a leak of remote symlink blocks into the filesystem when xattrs
    are used on symlinks
    - fix for fiemap to return FIEMAP_EXTENT_UNKOWN flag on delay extents
    - part of a fix for xfs_fsr
    - disable speculative preallocation with small files
    - performance improvements for inode creates and deletes"

    * tag 'for-linus-v3.11-rc1' of git://oss.sgi.com/xfs/xfs: (61 commits)
    xfs: Remove incore use of XFS_OQUOTA_ENFD and XFS_OQUOTA_CHKD
    xfs: Change xfs_dquot_acct to be a 2-dimensional array
    xfs: Code cleanup and removal of some typedef usage
    xfs: Replace macro XFS_DQ_TO_QIP with a function
    xfs: Replace macro XFS_DQUOT_TREE with a function
    xfs: Define a new function xfs_is_quota_inode()
    xfs: implement inode change count
    xfs: Use inode create transaction
    xfs: Inode create item recovery
    xfs: Inode create transaction reservations
    xfs: Inode create log items
    xfs: Introduce an ordered buffer item
    xfs: Introduce ordered log vector support
    xfs: xfs_ifree doesn't need to modify the inode buffer
    xfs: don't do IO when creating an new inode
    xfs: don't use speculative prealloc for small files
    xfs: plug directory buffer readahead
    xfs: add pluging for bulkstat readahead
    xfs: Remove dead function prototype xfs_sync_inode_grab()
    xfs: Remove the left function variable from xfs_ialloc_get_rec()
    ...

    Linus Torvalds
     
  • __kernel_time_t is a long, which cannot hold a U32_MAX on 32-bit
    architectures. Just drop this check as it has limited value.

    This fixes a crash like:

    [ 957.905812] kernel BUG at /srv/autobuild-ceph/gitbuilder.git/build/include/linux/ceph/decode.h:164!
    [ 957.914849] Internal error: Oops - BUG: 0 [#1] SMP ARM
    [ 957.919978] Modules linked in: rbd libceph libcrc32c ipmi_devintf ipmi_si ipmi_msghandler nfsd nfs_acl auth_rpcgss nfs fscache lockd sunrpc
    [ 957.932547] CPU: 1 Tainted: G W (3.9.0-ceph-19bb6a83-highbank #1)
    [ 957.939881] PC is at ceph_osdc_build_request+0x8c/0x4f8 [libceph]
    [ 957.945967] LR is at 0xec520904
    [ 957.949103] pc : [] lr : [] psr: 20000153
    [ 957.949103] sp : ec753df8 ip : 00000001 fp : ec53e100
    [ 957.960571] r10: ebef25c0 r9 : ec5fa400 r8 : ecbcc000
    [ 957.965788] r7 : 00000000 r6 : 00000000 r5 : ffffffff r4 : 00000020
    [ 957.972307] r3 : 51cc8143 r2 : ec520900 r1 : ec753e58 r0 : ec520908
    [ 957.978827] Flags: nzCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user
    [ 957.986039] Control: 10c5387d Table: 2c59c04a DAC: 00000015
    [ 957.991777] Process rbd (pid: 2138, stack limit = 0xec752238)
    [ 957.997514] Stack: (0xec753df8 to 0xec754000)
    [ 958.001864] 3de0: 00000001 00000001
    [ 958.010032] 3e00: 00000001 bf139744 ecbcc000 ec55a0a0 00000024 00000000 ebef25c0 fffffffe
    [ 958.018204] 3e20: ffffffff 00000000 00000000 00000001 ec5fa400 ebef25c0 ec53e100 bf166b68
    [ 958.026377] 3e40: 00000000 0000220f fffffffe ffffffff ec753e58 bf13ff24 51cc8143 05b25ed2
    [ 958.034548] 3e60: 00000001 00000000 00000000 bf1688d4 00000001 00000000 00000000 00000000
    [ 958.042720] 3e80: 00000001 00000060 ec5fa400 ed53d200 ed439600 ed439300 00000001 00000060
    [ 958.050888] 3ea0: ec5fa400 ed53d200 00000000 bf16a320 00000000 ec53e100 00000040 ec753eb8
    [ 958.059059] 3ec0: ec51df00 ed53d7c0 ed53d200 ed53d7c0 00000000 ed53d7c0 ec5fa400 bf16ed70
    [ 958.067230] 3ee0: 00000000 00000060 00000002 ed53d200 00000000 bf16acf4 ed53d7c0 ec752000
    [ 958.075402] 3f00: ed980e50 e954f5d8 00000000 00000060 ed53d240 ed53d258 ec753f80 c04f44a8
    [ 958.083574] 3f20: edb7910c ec664700 01ade920 c02e4c44 00000060 c016b3dc ec51de40 01adfb84
    [ 958.091745] 3f40: 00000060 ec752000 ec753f80 ec752000 00000060 c0108444 00000007 ec51de48
    [ 958.099914] 3f60: ed0eb8c0 00000000 00000000 ec51de40 01adfb84 00000001 00000060 c0108858
    [ 958.108085] 3f80: 00000000 00000000 51cc8143 00000060 01adfb84 00000007 00000004 c000dd68
    [ 958.116257] 3fa0: 00000000 c000dbc0 00000060 01adfb84 00000007 01adfb84 00000060 01adfb80
    [ 958.124429] 3fc0: 00000060 01adfb84 00000007 00000004 beded1a8 00000000 01adf2f0 01ade920
    [ 958.132599] 3fe0: 00000000 beded180 b6811324 b6811334 800f0010 00000007 2e7f5821 2e7f5c21
    [ 958.140815] [] (ceph_osdc_build_request+0x8c/0x4f8 [libceph]) from [] (rbd_osd_req_format_write+0x50/0x7c [rbd])
    [ 958.152739] [] (rbd_osd_req_format_write+0x50/0x7c [rbd]) from [] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd])
    [ 958.164486] [] (rbd_dev_header_watch_sync+0xe0/0x204 [rbd]) from [] (rbd_dev_image_probe+0x23c/0x850 [rbd])
    [ 958.175967] [] (rbd_dev_image_probe+0x23c/0x850 [rbd]) from [] (rbd_add+0x3c0/0x918 [rbd])
    [ 958.185975] [] (rbd_add+0x3c0/0x918 [rbd]) from [] (bus_attr_store+0x20/0x2c)
    [ 958.194850] [] (bus_attr_store+0x20/0x2c) from [] (sysfs_write_file+0x168/0x198)
    [ 958.203984] [] (sysfs_write_file+0x168/0x198) from [] (vfs_write+0x9c/0x170)
    [ 958.212768] [] (vfs_write+0x9c/0x170) from [] (sys_write+0x3c/0x70)
    [ 958.220768] [] (sys_write+0x3c/0x70) from [] (ret_fast_syscall+0x0/0x30)
    [ 958.229199] Code: e59d1058 e5913000 e3530000 ba000114 (e7f001f2)

    CC: stable@vger.kernel.org # 3.4+
    Signed-off-by: Josh Durgin
    Reviewed-by: Sage Weil

    Josh Durgin
     
  • Pull NFS client updates from Trond Myklebust:
    "Feature highlights include:
    - Add basic client support for NFSv4.2
    - Add basic client support for Labeled NFS (selinux for NFSv4.2)
    - Fix the use of credentials in NFSv4.1 stateful operations, and add
    support for NFSv4.1 state protection.

    Bugfix highlights:
    - Fix another NFSv4 open state recovery race
    - Fix an NFSv4.1 back channel session regression
    - Various rpc_pipefs races
    - Fix another issue with NFSv3 auth negotiation

    Please note that Labeled NFS does require some additional support from
    the security subsystem. The relevant changesets have all been
    reviewed and acked by James Morris."

    * tag 'nfs-for-3.11-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (54 commits)
    NFS: Set NFS_CS_MIGRATION for NFSv4 mounts
    NFSv4.1 Refactor nfs4_init_session and nfs4_init_channel_attrs
    nfs: have NFSv3 try server-specified auth flavors in turn
    nfs: have nfs_mount fake up a auth_flavs list when the server didn't provide it
    nfs: move server_authlist into nfs_try_mount_request
    nfs: refactor "need_mount" code out of nfs_try_mount
    SUNRPC: PipeFS MOUNT notification optimization for dying clients
    SUNRPC: split client creation routine into setup and registration
    SUNRPC: fix races on PipeFS UMOUNT notifications
    SUNRPC: fix races on PipeFS MOUNT notifications
    NFSv4.1 use pnfs_device maxcount for the objectlayout gdia_maxcount
    NFSv4.1 use pnfs_device maxcount for the blocklayout gdia_maxcount
    NFSv4.1 Fix gdia_maxcount calculation to fit in ca_maxresponsesize
    NFS: Improve legacy idmapping fallback
    NFSv4.1 end back channel session draining
    NFS: Apply v4.1 capabilities to v4.2
    NFSv4.1: Clean up layout segment comparison helper names
    NFSv4.1: layout segment comparison helpers should take 'const' parameters
    NFSv4: Move the DNS resolver into the NFSv4 module
    rpc_pipefs: only set rpc_dentry_ops if d_op isn't already set
    ...

    Linus Torvalds
     
  • Pull ext3 fix and quota cleanup from Jan Kara:
    "A fix of ext3 error reporting from fsync and a quota cleanup"

    * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
    quota: Convert use of typedef ctl_table to struct ctl_table
    ext3: Fix fsync error handling after filesystem abort.

    Linus Torvalds
     
  • Pull third set of VFS updates from Al Viro:
    "Misc stuff all over the place. There will be one more pile in a
    couple of days"

    This is an "evil merge" that also uses the new d_count helper in
    fs/configfs/dir.c, missed by commit 84d08fa888e7 ("helper for reading
    ->d_count")

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
    ncpfs: fix error return code in ncp_parse_options()
    locks: move file_lock_list to a set of percpu hlist_heads and convert file_lock_lock to an lglock
    seq_file: add seq_list_*_percpu helpers
    f2fs: fix readdir incorrectness
    mode_t whack-a-mole...
    lustre: kill the pointless wrapper
    helper for reading ->d_count

    Linus Torvalds
     
  • I was reviewing code which I suspected might allocate a zero size SG
    table. That will cause memory corruption. Also we can't return before
    doing the memset or we could end up using uninitialized memory in the
    cleanup path.

    Signed-off-by: Dan Carpenter
    Cc: Akinobu Mita
    Cc: Imre Deak
    Cc: Tejun Heo
    Cc: Daniel Vetter
    Cc: Maxim Levitsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • do_device_access() is a function that abstracts copying SG list from/to
    ramdisk storage (fake_storep).

    It must deal with the ranges exceeding actual fake_storep size, because
    such ranges are valid if virtual_gb is set greater than zero, and they
    should be treated as fake_storep is repeatedly mirrored up to virtual
    size.

    Unfortunately, it can't deal with the range which wraps around the end of
    fake_storep. A wrap around range is copied by two
    sg_copy_{from,to}_buffer() calls, but sg_copy_{from,to}_buffer() can't
    copy from/to in the middle of SG list, therefore the second call can't
    copy correctly.

    This fixes it by using sg_pcopy_{from,to}_buffer() that can copy from/to
    the middle of SG list.

    This also simplifies the assignment of sdb->resid in
    fill_from_dev_buffer(). Because fill_from_dev_buffer() is now only called
    once per command execution cycle. So it is not necessary to take care to
    decrease sdb->resid if fill_from_dev_buffer() is called more than once.

    Signed-off-by: Akinobu Mita
    Cc: "David S. Miller"
    Cc: "James E.J. Bottomley"
    Cc: Douglas Gilbert
    Cc: Herbert Xu
    Cc: Horia Geanta
    Cc: Imre Deak
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Use sg_pcopy_to_buffer() which is better than the function previously used.
    Because it doesn't do kmap/kunmap for skipped pages.

    Signed-off-by: Akinobu Mita
    Cc: "David S. Miller"
    Cc: "James E.J. Bottomley"
    Cc: Douglas Gilbert
    Cc: Herbert Xu
    Cc: Horia Geanta
    Cc: Imre Deak
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • The only difference between sg_pcopy_{from,to}_buffer() and
    sg_copy_{from,to}_buffer() is an additional argument that specifies the
    number of bytes to skip the SG list before copying.

    Signed-off-by: Akinobu Mita
    Cc: "David S. Miller"
    Cc: "James E.J. Bottomley"
    Cc: Douglas Gilbert
    Cc: Herbert Xu
    Cc: Horia Geanta
    Cc: Imre Deak
    Acked-by: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • This patchset introduces sg_pcopy_from_buffer() and sg_pcopy_to_buffer(),
    which copy data between a linear buffer and an SG list.

    The only difference between sg_pcopy_{from,to}_buffer() and
    sg_copy_{from,to}_buffer() is an additional argument that specifies the
    number of bytes to skip the SG list before copying.

    The main reason for introducing these functions is to fix a problem in
    scsi_debug module. And there is a local function in crypto/talitos
    module, which can be replaced by sg_pcopy_to_buffer().

    This patch:

    sg_miter_get_next_page() is used to proceed page iterator to the next page
    if necessary, and will be used to implement the variants of
    sg_copy_{from,to}_buffer() later.

    Signed-off-by: Akinobu Mita
    Acked-by: Tejun Heo
    Cc: Tejun Heo
    Cc: Imre Deak
    Cc: Herbert Xu
    Cc: "David S. Miller"
    Cc: "James E.J. Bottomley"
    Cc: Douglas Gilbert
    Cc: Horia Geanta
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Add support for lz4 and lz4hc compression algorithm using the lib/lz4/*
    codebase.

    [akpm@linux-foundation.org: fix warnings]
    Signed-off-by: Chanho Min
    Cc: "Darrick J. Wong"
    Cc: Bob Pearson
    Cc: Richard Weinberger
    Cc: Herbert Xu
    Cc: Yann Collet
    Cc: Kyungsik Lee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chanho Min
     
  • This patchset is for supporting LZ4 compression and the crypto API using
    it.

    As shown below, the size of data is a little bit bigger but compressing
    speed is faster under the enabled unaligned memory access. We can use
    lz4 de/compression through crypto API as well. Also, It will be useful
    for another potential user of lz4 compression.

    lz4 Compression Benchmark:
    Compiler: ARM gcc 4.6.4
    ARMv7, 1 GHz based board
    Kernel: linux 3.4
    Uncompressed data Size: 101 MB
    Compressed Size compression Speed
    LZO 72.1MB 32.1MB/s, 33.0MB/s(UA)
    LZ4 75.1MB 30.4MB/s, 35.9MB/s(UA)
    LZ4HC 59.8MB 2.4MB/s, 2.5MB/s(UA)
    - UA: Unaligned memory Access support
    - Latest patch set for LZO applied

    This patch:

    Add support for LZ4 compression in the Linux Kernel. LZ4 Compression APIs
    for kernel are based on LZ4 implementation by Yann Collet and were changed
    for kernel coding style.

    LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
    LZ4 source repository : http://code.google.com/p/lz4/
    svn revision : r90

    Two APIs are added:

    lz4_compress() support basic lz4 compression whereas lz4hc_compress()
    support high compression or CPU performance get lower but compression
    ratio get higher. Also, we require the pre-allocated working memory with
    the defined size and destination buffer must be allocated with the size of
    lz4_compressbound.

    [akpm@linux-foundation.org: make lz4_compresshcctx() static]
    Signed-off-by: Chanho Min
    Cc: "Darrick J. Wong"
    Cc: Bob Pearson
    Cc: Richard Weinberger
    Cc: Herbert Xu
    Cc: Yann Collet
    Cc: Kyungsik Lee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chanho Min
     
  • Integrates the LZ4 decompression code to the arm pre-boot code.

    Signed-off-by: Kyungsik Lee
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Russell King
    Cc: Borislav Petkov
    Cc: Florian Fainelli
    Cc: Yann Collet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kyungsik Lee
     
  • Add support for extracting LZ4-compressed kernel images, as well as
    LZ4-compressed ramdisk images in the kernel boot process.

    Signed-off-by: Kyungsik Lee
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Russell King
    Cc: Borislav Petkov
    Cc: Florian Fainelli
    Cc: Yann Collet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kyungsik Lee
     
  • Add support for LZ4 decompression in the Linux Kernel. LZ4 Decompression
    APIs for kernel are based on LZ4 implementation by Yann Collet.

    Benchmark Results(PATCH v3)
    Compiler: Linaro ARM gcc 4.6.2

    1. ARMv7, 1.5GHz based board
    Kernel: linux 3.4
    Uncompressed Kernel Size: 14MB
    Compressed Size Decompression Speed
    LZO 6.7MB 20.1MB/s, 25.2MB/s(UA)
    LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA)

    2. ARMv7, 1.7GHz based board
    Kernel: linux 3.7
    Uncompressed Kernel Size: 14MB
    Compressed Size Decompression Speed
    LZO 6.0MB 34.1MB/s, 52.2MB/s(UA)
    LZ4 6.5MB 86.7MB/s
    - UA: Unaligned memory Access support
    - Latest patch set for LZO applied

    This patch set is for adding support for LZ4-compressed Kernel. LZ4 is a
    very fast lossless compression algorithm and it also features an extremely
    fast decoder [1].

    But we have five of decompressors already and one question which does
    arise, however, is that of where do we stop adding new ones? This issue
    had been discussed and came to the conclusion [2].

    Russell King said that we should have:

    - one decompressor which is the fastest
    - one decompressor for the highest compression ratio
    - one popular decompressor (eg conventional gzip)

    If we have a replacement one for one of these, then it should do exactly
    that: replace it.

    The benchmark shows that an 8% increase in image size vs a 66% increase
    in decompression speed compared to LZO(which has been known as the
    fastest decompressor in the Kernel). Therefore the "fast but may not be
    small" compression title has clearly been taken by LZ4 [3].

    [1] http://code.google.com/p/lz4/
    [2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
    [3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347

    LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
    LZ4 source repository: http://code.google.com/p/lz4/

    Signed-off-by: Kyungsik Lee
    Signed-off-by: Yann Collet
    Cc: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Russell King
    Cc: Borislav Petkov
    Cc: Florian Fainelli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kyungsik Lee
     
  • Some architectures need __c[lt]z[sd]i2() for __builtin_c[lt]z[ll] and
    that causes a build failure. They can be implemented using the
    fls()/__ffs() and overridden by linking arch-specific versions may not
    be implemented yet.

    This is required by "lib: add lz4 compressor module".

    Reference: https://lkml.org/lkml/2013/4/18/603

    Signed-off-by: Chanho Min
    Reported-by: Geert Uytterhoeven
    Cc: "Darrick J. Wong"
    Cc: Bob Pearson
    Cc: Richard Weinberger
    Cc: Herbert Xu
    Cc: Yann Collet
    Cc: Kyungsik Lee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chanho Min
     
  • Merge together the unicore32, arm, and x86 reboot= command line
    parameter handling.

    Signed-off-by: Robin Holt
    Cc: H. Peter Anvin
    Cc: Russell King
    Cc: Guan Xuetao
    Cc: Russ Anderson
    Cc: Robin Holt
    Acked-by: Ingo Molnar
    Acked-by: Guan Xuetao
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Holt
     
  • Preparing to move the parsing of reboot= to generic kernel code forces
    the change in reboot_mode handling to use the enum.

    [akpm@linux-foundation.org: fix arch/arm/mach-socfpga/socfpga.c]
    Signed-off-by: Robin Holt
    Cc: Russell King
    Cc: Russ Anderson
    Cc: Robin Holt
    Cc: H. Peter Anvin
    Cc: Guan Xuetao
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Holt
     
  • Prepare for the moving the parsing of reboot= to the generic kernel code
    by making reboot_mode into a more generic form.

    Signed-off-by: Robin Holt
    Cc: Russell King
    Cc: Russ Anderson
    Cc: Robin Holt
    Cc: H. Peter Anvin
    Cc: Guan Xuetao
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Holt
     
  • These restart_mode fields are not used at all. Remove them to make
    moving the reboot= cmdline options to the general kernel easier.

    Signed-off-by: Robin Holt
    Cc: Russell King
    Cc: Russ Anderson
    Cc: Robin Holt
    Cc: H. Peter Anvin
    Cc: Guan Xuetao
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Holt
     
  • Prepare for the moving the parsing of reboot= to the generic kernel code
    by making reboot_mode into a more generic form.

    Signed-off-by: Robin Holt
    Cc: Guan Xuetao
    Cc: Russ Anderson
    Cc: Robin Holt
    Cc: Russell King
    Cc: H. Peter Anvin
    Acked-by: Guan Xuetao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Holt
     
  • Prepare for the moving the parsing of reboot= to the generic kernel code
    by making reboot_mode into a more generic form.

    Signed-off-by: Robin Holt
    Cc: H. Peter Anvin
    Cc: Miguel Boton
    Cc: Russ Anderson
    Cc: Robin Holt
    Cc: Russell King
    Cc: Guan Xuetao
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robin Holt