22 Apr, 2016

2 commits

  • - make modeset hw state checker atomic aware (Maarten)
    - close races in gpu stuck detection/seqno reading (Chris)
    - tons&tons of small improvements from Chris Wilson all over the gem code
    - more dsi/bxt work from Ramalingam&Jani
    - macro polish from Joonas
    - guc fw loading fixes (Arun&Dave)
    - vmap notifier (acked by Andrew) + i915 support by Chris Wilson
    - create bottom half for execlist irq processing (Chris Wilson)
    - vlv/chv pll cleanup (Ville)
    - rework DP detection, especially sink detection (Shubhangi Shrivastava)
    - make color manager support fully atomic (Maarten)
    - avoid livelock on chv in execlist irq handler (Chris)

    * tag 'drm-intel-next-2016-04-11' of git://anongit.freedesktop.org/drm-intel: (82 commits)
    drm/i915: Update DRIVER_DATE to 20160411
    drm/i915: Avoid allocating a vmap arena for a single page
    drm,i915: Introduce drm_malloc_gfp()
    drm/i915/shrinker: Restrict vmap purge to objects with vmaps
    drm/i915: Refactor duplicate object vmap functions
    drm/i915: Consolidate common error handling in intel_pin_and_map_ringbuffer_obj
    drm/i915/dmabuf: Tighten struct_mutex for unmap_dma_buf
    drm/i915: implement WaClearTdlStateAckDirtyBits
    drm/i915/bxt: Reversed polarity of PORT_PLL_REF_SEL bit
    drm/i915: Rename hw state checker to hw state verifier.
    drm/i915: Move modeset state verifier calls.
    drm/i915: Make modeset state verifier take crtc as argument.
    drm/i915: Replace manual barrier() with READ_ONCE() in HWS accessor
    drm/i915: Use simplest form for flushing the single cacheline in the HWS
    drm/i915: Harden detection of missed interrupts
    drm/i915: Separate out the seqno-barrier from engine->get_seqno
    drm/i915: Remove forcewake dance from seqno/irq barrier on legacy gen6+
    drm/i915: Fixup the free space logic in ring_prepare
    drm/i915: Simplify check for idleness in hangcheck
    drm/i915: Apply a mb between emitting the request and hangcheck
    ...

    Dave Airlie
     
  • Backmerge 4.6-rc3 for i915.

    Linux 4.6-rc3

    Dave Airlie
     

12 Apr, 2016

1 commit

  • Linux 4.6-rc3

    Backmerge requested by Chris Wilson to make his patches apply cleanly.
    Tiny conflict in vmalloc.c with the (properly acked and all) patch in
    drm-intel-next:

    commit 4da56b99d99e5a7df2b7f11e87bfea935f909732
    Author: Chris Wilson
    Date: Mon Apr 4 14:46:42 2016 +0100

    mm/vmap: Add a notifier for when we run out of vmap address space

    and Linus' tree.

    Signed-off-by: Daniel Vetter

    Daniel Vetter
     

10 Apr, 2016

2 commits

  • Pull networking fixes from David Miller:

    1) Stale SKB data pointer access across pskb_may_pull() calls in L2TP,
    from Haishuang Yan.

    2) Fix multicast frame handling in mac80211 AP code, from Felix
    Fietkau.

    3) mac80211 station hashtable insert errors not handled properly, fix
    from Johannes Berg.

    4) Fix TX descriptor count limit handling in e1000, from Alexander
    Duyck.

    5) Revert a buggy netdev refcount fix in netpoll, from Bjorn Helgaas.

    6) Must assign rtnl_link_ops of the device before registering it, fix
    in ip6_tunnel from Thadeu Lima de Souza Cascardo.

    7) Memory leak fix in tc action net exit, from WANG Cong.

    8) Add missing AF_KCM entries to name tables, from Dexuan Cui.

    9) Fix regression in GRE handling of csums wrt. FOU, from Alexander
    Duyck.

    10) Fix memory allocation alignment and congestion map corruption in
    RDS, from Shamir Rabinovitch.

    11) Fix default qdisc regression in tuntap driver, from Jason Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits)
    bridge, netem: mark mailing lists as moderated
    tuntap: restore default qdisc
    mpls: find_outdev: check for err ptr in addition to NULL check
    ipv6: Count in extension headers in skb->network_header
    RDS: fix congestion map corruption for PAGE_SIZE > 4k
    RDS: memory allocated must be align to 8
    GRE: Disable segmentation offloads w/ CSUM and we are encapsulated via FOU
    net: add the AF_KCM entries to family name tables
    MAINTAINERS: intel-wired-lan list is moderated
    lib/test_bpf: Add additional BPF_ADD tests
    lib/test_bpf: Add test to check for result of 32-bit add that overflows
    lib/test_bpf: Add tests for unsigned BPF_JGT
    lib/test_bpf: Fix JMP_JSET tests
    VSOCK: Detach QP check should filter out non matching QPs.
    stmmac: fix adjust link call in case of a switch is attached
    af_packet: tone down the Tx-ring unsupported spew.
    net_sched: fix a memory leak in tc action
    samples/bpf: Enable powerpc support
    samples/bpf: Use llc in PATH, rather than a hardcoded value
    samples/bpf: Fix build breakage with map_perf_test_user.c
    ...

    Linus Torvalds
     
  • Pull IOMMU fixes from Joerg Roedel:

    - compile-time fixes (warnings and failures)

    - a bug in iommu core code which could cause the group->domain pointer
    to be falsly cleared

    - fix in scatterlist handling of the ARM common DMA-API code

    - stall detection fix for the Rockchip IOMMU driver

    * tag 'iommu-fixes-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
    iommu/vt-d: Silence an uninitialized variable warning
    iommu/rockchip: Fix "is stall active" check
    iommu: Don't overwrite domain pointer when there is no default_domain
    iommu/dma: Restore scatterlist offsets correctly
    iommu: provide of_xlate pointer unconditionally

    Linus Torvalds
     

08 Apr, 2016

2 commits

  • Pull ext4 bugfixes from Ted Ts'o:
    "These changes contains a fix for overlayfs interacting with some
    (badly behaved) dentry code in various file systems. These have been
    reviewed by Al and the respective file system mtinainers and are going
    through the ext4 tree for convenience.

    This also has a few ext4 encryption bug fixes that were discovered in
    Android testing (yes, we will need to get these sync'ed up with the
    fs/crypto code; I'll take care of that). It also has some bug fixes
    and a change to ignore the legacy quota options to allow for xfstests
    regression testing of ext4's internal quota feature and to be more
    consistent with how xfs handles this case"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: ignore quota mount options if the quota feature is enabled
    ext4 crypto: fix some error handling
    ext4: avoid calling dquot_get_next_id() if quota is not enabled
    ext4: retry block allocation for failed DIO and DAX writes
    ext4: add lockdep annotations for i_data_sem
    ext4: allow readdir()'s of large empty directories to be interrupted
    btrfs: fix crash/invalid memory access on fsync when using overlayfs
    ext4 crypto: use dget_parent() in ext4_d_revalidate()
    ext4: use file_dentry()
    ext4: use dget_parent() in ext4_file_open()
    nfs: use file_dentry()
    fs: add file_dentry()
    ext4 crypto: don't let data integrity writebacks fail with ENOMEM
    ext4: check if in-inode xattr is corrupted in ext4_expand_extra_isize_ea()

    Linus Torvalds
     
  • This patch fixes an issue I found in which we were dropping frames if we
    had enabled checksums on GRE headers that were encapsulated by either FOU
    or GUE. Without this patch I was barely able to get 1 Gb/s of throughput.
    With this patch applied I am now at least getting around 6 Gb/s.

    The issue is due to the fact that with FOU or GUE applied we do not provide
    a transport offset pointing to the GRE header, nor do we offload it in
    software as the GRE header is completely skipped by GSO and treated like a
    VXLAN or GENEVE type header. As such we need to prevent the stack from
    generating it and also prevent GRE from generating it via any interface we
    create.

    Fixes: c3483384ee511 ("gro: Allow tunnel stacking in the case of FOU/GUE")
    Signed-off-by: Alexander Duyck
    Signed-off-by: David S. Miller

    Alexander Duyck
     

06 Apr, 2016

2 commits

  • * tag 'topic/drm-misc-2016-04-01' of git://anongit.freedesktop.org/drm-intel:
    drm: Add new DCS commands in the enum list
    drm: Make uapi headers C89 pendantic compliant
    drm/atomic: export drm_atomic_helper_wait_for_fences()
    drm: Untangle __KERNEL__ guards
    drm: Move DRM_MODE_OBJECT_* to uapi headers
    drm: align #include directives with libdrm in uapi headers
    drm: Make drm.h uapi header safe for C++
    vgacon: dummy implementation for vgacon_text_force
    drm/sysfs: Nuke TV/DVI property files
    drm/ttm: Remove TTM_HAS_AGP
    drm: bridge/dw-hdmi: Remove pre_enable/post_disable dummy funcs
    Revert "drm: Don't pass negative delta to ktime_sub_ns()"
    drm/atmel: Fixup drm_connector_/unplug/unregister/_all
    drm: Rename drm_connector_unplug_all() to drm_connector_unregister_all()
    drm: bridge: Make (pre/post) enable/disable callbacks optional

    Dave Airlie
     
  • Pull KVM fixes from Paolo Bonzini:
    "Miscellaneous bugfixes.

    The ARM and s390 fixes are for new regressions from the merge window,
    others are usual stable material"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    compiler-gcc: disable -ftracer for __noclone functions
    kvm: x86: make lapic hrtimer pinned
    s390/mm/kvm: fix mis-merge in gmap handling
    kvm: set page dirty only if page has been writable
    KVM: x86: reduce default value of halt_poll_ns parameter
    KVM: Hyper-V: do not do hypercall userspace exits if SynIC is disabled
    KVM: x86: Inject pending interrupt even if pending nmi exist
    arm64: KVM: Register CPU notifiers when the kernel runs at HYP
    arm64: kvm: 4.6-rc1: Fix VTCR_EL2 VS setting

    Linus Torvalds
     

05 Apr, 2016

7 commits

  • -ftracer can duplicate asm blocks causing compilation to fail in
    noclone functions. For example, KVM declares a global variable
    in an asm like

    asm("2: ... \n
    .pushsection data \n
    .global vmx_return \n
    vmx_return: .long 2b");

    and -ftracer causes a double declaration.

    Cc: Andrew Morton
    Cc: Michal Marek
    Cc: stable@vger.kernel.org
    Cc: kvm@vger.kernel.org
    Reported-by: Linda Walsh
    Signed-off-by: Paolo Bonzini

    Paolo Bonzini
     
  • iommu drivers that support the standard DT bindings use a of_xlate
    callback pointer, but that is only part of struct iommu_ops when
    CONFIG_OF_IOMMU is enabled, leading to build errors in randconfig
    builds when that is not provided:

    drivers/iommu/mtk_iommu.c:497:2: error: unknown field 'of_xlate' specified in initializer
    .of_xlate = mtk_iommu_of_xlate,
    ^
    drivers/iommu/mtk_iommu.c:497:14: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
    .of_xlate = mtk_iommu_of_xlate,
    ^~~~~~~~~~~~~~~~~~
    drivers/iommu/mtk_iommu.c:497:14: note: (near initialization for 'mtk_iommu_ops.domain_get_attr')

    We can work around it by adding more #ifdefs in each driver, but
    it seems nicer to just allow setting the pointer even if it is
    unused. This makes the driver code look nicer, and it gives better
    compile-time coverage when test building on other architectures.

    Signed-off-by: Arnd Bergmann
    Fixes: 0df4fabe208d ("iommu/mediatek: Add mt8173 IOMMU driver")
    Reviewed-by: Robin Murphy
    Signed-off-by: Joerg Roedel

    Arnd Bergmann
     
  • vmaps are temporary kernel mappings that may be of long duration.
    Reusing a vmap on an object is preferrable for a driver as the cost of
    setting up the vmap can otherwise dominate the operation on the object.
    However, the vmap address space is rather limited on 32bit systems and
    so we add a notification for vmap pressure in order for the driver to
    release any cached vmappings.

    The interface is styled after the oom-notifier where the callees are
    passed a pointer to an unsigned long counter for them to indicate if they
    have freed any space.

    v2: Guard the blocking notifier call with gfpflags_allow_blocking()
    v3: Correct typo in forward declaration and move to head of file

    Signed-off-by: Chris Wilson
    Cc: Andrew Morton
    Cc: David Rientjes
    Cc: Roman Peniaev
    Cc: Mel Gorman
    Cc: linux-mm@kvack.org
    Cc: linux-kernel@vger.kernel.org
    Acked-by: Andrew Morton # for inclusion via DRM
    Cc: Joonas Lahtinen
    Cc: Tvrtko Ursulin
    Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-3-git-send-email-chris@chris-wilson.co.uk
    Reviewed-by: Joonas Lahtinen

    Chris Wilson
     
  • Merge PAGE_CACHE_SIZE removal patches from Kirill Shutemov:
    "PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The first patch with most changes has been done with coccinelle. The
    second is manual fixups on top.

    The third patch removes macros definition"

    [ I was planning to apply this just before rc2, but then I spaced out,
    so here it is right _after_ rc2 instead.

    As Kirill suggested as a possibility, I could have decided to only
    merge the first two patches, and leave the old interfaces for
    compatibility, but I'd rather get it all done and any out-of-tree
    modules and patches can trivially do the converstion while still also
    working with older kernels, so there is little reason to try to
    maintain the redundant legacy model. - Linus ]

    * PAGE_CACHE_SIZE-removal:
    mm: drop PAGE_CACHE_* and page_cache_{get,release} definition
    mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage
    mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros

    Linus Torvalds
     
  • All users gone. We can remove these macros.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Mostly direct substitution with occasional adjustment or removing
    outdated comments.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

03 Apr, 2016

2 commits

  • Pull core kernel fixes from Ingo Molnar:
    "This contains the nohz/atomic cleanup/fix for the fetch_or() ugliness
    you noted during the original nohz pull request, plus there's also
    misc fixes:

    - fix liblockdep build bug
    - fix uapi header build bug
    - print more lockdep hash collision info to help debug recent reports
    of hash collisions
    - update MAINTAINERS email address"

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    MAINTAINERS: Update my email address
    locking/lockdep: Print chain_key collision information
    uapi/linux/stddef.h: Provide __always_inline to userspace headers
    tools/lib/lockdep: Fix unsupported 'basename -s' in run_tests.sh
    locking/atomic, sched: Unexport fetch_or()
    timers/nohz: Convert tick dependency mask to atomic_t
    locking/atomic: Introduce atomic_fetch_or()

    Linus Torvalds
     
  • Pull configfs fix from Christoph Hellwig:
    "A trivial fix to the recently introduced binary attribute helper
    macros"

    * tag 'configfs-for-linus-2' of git://git.infradead.org/users/hch/configfs:
    configfs: fix CONFIGFS_BIN_ATTR_[RW]O definitions

    Linus Torvalds
     

02 Apr, 2016

5 commits

  • Pull networking fixes from David Miller:

    1) Missing device reference in IPSEC input path results in crashes
    during device unregistration. From Subash Abhinov Kasiviswanathan.

    2) Per-queue ISR register writes not being done properly in macb
    driver, from Cyrille Pitchen.

    3) Stats accounting bugs in bcmgenet, from Patri Gynther.

    4) Lightweight tunnel's TTL and TOS were swapped in netlink dumps, from
    Quentin Armitage.

    5) SXGBE driver has off-by-one in probe error paths, from Rasmus
    Villemoes.

    6) Fix race in save/swap/delete options in netfilter ipset, from
    Vishwanath Pai.

    7) Ageing time of bridge not set properly when not operating over a
    switchdev device. Fix from Haishuang Yan.

    8) Fix GRO regression wrt nested FOU/GUE based tunnels, from Alexander
    Duyck.

    9) IPV6 UDP code bumps wrong stats, from Eric Dumazet.

    10) FEC driver should only access registers that actually exist on the
    given chipset, fix from Fabio Estevam.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
    net: mvneta: fix changing MTU when using per-cpu processing
    stmmac: fix MDIO settings
    Revert "stmmac: Fix 'eth0: No PHY found' regression"
    stmmac: fix TX normal DESC
    net: mvneta: use cache_line_size() to get cacheline size
    net: mvpp2: use cache_line_size() to get cacheline size
    net: mvpp2: fix maybe-uninitialized warning
    tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter
    net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card
    rtnl: fix msg size calculation in if_nlmsg_size()
    fec: Do not access unexisting register in Coldfire
    net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
    net: mvpp2: replace MVPP2_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES
    net: dsa: mv88e6xxx: Clear the PDOWN bit on setup
    net: dsa: mv88e6xxx: Introduce _mv88e6xxx_phy_page_{read, write}
    bpf: make padding in bpf_tunnel_key explicit
    ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates
    bnxt_en: Fix ethtool -a reporting.
    bnxt_en: Fix typo in bnxt_hwrm_set_pause_common().
    bnxt_en: Implement proper firmware message padding.
    ...

    Linus Torvalds
     
  • The return value of pmd_trans_huge_lock() is a pointer, not a boolean
    value, so use NULL instead of false as the return value.

    Signed-off-by: Chen Gang
    Acked-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen Gang
     
  • Initially the phy_bus_name was added to manipulate the
    driver name but it was recently just used to manage the
    fixed-link and then to take some decision at run-time.
    So the patch uses the is_pseudo_fixed_link and removes
    the phy_bus_name variable not necessary anymore.

    The driver can manage the mdio registration by using phy-handle,
    dwmac-mdio and own parameter e.g. snps,phy-addr.
    This patch takes care about all these possible configurations
    and fixes the mdio registration in case of there is a real
    transceiver or a switch (that needs to be managed by using
    fixed-link).

    Signed-off-by: Giuseppe Cavallaro
    Reviewed-by: Andreas Färber
    Tested-by: Frank Schäfer
    Cc: Gabriel Fernandez
    Cc: Dinh Nguyen
    Cc: David S. Miller
    Cc: Phil Reid
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO
     
  • This reverts commit 88f8b1bb41c6208f81b6a480244533ded7b59493.
    due to problems on GeekBox and Banana Pi M1 board when
    connected to a real transceiver instead of a switch via
    fixed-link.

    Signed-off-by: Giuseppe Cavallaro
    Cc: Gabriel Fernandez
    Cc: Andreas Färber
    Cc: Frank Schäfer
    Cc: Dinh Nguyen
    Cc: David S. Miller
    Signed-off-by: David S. Miller

    Giuseppe CAVALLARO
     
  • Sasha Levin reported a suspicious rcu_dereference_protected() warning
    found while fuzzing with trinity that is similar to this one:

    [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage!
    [ 52.765688] other info that might help us debug this:
    [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1
    [ 52.765701] 1 lock held by a.out/1525:
    [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [] rtnl_lock+0x17/0x20
    [ 52.765721] stack backtrace:
    [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264
    [...]
    [ 52.765768] Call Trace:
    [ 52.765775] [] dump_stack+0x85/0xc8
    [ 52.765784] [] lockdep_rcu_suspicious+0xd5/0x110
    [ 52.765792] [] sk_detach_filter+0x82/0x90
    [ 52.765801] [] tun_detach_filter+0x35/0x90 [tun]
    [ 52.765810] [] __tun_chr_ioctl+0x354/0x1130 [tun]
    [ 52.765818] [] ? selinux_file_ioctl+0x130/0x210
    [ 52.765827] [] tun_chr_ioctl+0x13/0x20 [tun]
    [ 52.765834] [] do_vfs_ioctl+0x96/0x690
    [ 52.765843] [] ? security_file_ioctl+0x43/0x60
    [ 52.765850] [] SyS_ioctl+0x79/0x90
    [ 52.765858] [] do_syscall_64+0x62/0x140
    [ 52.765866] [] entry_SYSCALL64_slow_path+0x25/0x25

    Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled
    from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH,
    DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices.

    Since the fix in f91ff5b9ff52 ("net: sk_{detach|attach}_filter() rcu
    fixes") sk_attach_filter()/sk_detach_filter() now dereferences the
    filter with rcu_dereference_protected(), checking whether socket lock
    is held in control path.

    Since its introduction in 994051625981 ("tun: socket filter support"),
    tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the
    sock_owned_by_user(sk) doesn't apply in this specific case and therefore
    triggers the false positive.

    Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair
    that is used by tap filters and pass in lockdep_rtnl_is_held() for the
    rcu_dereference_protected() checks instead.

    Reported-by: Sasha Levin
    Signed-off-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

30 Mar, 2016

1 commit

  • This allows us to ditch a ton of ugly #ifdefs from a bunch of drm modeset
    drivers.

    v2: Make the dummy function actually return a sane value, spotted by
    Ville.

    v3: Because the patch is still in limbo there's no more drivers to
    convert, noticed by Emil.

    v4: Rebase once more, because hooray. I'll just go ahead an apply this
    one later on to drm-misc.

    Cc: Emil Velikov
    Cc: Ville Syrjälä
    Cc: Andrew Morton
    Cc: Greg Kroah-Hartman
    Reviewed-by: Emil Velikov
    Reviewed-by: Alex Deucher
    Signed-off-by: Daniel Vetter

    Daniel Vetter
     

29 Mar, 2016

4 commits

  • This patch functionally reverts:

    5fd7a09cfb8c ("atomic: Export fetch_or()")

    During the merge Linus observed that the generic version of fetch_or()
    was messy:

    " This makes the ugly "fetch_or()" macro that the scheduler used
    internally a new generic helper, and does a bad job at it. "

    e23604edac2a Merge branch 'timers-nohz-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

    Now that we have introduced atomic_fetch_or(), fetch_or() is only used
    by the scheduler in order to deal with thread_info flags which type
    can vary across architectures.

    Lets confine fetch_or() back to the scheduler so that we encourage
    future users to use the more robust and well typed atomic_t version
    instead.

    While at it, fetch_or() gets robustified, pasting improvements from a
    previous patch by Ingo Molnar that avoids needless expression
    re-evaluations in the loop.

    Reported-by: Linus Torvalds
    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1458830281-4255-4-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • The tick dependency mask was intially unsigned long because this is the
    type on which clear_bit() operates on and fetch_or() accepts it.

    But now that we have atomic_fetch_or(), we can instead use
    atomic_andnot() to clear the bit. This consolidates the type of our
    tick dependency mask, reduce its size on structures and benefit from
    possible architecture optimizations on atomic_t operations.

    Suggested-by: Linus Torvalds
    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1458830281-4255-3-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • This is deemed to replace the type generic fetch_or() which brings a lot
    of issues such as macro induced block variable aliasing and sloppy types.
    Not to mention fetch_or() doesn't refer to any namespace, adding even
    more confusion.

    So lets provide an atomic_t version. Current and next users of fetch_or()
    are thus encouraged to use atomic_t.

    Signed-off-by: Frederic Weisbecker
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1458830281-4255-2-git-send-email-fweisbec@gmail.com
    Signed-off-by: Ingo Molnar

    Frederic Weisbecker
     
  • Update the definition of memcpy_from_pmem() to return 0 or a negative
    error code. Implement x86/arch_memcpy_from_pmem() with memcpy_mcsafe().

    Cc: Borislav Petkov
    Cc: Tony Luck
    Cc: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: Linus Torvalds
    Acked-by: Ingo Molnar
    Reviewed-by: Ross Zwisler
    Signed-off-by: Dan Williams

    Dan Williams
     

28 Mar, 2016

1 commit

  • This fix adds a new reference counter (ref_netlink) for the struct ip_set.
    The other reference counter (ref) can be swapped out by ip_set_swap and we
    need a separate counter to keep track of references for netlink events
    like dump. Using the same ref counter for dump causes a race condition
    which can be demonstrated by the following script:

    ipset create hash_ip1 hash:ip family inet hashsize 1024 maxelem 500000 \
    counters
    ipset create hash_ip2 hash:ip family inet hashsize 300000 maxelem 500000 \
    counters
    ipset create hash_ip3 hash:ip family inet hashsize 1024 maxelem 500000 \
    counters

    ipset save &

    ipset swap hash_ip3 hash_ip2
    ipset destroy hash_ip3 /* will crash the machine */

    Swap will exchange the values of ref so destroy will see ref = 0 instead of
    ref = 1. With this fix in place swap will not succeed because ipset save
    still has ref_netlink on the set (ip_set_swap doesn't swap ref_netlink).

    Both delete and swap will error out if ref_netlink != 0 on the set.

    Note: The changes to *_head functions is because previously we would
    increment ref whenever we called these functions, we don't do that
    anymore.

    Reviewed-by: Joshua Hunt
    Signed-off-by: Vishwanath Pai
    Signed-off-by: Jozsef Kadlecsik
    Signed-off-by: Pablo Neira Ayuso

    Vishwanath Pai
     

27 Mar, 2016

3 commits

  • Pull Ceph updates from Sage Weil:
    "There is quite a bit here, including some overdue refactoring and
    cleanup on the mon_client and osd_client code from Ilya, scattered
    writeback support for CephFS and a pile of bug fixes from Zheng, and a
    few random cleanups and fixes from others"

    [ I already decided not to pull this because of it having been rebased
    recently, but ended up changing my mind after all. Next time I'll
    really hold people to it. Oh well. - Linus ]

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (34 commits)
    libceph: use KMEM_CACHE macro
    ceph: use kmem_cache_zalloc
    rbd: use KMEM_CACHE macro
    ceph: use lookup request to revalidate dentry
    ceph: kill ceph_get_dentry_parent_inode()
    ceph: fix security xattr deadlock
    ceph: don't request vxattrs from MDS
    ceph: fix mounting same fs multiple times
    ceph: remove unnecessary NULL check
    ceph: avoid updating directory inode's i_size accidentally
    ceph: fix race during filling readdir cache
    libceph: use sizeof_footer() more
    ceph: kill ceph_empty_snapc
    ceph: fix a wrong comparison
    ceph: replace CURRENT_TIME by current_fs_time()
    ceph: scattered page writeback
    libceph: add helper that duplicates last extent operation
    libceph: enable large, variable-sized OSD requests
    libceph: osdc->req_mempool should be backed by a slab pool
    libceph: make r_request msg_size calculation clearer
    ...

    Linus Torvalds
     
  • This series fixes bugs in nfs and ext4 due to 4bacc9c9234c ("overlayfs:
    Make f_path always point to the overlay and f_inode to the underlay").

    Regular files opened on overlayfs will result in the file being opened on
    the underlying filesystem, while f_path points to the overlayfs
    mount/dentry.

    This confuses filesystems which get the dentry from struct file and assume
    it's theirs.

    Add a new helper, file_dentry() [*], to get the filesystem's own dentry
    from the file. This checks file->f_path.dentry->d_flags against
    DCACHE_OP_REAL, and returns file->f_path.dentry if DCACHE_OP_REAL is not
    set (this is the common, non-overlayfs case).

    In the uncommon case it will call into overlayfs's ->d_real() to get the
    underlying dentry, matching file_inode(file).

    The reason we need to check against the inode is that if the file is copied
    up while being open, d_real() would return the upper dentry, while the open
    file comes from the lower dentry.

    [*] If possible, it's better simply to use file_inode() instead.

    Signed-off-by: Miklos Szeredi
    Signed-off-by: Theodore Ts'o
    Tested-by: Goldwyn Rodrigues
    Reviewed-by: Trond Myklebust
    Cc: # v4.2
    Cc: David Howells
    Cc: Al Viro
    Cc: Daniel Axtens

    Miklos Szeredi
     
  • Pull NTB bug fixes from Jon Mason:
    "NTB bug fixes for tasklet from spinning forever, link errors,
    translation window setup, NULL ptr dereference, and ntb-perf errors.

    Also, a modification to the driver API that makes _addr functions
    optional"

    * tag 'ntb-4.6' of git://github.com/jonmason/ntb:
    NTB: Remove _addr functions from ntb_hw_amd
    NTB: Make _addr functions optional in the API
    NTB: Fix incorrect clean up routine in ntb_perf
    NTB: Fix incorrect return check in ntb_perf
    ntb: fix possible NULL dereference
    ntb: add missing setup of translation window
    ntb: stop link work when we do not have memory
    ntb: stop tasklet from spinning forever during shutdown.
    ntb: perf test: fix address space confusion

    Linus Torvalds
     

26 Mar, 2016

8 commits

  • Implement the stack depot and provide CONFIG_STACKDEPOT. Stack depot
    will allow KASAN store allocation/deallocation stack traces for memory
    chunks. The stack traces are stored in a hash table and referenced by
    handles which reside in the kasan_alloc_meta and kasan_free_meta
    structures in the allocated memory chunks.

    IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
    duplication.

    Right now stackdepot support is only enabled in SLAB allocator. Once
    KASAN features in SLAB are on par with those in SLUB we can switch SLUB
    to stackdepot as well, thus removing the dependency on SLUB stack
    bookkeeping, which wastes a lot of memory.

    This patch is based on the "mm: kasan: stack depots" patch originally
    prepared by Dmitry Chernenkov.

    Joonsoo has said that he plans to reuse the stackdepot code for the
    mm/page_owner.c debugging facility.

    [akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
    [aryabinin@virtuozzo.com: comment style fixes]
    Signed-off-by: Alexander Potapenko
    Signed-off-by: Andrey Ryabinin
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Steven Rostedt
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • KASAN needs to know whether the allocation happens in an IRQ handler.
    This lets us strip everything below the IRQ entry point to reduce the
    number of unique stack traces needed to be stored.

    Move the definition of __irq_entry to so that the
    users don't need to pull in . Also introduce the
    __softirq_entry macro which is similar to __irq_entry, but puts the
    corresponding functions to the .softirqentry.text section.

    Signed-off-by: Alexander Potapenko
    Acked-by: Steven Rostedt
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • Add GFP flags to KASAN hooks for future patches to use.

    This patch is based on the "mm: kasan: unified support for SLUB and SLAB
    allocators" patch originally prepared by Dmitry Chernenkov.

    Signed-off-by: Alexander Potapenko
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: Steven Rostedt
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • Add KASAN hooks to SLAB allocator.

    This patch is based on the "mm: kasan: unified support for SLUB and SLAB
    allocators" patch originally prepared by Dmitry Chernenkov.

    Signed-off-by: Alexander Potapenko
    Cc: Christoph Lameter
    Cc: Pekka Enberg
    Cc: David Rientjes
    Cc: Joonsoo Kim
    Cc: Andrey Konovalov
    Cc: Dmitry Vyukov
    Cc: Andrey Ryabinin
    Cc: Steven Rostedt
    Cc: Konstantin Serebryany
    Cc: Dmitry Chernenkov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexander Potapenko
     
  • A leftover from commit c32b3cbe0d06 ("oom, PM: make OOM detection in the
    freezer path raceless").

    Signed-off-by: Tetsuo Handa
    Acked-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • "oom, oom_reaper: disable oom_reaper for oom_kill_allocating_task" tried
    to protect oom_reaper_list using MMF_OOM_KILLED flag. But we can do it
    by simply checking tsk->oom_reaper_list != NULL.

    Signed-off-by: Tetsuo Handa
    Signed-off-by: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tetsuo Handa
     
  • Entries are only added/removed from oom_reaper_list at head so we can
    use a single linked list and hence save a word in task_struct.

    Signed-off-by: Vladimir Davydov
    Signed-off-by: Michal Hocko
    Cc: Tetsuo Handa
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     
  • Tetsuo has reported that oom_kill_allocating_task=1 will cause
    oom_reaper_list corruption because oom_kill_process doesn't follow
    standard OOM exclusion (aka ignores TIF_MEMDIE) and allows to enqueue
    the same task multiple times - e.g. by sacrificing the same child
    multiple times.

    This patch fixes the issue by introducing a new MMF_OOM_KILLED mm flag
    which is set in oom_kill_process atomically and oom reaper is disabled
    if the flag was already set.

    Signed-off-by: Michal Hocko
    Reported-by: Tetsuo Handa
    Cc: David Rientjes
    Cc: Mel Gorman
    Cc: Oleg Nesterov
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko