26 Nov, 2018

2 commits

  • syzbot was able to trigger the WARN in cttimeout_default_get() by
    passing UDPLITE as l4protocol. Alias UDPLITE to UDP, both use
    same timeout values.

    Furthermore, also fetch GRE timeouts. GRE is a bit more complicated,
    as it still can be a module and its netns_proto_gre struct layout isn't
    visible outside of the gre module. Can't move timeouts around, it
    appears conntrack sysctl unregister assumes net_generic() returns
    nf_proto_net, so we get crash. Expose layout of netns_proto_gre instead.

    A followup nf-next patch could make gre tracker be built-in as well
    if needed, its not that large.

    Last, make the WARN() mention the missing protocol value in case
    anything else is missing.

    Reported-by: syzbot+2fae8fa157dd92618cae@syzkaller.appspotmail.com
    Fixes: 8866df9264a3 ("netfilter: nfnetlink_cttimeout: pass default timeout policy to obj_to_nlattr")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • ip_vs_dst_event is supposed to clean up all dst used in ipvs'
    destinations when a net dev is going down. But it works only
    when the dst's dev is the same as the dev from the event.

    Now with the same priority but late registration,
    ip_vs_dst_notifier is always called later than ipv6_dev_notf
    where the dst's dev is set to lo for NETDEV_DOWN event.

    As the dst's dev lo is not the same as the dev from the event
    in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work.
    Also as these dst have to wait for dest_trash_timer to clean
    them up. It would cause some non-permanent kernel warnings:

    unregister_netdevice: waiting for br0 to become free. Usage count = 3

    To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf
    by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.

    Note that for ipv4 route fib_netdev_notifier doesn't set dst's
    dev to lo in NETDEV_DOWN event, so this fix is only needed when
    IP_VS_IPV6 is defined.

    Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
    Reported-by: Li Shuang
    Signed-off-by: Xin Long
    Acked-by: Julian Anastasov
    Acked-by: Simon Horman
    Signed-off-by: Pablo Neira Ayuso

    Xin Long
     

18 Nov, 2018

1 commit


13 Nov, 2018

2 commits

  • nft_compat ops do not have static storage duration, unlike all other
    expressions.

    When nf_tables_expr_destroy() returns, expr->ops might have been
    free'd already, so we need to store next address before calling
    expression destructor.

    For same reason, we can't deref match pointer after nft_xt_put().

    This can be easily reproduced by adding msleep() before
    nft_match_destroy() returns.

    Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
    Reported-by: Pablo Neira Ayuso
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • xt_rateest_net_exit() was added to check whether rules are flushed
    successfully. but ->net_exit() callback is called earlier than
    ->destroy() callback.
    So that ->net_exit() callback can't check that.

    test commands:
    %ip netns add vm1
    %ip netns exec vm1 iptables -t mangle -I PREROUTING -p udp \
    --dport 1111 -j RATEEST --rateest-name ap \
    --rateest-interval 250ms --rateest-ewma 0.5s
    %ip netns del vm1

    splat looks like:
    [ 668.813518] WARNING: CPU: 0 PID: 87 at net/netfilter/xt_RATEEST.c:210 xt_rateest_net_exit+0x210/0x340 [xt_RATEEST]
    [ 668.813518] Modules linked in: xt_RATEEST xt_tcpudp iptable_mangle bpfilter ip_tables x_tables
    [ 668.813518] CPU: 0 PID: 87 Comm: kworker/u4:2 Not tainted 4.19.0-rc7+ #21
    [ 668.813518] Workqueue: netns cleanup_net
    [ 668.813518] RIP: 0010:xt_rateest_net_exit+0x210/0x340 [xt_RATEEST]
    [ 668.813518] Code: 00 48 8b 85 30 ff ff ff 4c 8b 23 80 38 00 0f 85 24 01 00 00 48 8b 85 30 ff ff ff 4d 85 e4 4c 89 a5 58 ff ff ff c6 00 f8 74 b2 0b 48 83 c3 08 4c 39 f3 75 b0 48 b8 00 00 00 00 00 fc ff df 49
    [ 668.813518] RSP: 0018:ffff8801156c73f8 EFLAGS: 00010282
    [ 668.813518] RAX: ffffed0022ad8e85 RBX: ffff880118928e98 RCX: 5db8012a00000000
    [ 668.813518] RDX: ffff8801156c7428 RSI: 00000000cb1d185f RDI: ffff880115663b74
    [ 668.813518] RBP: ffff8801156c74d0 R08: ffff8801156633c0 R09: 1ffff100236440be
    [ 668.813518] R10: 0000000000000001 R11: ffffed002367d852 R12: ffff880115142b08
    [ 668.813518] R13: 1ffff10022ad8e81 R14: ffff880118928ea8 R15: dffffc0000000000
    [ 668.813518] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
    [ 668.813518] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 668.813518] CR2: 0000563aa69f4f28 CR3: 0000000105a16000 CR4: 00000000001006f0
    [ 668.813518] Call Trace:
    [ 668.813518] ? unregister_netdevice_many+0xe0/0xe0
    [ 668.813518] ? xt_rateest_net_init+0x2c0/0x2c0 [xt_RATEEST]
    [ 668.813518] ? default_device_exit+0x1ca/0x270
    [ 668.813518] ? remove_proc_entry+0x1cd/0x390
    [ 668.813518] ? dev_change_net_namespace+0xd00/0xd00
    [ 668.813518] ? __init_waitqueue_head+0x130/0x130
    [ 668.813518] ops_exit_list.isra.10+0x94/0x140
    [ 668.813518] cleanup_net+0x45b/0x900
    [ 668.813518] ? net_drop_ns+0x110/0x110
    [ 668.813518] ? swapgs_restore_regs_and_return_to_usermode+0x3c/0x80
    [ 668.813518] ? save_trace+0x300/0x300
    [ 668.813518] ? lock_acquire+0x196/0x470
    [ 668.813518] ? lock_acquire+0x196/0x470
    [ 668.813518] ? process_one_work+0xb60/0x1de0
    [ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40
    [ 668.813518] ? _raw_spin_unlock_irq+0x29/0x40
    [ 668.813518] ? __lock_acquire+0x4500/0x4500
    [ 668.813518] ? __lock_is_held+0xb4/0x140
    [ 668.813518] process_one_work+0xc13/0x1de0
    [ 668.813518] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
    [ 668.813518] ? set_load_weight+0x270/0x270
    [ ... ]

    Fixes: 3427b2ab63fa ("netfilter: make xt_rateest hash table per net")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     

12 Nov, 2018

20 commits

  • Its possible to set both HANDLE and POSITION when replacing a rule.
    In this case, the rule at POSITION gets replaced using the
    userspace-provided handle. Rule handles are supposed to be generated
    by the kernel only.

    Duplicate handles should be harmless, however better disable this "feature"
    by only checking for the POSITION attribute on insert operations.

    Fixes: 5e94846686d0 ("netfilter: nf_tables: add insert operation")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • Start flood ping for each cpu while loading/flushing rulesets to make
    sure we do not access already-free'd rules from nf_tables evaluation loop.

    Also add this to TARGETS so 'make run_tests' in selftest dir runs it
    automatically.

    This would have caught the bug fixed in previous change
    ("netfilter: nf_tables: do not skip inactive chains during generation update")
    sooner.

    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • There is no synchronization between packet path and the configuration plane.

    The packet path uses two arrays with rules, one contains the current (active)
    generation. The other either contains the last (obsolete) generation or
    the future one.

    Consider:
    cpu1 cpu2
    nft_do_chain(c);
    delete c
    net->gen++;
    genbit = !!net->gen;
    rules = c->rg[genbit];

    cpu1 ignores c when updating if c is not active anymore in the new
    generation.

    On cpu2, we now use rules from wrong generation, as c->rg[old]
    contains the rules matching 'c' whereas c->rg[new] was not updated and
    can even point to rules that have been free'd already, causing a crash.

    To fix this, make sure that 'current' to the 'next' generation are
    identical for chains that are going away so that c->rg[new] will just
    use the matching rules even if genbit was incremented already.

    Fixes: 0cbc06b3faba7 ("netfilter: nf_tables: remove synchronize_rcu in commit phase")
    Signed-off-by: Florian Westphal
    Signed-off-by: Pablo Neira Ayuso

    Florian Westphal
     
  • When list->count is 0, the list is deleted by GC. But list->count is
    never reached 0 because initial count value is 1 and it is increased
    when node is inserted. So that initial value of list->count should be 0.

    Originally GC always finds zero count list through deleting node and
    decreasing count. However, list may be left empty since node insertion
    may fail eg. allocaton problem. In order to solve this problem, GC
    routine also finds zero count list without deleting node.

    Fixes: cb2b36f5a97d ("netfilter: nf_conncount: Switch to plain list")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     
  • nf_conncount_tuple is an element of nft_connlimit and that is deleted by
    conn_free(). Elements can be deleted by both GC routine and data path
    functions (nf_conncount_lookup, nf_conncount_add) and they call
    conn_free() to free elements. But conn_free() only protects lists, not
    each element. So that list_del corruption could occurred.

    The conn_free() doesn't check whether element is already deleted. In
    order to protect elements, dead flag is added. If an element is deleted,
    dead flag is set. The only conn_free() can delete elements so that both
    list lock and dead flag are enough to protect it.

    test commands:
    %nft add table ip filter
    %nft add chain ip filter input { type filter hook input priority 0\; }
    %nft add rule filter input meter test { ip id ct count over 2 } counter

    splat looks like:
    [ 1779.495778] list_del corruption, ffff8800b6e12008->prev is LIST_POISON2 (dead000000000200)
    [ 1779.505453] ------------[ cut here ]------------
    [ 1779.506260] kernel BUG at lib/list_debug.c:50!
    [ 1779.515831] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
    [ 1779.516772] CPU: 0 PID: 33 Comm: kworker/0:2 Not tainted 4.19.0-rc6+ #22
    [ 1779.516772] Workqueue: events_power_efficient nft_rhash_gc [nf_tables_set]
    [ 1779.516772] RIP: 0010:__list_del_entry_valid+0xd8/0x150
    [ 1779.516772] Code: 39 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 89 ea 48 c7 c7 00 c3 5b 98 e8 0f dc 40 ff 0f 0b 48 c7 c7 60 c3 5b 98 e8 01 dc 40 ff 0b 48 c7 c7 c0 c3 5b 98 e8 f3 db 40 ff 0f 0b 48 c7 c7 20 c4 5b
    [ 1779.516772] RSP: 0018:ffff880119127420 EFLAGS: 00010286
    [ 1779.516772] RAX: 000000000000004e RBX: dead000000000200 RCX: 0000000000000000
    [ 1779.516772] RDX: 000000000000004e RSI: 0000000000000008 RDI: ffffed0023224e7a
    [ 1779.516772] RBP: ffff88011934bc10 R08: ffffed002367cea9 R09: ffffed002367cea9
    [ 1779.516772] R10: 0000000000000001 R11: ffffed002367cea8 R12: ffff8800b6e12008
    [ 1779.516772] R13: ffff8800b6e12010 R14: ffff88011934bc20 R15: ffff8800b6e12008
    [ 1779.516772] FS: 0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
    [ 1779.516772] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1779.516772] CR2: 00007fc876534010 CR3: 000000010da16000 CR4: 00000000001006f0
    [ 1779.516772] Call Trace:
    [ 1779.516772] conn_free+0x9f/0x2b0 [nf_conncount]
    [ 1779.516772] ? nf_ct_tmpl_alloc+0x2a0/0x2a0 [nf_conntrack]
    [ 1779.516772] ? nf_conncount_add+0x520/0x520 [nf_conncount]
    [ 1779.516772] ? do_raw_spin_trylock+0x1a0/0x1a0
    [ 1779.516772] ? do_raw_spin_trylock+0x10/0x1a0
    [ 1779.516772] find_or_evict+0xe5/0x150 [nf_conncount]
    [ 1779.516772] nf_conncount_gc_list+0x162/0x360 [nf_conncount]
    [ 1779.516772] ? nf_conncount_lookup+0xee0/0xee0 [nf_conncount]
    [ 1779.516772] ? _raw_spin_unlock_irqrestore+0x45/0x50
    [ 1779.516772] ? trace_hardirqs_off+0x6b/0x220
    [ 1779.516772] ? trace_hardirqs_on_caller+0x220/0x220
    [ 1779.516772] nft_rhash_gc+0x16b/0x540 [nf_tables_set]
    [ ... ]

    Fixes: 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     
  • conn_free() holds lock with spin_lock() and it is called by both
    nf_conncount_lookup() and nf_conncount_gc_list(). nf_conncount_lookup()
    is called from bottom-half context and nf_conncount_gc_list() from
    process context. So that spin_lock() call is not safe. Hence
    conn_free() should use spin_lock_bh() instead of spin_lock().

    test commands:
    %nft add table ip filter
    %nft add chain ip filter input { type filter hook input priority 0\; }
    %nft add rule filter input meter test { ip saddr ct count over 2 } \
    counter

    splat looks like:
    [ 461.996507] ================================
    [ 461.998999] WARNING: inconsistent lock state
    [ 461.998999] 4.19.0-rc6+ #22 Not tainted
    [ 461.998999] --------------------------------
    [ 461.998999] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
    [ 461.998999] kworker/0:2/134 [HC0[0]:SC0[0]:HE1:SE1] takes:
    [ 461.998999] 00000000a71a559a (&(&list->list_lock)->rlock){+.?.}, at: conn_free+0x69/0x2b0 [nf_conncount]
    [ 461.998999] {IN-SOFTIRQ-W} state was registered at:
    [ 461.998999] _raw_spin_lock+0x30/0x70
    [ 461.998999] nf_conncount_add+0x28a/0x520 [nf_conncount]
    [ 461.998999] nft_connlimit_eval+0x401/0x580 [nft_connlimit]
    [ 461.998999] nft_dynset_eval+0x32b/0x590 [nf_tables]
    [ 461.998999] nft_do_chain+0x497/0x1430 [nf_tables]
    [ 461.998999] nft_do_chain_ipv4+0x255/0x330 [nf_tables]
    [ 461.998999] nf_hook_slow+0xb1/0x160
    [ ... ]
    [ 461.998999] other info that might help us debug this:
    [ 461.998999] Possible unsafe locking scenario:
    [ 461.998999]
    [ 461.998999] CPU0
    [ 461.998999] ----
    [ 461.998999] lock(&(&list->list_lock)->rlock);
    [ 461.998999]
    [ 461.998999] lock(&(&list->list_lock)->rlock);
    [ 461.998999]
    [ 461.998999] *** DEADLOCK ***
    [ 461.998999]
    [ ... ]

    Fixes: 5c789e131cbb ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
    Signed-off-by: Taehee Yoo
    Signed-off-by: Pablo Neira Ayuso

    Taehee Yoo
     
  • Linus Torvalds
     
  • Pull networking fixes from David Miller:
    "One last pull request before heading to Vancouver for LPC, here we have:

    1) Don't forget to free VSI contexts during ice driver unload, from
    Victor Raj.

    2) Don't forget napi delete calls during device remove in ice driver,
    from Dave Ertman.

    3) Don't request VLAN tag insertion of ibmvnic device when SKB
    doesn't have VLAN tags at all.

    4) IPV4 frag handling code has to accomodate the situation where two
    threads try to insert the same fragment into the hash table at the
    same time. From Eric Dumazet.

    5) Relatedly, don't flow separate on protocol ports for fragmented
    frames, also from Eric Dumazet.

    6) Memory leaks in qed driver, from Denis Bolotin.

    7) Correct valid MTU range in smsc95xx driver, from Stefan Wahren.

    8) Validate cls_flower nested policies properly, from Jakub Kicinski.

    9) Clearing of stats counters in mc88e6xxx driver doesn't retain
    important bits in the G1_STATS_OP register causing the chip to
    hang. Fix from Andrew Lunn"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
    act_mirred: clear skb->tstamp on redirect
    net: dsa: mv88e6xxx: Fix clearing of stats counters
    tipc: fix link re-establish failure
    net: sched: cls_flower: validate nested enc_opts_policy to avoid warning
    net: mvneta: correct typo
    flow_dissector: do not dissect l4 ports for fragments
    net: qualcomm: rmnet: Fix incorrect assignment of real_dev
    net: aquantia: allow rx checksum offload configuration
    net: aquantia: invalid checksumm offload implementation
    net: aquantia: fixed enable unicast on 32 macvlan
    net: aquantia: fix potential IOMMU fault after driver unbind
    net: aquantia: synchronized flow control between mac/phy
    net: smsc95xx: Fix MTU range
    net: stmmac: Fix RX packet size > 8191
    qed: Fix potential memory corruption
    qed: Fix SPQ entries not returned to pool in error flows
    qed: Fix blocking/unlimited SPQ entries leak
    qed: Fix memory/entry leak in qed_init_sp_request()
    inet: frags: better deal with smp races
    net: hns3: bugfix for not checking return value
    ...

    Linus Torvalds
     
  • …masahiroy/linux-kbuild

    Pull Kbuild fixes from Masahiro Yamada:

    - fix build errors in binrpm-pkg and bindeb-pkg targets

    - fix false positive matches in merge_config.sh

    - fix build version mismatch in deb-pkg target

    - fix dtbs_install handling in (bin)deb-pkg target

    - revert a commit that allows setlocalversion to write to source tree

    * tag 'kbuild-fixes-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
    builddeb: Fix inclusion of dtbs in debian package
    Revert "scripts/setlocalversion: git: Make -dirty check more robust"
    kbuild: deb-pkg: fix too low build version number
    kconfig: merge_config: avoid false positive matches from comment lines
    kbuild: deb-pkg: fix bindeb-pkg breakage when O= is used
    kbuild: rpm-pkg: fix binrpm-pkg breakage when O= is used

    Linus Torvalds
     
  • Pull btrfs fixes from David Sterba:
    "Several fixes to recent release (4.19, fixes tagged for stable) and
    other fixes"

    * tag 'for-4.20-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
    Btrfs: fix missing delayed iputs on unmount
    Btrfs: fix data corruption due to cloning of eof block
    Btrfs: fix infinite loop on inode eviction after deduplication of eof block
    Btrfs: fix deadlock on tree root leaf when finding free extent
    btrfs: avoid link error with CONFIG_NO_AUTO_INLINE
    btrfs: tree-checker: Fix misleading group system information
    Btrfs: fix missing data checksums after a ranged fsync (msync)
    btrfs: fix pinned underflow after transaction aborted
    Btrfs: fix cur_offset in the error case for nocow

    Linus Torvalds
     
  • Pull ext4 fixes from Ted Ts'o:
    "A large number of ext4 bug fixes, mostly buffer and memory leaks on
    error return cleanup paths"

    * tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
    ext4: missing !bh check in ext4_xattr_inode_write()
    ext4: fix buffer leak in __ext4_read_dirblock() on error path
    ext4: fix buffer leak in ext4_expand_extra_isize_ea() on error path
    ext4: fix buffer leak in ext4_xattr_move_to_block() on error path
    ext4: release bs.bh before re-using in ext4_xattr_block_find()
    ext4: fix buffer leak in ext4_xattr_get_block() on error path
    ext4: fix possible leak of s_journal_flag_rwsem in error path
    ext4: fix possible leak of sbi->s_group_desc_leak in error path
    ext4: remove unneeded brelse call in ext4_xattr_inode_update_ref()
    ext4: avoid possible double brelse() in add_new_gdb() on error path
    ext4: avoid buffer leak in ext4_orphan_add() after prior errors
    ext4: avoid buffer leak on shutdown in ext4_mark_iloc_dirty()
    ext4: fix possible inode leak in the retry loop of ext4_resize_fs()
    ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing
    ext4: add missing brelse() update_backups()'s error path
    ext4: add missing brelse() add_new_gdb_meta_bg()'s error path
    ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path
    ext4: avoid potential extra brelse in setup_new_flex_group_blocks()

    Linus Torvalds
     
  • Pull x86 fixes from Thomas Gleixner:
    "A set of x86 fixes:

    - Cure the LDT remapping to user space on 5 level paging which ended
    up in the KASLR space

    - Remove LDT mapping before freeing the LDT pages

    - Make NFIT MCE handling more robust

    - Unbreak the VSMP build by removing the dependency on paravirt ops

    - Support broken PIT emulation on Microsoft hyperV

    - Don't trace vmware_sched_clock() to avoid tracer recursion

    - Remove -pipe from KBUILD CFLAGS which breaks clang and is also
    slower on GCC

    - Trivial coding style and typo fixes"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/cpu/vmware: Do not trace vmware_sched_clock()
    x86/vsmp: Remove dependency on pv_irq_ops
    x86/ldt: Remove unused variable in map_ldt_struct()
    x86/ldt: Unmap PTEs for the slot before freeing LDT pages
    x86/mm: Move LDT remap out of KASLR region on 5-level paging
    acpi/nfit, x86/mce: Validate a MCE's address before using it
    acpi/nfit, x86/mce: Handle only uncorrectable machine checks
    x86/build: Remove -pipe from KBUILD_CFLAGS
    x86/hyper-v: Fix indentation in hv_do_fast_hypercall16()
    Documentation/x86: Fix typo in zero-page.txt
    x86/hyper-v: Enable PIT shutdown quirk
    clockevents/drivers/i8253: Add support for PIT shutdown quirk

    Linus Torvalds
     
  • Pull perf fixes from Thomas Gleixner:
    "A bunch of perf tooling fixes:

    - Make the Intel PT SQL viewer more robust

    - Make the Intel PT debug log more useful

    - Support weak groups in perf record so it's behaving the same way as
    perf stat

    - Display the LBR stats in callchain entries properly in perf top

    - Handle different PMu names with common prefix properlin in pert
    stat

    - Start syscall augmenting in perf trace. Preparation for
    architecture independent eBPF instrumentation of syscalls.

    - Fix build breakage in JVMTI perf lib

    - Fix arm64 tools build failure wrt smp_load_{acquire,release}"

    * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf tools: Do not zero sample_id_all for group members
    perf tools: Fix undefined symbol scnprintf in libperf-jvmti.so
    perf beauty: Use SRCARCH, ARCH=x86_64 must map to "x86" to find the headers
    perf intel-pt: Add MTC and CYC timestamps to debug log
    perf intel-pt: Add more event information to debug log
    perf scripts python: exported-sql-viewer.py: Fix table find when table re-ordered
    perf scripts python: exported-sql-viewer.py: Add help window
    perf scripts python: exported-sql-viewer.py: Add Selected branches report
    perf scripts python: exported-sql-viewer.py: Fall back to /usr/local/lib/libxed.so
    perf top: Display the LBR stats in callchain entry
    perf stat: Handle different PMU names with common prefix
    perf record: Support weak groups
    perf evlist: Move perf_evsel__reset_weak_group into evlist
    perf augmented_syscalls: Start collecting pathnames in the BPF program
    perf trace: Fix setting of augmented payload when using eBPF + raw_syscalls
    perf trace: When augmenting raw_syscalls plug raw_syscalls:sys_exit too
    perf examples bpf: Start augmenting raw_syscalls:sys_{start,exit}
    tools headers barrier: Fix arm64 tools build failure wrt smp_load_{acquire,release}

    Linus Torvalds
     
  • Pull timer fix from Thomas Gleixner:
    "Just the removal of a redundant call into the sched deadline overrun
    check"

    * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    posix-cpu-timers: Remove useless call to check_dl_overrun()

    Linus Torvalds
     
  • Pull scheduler fixes from Thomas Gleixner:
    "Two small scheduler fixes:

    - Take hotplug lock in sched_init_smp(). Technically not really
    required, but lockdep will complain other.

    - Trivial comment fix in sched/fair"

    * 'sched/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    sched/fair: Fix a comment in task_numa_fault()
    sched/core: Take the hotplug lock in sched_init_smp()

    Linus Torvalds
     
  • Pull locking build fix from Thomas Gleixner:
    "A single fix for a build fail with CONFIG_PROFILE_ALL_BRANCHES=y in
    the qspinlock code"

    * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/qspinlock: Fix compile error

    Linus Torvalds
     
  • Pull core fixes from Thomas Gleixner:
    "A couple of fixlets for the core:

    - Kernel doc function documentation fixes

    - Missing prototypes for weak watchdog functions"

    * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    resource/docs: Complete kernel-doc style function documentation
    watchdog/core: Add missing prototypes for weak functions
    resource/docs: Fix new kernel-doc warnings

    Linus Torvalds
     
  • If sch_fq is used at ingress, skbs that might have been
    timestamped by net_timestamp_set() if a packet capture
    is requesting timestamps could be delayed by arbitrary
    amount of time, since sch_fq time base is MONOTONIC.

    Fix this problem by moving code from sch_netem.c to act_mirred.c.

    Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC")
    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • The mv88e6161 would sometime fail to probe with a timeout waiting for
    the switch to complete an operation. This operation is supposed to
    clear the statistics counters. However, due to a read/modify/write,
    without the needed mask, the operation actually carried out was more
    random, with invalid parameters, resulting in the switch not
    responding. We need to preserve the histogram mode bits, so apply a
    mask to keep them.

    Reported-by: Chris Healy
    Fixes: 40cff8fca9e3 ("net: dsa: mv88e6xxx: Fix stats histogram mode")
    Signed-off-by: Andrew Lunn
    Signed-off-by: David S. Miller

    Andrew Lunn
     
  • When a link failure is detected locally, the link is reset, the flag
    link->in_session is set to false, and a RESET_MSG with the 'stopping'
    bit set is sent to the peer.

    The purpose of this bit is to inform the peer that this endpoint just
    is going down, and that the peer should handle the reception of this
    particular RESET message as a local failure. This forces the peer to
    accept another RESET or ACTIVATE message from this endpoint before it
    can re-establish the link. This again is necessary to ensure that
    link session numbers are properly exchanged before the link comes up
    again.

    If a failure is detected locally at the same time at the peer endpoint
    this will do the same, which is also a correct behavior.

    However, when receiving such messages, the endpoints will not
    distinguish between 'stopping' RESETs and ordinary ones when it comes
    to updating session numbers. Both endpoints will copy the received
    session number and set their 'in_session' flags to true at the
    reception, while they are still expecting another RESET from the
    peer before they can go ahead and re-establish. This is contradictory,
    since, after applying the validation check referred to below, the
    'in_session' flag will cause rejection of all such messages, and the
    link will never come up again.

    We now fix this by not only handling received RESET/STOPPING messages
    as a local failure, but also by omitting to set a new session number
    and the 'in_session' flag in such cases.

    Fixes: 7ea817f4e832 ("tipc: check session number before accepting link protocol messages")
    Signed-off-by: Jon Maloy
    Signed-off-by: David S. Miller

    Jon Maloy
     

11 Nov, 2018

12 commits

  • Commit 37c8a5fafa3b ("kbuild: consolidate Devicetree dtb build rules")
    moved the location of 'dtbs_install' target which caused dtbs to not be
    installed when building debian package with 'bindeb-pkg' target. Update
    the builddeb script to use the same logic that determines if there's a
    'dtbs_install' target which is presence of the arch dts directory. Also,
    use CONFIG_OF_EARLY_FLATTREE instead of CONFIG_OF as that's a better
    indication of whether we are building dtbs.

    This commit will also have the side effect of installing dtbs on any
    arch that has dts files. Previously, it was dependent on whether the
    arch defined 'dtbs_install'.

    Fixes: 37c8a5fafa3b ("kbuild: consolidate Devicetree dtb build rules")
    Reported-by: Nuno Gonçalves
    Signed-off-by: Rob Herring
    Signed-off-by: Masahiro Yamada

    Rob Herring
     
  • This reverts commit 6147b1cf19651c7de297e69108b141fb30aa2349.

    The reverted patch results in attempted write access to the source
    repository, even if that repository is mounted read-only.

    Output from "strace git status -uno --porcelain":

    getcwd("/tmp/linux-test", 129) = 16
    open("/tmp/linux-test/.git/index.lock", O_RDWR|O_CREAT|O_EXCL|O_CLOEXEC, 0666) =
    -1 EROFS (Read-only file system)

    While git appears to be able to handle this situation, a monitored
    build environment (such as the one used for Chrome OS kernel builds)
    may detect it and bail out with an access violation error. On top of
    that, the attempted write access suggests that git _will_ write to the
    file even if a build output directory is specified. Users may have the
    reasonable expectation that the source repository remains untouched in
    that situation.

    Fixes: 6147b1cf19651 ("scripts/setlocalversion: git: Make -dirty check more robust"
    Cc: Genki Sky
    Signed-off-by: Guenter Roeck
    Reviewed-by: Brian Norris
    Signed-off-by: Masahiro Yamada

    Guenter Roeck
     
  • Since commit b41d920acff8 ("kbuild: deb-pkg: split generating packaging
    and build"), the build version of the kernel contained in a deb package
    is too low by 1.

    Prior to the bad commit, the kernel was built first, then the number
    in .version file was read out, and written into the debian control file.

    Now, the debian control file is created before the kernel is actually
    compiled, which is causing the version number mismatch.

    Let the mkdebian script pass KBUILD_BUILD_VERSION=${revision} to require
    the build system to use the specified version number.

    Fixes: b41d920acff8 ("kbuild: deb-pkg: split generating packaging and build")
    Reported-by: Doug Smythies
    Signed-off-by: Masahiro Yamada
    Tested-by: Doug Smythies

    Masahiro Yamada
     
  • The current SED_CONFIG_EXP could match to comment lines in config
    fragment files, especially when CONFIG_PREFIX_ is empty. For example,
    Buildroot uses empty prefixing; starting symbols with BR2_ is just
    convention.

    Make the sed expression more robust against false positives from
    comment lines. The new sed expression matches to only valid patterns.

    Signed-off-by: Masahiro Yamada
    Reviewed-by: Petr Vorel
    Reviewed-by: Arnout Vandecappelle (Essensium/Mind)

    Masahiro Yamada
     
  • Pull tty/serial fixes from Greg KH:
    "Here are some small tty fixes for 4.20-rc2

    One of these missed the original 4.19-final release, I missed that I
    hadn't done a pull request for it as it was in linux-next and my
    branch for a long time, that's my fault.

    The others are small, fixing some reported issues and finally fixing
    the termios mess for alpha so that glibc has a chance to implement
    some missing functionality that has been pending for many years now.

    All of these have been in linux-next with no reported issues"

    * tag 'tty-4.20-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
    serial: sh-sci: Fix could not remove dev_attr_rx_fifo_timeout
    arch/alpha, termios: implement BOTHER, IBSHIFT and termios2
    termios, tty/tty_baudrate.c: fix buffer overrun
    vt: fix broken display when running aptitude
    serial: sh-sci: Fix receive on SCIFA/SCIFB variants with DMA

    Linus Torvalds
     
  • Pull drm fixes from Dave Airlie:
    "drm: i915, amdgpu, sun4i, exynos and etnaviv fixes:

    - amdgpu has some display fixes, KFD ioctl fixes and a Vega20 bios
    interaction fix.

    - sun4i has some NULL checks added

    - i915 has a 32-bit system fix, LPE audio oops, and HDMI2.0 clock
    fixes.

    - Exynos has a 3 regression fixes (one frame counter, fbdev missing,
    dsi->panel check)

    - Etnaviv has a single fencing fix for GPU recovery"

    * tag 'drm-fixes-2018-11-11' of git://anongit.freedesktop.org/drm/drm: (39 commits)
    drm/amd/amdgpu/dm: Fix dm_dp_create_fake_mst_encoder()
    drm/amd/display: Drop reusing drm connector for MST
    drm/amd/display: Cleanup MST non-atomic code workaround
    drm/amd/powerplay: always use fast UCLK switching when UCLK DPM enabled
    drm/amd/powerplay: set a default fclk/gfxclk ratio
    drm/amdgpu/display/dce11: only enable FBC when selected
    drm/amdgpu/display/dm: handle FBC dc feature parameter
    drm/amdgpu/display/dc: add FBC to dc_config
    drm/amdgpu: add DC feature mask module parameter
    drm/amdgpu/display: check if fbc is available in set_static_screen_control (v2)
    drm/amdgpu/vega20: add CLK base offset
    drm/amd/display: Stop leaking planes
    drm/amd/display: Fix misleading buffer information
    Revert "drm/amd/display: set backlight level limit to 1"
    drm/amd: Update atom_smu_info_v3_3 structure
    drm/i915: Fix ilk+ watermarks when disabling pipes
    drm/sun4i: tcon: prevent tcon->panel dereference if NULL
    drm/sun4i: tcon: fix check of tcon->panel null pointer
    drm/i915: Don't oops during modeset shutdown after lpe audio deinit
    drm/i915: Mark pin flags as u64
    ...

    Linus Torvalds
     
  • Pull namespace fixes from Eric Biederman:
    "I believe all of these are simple obviously correct bug fixes. These
    fall into two groups:

    - Fixing the implementation of MNT_LOCKED which prevents lesser
    privileged users from seeing unders mounts created by more
    privileged users.

    - Fixing the extended uid and group mapping in user namespaces.

    As well as ensuring the code looks correct I have spot tested these
    changes as well and in my testing the fixes are working"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
    mount: Prevent MNT_DETACH from disconnecting locked mounts
    mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts
    mount: Retest MNT_LOCKED in do_umount
    userns: also map extents in the reverse map to kernel IDs

    Linus Torvalds
     
  • Pull clk fixes from Stephen Boyd:
    "A small set of fixes for clk drivers.

    One to fix a DT refcount imbalance, two to mark some Amlogic clks as
    critical, and one final one that fixes a clk name for the Qualcomm
    driver merged this cycle"

    * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
    clk: qcom: gcc: Fix board clock node name
    clk: meson: axg: mark fdiv2 and fdiv3 as critical
    clk: meson-gxbb: set fclk_div3 as CLK_IS_CRITICAL
    clk: fixed-factor: fix of_node_get-put imbalance

    Linus Torvalds
     
  • Fixes for 4.20:
    - DC MST fixes
    - DC FBC fix
    - Vega20 updates to support the latest vbios
    - KFD type fixes for ioctl headers

    Signed-off-by: Dave Airlie
    From: Alex Deucher
    Link: https://patchwork.freedesktop.org/patch/msgid/20181108035551.2904-1-alexander.deucher@amd.com

    Dave Airlie
     
  • - sun4i: tcon->panel NULL deref protections (Giulio)

    Cc: Giulio Benetti
    Signed-off-by: Dave Airlie
    From: Sean Paul
    Link: https://patchwork.freedesktop.org/patch/msgid/20181107205051.GA27823@art_vandelay

    Dave Airlie
     
  • Bugzilla #108282 fixed: Avoid graphics corruption on 32-bit systems for Mesa 18.2.x
    Avoid OOPS on LPE audio deinit. Remove two unused W/As.
    Fix to correct HDMI 2.0 audio clock modes to spec.

    Signed-off-by: Dave Airlie
    From: Joonas Lahtinen
    Link: https://patchwork.freedesktop.org/patch/msgid/20181108134508.GA28466@jlahtine-desk.ger.corp.intel.com

    Dave Airlie
     
  • TCA_FLOWER_KEY_ENC_OPTS and TCA_FLOWER_KEY_ENC_OPTS_MASK can only
    currently contain further nested attributes, which are parsed by
    hand, so the policy is never actually used resulting in a W=1
    build warning:

    net/sched/cls_flower.c:492:1: warning: ‘enc_opts_policy’ defined but not used [-Wunused-const-variable=]
    enc_opts_policy[TCA_FLOWER_KEY_ENC_OPTS_MAX + 1] = {

    Add the validation anyway to avoid potential bugs when other
    attributes are added and to make the attribute structure slightly
    more clear. Validation will also set extact to point to bad
    attribute on error.

    Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options")
    Signed-off-by: Jakub Kicinski
    Acked-by: Simon Horman
    Acked-by: Jiri Pirko
    Signed-off-by: David S. Miller

    Jakub Kicinski
     

10 Nov, 2018

3 commits

  • Pull xen fixes from Juergen Gross:
    "Several fixes, mostly for rather recent regressions when running under
    Xen"

    * tag 'for-linus-4.20a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen: remove size limit of privcmd-buf mapping interface
    xen: fix xen_qlock_wait()
    x86/xen: fix pv boot
    xen-blkfront: fix kernel panic with negotiate_mq error path
    xen/grant-table: Fix incorrect gnttab_dma_free_pages() pr_debug message
    CONFIG_XEN_PV breaks xen_create_contiguous_region on ARM

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:

    - Fix occasional page fault during boot due to memblock resizing before
    the linear map is up.

    - Define NET_IP_ALIGN to 0 to improve the DMA performance on some
    platforms.

    - lib/raid6 test build fix.

    - .mailmap update for Punit Agrawal

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: memblock: don't permit memblock resizing until linear mapping is up
    arm64: mm: define NET_IP_ALIGN to 0
    lib/raid6: Fix arm64 test build
    mailmap: Update email for Punit Agrawal

    Linus Torvalds
     
  • Pull i2c updates from Wolfram Sang:
    "I2C has one bugfix (qcom-geni driver), one arch enablement (i2c-omap
    driver, no code change), and a new driver (nvidia-gpu) this time"

    * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
    usb: typec: ucsi: add support for Cypress CCGx
    i2c: nvidia-gpu: make pm_ops static
    i2c: add i2c bus driver for NVIDIA GPU
    i2c: qcom-geni: Fix runtime PM mismatch with child devices
    MAINTAINERS: Add entry for i2c-omap driver
    i2c: omap: Enable for ARCH_K3
    dt-bindings: i2c: omap: Add new compatible for AM654 SoCs

    Linus Torvalds