28 Apr, 2014

1 commit

  • A race exists between module loading and enabling of function tracer.

    CPU 1 CPU 2
    ----- -----
    load_module()
    module->state = MODULE_STATE_COMING

    register_ftrace_function()
    mutex_lock(&ftrace_lock);
    ftrace_startup()
    update_ftrace_function();
    ftrace_arch_code_modify_prepare()
    set_all_module_text_rw();

    ftrace_arch_code_modify_post_process()
    set_all_module_text_ro();

    [ here all module text is set to RO,
    including the module that is
    loading!! ]

    blocking_notifier_call_chain(MODULE_STATE_COMING);
    ftrace_init_module()

    [ tries to modify code, but it's RO, and fails!
    ftrace_bug() is called]

    When this race happens, ftrace_bug() will produces a nasty warning and
    all of the function tracing features will be disabled until reboot.

    The simple solution is to treate module load the same way the core
    kernel is treated at boot. To hardcode the ftrace function modification
    of converting calls to mcount into nops. This is done in init/main.c
    there's no reason it could not be done in load_module(). This gives
    a better control of the changes and doesn't tie the state of the
    module to its notifiers as much. Ftrace is special, it needs to be
    treated as such.

    The reason this would work, is that the ftrace_module_init() would be
    called while the module is in MODULE_STATE_UNFORMED, which is ignored
    by the set_all_module_text_ro() call.

    Link: http://lkml.kernel.org/r/1395637826-3312-1-git-send-email-indou.takao@jp.fujitsu.com

    Reported-by: Takao Indoh
    Acked-by: Rusty Russell
    Cc: stable@vger.kernel.org # 2.6.38+
    Signed-off-by: Steven Rostedt

    Steven Rostedt (Red Hat)
     

19 Apr, 2014

8 commits

  • Pull more networking fixes from David Miller:

    1) Fix mlx4_en_netpoll implementation, it needs to schedule a NAPI
    context, not synchronize it. From Chris Mason.

    2) Ipv4 flow input interface should never be zero, it should be
    LOOPBACK_IFINDEX instead. From Cong Wang and Julian Anastasov.

    3) Properly configure MAC to PHY connection in mvneta devices, from
    Thomas Petazzoni.

    4) sys_recv should use SYSCALL_DEFINE. From Jan Glauber.

    5) Tunnel driver ioctls do not use the correct namespace, fix from
    Nicolas Dichtel.

    6) Fix memory leak on seccomp filter attach, from Kees Cook.

    7) Fix lockdep warning for nested vlans, from Ding Tianhong.

    8) Crashes can happen in SCTP due to how the auth_enable value is
    managed, fix from Vlad Yasevich.

    9) Wireless fixes from John W Linville and co.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (45 commits)
    net: sctp: cache auth_enable per endpoint
    tg3: update rx_jumbo_pending ring param only when jumbo frames are enabled
    vlan: Fix lockdep warning when vlan dev handle notification
    seccomp: fix memory leak on filter attach
    isdn: icn: buffer overflow in icn_command()
    ip6_tunnel: use the right netns in ioctl handler
    sit: use the right netns in ioctl handler
    ip_tunnel: use the right netns in ioctl handler
    net: use SYSCALL_DEFINEx for sys_recv
    net: mdio-gpio: Add support for separate MDI and MDO gpio pins
    net: mdio-gpio: Add support for active low gpio pins
    net: mdio-gpio: Use devm_ functions where possible
    ipv4, route: pass 0 instead of LOOPBACK_IFINDEX to fib_validate_source()
    ipv4, fib: pass LOOPBACK_IFINDEX instead of 0 to flowi4_iif
    mlx4_en: don't use napi_synchronize inside mlx4_en_netpoll
    net: mvneta: properly configure the MAC PHY connection in all situations
    net: phy: add minimal support for QSGMII PHY
    sfc:On MCDI timeout, issue an FLR (and mark MCDI to fail-fast)
    mwifiex: fix hung task on command timeout
    mwifiex: process event before command response
    ...

    Linus Torvalds
     
  • Pull char/misc driver fixes from Greg KH:
    "Here are a few driver fixes for char/misc drivers that resolve
    reported issues.

    All have been in linux-next successfully for a few days"

    * tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
    Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
    Tools: hv: Handle the case when the target file exists correctly
    vme_tsi148: Utilize to_pci_dev() macro
    vme_tsi148: Fix PCI address mapping assumption
    vme_tsi148: Fix typo in tsi148_slave_get()
    w1: avoid recursive device_add
    w1: fix netlink refcnt leak on error path
    misc: Grammar s/addition/additional/
    drivers: mcb: fix memory leak in chameleon_parse_cells() error path
    mei: ignore client writing state during cb completion
    mei: me: do not load the driver if the FW doesn't support MEI interface
    GenWQE: Increase driver version number
    GenWQE: Fix multithreading problems
    GenWQE: Ensure rc is not returning an uninitialized value
    GenWQE: Add wmb before DDCB is started
    GenWQE: Enable access to VPD flash area

    Linus Torvalds
     
  • Pull driver core fixes from Greg KH:
    "Here are some driver core fixes for 3.15-rc2. Also in here are some
    documentation updates, as well as an API removal that had to wait for
    after -rc1 due to the cleanups coming into you from multiple developer
    trees (this one and the PPC tree.)

    All have been in linux next successfully"

    * tag 'driver-core-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
    drivers/base/dd.c incorrect pr_debug() parameters
    Documentation: Update stable address in Chinese and Japanese translations
    topology: Fix compilation warning when not in SMP
    Chinese: add translation of io_ordering.txt
    stable_kernel_rules: spelling/word usage
    sysfs, driver-core: remove unused {sysfs|device}_schedule_callback_owner()
    kernfs: protect lazy kernfs_iattrs allocation with mutex
    fs: Don't return 0 from get_anon_bdev

    Linus Torvalds
     
  • Merge misc fixes from Andrew Morton:
    "13 fixes"

    * emailed patches from Andrew Morton :
    thp: close race between split and zap huge pages
    mm: fix new kernel-doc warning in filemap.c
    mm: fix CONFIG_DEBUG_VM_RB description
    mm: use paravirt friendly ops for NUMA hinting ptes
    mips: export flush_icache_range
    mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()
    wait: explain the shadowing and type inconsistencies
    Shiraz has moved
    Documentation/vm/numa_memory_policy.txt: fix wrong document in numa_memory_policy.txt
    powerpc/mm: fix ".__node_distance" undefined
    kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()
    init/Kconfig: move the trusted keyring config option to general setup
    vmscan: reclaim_clean_pages_from_list() must use mod_zone_page_state()

    Linus Torvalds
     
  • Stick in a comment before someone else tries to fix the sparse warning
    this generates.

    Suggested-by: Andrew Morton
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/n/tip-o2ro6f3vkxklni0bc8f7m68s@git.kernel.org
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • shiraz.hashim@st.com email-id doesn't exist anymore as he has left the
    company. Replace ST's id with shiraz.linux.kernel@gmail.com.

    It also updates .mailmap file to fix address for 'git shortlog'.

    Signed-off-by: Viresh Kumar
    Cc: Shiraz Hashim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Viresh Kumar
     
  • Pull infiniband/rdma updates from Roland Dreier:

    - mostly cxgb4 fixes unblocked by the merge of some prerequisites via
    the net tree

    - drop deprecated MSI-X API use.

    - a couple other miscellaneous things.

    * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
    RDMA/cxgb4: Fix over-dereference when terminating
    RDMA/cxgb4: Use uninitialized_var()
    RDMA/cxgb4: Add missing debug stats
    RDMA/cxgb4: Initialize reserved fields in a FW work request
    RDMA/cxgb4: Use pr_warn_ratelimited
    RDMA/cxgb4: Max fastreg depth depends on DSGL support
    RDMA/cxgb4: SQ flush fix
    RDMA/cxgb4: rmb() after reading valid gen bit
    RDMA/cxgb4: Endpoint timeout fixes
    RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices
    IB/mlx5: Add block multicast loopback support
    IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix()
    IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix()

    Linus Torvalds
     
  • Pull devicetree fixes from Rob Herring:
    - fix error handling in of_update_property
    - fix section mismatch warnings in __reserved_mem_check_root
    - add empty of_find_node_by_path for !OF builds
    - add various missing binding documentation

    * tag 'dt-fixes-for-3.15' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
    of: add empty of_find_node_by_path() for !OF
    of: Clean up of_update_property
    DT: add vendor prefix for EBV Elektronik
    of: Fix the section mismatch warnings.
    of: Add vendor prefix for Digi International Inc.
    DT: I2C: Add trivial bindings used by kirkwood boards
    DT: Vendor: Add prefixes used by Kirkwood devices
    DT: bindings: add missing Marvell Kirkwood SoC documentation
    dt-bindings: add vendor-prefix for Newhaven Display
    of: add vendor prefix for I2SE GmbH
    of: add vendor prefix for ISEE 2007 S.L.

    Linus Torvalds
     

18 Apr, 2014

4 commits

  • Add an empty version of of_find_node_by_path().
    This fixes following build error for asoc tree:
    sound/soc/fsl/fsl_ssi.c: In function 'fsl_ssi_probe':
    sound/soc/fsl/fsl_ssi.c:1471:2: error: implicit declaration of function 'of_find_node_by_path' [-Werror=implicit-function-declaration]
    sprop = of_get_property(of_find_node_by_path("/"), "compatible", NULL);

    Reported-by: Stephen Rothwell
    Signed-off-by: Alexander Shiyan
    Signed-off-by: Rob Herring

    Alexander Shiyan
     
  • Merge ipmi fixes from Corey Minyard:
    "Things collected since last kernel release.

    Some of these are pretty important. The first three are bug fixes.
    The next two are to hopefully make everyone happy about allowing
    ACPI to be on all the time and not have IPMI have an effect on the
    system when not in use. The last is a little cleanup"

    * emailed patches from Corey Minyard :
    ipmi: boolify some things
    ipmi: Turn off all activity on an idle ipmi interface
    ipmi: Turn off default probing of interfaces
    ipmi: Reset the KCS timeout when starting error recovery
    ipmi: Fix a race restarting the timer
    Char: ipmi_bt_sm, fix infinite loop

    Linus Torvalds
     
  • Convert some ints to bools.

    Signed-off-by: Corey Minyard
    Signed-off-by: Linus Torvalds

    Corey Minyard
     
  • The IPMI driver would wake up periodically looking for events and
    watchdog pretimeouts. If there is nothing waiting for these events,
    it's really kind of pointless to be checking for them. So modify the
    driver so the message handler can pass down if it needs the lower layer
    to be waiting for these. Modify the system interface lower layer to
    turn off all timer and thread activity if the upper layer doesn't need
    anything and it is not currently handling messages. And modify the
    message handler to not restart the timer if its timer is not needed.

    The timers and kthread will still be enabled if:
    - the SI interface is handling a message.
    - a user has enabled watching for events.
    - the IPMI watchdog timer is in use (since it uses pretimeouts).
    - the message handler is waiting on a remote response.
    - a user has registered to receive commands.

    This mostly affects interfaces without interrupts. Interfaces with
    interrupts already don't use CPU in the system interface when the
    interface is idle.

    Signed-off-by: Corey Minyard
    Signed-off-by: Linus Torvalds

    Corey Minyard
     

17 Apr, 2014

6 commits

  • Pull x86 fixes from Ingo Molnar:
    "Various fixes:

    - reboot regression fix
    - build message spam fix
    - GPU quirk fix
    - 'make kvmconfig' fix

    plus the wire-up of the renameat2() system call on i386"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86: Remove the PCI reboot method from the default chain
    x86/build: Supress "Nothing to be done for ..." messages
    x86/gpu: Fix sign extension issue in Intel graphics stolen memory quirks
    x86/platform: Fix "make O=dir kvmconfig"
    i386: Wire up the renameat2() syscall

    Linus Torvalds
     
  • Only ws2012r2 hosts support the ability to reconnect to the host on VMBUS. This functionality
    is needed by kexec in Linux. To use this functionality we need to negotiate version 3.0 of the
    VMBUS protocol.

    Signed-off-by: K. Y. Srinivasan
    Cc: [3.9+]
    Signed-off-by: Greg Kroah-Hartman

    K. Y. Srinivasan
     
  • This is for a system with fixed assignments of input and output pins
    (various variants of Kontron COMe).

    Signed-off-by: Guenter Roeck
    Signed-off-by: David S. Miller

    Guenter Roeck
     
  • Some systems using mdio-gpio may use active-low gpio pins
    (eg with inverters or FETs connected to all or some of the
    gpio pins).

    Signed-off-by: Guenter Roeck
    Signed-off-by: David S. Miller

    Guenter Roeck
     
  • All device_schedule_callback_owner() users are converted to use
    device_remove_file_self(). Remove now unused
    {sysfs|device}_schedule_callback_owner().

    Signed-off-by: Tejun Heo
    Signed-off-by: Greg Kroah-Hartman

    Tejun Heo
     
  • This commit adds the necessary definitions for the PHY layer to
    recognize "qsgmii" as a valid PHY interface. A QSMII interface, as
    defined at
    http://en.wikipedia.org/wiki/Media_Independent_Interface#Quad_Serial_Gigabit_Media_Independent_Interface,
    is "is a method of combining four SGMII lines into a 5Gbit/s
    interface. QSGMII, like SGMII, uses LVDS signalling for the TX and RX
    data and a single LVDS clock signal. QSGMII uses significantly fewer
    signal lines than four SGMII busses."

    This type of MAC PHY connection might require special handling on
    the MAC driver side, so it should be possible to express this type of
    MAC PHY connection, for example in the Device Tree.

    Signed-off-by: Thomas Petazzoni
    Cc: devicetree@vger.kernel.org
    Reviewed-by: Florian Fainelli
    Signed-off-by: David S. Miller

    Thomas Petazzoni
     

16 Apr, 2014

2 commits

  • Steve reported a reboot hang and bisected it back to this commit:

    a4f1987e4c54 x86, reboot: Add EFI and CF9 reboot methods into the default list

    He heroically tested all reboot methods and found the following:

    reboot=t # triple fault ok
    reboot=k # keyboard ctrl FAIL
    reboot=b # BIOS ok
    reboot=a # ACPI FAIL
    reboot=e # EFI FAIL [system has no EFI]
    reboot=p # PCI 0xcf9 FAIL

    And I think it's pretty obvious that we should only try PCI 0xcf9 as a
    last resort - if at all.

    The other observation is that (on this box) we should never try
    the PCI reboot method, but close with either the 'triple fault'
    or the 'BIOS' (terminal!) reboot methods.

    Thirdly, CF9_COND is a total misnomer - it should be something like
    CF9_SAFE or CF9_CAREFUL, and 'CF9' should be 'CF9_FORCE' ...

    So this patch fixes the worst problems:

    - it orders the actual reboot logic to follow the reboot ordering
    pattern - it was in a pretty random order before for no good
    reason.

    - it fixes the CF9 misnomers and uses BOOT_CF9_FORCE and
    BOOT_CF9_SAFE flags to make the code more obvious.

    - it tries the BIOS reboot method before the PCI reboot method.
    (Since 'BIOS' is a terminal reboot method resulting in a hang
    if it does not work, this is essentially equivalent to removing
    the PCI reboot method from the default reboot chain.)

    - just for the miraculous possibility of terminal (resulting
    in hang) reboot methods of triple fault or BIOS returning
    without having done their job, there's an ordering between
    them as well.

    Reported-and-bisected-and-tested-by: Steven Rostedt
    Cc: Li Aubrey
    Cc: Linus Torvalds
    Cc: Matthew Garrett
    Link: http://lkml.kernel.org/r/20140404064120.GB11877@gmail.com
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Pull networking fixes from David Miller:

    1) Fix BPF filter validation of netlink attribute accesses, from
    Mathias Kruase.

    2) Netfilter conntrack generation seqcount not initialized properly,
    from Andrey Vagin.

    3) Fix comparison mask computation on big-endian in nft_cmp_fast(),
    from Patrick McHardy.

    4) Properly limit MTU over ipv6, from Eric Dumazet.

    5) Fix seccomp system call argument population on 32-bit, from Daniel
    Borkmann.

    6) skb_network_protocol() should not use hard-coded ETH_HLEN, instead
    skb->mac_len needs to be used. From Vlad Yasevich.

    7) We have several cases of using socket based communications to
    implement a tunnel. For example, some tunnels are encapsulations
    over UDP so we use an internal kernel UDP socket to do the
    transmits.

    These tunnels should behave just like other software devices and
    pass the packets on down to the next layer.

    Most importantly we want the top-level socket (eg TCP) that created
    the traffic to be charged for the SKB memory.

    However, once you get into the IP output path, we have code that
    assumed that whatever was attached to skb->sk is an IP socket.

    To keep the top-level socket being charged for the SKB memory,
    whilst satisfying the needs of the IP output path, we now pass in an
    explicit 'sk' argument.

    From Eric Dumazet.

    8) ping_init_sock() leaks group info, from Xiaoming Wang.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (33 commits)
    cxgb4: use the correct max size for firmware flash
    qlcnic: Fix MSI-X initialization code
    ip6_gre: don't allow to remove the fb_tunnel_dev
    ipv4: add a sock pointer to dst->output() path.
    ipv4: add a sock pointer to ip_queue_xmit()
    driver/net: cosa driver uses udelay incorrectly
    at86rf230: fix __at86rf230_read_subreg function
    at86rf230: remove check if AVDD settled
    net: cadence: Add architecture dependencies
    net: Start with correct mac_len in skb_network_protocol
    Revert "net: sctp: Fix a_rwnd/rwnd management to reflect real state of the receiver's buffer"
    cxgb4: Save the correct mac addr for hw-loopback connections in the L2T
    net: filter: seccomp: fix wrong decoding of BPF_S_ANC_SECCOMP_LD_W
    seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
    qlcnic: Do not disable SR-IOV when VFs are assigned to VMs
    qlcnic: Fix QLogic application/driver interface for virtual NIC configuration
    qlcnic: Fix PVID configuration on eSwitch port.
    qlcnic: Fix max ring count calculation
    qlcnic: Fix to send INIT_NIC_FUNC as first mailbox.
    qlcnic: Fix panic due to uninitialzed delayed_work struct in use.
    ...

    Linus Torvalds
     

15 Apr, 2014

2 commits

  • Pablo Neira Ayuso says:

    ====================
    Netfilter fixes for net

    The following patchset contains three Netfilter fixes for your net tree,
    they are:

    * Fix missing generation sequence initialization which results in a splat
    if lockdep is enabled, it was introduced in the recent works to improve
    nf_conntrack scalability, from Andrey Vagin.

    * Don't flush the GRE keymap list in nf_conntrack when the pptp helper is
    disabled otherwise this crashes due to a double release, from Andrey
    Vagin.

    * Fix nf_tables cmp fast in big endian, from Patrick McHardy.
    ====================

    Signed-off-by: David S. Miller

    David S. Miller
     
  • While reviewing seccomp code, we found that BPF_S_ANC_SECCOMP_LD_W has
    been wrongly decoded by commit a8fc927780 ("sk-filter: Add ability to
    get socket filter program (v2)") into the opcode BPF_LD|BPF_B|BPF_ABS
    although it should have been decoded as BPF_LD|BPF_W|BPF_ABS.

    In practice, this should not have much side-effect though, as such
    conversion is/was being done through prctl(2) PR_SET_SECCOMP. Reverse
    operation PR_GET_SECCOMP will only return the current seccomp mode, but
    not the filter itself. Since the transition to the new BPF infrastructure,
    it's also not used anymore, so we can simply remove this as it's
    unreachable.

    Fixes: a8fc927780 ("sk-filter: Add ability to get socket filter program (v2)")
    Signed-off-by: Daniel Borkmann
    Signed-off-by: Alexei Starovoitov
    Cc: Pavel Emelyanov
    Signed-off-by: David S. Miller

    Daniel Borkmann
     

14 Apr, 2014

1 commit

  • Pull slab changes from Pekka Enberg:
    "The biggest change is byte-sized freelist indices which reduces slab
    freelist memory usage:

    https://lkml.org/lkml/2013/12/2/64"

    * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux:
    mm: slab/slub: use page->list consistently instead of page->lru
    mm/slab.c: cleanup outdated comments and unify variables naming
    slab: fix wrongly used macro
    slub: fix high order page allocation problem with __GFP_NOFAIL
    slab: Make allocations with GFP_ZERO slightly more efficient
    slab: make more slab management structure off the slab
    slab: introduce byte sized index for the freelist of a slab
    slab: restrict the number of objects in a slab
    slab: introduce helper functions to get/set free object
    slab: factor out calculate nr objects in cache_estimate

    Linus Torvalds
     

13 Apr, 2014

6 commits

  • Pull yet more networking updates from David Miller:

    1) Various fixes to the new Redpine Signals wireless driver, from
    Fariya Fatima.

    2) L2TP PPP connect code takes PMTU from the wrong socket, fix from
    Dmitry Petukhov.

    3) UFO and TSO packets differ in whether they include the protocol
    header in gso_size, account for that in skb_gso_transport_seglen().
    From Florian Westphal.

    4) If VLAN untagging fails, we double free the SKB in the bridging
    output path. From Toshiaki Makita.

    5) Several call sites of sk->sk_data_ready() were referencing an SKB
    just added to the socket receive queue in order to calculate the
    second argument via skb->len. This is dangerous because the moment
    the skb is added to the receive queue it can be consumed in another
    context and freed up.

    It turns out also that none of the sk->sk_data_ready()
    implementations even care about this second argument.

    So just kill it off and thus fix all these use-after-free bugs as a
    side effect.

    6) Fix inverted test in tcp_v6_send_response(), from Lorenzo Colitti.

    7) pktgen needs to do locking properly for LLTX devices, from Daniel
    Borkmann.

    8) xen-netfront driver initializes TX array entries in RX loop :-) From
    Vincenzo Maffione.

    9) After refactoring, some tunnel drivers allow a tunnel to be
    configured on top itself. Fix from Nicolas Dichtel.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (46 commits)
    vti: don't allow to add the same tunnel twice
    gre: don't allow to add the same tunnel twice
    drivers: net: xen-netfront: fix array initialization bug
    pktgen: be friendly to LLTX devices
    r8152: check RTL8152_UNPLUG
    net: sun4i-emac: add promiscuous support
    net/apne: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO
    net: ipv6: Fix oif in TCP SYN+ACK route lookup.
    drivers: net: cpsw: enable interrupts after napi enable and clearing previous interrupts
    drivers: net: cpsw: discard all packets received when interface is down
    net: Fix use after free by removing length arg from sk_data_ready callbacks.
    Drivers: net: hyperv: Address UDP checksum issues
    Drivers: net: hyperv: Negotiate suitable ndis version for offload support
    Drivers: net: hyperv: Allocate memory for all possible per-pecket information
    bridge: Fix double free and memory leak around br_allowed_ingress
    bonding: Remove debug_fs files when module init fails
    i40evf: program RSS LUT correctly
    i40evf: remove open-coded skb_cow_head
    ixgb: remove open-coded skb_cow_head
    igbvf: remove open-coded skb_cow_head
    ...

    Linus Torvalds
     
  • Pull llvm patches from Behan Webster:
    "These are some initial updates to support compiling the kernel with
    clang.

    These patches have been through the proper reviews to the best of my
    ability, and have been soaking in linux-next for a few weeks. These
    patches by themselves still do not completely allow clang to be used
    with the kernel code, but lay the foundation for other patches which
    are still under review.

    Several other of the LLVMLinux patches have been already added via
    maintainer trees"

    * tag 'llvmlinux-for-v3.15' of git://git.linuxfoundation.org/llvmlinux/kernel:
    x86: LLVMLinux: Fix "incomplete type const struct x86cpu_device_id"
    x86 kbuild: LLVMLinux: More cc-options added for clang
    x86, acpi: LLVMLinux: Remove nested functions from Thinkpad ACPI
    LLVMLinux: Add support for clang to compiler.h and new compiler-clang.h
    LLVMLinux: Remove warning about returning an uninitialized variable
    kbuild: LLVMLinux: Fix LINUX_COMPILER definition script for compilation with clang
    Documentation: LLVMLinux: Update Documentation/dontdiff
    kbuild: LLVMLinux: Adapt warnings for compilation with clang
    kbuild: LLVMLinux: Add Kbuild support for building kernel with Clang

    Linus Torvalds
     
  • Pull PCIe non-transparent bridge fixes and features from Jon Mason:
    "NTB driver bug fixes to address issues in list traversal, skb leak in
    ntb_netdev, a typo, and a leak of msix entries in the error path.
    Clean ups of the event handling logic, as well as a overall style
    cleanup. Finally, the driver was converted to use the new
    pci_enable_msix_range logic (and the refactoring to go along with it)"

    * tag 'ntb-3.15' of git://github.com/jonmason/ntb:
    ntb: Use pci_enable_msix_range() instead of pci_enable_msix()
    ntb: Split ntb_setup_msix() into separate BWD/SNB routines
    ntb: Use pci_msix_vec_count() to obtain number of MSI-Xs
    NTB: Code Style Clean-up
    NTB: client event cleanup
    ntb: Fix leakage of ntb_device::msix_entries[] array
    NTB: Fix typo in setting one translation register
    ntb_netdev: Fix skb free issue in open
    ntb_netdev: Fix list_for_each_entry exit issue

    Linus Torvalds
     
  • Pull vfs updates from Al Viro:
    "The first vfs pile, with deep apologies for being very late in this
    window.

    Assorted cleanups and fixes, plus a large preparatory part of iov_iter
    work. There's a lot more of that, but it'll probably go into the next
    merge window - it *does* shape up nicely, removes a lot of
    boilerplate, gets rid of locking inconsistencie between aio_write and
    splice_write and I hope to get Kent's direct-io rewrite merged into
    the same queue, but some of the stuff after this point is having
    (mostly trivial) conflicts with the things already merged into
    mainline and with some I want more testing.

    This one passes LTP and xfstests without regressions, in addition to
    usual beating. BTW, readahead02 in ltp syscalls testsuite has started
    giving failures since "mm/readahead.c: fix readahead failure for
    memoryless NUMA nodes and limit readahead pages" - might be a false
    positive, might be a real regression..."

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
    missing bits of "splice: fix racy pipe->buffers uses"
    cifs: fix the race in cifs_writev()
    ceph_sync_{,direct_}write: fix an oops on ceph_osdc_new_request() failure
    kill generic_file_buffered_write()
    ocfs2_file_aio_write(): switch to generic_perform_write()
    ceph_aio_write(): switch to generic_perform_write()
    xfs_file_buffered_aio_write(): switch to generic_perform_write()
    export generic_perform_write(), start getting rid of generic_file_buffer_write()
    generic_file_direct_write(): get rid of ppos argument
    btrfs_file_aio_write(): get rid of ppos
    kill the 5th argument of generic_file_buffered_write()
    kill the 4th argument of __generic_file_aio_write()
    lustre: don't open-code kernel_recvmsg()
    ocfs2: don't open-code kernel_recvmsg()
    drbd: don't open-code kernel_recvmsg()
    constify blk_rq_map_user_iov() and friends
    lustre: switch to kernel_sendmsg()
    ocfs2: don't open-code kernel_sendmsg()
    take iov_iter stuff to mm/iov_iter.c
    process_vm_access: tidy up a bit
    ...

    Linus Torvalds
     
  • Pull more tracing updates from Steven Rostedt:
    "This includes the final patch to clean up and fix the issue with the
    design of tracepoints and how a user could register a tracepoint and
    have that tracepoint not be activated but no error was shown.

    The design was for an out of tree module but broke in tree users. The
    clean up was to remove the saving of the hash table of tracepoint
    names such that they can be enabled before they exist (enabling a
    module tracepoint before that module is loaded). This added more
    complexity than needed. The clean up was to remove that code and just
    enable tracepoints that exist or fail if they do not.

    This removed a lot of code as well as the complexity that it brought.
    As a side effect, instead of registering a tracepoint by its name, the
    tracepoint needs to be registered with the tracepoint descriptor.
    This removes having to duplicate the tracepoint names that are
    enabled.

    The second patch was added that simplified the way modules were
    searched for.

    This cleanup required changes that were in the 3.15 queue as well as
    some changes that were added late in the 3.14-rc cycle. This final
    change waited till the two were merged in upstream and then the change
    was added and full tests were run. Unfortunately, the test found some
    errors, but after it was already submitted to the for-next branch and
    not to be rebased. Sparse errors were detected by Fengguang Wu's bot
    tests, and my internal tests discovered that the anonymous union
    initialization triggered a bug in older gcc compilers. Luckily, there
    was a bugzilla for the gcc bug which gave a work around to the
    problem. The third and fourth patch handled the sparse error and the
    gcc bug respectively.

    A final patch was tagged along to fix a missing documentation for the
    README file"

    * tag 'trace-3.15-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
    tracing: Add missing function triggers dump and cpudump to README
    tracing: Fix anonymous unions in struct ftrace_event_call
    tracepoint: Fix sparse warnings in tracepoint.c
    tracepoint: Simplify tracepoint module search
    tracepoint: Use struct pointer instead of name hash for reg/unreg tracepoints

    Linus Torvalds
     
  • Pull audit updates from Eric Paris.

    * git://git.infradead.org/users/eparis/audit: (28 commits)
    AUDIT: make audit_is_compat depend on CONFIG_AUDIT_COMPAT_GENERIC
    audit: renumber AUDIT_FEATURE_CHANGE into the 1300 range
    audit: do not cast audit_rule_data pointers pointlesly
    AUDIT: Allow login in non-init namespaces
    audit: define audit_is_compat in kernel internal header
    kernel: Use RCU_INIT_POINTER(x, NULL) in audit.c
    sched: declare pid_alive as inline
    audit: use uapi/linux/audit.h for AUDIT_ARCH declarations
    syscall_get_arch: remove useless function arguments
    audit: remove stray newline from audit_log_execve_info() audit_panic() call
    audit: remove stray newlines from audit_log_lost messages
    audit: include subject in login records
    audit: remove superfluous new- prefix in AUDIT_LOGIN messages
    audit: allow user processes to log from another PID namespace
    audit: anchor all pid references in the initial pid namespace
    audit: convert PPIDs to the inital PID namespace.
    pid: get pid_t ppid of task in init_pid_ns
    audit: rename the misleading audit_get_context() to audit_take_context()
    audit: Add generic compat syscall support
    audit: Add CONFIG_HAVE_ARCH_AUDITSYSCALL
    ...

    Linus Torvalds
     

12 Apr, 2014

3 commits

  • Pull NVMe driver updates from Matthew Wilcox:
    "Various updates to the NVMe driver. The most user-visible change is
    that drive hotplugging now works and CPU hotplug while an NVMe drive
    is installed should also work better"

    * git://git.infradead.org/users/willy/linux-nvme:
    NVMe: Retry failed commands with non-fatal errors
    NVMe: Add getgeo to block ops
    NVMe: Start-stop nvme_thread during device add-remove.
    NVMe: Make I/O timeout a module parameter
    NVMe: CPU hot plug notification
    NVMe: per-cpu io queues
    NVMe: Replace DEFINE_PCI_DEVICE_TABLE
    NVMe: Fix divide-by-zero in nvme_trans_io_get_num_cmds
    NVMe: IOCTL path RCU protect queue access
    NVMe: RCU protected access to io queues
    NVMe: Initialize device reference count earlier
    NVMe: Add CONFIG_PM_SLEEP to suspend/resume functions

    Linus Torvalds
     
  • Pull more ACPI and power management fixes and updates from Rafael Wysocki:
    "This is PM and ACPI material that has emerged over the last two weeks
    and one fix for a CPU hotplug regression introduced by the recent CPU
    hotplug notifiers registration series.

    Included are intel_idle and turbostat updates from Len Brown (these
    have been in linux-next for quite some time), a new cpufreq driver for
    powernv (that might spend some more time in linux-next, but BenH was
    asking me so nicely to push it for 3.15 that I couldn't resist), some
    cpufreq fixes and cleanups (including fixes for some silly breakage in
    a couple of cpufreq drivers introduced during the 3.14 cycle),
    assorted ACPI cleanups, wakeup framework documentation fixes, a new
    sysfs attribute for cpuidle and a new command line argument for power
    domains diagnostics.

    Specifics:

    - Fix for a recently introduced CPU hotplug regression in ARM KVM
    from Ming Lei.

    - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32
    cpufreq drivers introduced during the 3.14 cycle (-stable material)
    from Chen Gang and Viresh Kumar.

    - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits
    from Gautham R Shenoy and Srivatsa S Bhat.

    - Exynos cpufreq driver fix preventing it from being included into
    multiplatform builds that aren't supported by it from Sachin Kamat.

    - cpufreq cleanups related to the usage of the driver_data field in
    struct cpufreq_frequency_table from Viresh Kumar.

    - cpufreq ppc driver cleanup from Sachin Kamat.

    - Intel BayTrail support for intel_idle and ACPI idle from Len Brown.

    - Intel CPU model 54 (Atom N2000 series) support for intel_idle from
    Jan Kiszka.

    - intel_idle fix for Intel Ivy Town residency targets from Len Brown.

    - turbostat updates (Intel Broadwell support and output cleanups)
    from Len Brown.

    - New cpuidle sysfs attribute for exporting C-states' target
    residency information to user space from Daniel Lezcano.

    - New kernel command line argument to prevent power domains enabled
    by the bootloader from being turned off even if they are not in use
    (for diagnostics purposes) from Tushar Behera.

    - Fixes for wakeup sysfs attributes documentation from Geert
    Uytterhoeven.

    - New ACPI video blacklist entry for ThinkPad Helix from Stephen
    Chandler Paul.

    - Assorted ACPI cleanups and a Kconfig help update from Jonghwan
    Choi, Zhihui Zhang, Hanjun Guo"

    * tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits)
    ACPI: Update the ACPI spec information in Kconfig
    arm, kvm: fix double lock on cpu_add_remove_lock
    cpuidle: sysfs: Export target residency information
    cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h
    cpufreq: create another field .flags in cpufreq_frequency_table
    cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table
    cpufreq: don't print value of .driver_data from core
    cpufreq: ia64: don't set .driver_data to index
    cpufreq: powernv: Select CPUFreq related Kconfig options for powernv
    cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids
    cpufreq: powernv: cpufreq driver for powernv platform
    cpufreq: at32ap: don't declare local variable as static
    cpufreq: loongson2_cpufreq: don't declare local variable as static
    cpufreq: unicore32: fix typo issue for 'clk'
    cpufreq: exynos: Disable on multiplatform build
    PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes
    PM / domains: Add pd_ignore_unused to keep power domains enabled
    ACPI / dock: Drop dock_device_ids[] table
    ACPI / video: Favor native backlight interface for ThinkPad Helix
    ACPI / thermal: Fix wrong variable usage in debug statement
    ...

    Linus Torvalds
     
  • Several spots in the kernel perform a sequence like:

    skb_queue_tail(&sk->s_receive_queue, skb);
    sk->sk_data_ready(sk, skb->len);

    But at the moment we place the SKB onto the socket receive queue it
    can be consumed and freed up. So this skb->len access is potentially
    to freed up memory.

    Furthermore, the skb->len can be modified by the consumer so it is
    possible that the value isn't accurate.

    And finally, no actual implementation of this callback actually uses
    the length argument. And since nobody actually cared about it's
    value, lots of call sites pass arbitrary values in such as '0' and
    even '1'.

    So just remove the length argument from the callback, that way there
    is no confusion whatsoever and all of these use-after-free cases get
    fixed as a side effect.

    Based upon a patch by Eric Dumazet and his suggestion to audit this
    issue tree-wide.

    Signed-off-by: David S. Miller

    David S. Miller
     

11 Apr, 2014

7 commits

  • 'struct page' has two list_head fields: 'lru' and 'list'. Conveniently,
    they are unioned together. This means that code can use them
    interchangably, which gets horribly confusing like with this nugget from
    slab.c:

    > list_del(&page->lru);
    > if (page->active == cachep->num)
    > list_add(&page->list, &n->slabs_full);

    This patch makes the slab and slub code use page->lru universally instead
    of mixing ->list and ->lru.

    So, the new rule is: page->lru is what the you use if you want to keep
    your page on a list. Don't like the fact that it's not called ->list?
    Too bad.

    Signed-off-by: Dave Hansen
    Acked-by: Christoph Lameter
    Acked-by: David Rientjes
    Cc: Pekka Enberg
    Signed-off-by: Andrew Morton
    Signed-off-by: Pekka Enberg

    Dave Hansen
     
  • Add support for the block multicast loopback QP creation flag along
    the proper firmware API for that.

    Signed-off-by: Eli Cohen
    Signed-off-by: Or Gerlitz
    Signed-off-by: Roland Dreier

    Eli Cohen
     
  • On systems with CONFIG_COMPAT we introduced the new requirement that
    audit_classify_compat_syscall() exists. This wasn't true for everything
    (apparently not for "tilegx", which I know less that nothing about.)

    Instead of wrapping the preprocessor optomization with CONFIG_COMPAT we
    should have used the new CONFIG_AUDIT_COMPAT_GENERIC. This patch uses
    that config option to make sure only arches which intend to implement
    this have the requirement.

    This works fine for tilegx according to Chris Metcalf
    Signed-off-by: Eric Paris

    Chris Metcalf
     
  • For commands returned with failed status, queue these for resubmission
    and continue retrying them until success or for a limited amount of
    time. The final timeout was arbitrarily chosen so requests can't be
    retried indefinitely.

    Since these are requeued on the nvmeq that submitted the command, the
    callbacks have to take an nvmeq instead of an nvme_dev as a parameter
    so that we can use the locked queue to append the iod to retry later.

    The nvme_iod conviently can be used to track how long we've been trying
    to successfully complete an iod request. The nvme_iod also provides the
    nvme prp dma mappings, so I had to move a few things around so we can
    keep those mappings.

    Signed-off-by: Keith Busch
    [fixed checkpatch issue with long line]
    Signed-off-by: Matthew Wilcox

    Keith Busch
     
  • Increase the default timeout to 30 seconds to match SCSI.

    Signed-off-by: Keith Busch
    [use byte instead of ushort]
    Signed-off-by: Matthew Wilcox

    Keith Busch
     
  • Registers with hot cpu notification to rebalance, and potentially allocate
    additional, io queues.

    Signed-off-by: Keith Busch
    Signed-off-by: Matthew Wilcox

    Keith Busch
     
  • The device's IO queues are associated with CPUs, so we can use a per-cpu
    variable to map the a qid to a cpu. This provides a convienient way
    to optimally assign queues to multiple cpus when the device supports
    fewer queues than the host has cpus. The previous implementation may
    have assigned these poorly in these situations. This patch addresses
    this by sharing queues among cpus that are "close" together and should
    have a lower lock contention penalty.

    Signed-off-by: Keith Busch
    Signed-off-by: Matthew Wilcox

    Keith Busch