29 Jan, 2014

3 commits

  • Signed-off-by: David S. Miller

    David S. Miller
     
  • The pci.o is built for SPARC64_PCI -- which is bool, and hence
    this code is either present or absent. It will never be modular,
    so using module_init as an alias for __initcall can be somewhat
    misleading.

    Fix this up now, so that we can relocate module_init from
    init.h into module.h in the future. If we don't do this, we'd
    have to add module.h to obviously non-modular code, and that
    would be a worse thing.

    Note that direct use of __initcall is discouraged, vs. one
    of the priority categorized subgroups. As __initcall gets
    mapped onto device_initcall, our use of device_initcall
    directly in this change means that the runtime impact is
    zero -- it will remain at level 6 in initcall ordering.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Paul Gortmaker
     
  • None of these files are actually using any __init type directives
    and hence don't need to include . Most are just a
    left over from __devinit and __cpuinit removal, or simply due to
    code getting copied from one driver to the next.

    Signed-off-by: Paul Gortmaker
    Signed-off-by: David S. Miller

    Paul Gortmaker
     

26 Jan, 2014

1 commit

  • Pull networking updates from David Miller:

    1) BPF debugger and asm tool by Daniel Borkmann.

    2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

    3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

    4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match. From Ben Hutchings.

    5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

    6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

    7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

    8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

    9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

    10) Fix ipv6 router reachability probing, from Jiri Benc.

    11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

    12) Support tunneling in GRO layer, from Jerry Chu.

    13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

    14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI. From Atzm Watanabe.

    15) New "Heavy Hitter" qdisc, from Terry Lam.

    16) Significantly improve the IPSEC support in pktgen, from Fan Du.

    17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom
    Herbert.

    18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

    19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

    20) Key TCP metrics blobs also by source address, not just destination
    address. From Christoph Paasch.

    21) Support 10G in generic phylib. From Andy Fleming.

    22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided. From Tom Herbert.

    The wireless and netfilter folks have been busy little bees too.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
    net/cxgb4: Fix referencing freed adapter
    ipv6: reallocate addrconf router for ipv6 address when lo device up
    fib_frontend: fix possible NULL pointer dereference
    rtnetlink: remove IFLA_BOND_SLAVE definition
    rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
    qlcnic: update version to 5.3.55
    qlcnic: Enhance logic to calculate msix vectors.
    qlcnic: Refactor interrupt coalescing code for all adapters.
    qlcnic: Update poll controller code path
    qlcnic: Interrupt code cleanup
    qlcnic: Enhance Tx timeout debugging.
    qlcnic: Use bool for rx_mac_learn.
    bonding: fix u64 division
    rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
    sfc: Use the correct maximum TX DMA ring size for SFC9100
    Add Shradha Shah as the sfc driver maintainer.
    net/vxlan: Share RX skb de-marking and checksum checks with ovs
    tulip: cleanup by using ARRAY_SIZE()
    ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
    net/cxgb4: Don't retrieve stats during recovery
    ...

    Linus Torvalds
     

25 Jan, 2014

1 commit

  • Pull input subsystem updates from Dmitry Torokhov:
    "Just a swath of driver fixes and cleanups, no new drivers this time
    (although ALPS now supports one of the newer protocols, more to come)"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (57 commits)
    Input: wacom - add support for DTU-1031
    Input: wacom - fix wacom->shared guards for dual input devices
    Input: edt_ft5x06 - use devm_* functions where appropriate
    Input: hyperv-keyboard - pass through 0xE1 prefix
    Input: logips2pp - fix spelling s/reciver/receiver/
    Input: delete non-required instances of include
    Input: twl4030-keypad - convert to using managed resources
    Input: twl6040-vibra - remove unneeded check for CONFIG_OF
    Input: twl4030-keypad - add device tree support
    Input: twl6040-vibra - add missing of_node_put
    Input: twl4030-vibra - add missing of_node_put
    Input: i8042 - cleanup SERIO_I8042 dependencies
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on x86
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on unicore32
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on sparc
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO for SH_CAYMAN
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on powerpc
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on mips
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on IA64
    Input: i8042 - select ARCH_MIGHT_HAVE_PC_SERIO on ARM/Footbridge
    ...

    Linus Torvalds
     

24 Jan, 2014

1 commit

  • Remove an outdated reference to "most personal computers" having only one
    CPU, and change the use of "singleprocessor" and "single processor" in
    CONFIG_SMP's documentation to "uniprocessor" across all arches where that
    documentation is present.

    Signed-off-by: Robert Graffham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Robert Graffham
     

23 Jan, 2014

1 commit

  • Pull PCI updates from Bjorn Helgaas:
    "PCI changes for the v3.14 merge window:

    Resource management
    - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas)
    - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu)
    - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas)
    - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas)
    - Enforce bus address limits in resource allocation (Yinghai Lu)
    - Allocate 64-bit BARs above 4G when possible (Yinghai Lu)
    - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu)

    PCI device hotplug
    - Major rescan/remove locking update (Rafael J. Wysocki)
    - Make ioapic builtin only (not modular) (Yinghai Lu)
    - Fix release/free issues (Yinghai Lu)
    - Clean up pciehp (Bjorn Helgaas)
    - Announce pciehp slot info during enumeration (Bjorn Helgaas)

    MSI
    - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev)
    - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev)
    - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev)
    - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman)
    - Drop "irq" param from *_restore_msi_irqs() (DuanZhenzhong)

    SR-IOV
    - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao)

    Virtualization
    - Add support for save/restore of extended capabilities (Alex Williamson)
    - Add Virtual Channel to save/restore support (Alex Williamson)
    - Never treat a VF as a multifunction device (Alex Williamson)
    - Add pci_try_reset_function(), et al (Alex Williamson)

    AER
    - Ignore non-PCIe error sources (Betty Dall)
    - Support ACPI HEST error sources for domains other than 0 (Betty Dall)
    - Consolidate HEST error source parsers (Bjorn Helgaas)
    - Add a TLP header print helper (Borislav Petkov)

    Freescale i.MX6
    - Remove unnecessary code (Fabio Estevam)
    - Make reset-gpio optional (Marek Vasut)
    - Report "link up" only after link training completes (Marek Vasut)
    - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut)
    - Fix PCIe startup code (Richard Zhu)

    Marvell MVEBU
    - Remove duplicate of_clk_get_by_name() call (Andrew Lunn)
    - Drop writes to bridge Secondary Status register (Jason Gunthorpe)
    - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe)
    - Support a bridge with no IO port window (Jason Gunthorpe)
    - Use max_t() instead of max(resource_size_t,) (Jingoo Han)
    - Remove redundant of_match_ptr (Sachin Kamat)
    - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni)

    NVIDIA Tegra
    - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower)

    Renesas R-Car
    - Add runtime PM support (Valentine Barshak)
    - Fix rcar_pci_probe() return value check (Wei Yongjun)

    Synopsys DesignWare
    - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen)
    - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen)
    - Fix missing MSI IRQs (Harro Haan)
    - Add dw_pcie prefix before cfg_read/write (Pratyush Anand)
    - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand)
    - Whitespace cleanup (Jingoo Han)

    EISA
    - Call put_device() if device_register() fails (Levente Kurusa)
    - Revert EISA initialization breakage ((Bjorn Helgaas)

    Miscellaneous
    - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger)
    - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas)
    - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas)
    - Use dev_is_pci() to identify PCI devices (Yijing Wang)
    - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches)
    - Update documentation 00-INDEX (Erik Ekman)"

    * tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (119 commits)
    Revert "EISA: Initialize device before its resources"
    Revert "EISA: Log device resources in dmesg"
    vfio-pci: Use pci "try" reset interface
    PCI: Check parent kobject in pci_destroy_dev()
    xen/pcifront: Use global PCI rescan-remove locking
    powerpc/eeh: Use global PCI rescan-remove locking
    PCI: Fix pci_check_and_unmask_intx() comment typos
    PCI: Add pci_try_reset_function(), pci_try_reset_slot(), pci_try_reset_bus()
    MPT / PCI: Use pci_stop_and_remove_bus_device_locked()
    platform / x86: Use global PCI rescan-remove locking
    PCI: hotplug: Use global PCI rescan-remove locking
    pcmcia: Use global PCI rescan-remove locking
    ACPI / hotplug / PCI: Use global PCI rescan-remove locking
    ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug
    PCI: Add global pci_lock_rescan_remove()
    PCI: Cleanup pci.h whitespace
    PCI: Reorder so actual code comes before stubs
    PCI/AER: Support ACPI HEST AER error sources for PCI domains other than 0
    ACPICA: Add helper macros to extract bus/segment numbers from HEST table.
    PCI: Make local functions static
    ...

    Linus Torvalds
     

22 Jan, 2014

1 commit

  • [sfr@canb.auug.org.au: fix powerpc build]
    Signed-off-by: Tang Chen
    Reviewed-by: Zhang Yanfei
    Cc: "H. Peter Anvin"
    Cc: "Rafael J . Wysocki"
    Cc: Chen Tang
    Cc: Gong Chen
    Cc: Ingo Molnar
    Cc: Jiang Liu
    Cc: Johannes Weiner
    Cc: Lai Jiangshan
    Cc: Larry Woodman
    Cc: Len Brown
    Cc: Liu Jiang
    Cc: Mel Gorman
    Cc: Michal Nazarewicz
    Cc: Minchan Kim
    Cc: Prarit Bhargava
    Cc: Rik van Riel
    Cc: Taku Izumi
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Thomas Renninger
    Cc: Toshi Kani
    Cc: Vasilis Liaskovitis
    Cc: Wanpeng Li
    Cc: Wen Congyang
    Cc: Yasuaki Ishimatsu
    Cc: Yinghai Lu
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tang Chen
     

21 Jan, 2014

1 commit

  • Pull core locking changes from Ingo Molnar:
    - futex performance increases: larger hashes, smarter wakeups
    - mutex debugging improvements
    - lots of SMP ordering documentation updates
    - introduce the smp_load_acquire(), smp_store_release() primitives.
    (There are WIP patches that make use of them - not yet merged)
    - lockdep micro-optimizations
    - lockdep improvement: better cover IRQ contexts
    - liblockdep at last. We'll continue to monitor how useful this is

    * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
    futexes: Fix futex_hashsize initialization
    arch: Re-sort some Kbuild files to hopefully help avoid some conflicts
    futexes: Avoid taking the hb->lock if there's nothing to wake up
    futexes: Document multiprocessor ordering guarantees
    futexes: Increase hash table size for better performance
    futexes: Clean up various details
    arch: Introduce smp_load_acquire(), smp_store_release()
    arch: Clean up asm/barrier.h implementations using asm-generic/barrier.h
    arch: Move smp_mb__{before,after}_atomic_{inc,dec}.h into asm/atomic.h
    locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
    mutexes: Give more informative mutex warning in the !lock->owner case
    powerpc: Full barrier for smp_mb__after_unlock_lock()
    rcu: Apply smp_mb__after_unlock_lock() to preserve grace periods
    Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
    locking: Add an smp_mb__after_unlock_lock() for UNLOCK+BLOCK barrier
    Documentation/memory-barriers.txt: Document ACCESS_ONCE()
    Documentation/memory-barriers.txt: Prohibit speculative writes
    Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
    Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
    Revert "smp/cpumask: Make CONFIG_CPUMASK_OFFSTACK=y usable without debug dependency"
    ...

    Linus Torvalds
     

19 Jan, 2014

1 commit

  • For user space packet capturing libraries such as libpcap, there's
    currently only one way to check which BPF extensions are supported
    by the kernel, that is, commit aa1113d9f85d ("net: filter: return
    -EINVAL if BPF_S_ANC* operation is not supported"). For querying all
    extensions at once this might be rather inconvenient.

    Therefore, this patch introduces a new option which can be used as
    an argument for getsockopt(), and allows one to obtain information
    about which BPF extensions are supported by the current kernel.

    As David Miller suggests, we do not need to define any bits right
    now and status quo can just return 0 in order to state that this
    versions supports SKF_AD_PROTOCOL up to SKF_AD_PAY_OFFSET. Later
    additions to BPF extensions need to add their bits to the
    bpf_tell_extensions() function, as documented in the comment.

    Signed-off-by: Michal Sekletar
    Cc: David Miller
    Reviewed-by: Daniel Borkmann
    Signed-off-by: David S. Miller

    Michal Sekletar
     

18 Jan, 2014

1 commit


16 Jan, 2014

1 commit

  • At first Jakub Zawadzki noticed that some divisions by reciprocal_divide
    were not correct. (off by one in some cases)
    http://www.wireshark.org/~darkjames/reciprocal-buggy.c

    He could also show this with BPF:
    http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c

    The reciprocal divide in linux kernel is not generic enough,
    lets remove its use in BPF, as it is not worth the pain with
    current cpus.

    Signed-off-by: Eric Dumazet
    Reported-by: Jakub Zawadzki
    Cc: Mircea Gherzan
    Cc: Daniel Borkmann
    Cc: Hannes Frederic Sowa
    Cc: Matt Evans
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: David S. Miller
    Signed-off-by: David S. Miller

    Eric Dumazet
     

15 Jan, 2014

1 commit


13 Jan, 2014

1 commit


12 Jan, 2014

2 commits

  • A number of situations currently require the heavyweight smp_mb(),
    even though there is no need to order prior stores against later
    loads. Many architectures have much cheaper ways to handle these
    situations, but the Linux kernel currently has no portable way
    to make use of them.

    This commit therefore supplies smp_load_acquire() and
    smp_store_release() to remedy this situation. The new
    smp_load_acquire() primitive orders the specified load against
    any subsequent reads or writes, while the new smp_store_release()
    primitive orders the specifed store against any prior reads or
    writes. These primitives allow array-based circular FIFOs to be
    implemented without an smp_mb(), and also allow a theoretical
    hole in rcu_assign_pointer() to be closed at no additional
    expense on most architectures.

    In addition, the RCU experience transitioning from explicit
    smp_read_barrier_depends() and smp_wmb() to rcu_dereference()
    and rcu_assign_pointer(), respectively resulted in substantial
    improvements in readability. It therefore seems likely that
    replacing other explicit barriers with smp_load_acquire() and
    smp_store_release() will provide similar benefits. It appears
    that roughly half of the explicit barriers in core kernel code
    might be so replaced.

    [Changelog by PaulMck]

    Reviewed-by: "Paul E. McKenney"
    Signed-off-by: Peter Zijlstra
    Acked-by: Will Deacon
    Cc: Benjamin Herrenschmidt
    Cc: Frederic Weisbecker
    Cc: Mathieu Desnoyers
    Cc: Michael Ellerman
    Cc: Michael Neuling
    Cc: Russell King
    Cc: Geert Uytterhoeven
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Victor Kaplansky
    Cc: Tony Luck
    Cc: Oleg Nesterov
    Link: http://lkml.kernel.org/r/20131213150640.908486364@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • We're going to be adding a few new barrier primitives, and in order to
    avoid endless duplication make more agressive use of
    asm-generic/barrier.h.

    Change the asm-generic/barrier.h such that it allows partial barrier
    definitions and fills out the rest with defaults.

    There are a few architectures (m32r, m68k) that could probably
    do away with their barrier.h file entirely but are kept for now due to
    their unconventional nop() implementation.

    Suggested-by: Geert Uytterhoeven
    Reviewed-by: "Paul E. McKenney"
    Reviewed-by: Mathieu Desnoyers
    Signed-off-by: Peter Zijlstra
    Cc: Michael Ellerman
    Cc: Michael Neuling
    Cc: Russell King
    Cc: Heiko Carstens
    Cc: Linus Torvalds
    Cc: Martin Schwidefsky
    Cc: Victor Kaplansky
    Cc: Tony Luck
    Cc: Oleg Nesterov
    Cc: Benjamin Herrenschmidt
    Cc: Frederic Weisbecker
    Link: http://lkml.kernel.org/r/20131213150640.846368594@infradead.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

11 Jan, 2014

1 commit

  • * pci/resource:
    PCI: Allocate 64-bit BARs above 4G when possible
    PCI: Enforce bus address limits in resource allocation
    PCI: Split out bridge window override of minimum allocation address
    agp/ati: Use PCI_COMMAND instead of hard-coded 4
    agp/intel: Use CPU physical address, not bus address, for ioremap()
    agp/intel: Use pci_bus_address() to get GTTADR bus address
    agp/intel: Use pci_bus_address() to get MMADR bus address
    agp/intel: Support 64-bit GMADR
    agp/intel: Rename gtt_bus_addr to gtt_phys_addr
    drm/i915: Rename gtt_bus_addr to gtt_phys_addr
    agp: Use pci_resource_start() to get CPU physical address for BAR
    agp: Support 64-bit APBASE
    PCI: Add pci_bus_address() to get bus address of a BAR
    PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev
    PCI: Change pci_bus_region addresses to dma_addr_t

    Bjorn Helgaas
     

07 Jan, 2014

1 commit


05 Jan, 2014

4 commits

  • Pull sparc bugfixes from David Miller:

    1) Missing include can lead to build failure, from Kirill Tkhai.

    2) Use dev_is_pci() where applicable, from Yijing Wang.

    3) Enable irqs after we enable preemption in cpu startup path, from
    Kirill Tkhai.

    4) Revert a __copy_{to,from}_user_inatomic change that broke
    iov_iter_copy_from_user_atomic() and thus several tests in xfstests
    and LTP. From Dave Kleikamp.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
    Revert "sparc64: Fix __copy_{to,from}_user_inatomic defines."
    sparc64: smp_callin: Enable irqs after preemption is disabled
    sparc/PCI: Use dev_is_pci() to identify PCI devices
    sparc64: Fix build regression

    Linus Torvalds
     
  • This reverts commit 145e1c0023585e0e8f6df22316308ec61c5066b2.

    This commit broke the behavior of __copy_from_user_inatomic when
    it is only partially successful. Instead of returning the number
    of bytes not copied, it now returns 1. This translates to the
    wrong value being returned by iov_iter_copy_from_user_atomic.

    xfstests generic/246 and LTP writev01 both fail on btrfs and nfs
    because of this.

    Signed-off-by: Dave Kleikamp
    Cc: Hugh Dickins
    Cc: David S. Miller
    Cc: sparclinux@vger.kernel.org
    Signed-off-by: David S. Miller

    Dave Kleikamp
     
  • Most of other architectures have below suggested order.
    So lets do the same to fit generic idle loop scheme better.

    Signed-off-by: Kirill Tkhai
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • Use dev_is_pci() instead of checking bus type directly.

    Signed-off-by: Yijing Wang
    Signed-off-by: Bjorn Helgaas
    Signed-off-by: David S. Miller

    Yijing Wang
     

03 Jan, 2014

1 commit


22 Dec, 2013

1 commit

  • These interfaces:

    pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource)
    pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region)

    took a pci_dev, but they really depend only on the pci_bus. And we want to
    use them in resource allocation paths where we have the bus but not a
    device, so this patch converts them to take the pci_bus instead of the
    pci_dev:

    pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource)
    pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region)

    In fact, with standard PCI-PCI bridges, they only depend on the host
    bridge, because that's the only place address translation occurs, but
    we aren't going that far yet.

    [bhelgaas: changelog]
    Signed-off-by: Yinghai Lu
    Signed-off-by: Bjorn Helgaas

    Yinghai Lu
     

19 Dec, 2013

1 commit

  • There are a few subtle races, between change_protection_range (used by
    mprotect and change_prot_numa) on one side, and NUMA page migration and
    compaction on the other side.

    The basic race is that there is a time window between when the PTE gets
    made non-present (PROT_NONE or NUMA), and the TLB is flushed.

    During that time, a CPU may continue writing to the page.

    This is fine most of the time, however compaction or the NUMA migration
    code may come in, and migrate the page away.

    When that happens, the CPU may continue writing, through the cached
    translation, to what is no longer the current memory location of the
    process.

    This only affects x86, which has a somewhat optimistic pte_accessible.
    All other architectures appear to be safe, and will either always flush,
    or flush whenever there is a valid mapping, even with no permissions
    (SPARC).

    The basic race looks like this:

    CPU A CPU B CPU C

    load TLB entry
    make entry PTE/PMD_NUMA
    fault on entry
    read/write old page
    start migrating page
    change PTE/PMD to new page
    read/write old page [*]
    flush TLB
    reload TLB from new entry
    read/write new page
    lose data

    [*] the old page may belong to a new user at this point!

    The obvious fix is to flush remote TLB entries, by making sure that
    pte_accessible aware of the fact that PROT_NONE and PROT_NUMA memory may
    still be accessible if there is a TLB flush pending for the mm.

    This should fix both NUMA migration and compaction.

    [mgorman@suse.de: fix build]
    Signed-off-by: Rik van Riel
    Signed-off-by: Mel Gorman
    Cc: Alex Thorlton
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel
     

18 Dec, 2013

1 commit


12 Dec, 2013

1 commit


04 Dec, 2013

1 commit

  • This patch fixes build error which was introduced by commit

    812cb83a56a908729c453a7db3fb2c262119bc9d (Implement HAVE_CONTEXT_TRACKING).

    [*]https://lkml.org/lkml/2013/11/23/103

    Signed-off-by: Kirill Tkhai
    CC: David Miller
    Signed-off-by: David S. Miller

    Kirill Tkhai
     

20 Nov, 2013

2 commits

  • Pull sparc fixes from David Miller:
    "Two merge window fallout build fixes"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
    sparc64: merge fix
    sparc64: fix build regession

    Linus Torvalds
     
  • Pull irq cleanups from Ingo Molnar:
    "This is a multi-arch cleanup series from Thomas Gleixner, which we
    kept to near the end of the merge window, to not interfere with
    architecture updates.

    This series (motivated by the -rt kernel) unifies more aspects of IRQ
    handling and generalizes PREEMPT_ACTIVE"

    * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    preempt: Make PREEMPT_ACTIVE generic
    sparc: Use preempt_schedule_irq
    ia64: Use preempt_schedule_irq
    m32r: Use preempt_schedule_irq
    hardirq: Make hardirq bits generic
    m68k: Simplify low level interrupt handling code
    genirq: Prevent spurious detection for unconditionally polled interrupts

    Linus Torvalds
     

19 Nov, 2013

2 commits

  • After merging the final tree, today's linux-next build (sparc64 defconfig)
    failed like this:

    arch/sparc/mm/init_64.c: In function 'pte_alloc_one':
    arch/sparc/mm/init_64.c:2568:9: error: unused variable 'pte' [-Werror=unused-variable]

    Caused by the merge between commit 37b3a8ff3e08 ("sparc64: Move from 4MB
    to 8MB huge pages") and commit 1ae9ae5f7df7 ("sparc: handle
    pgtable_page_ctor() fail") (I had the following merge fix in linux-next,
    but it didn't seem to propagate upstream - may have forgotten to point it
    out :-().

    Signed-off-by: Stephen Rothwell
    Acked-by: Kirill A. Shutemov
    Signed-off-by: David S. Miller

    Stephen Rothwell
     
  • Commit ea1e7ed33708 triggers build regression on sparc64.

    include/linux/mm.h:1391:2: error: implicit declaration of function 'pgtable_cache_init' [-Werror=implicit-function-declaration]
    arch/sparc/include/asm/pgtable_64.h:978:13: error: conflicting types for 'pgtable_cache_init' [-Werror]

    It happens due headers include loop:

    -> -> ->
    -> ->

    Let's drop include from asm/tlbflush_64.h.
    Build tested with allmodconfig.

    Signed-off-by: Kirill A. Shutemov
    Reported-by: Geert Uytterhoeven
    Signed-off-by: David S. Miller

    Kirill A. Shutemov
     

16 Nov, 2013

2 commits

  • Pull trivial tree updates from Jiri Kosina:
    "Usual earth-shaking, news-breaking, rocket science pile from
    trivial.git"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
    doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
    doc: add missing files to timers/00-INDEX
    timekeeping: Fix some trivial typos in comments
    mm: Fix some trivial typos in comments
    irq: Fix some trivial typos in comments
    NUMA: fix typos in Kconfig help text
    mm: update 00-INDEX
    doc: Documentation/DMA-attributes.txt fix typo
    DRM: comment: `halve' -> `half'
    Docs: Kconfig: `devlopers' -> `developers'
    doc: typo on word accounting in kprobes.c in mutliple architectures
    treewide: fix "usefull" typo
    treewide: fix "distingush" typo
    mm/Kconfig: Grammar s/an/a/
    kexec: Typo s/the/then/
    Documentation/kvm: Update cpuid documentation for steal time and pv eoi
    treewide: Fix common typo in "identify"
    __page_to_pfn: Fix typo in comment
    Correct some typos for word frequency
    clk: fixed-factor: Fix a trivial typo
    ...

    Linus Torvalds
     
  • Pull Kconfig cleanups from Mark Salter:
    "Remove some unused config options from C6X and clean up PC_PARPORT
    dependencies. The latter was discussed here:

    https://lkml.org/lkml/2013/10/8/12"

    * tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
    c6x: remove unused COMMON_CLKDEV Kconfig parameter
    Kconfig cleanup (PARPORT_PC dependencies)
    x86: select ARCH_MIGHT_HAVE_PC_PARPORT
    unicore32: select ARCH_MIGHT_HAVE_PC_PARPORT
    sparc: select ARCH_MIGHT_HAVE_PC_PARPORT
    sh: select ARCH_MIGHT_HAVE_PC_PARPORT
    powerpc: select ARCH_MIGHT_HAVE_PC_PARPORT
    parisc: select ARCH_MIGHT_HAVE_PC_PARPORT
    mips: select ARCH_MIGHT_HAVE_PC_PARPORT
    microblaze: select ARCH_MIGHT_HAVE_PC_PARPORT
    m68k: select ARCH_MIGHT_HAVE_PC_PARPORT
    ia64: select ARCH_MIGHT_HAVE_PC_PARPORT
    arm: select ARCH_MIGHT_HAVE_PC_PARPORT
    alpha: select ARCH_MIGHT_HAVE_PC_PARPORT
    c6x: remove unused parameter in Kconfig

    Linus Torvalds
     

15 Nov, 2013

6 commits

  • Pull sparc update from David Miller:

    1) Implement support for up to 47-bit physical addresses on sparc64.

    2) Support HAVE_CONTEXT_TRACKING on sparc64, from Kirill Tkhai.

    3) Fix Simba bridge window calculations, from Kjetil Oftedal.

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next:
    sparc64: Implement HAVE_CONTEXT_TRACKING
    sparc64: Add self-IPI support for smp_send_reschedule()
    sparc: PCI: Fix incorrect address calculation of PCI Bridge windows on Simba-bridges
    sparc64: Encode huge PMDs using PTE encoding.
    sparc64: Move to 64-bit PGDs and PMDs.
    sparc64: Move from 4MB to 8MB huge pages.
    sparc64: Make PAGE_OFFSET variable.
    sparc64: Fix inconsistent max-physical-address defines.
    sparc64: Document the shift counts used to validate linear kernel addresses.
    sparc64: Define PAGE_OFFSET in terms of physical address bits.
    sparc64: Use PAGE_OFFSET instead of a magic constant.
    sparc64: Clean up 64-bit mmap exclusion defines.

    Linus Torvalds
     
  • We've switched over every architecture that supports SMP to it, so
    remove the new useless config variable.

    Signed-off-by: Christoph Hellwig
    Cc: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Signed-off-by: Kirill A. Shutemov
    Acked-by: David S. Miller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Currently mm->pmd_huge_pte protected by page table lock. It will not
    work with split lock. We have to have per-pmd pmd_huge_pte for proper
    access serialization.

    For now, let's just introduce wrapper to access mm->pmd_huge_pte.

    Signed-off-by: Kirill A. Shutemov
    Tested-by: Alex Thorlton
    Cc: Alex Thorlton
    Cc: Ingo Molnar
    Cc: Naoya Horiguchi
    Cc: "Eric W . Biederman"
    Cc: "Paul E . McKenney"
    Cc: Al Viro
    Cc: Andi Kleen
    Cc: Andrea Arcangeli
    Cc: Dave Hansen
    Cc: Dave Jones
    Cc: David Howells
    Cc: Frederic Weisbecker
    Cc: Johannes Weiner
    Cc: Kees Cook
    Cc: Mel Gorman
    Cc: Michael Kerrisk
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Rik van Riel
    Cc: Robin Holt
    Cc: Sedat Dilek
    Cc: Srikar Dronamraju
    Cc: Thomas Gleixner
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • Mark the places when the system are in user or are in kernel.
    This is used to make full dynticks system (tickless) --
    CONFIG_NO_HZ_FULL dependence.

    Signed-off-by: Kirill Tkhai
    CC: David Miller
    Signed-off-by: David S. Miller

    Kirill Tkhai
     
  • CONFIG_NO_HZ_FULL requires possibility of smp_send_reschedule()
    for the calling CPU. Currently, it is used in inc_nr_running()
    scheduler primitive only.

    Nobody calls smp_send_reschedule() from preemptible context
    (furthermore, it looks like it will be save if anybody use it
    another way in the future). But anyway I add WARN_ON() here
    just to return here if anything changes.

    Signed-off-by: Kirill Tkhai
    CC: David Miller
    Signed-off-by: David S. Miller

    Kirill Tkhai