14 Apr, 2010

6 commits

  • This is a partial revert of 4cd8b5e2a159 "lguest: use KVM hypercalls";
    we revert to using (just as questionable but more reliable) int $15 for
    hypercalls. I didn't revert the register mapping, so we still use the
    same calling convention as kvm.

    KVM in more recent incarnations stopped injecting a fault when a guest
    tried to use the VMCALL instruction from ring 1, so lguest under kvm
    fails to make hypercalls. It was nice to share code with our KVM
    cousins, but this was overreach.

    Signed-off-by: Rusty Russell
    Cc: Matias Zabaljauregui
    Cc: Avi Kivity

    Rusty Russell
     
  • It's only used by cmpxchg8b_emu (see db677ffa5f5a for the gory
    details), and fixing that to be paravirt aware would be more work than
    simply ignoring it (and AFAICT only help lguest). This makes lguest
    work on machines which have cmpxchg8b, for kernels compiled for older
    processors.

    (We can't emulate it properly: the popf which expects to restore interrupts
    does not trap).

    Signed-off-by: Rusty Russell
    Cc: Jeremy Fitzhardinge
    Cc: virtualization@lists.osdl.org

    Rusty Russell
     
  • * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
    PM / Hibernate: user.c, fix SNAPSHOT_SET_SWAP_AREA handling

    Linus Torvalds
     
  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    NFSv4: fix delegated locking
    NFS: Ensure that the WRITE and COMMIT RPC calls are always uninterruptible
    NFS: Fix a race with the new commit code
    NFS: Ensure that writeback_single_inode() calls write_inode() when syncing
    NFS: Fix the mode calculation in nfs_find_open_context
    NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
    sparc64: Add some more commentary to __raw_local_irq_save()
    sparc64: Fix memory leak in pci_register_iommu_region().
    sparc64: Add kmemleak annotation to sun4v_build_virq()
    sparc64: Support kmemleak.
    sparc64: Add function graph tracer support.
    sparc64: Give a stack frame to the ftrace call sites.
    sparc64: Use a seperate counter for timer interrupts and NMI checks, like x86.
    sparc64: Remove profiling from some low-level bits.
    sparc64: Kill unnecessary static on local var in ftrace_call_replace().
    sparc64: Kill CONFIG_STACK_DEBUG code.
    sparc64: Add HAVE_FUNCTION_TRACE_MCOUNT_TEST and tidy up.
    sparc64: Adjust __raw_local_irq_save() to cooperate in NMIs.
    sparc64: Use kstack_valid() in die_if_kernel().

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits)
    smc91c92_cs: define multicast_table as unsigned char
    can: avoids a false warning
    e1000e: stop cleaning when we reach tx_ring->next_to_use
    igb: restrict WoL for 82576 ET2 Quad Port Server Adapter
    virtio_net: missing sg_init_table
    Revert "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"
    iwlwifi: need check for valid qos packet before free
    tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb
    udp: fix for unicast RX path optimization
    myri10ge: fix rx_pause in myri10ge_set_pauseparam
    net: corrected documentation for hardware time stamping
    stmmac: use resource_size()
    x.25 attempts to negotiate invalid throughput
    x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet.
    bridge: Fix IGMP3 report parsing
    cnic: Fix crash during bnx2x MTU change.
    qlcnic: fix set mac addr
    r6040: fix r6040_multicast_list
    vhost-net: fix vq_memory_access_ok error checking
    ath9k: fix double calls to ath_radio_enable
    ...

    Linus Torvalds
     

13 Apr, 2010

34 commits

  • smc91c92_cs:
    * define multicast_table as unsigned char
    * remove unnecessary "#ifndef final_version"

    Signed-off-by: Ken Kawasaki
    Signed-off-by: David S. Miller

    Ken Kawasaki
     
  • At this point optlen == sizeof(sfilter) but some compilers are dumb.

    Reported-by: Németh Márton
    Acked-by: Oliver Hartkopp
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Tx ring buffers after tx_ring->next_to_use are volatile and could
    change, possibly causing a crash. Stop cleaning when we hit
    tx_ring->next_to_use.

    Signed-off-by: Terry Loftin
    Acked-by: Bruce Allan
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Terry Loftin
     
  • Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is
    only supported there.

    Signed-off-by: Stefan Assmann
    Acked-by: Alexander Duyck
    Signed-off-by: Jeff Kirsher
    Signed-off-by: David S. Miller

    Stefan Assmann
     
  • Suggested by Peter Zijlstra

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Conflicts:
    lib/Kconfig.debug

    David S. Miller
     
  • Found by kmemleak.

    If request_resource() fails, we leak the struct resource we
    allocated to represent the IOMMU mapping area.

    This actually happens on sun4v machines because the IOMEM area is only
    reported sans the IOMMU region, unlike all previous systems. I'll
    need to fix that at some point, but for now fix the leak.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • The only reference we store to this memory is in the form of a
    physical address, so kmemleak can't see it.

    Add a kmemleak_not_leak() annotation.

    It's probably useful to be able to look at a dump of these things
    either via debugfs or similar, and thus we could at some point store
    them in some kind of table and therefore get rid of this annotation.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Only missing thing was an _sdata marker in vmlinux.lds.S

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • It's the only way we'll be able to implement the function
    graph tracer properly.

    A positive is that we no longer have to worry about the
    linker over-optimizing the tail call, since we don't
    use a tail call any more.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This keeps us from having to use kstat_irqs_cpu() from the NMI handler,
    the former of which is a profiled function.

    Instead we use a currently empty slot in the cpu_data

    Signed-off-by: David S. Miller

    David S. Miller
     
  • These include the timer implementation, perf events support, and the
    performance counter register (pcr) programming layer.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Signed-off-by: David S. Miller

    David S. Miller
     
  • The generic stack tracer does this job just as well.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Check function_trace_stop at ftrace_caller

    Toss mcount_call and dummy call of ftrace_stub, unnecessary.

    Document problems we'll have if the final kernel image link
    ever turns on relaxation.

    Properly size 'ftrace_call' so it looks right when inspecting
    instructions under gdb et al.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • If we are in an NMI then doing a plain raw_local_irq_disable() will
    write PIL_NORMAL_MAX into %pil, which is lower than PIL_NMI, and thus
    we'll re-enable NMIs and recurse.

    Doing a simple:

    %pil = %pil | PIL_NORMAL_MAX

    does what we want, if we're already at PIL_NMI (15) we leave it at
    that setting, else we set it to PIL_NORMAL_MAX (14).

    This should get the function tracer working on sparc64.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • This gets rid of a local function (is_kernel_stack()) which tries to
    do the same thing, yet poorly in that it doesn't handle IRQ stacks
    properly.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Add missing sg_init_table for sg_set_buf in virtio_net which
    induced in defer skb patch.

    Reported-by: Thomas Müller
    Tested-by: Thomas Müller
    Signed-off-by: Shirley Ma
    Signed-off-by: David S. Miller

    Shirley Ma
     
  • Linus Torvalds
     
  • * anonvma:
    anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma
    anon_vma: clone the anon_vma chain in the right order
    vma_adjust: fix the copying of anon_vma chains
    Simplify and comment on anon_vma re-use for anon_vma_prepare()

    Linus Torvalds
     
  • * master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits)
    ARM: Fix ioremap_cached()/ioremap_wc() for SMP platforms
    ARM: 6043/1: AT91 slow-clock resume: Don't wait for a disabled PLL to lock
    ARM: 6031/1: fix Thumb-2 decompressor
    ARM: 6029/1: ep93xx: gpio.c: local functions should be static
    ARM: 6028/1: ARM: add MAINTAINERS for U300
    ARM: 6024/1: bcmring: fix missing down on semaphore in dma.c
    MXC: mach_armadillo5x0: Add USB Host support.
    ARM mach-mx3: duplicated include
    ARM mach-mx3: duplicated include
    imx31: add watchdog device on litekit board.
    imx3: Add watchdog platform device support
    MXC: mach-mx31_3ds: add support for freescale mc13783 power management device.
    MXC: mach-mx31_3ds: Add SPI1 device support.
    MXC: mach-mx31_3ds: Add support for on board NAND Flash.
    MXC: mach-mx31_3ds: Update variable names over recent mach name modification.
    imx31: fix parent clock for rtc
    i.MX51: remove NFC AXI static mapping
    i.MX51: determine silicon revision dynamically
    i.MX51: map TZIC dynamically
    i.MX51: Use correct clock for gpt
    ...

    Linus Torvalds
     
  • * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
    Btrfs: make sure the chunk allocator doesn't create zero length chunks
    Btrfs: fix data enospc check overflow

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
    quota: Fix possible dq_flags corruption
    quota: Hide warnings about writes to the filesystem before quota was turned on
    ext3: symlink must be handled via filesystem specific operation
    ext2: symlink must be handled via filesystem specific operation

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
    udf: add speciffic ->setattr callback
    udf: potential integer overflow

    Linus Torvalds
     
  • * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (36 commits)
    MIPS: Calculate proper ebase value for 64-bit kernels
    MIPS: Alchemy: DB1200: Remove custom wait implementation
    MIPS: Big Sur: Make defconfig more useful.
    MIPS: Fix __vmalloc() etc. on MIPS for non-GPL modules
    MIPS: Sibyte: Fix M3 TLB exception handler workaround.
    MIPS: BCM63xx: Fix build failure in board_bcm963xx.c
    MIPS: uasm: Add OR instruction.
    MIPS: Sibyte: Apply M3 workaround only on affected chip types and versions.
    MIPS: BCM63xx: Initialize gpio_out_low & out_high to current value at boot.
    MIPS: BCM63xx: Register SSB SPROM fallback in board's first stage callback
    MIPS: BCM63xx: Fix typo in cpu-feature-overrides file.
    MIPS: BCM63xx: Add support for second uart.
    MIPS: BCM63xx: Fix double gpio registration.
    MIPS: BCM63xx: Add DWVS0 board
    MIPS: BCM63xx: Add the RTA1025W-16 BCM6348-based board to suppported boards.
    MIPS: BCM63xx: Fix BCM6338 and BCM6345 gpio count
    MIPS: libgcc.h: Checkpatch cleanup
    MIPS: Loongson-2F: Flush the branch target history in BTB and RAS
    MIPS: Move signal trampolines off of the stack.
    MIPS: Preliminary VDSO
    ...

    Linus Torvalds
     
  • * 'for-2.6.34' of git://linux-nfs.org/~bfields/linux:
    svcrdma: RDMA support not yet compatible with RPC6

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
    nilfs2: fix typo "numer" -> "number" in alloc.c
    nilfs2: Remove an uninitialization warning in nilfs_btree_propagate_v()
    nilfs2: fix a wrong type conversion in nilfs_ioctl()

    Linus Torvalds
     
  • Otherwise we might be mapping in a page in a new mapping, but that page
    (through the swapcache) would later be mapped into an old mapping too.
    The page->mapping must be the case that works for everybody, not just
    the mapping that happened to page it in first.

    Here's the scenario:

    - page gets allocated/mapped by process A. Let's call the anon_vma we
    associate the page with 'A' to keep it easy to track.

    - Process A forks, creating process B. The anon_vma in B is 'B', and has
    a chain that looks like 'B' -> 'A'. Everything is fine.

    - Swapping happens. The page (with mapping pointing to 'A') gets swapped
    out (perhaps not to disk - it's enough to assume that it's just not
    mapped any more, and lives entirely in the swap-cache)

    - Process B pages it in, which goes like this:

    do_swap_page ->
    page = lookup_swap_cache(entry);
    ...
    set_pte_at(mm, address, page_table, pte);
    page_add_anon_rmap(page, vma, address);

    And think about what happens here!

    In particular, what happens is that this will now be the "first"
    mapping of that page, so page_add_anon_rmap() used to do

    if (first)
    __page_set_anon_rmap(page, vma, address);

    and notice what anon_vma it will use? It will use the anon_vma for
    process B!

    What happens then? Trivial: process 'A' also pages it in (nothing
    happens, it's not the first mapping), and then process 'B' execve's
    or exits or unmaps, making anon_vma B go away.

    End result: process A has a page that points to anon_vma B, but
    anon_vma B does not exist any more. This can go on forever. Forget
    about RCU grace periods, forget about locking, forget anything like
    that. The bug is simply that page->mapping points to an anon_vma
    that was correct at one point, but was _not_ the one that was shared
    by all users of that possible mapping.

    Changing it to always use the deepest anon_vma in the anonvma chain gets
    us to the safest model.

    This can be improved in certain cases: if we know the page is private to
    just this particular mapping (for example, it's a new page, or it is the
    only swapcache entry), we could pick the top (most specific) anon_vma.

    But that's a future optimization. Make it _work_ reliably first.

    Reviewed-by: Rik van Riel
    Acked-by: Johannes Weiner
    Tested-by: Borislav Petkov [ "What do you know, I think you fixed it!" ]
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • We want to walk the chain in reverse order when cloning it, so that the
    order of the result chain will be the same as the order in the source
    chain. When we add entries to the chain, they go at the head of the
    chain, so we want to add the source head last.

    Reviewed-by: Rik van Riel
    Acked-by: Johannes Weiner
    Tested-by: Borislav Petkov [ "No, it still oopses" ]
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • When we move the boundaries between two vma's due to things like
    mprotect, we need to make sure that the anon_vma of the pages that got
    moved from one vma to another gets properly copied around. And that was
    not always the case, in this rather hard-to-follow code sequence.

    Clarify the code, and fix it so that it copies the anon_vma from the
    right source.

    Reviewed-by: Rik van Riel
    Acked-by: Johannes Weiner
    Tested-by: Borislav Petkov [ "Yeah, not so much this one either" ]
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • This changes the anon_vma reuse case to require that we only reuse
    simple anon_vma's - ie the case when the vma only has a single anon_vma
    associated with it.

    This means that a reuse of an anon_vma from an adjacent vma will always
    guarantee that both vma's are associated not only with the same
    anon_vma, they will also have the same anon_vma chain (of just a single
    entry in this case).

    And since anon_vma re-use was the only case where the same anon_vma
    might be associated with different chains of anon_vma's, we now have the
    case that every vma that shares the same anon_vma will always also have
    the same chain. That makes it much easier to think about merging vma's
    that share the same anon_vma's: you can always just drop the other
    anon_vma chain in anon_vma_merge() since you know that they are always
    identical.

    This also splits up the function to validate the anon_vma re-use, and
    adds a lot of commentary about the possible races.

    Reviewed-by: Rik van Riel
    Acked-by: Johannes Weiner
    Tested-by: Borislav Petkov [ "That didn't fix it" ]
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • dq_flags are modified non-atomically in do_set_dqblk via __set_bit calls and
    atomically for example in mark_dquot_dirty or clear_dquot_dirty. Hence a
    change done by an atomic operation can be overwritten by a change done by a
    non-atomic one. Fix the problem by using atomic bitops even in do_set_dqblk.

    Signed-off-by: Andrew Perepechko
    Signed-off-by: Jan Kara

    Andrew Perepechko
     
  • For a root filesystem write to the filesystem before quota is turned on happens
    regularly and there's no way around it because of writes to syslog, /etc/mtab,
    and similar. So the warning is rather pointless for ordinary users. It's
    still useful during development so we just hide the warning behind
    __DQUOT_PARANOIA config option.

    Signed-off-by: Jan Kara

    Jan Kara