30 Sep, 2010

1 commit


29 Sep, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
    tcp: Fix >4GB writes on 64-bit.
    net/9p: Mount only matching virtio channels
    de2104x: fix ethtool
    tproxy: check for transparent flag in ip_route_newports
    ipv6: add IPv6 to neighbour table overflow warning
    tcp: fix TSO FACK loss marking in tcp_mark_head_lost
    3c59x: fix regression from patch "Add ethtool WOL support"
    ipv6: add a missing unregister_pernet_subsys call
    s390: use free_netdev(netdev) instead of kfree()
    sgiseeq: use free_netdev(netdev) instead of kfree()
    rionet: use free_netdev(netdev) instead of kfree()
    ibm_newemac: use free_netdev(netdev) instead of kfree()
    smsc911x: Add MODULE_ALIAS()
    net: reset skb queue mapping when rx'ing over tunnel
    br2684: fix scheduling while atomic
    de2104x: fix TP link detection
    de2104x: fix power management
    de2104x: disable autonegotiation on broken hardware
    net: fix a lockdep splat
    e1000e: 82579 do not gate auto config of PHY by hardware during nominal use
    ...

    Linus Torvalds
     

28 Sep, 2010

3 commits

  • Fixes kernel bugzilla #16603

    tcp_sendmsg() truncates iov_len to an 'int' which a 4GB write to write
    zero bytes, for example.

    There is also the problem higher up of how verify_iovec() works. It
    wants to prevent the total length from looking like an error return
    value.

    However it does this using 'int', but syscalls return 'long' (and
    thus signed 64-bit on 64-bit machines). So it could trigger
    false-positives on 64-bit as written. So fix it to use 'long'.

    Reported-by: Olaf Bonorden
    Reported-by: Daniel Büse
    Reported-by: Andrew Morton
    Signed-off-by: David S. Miller

    David S. Miller
     
  • as done in ip_route_connect()

    Signed-off-by: Ulrich Weber
    Signed-off-by: David S. Miller

    Ulrich Weber
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86/amd-iommu: Fix rounding-bug in __unmap_single
    x86/amd-iommu: Work around S3 BIOS bug
    x86/amd-iommu: Set iommu configuration flags in enable-loop
    x86, setup: Fix earlyprintk=serial,0x3f8,115200
    x86, setup: Fix earlyprintk=serial,ttyS0,115200

    Linus Torvalds
     

27 Sep, 2010

2 commits

  • Clean up a missing exit path in the ipv6 module init routines. In
    addrconf_init we call ipv6_addr_label_init which calls register_pernet_subsys
    for the ipv6_addr_label_ops structure. But if module loading fails, or if the
    ipv6 module is removed, there is no corresponding unregister_pernet_subsys call,
    which leaves a now-bogus address on the pernet_list, leading to oopses in
    subsequent registrations. This patch cleans up both the failed load path and
    the unload path. Tested by myself with good results.

    Signed-off-by: Neil Horman

    include/net/addrconf.h | 1 +
    net/ipv6/addrconf.c | 11 ++++++++---
    net/ipv6/addrlabel.c | 5 +++++
    3 files changed, 14 insertions(+), 3 deletions(-)
    Signed-off-by: David S. Miller

    Neil Horman
     
  • Reset queue mapping when an skb is reentering the stack via a tunnel.
    On second pass, the queue mapping from the original device is no
    longer valid.

    Signed-off-by: Tom Herbert
    Acked-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Tom Herbert
     

24 Sep, 2010

1 commit


23 Sep, 2010

4 commits

  • This patch adds a workaround for an IOMMU BIOS problem to
    the AMD IOMMU driver. The result of the bug is that the
    IOMMU does not execute commands anymore when the system
    comes out of the S3 state resulting in system failure. The
    bug in the BIOS is that is does not restore certain hardware
    specific registers correctly. This workaround reads out the
    contents of these registers at boot time and restores them
    on resume from S3. The workaround is limited to the specific
    IOMMU chipset where this problem occurs.

    Cc: stable@kernel.org
    Signed-off-by: Joerg Roedel

    Joerg Roedel
     
  • This fixes the regression caused by the commit 6fee48cd330c68
    ("dma-mapping: arm: use generic pci_set_dma_mask and
    pci_set_consistent_dma_mask").

    ARM needs to clip the dma coherent mask for dmabounce devices. This
    restores the old trick.

    Note that strictly speaking, the DMA API doesn't allow architectures to do
    such but I'm not sure it's worth adding the new API to set the dma mask
    that allows architectures to clip it.

    Reported-by: Krzysztof Halasa
    Signed-off-by: FUJITA Tomonori
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • Add a missing inline keyword for static function in linux/dmaengine.h to
    avoid duplicate symbol definitions.

    Signed-off-by: Mathieu Lacage
    Signed-off-by: Dan Williams

    Mathieu Lacage
     
  • This patch reduces namespace pollution by moving the "struct net" declaration
    out of the userspace-facing portion of linux/netlink.h. It has no impact on
    the kernel.

    (This came up because we have several C++ applications which use "net" as a
    namespace name.)

    Signed-off-by: Ollie Wild
    Signed-off-by: David S. Miller

    Ollie Wild
     

22 Sep, 2010

1 commit

  • The lock structs are currently protected by the BKL, but are accessed by
    code in fs/locks.c and misc file system and DLM code. These stubs will
    allow all users to switch to the new interface before the implementation
    is changed to a spinlock.

    Acked-by: Arnd Bergmann
    Signed-off-by: Sage Weil
    Signed-off-by: Linus Torvalds

    Sage Weil
     

21 Sep, 2010

1 commit

  • The family parameter xfrm_state_find is used to find a state matching a
    certain policy. This value is set to the template's family
    (encap_family) right before xfrm_state_find is called.
    The family parameter is however also used to construct a temporary state
    in xfrm_state_find itself which is wrong for inter-family scenarios
    because it produces a selector for the wrong family. Since this selector
    is included in the xfrm_user_acquire structure, user space programs
    misinterpret IPv6 addresses as IPv4 and vice versa.
    This patch splits up the original init_tempsel function into a part that
    initializes the selector respectively the props and id of the temporary
    state, to allow for differing ip address families whithin the state.

    Signed-off-by: Thomas Egerer
    Signed-off-by: Steffen Klassert
    Signed-off-by: David S. Miller

    Thomas Egerer
     

20 Sep, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (21 commits)
    dca: disable dca on IOAT ver.3.0 multiple-IOH platforms
    netpoll: Disable IRQ around RCU dereference in netpoll_rx
    sctp: Do not reset the packet during sctp_packet_config().
    net/llc: storing negative error codes in unsigned short
    MAINTAINERS: move atlx discussions to netdev
    drivers/net/cxgb3/cxgb3_main.c: prevent reading uninitialized stack memory
    drivers/net/eql.c: prevent reading uninitialized stack memory
    drivers/net/usb/hso.c: prevent reading uninitialized memory
    xfrm: dont assume rcu_read_lock in xfrm_output_one()
    r8169: Handle rxfifo errors on 8168 chips
    3c59x: Remove atomic context inside vortex_{set|get}_wol
    tcp: Prevent overzealous packetization by SWS logic.
    net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
    phylib: fix PAL state machine restart on resume
    net: use rcu_barrier() in rollback_registered_many
    bonding: correctly process non-linear skbs
    ipv4: enable getsockopt() for IP_NODEFRAG
    ipv4: force_igmp_version ignored when a IGMPv3 query received
    ppp: potential NULL dereference in ppp_mp_explode()
    net/llc: make opt unsigned in llc_ui_setsockopt()
    ...

    Linus Torvalds
     

18 Sep, 2010

1 commit

  • We cannot use rcu_dereference_bh safely in netpoll_rx as we may
    be called with IRQs disabled. We could however simply disable
    IRQs as that too causes BH to be disabled and is safe in either
    case.

    Thanks to John Linville for discovering this bug and providing
    a patch.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     

17 Sep, 2010

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
    workqueue: add documentation

    Linus Torvalds
     
  • * 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
    drm/radeon/kms: only warn on mipmap size checks in r600 cs checker (v2)
    drm/radeon/kms: force legacy pll algo for RV620 LVDS
    drm: fix race between driver loading and userspace open.
    drm: Use a nondestructive mode for output detect when polling (v2)
    drm/radeon/kms: fix the colorbuffer CS checker for r300-r500
    drm/radeon/kms: increase lockup detection interval to 10 sec for r100-r500
    drm/radeon/kms/evergreen: fix backend setup
    drm: Use a nondestructive mode for output detect when polling
    drm/radeon: add some missing copyright headers
    drm: Only decouple the old_fb from the crtc is we call mode_set*
    drm/radeon/kms: don't enable underscan with interlaced modes
    drm/radeon/kms: add connector table for Mac x800
    drm/radeon/kms: fix regression in RMX code (v2)
    drm: Fix regression in disable polling e58f637

    Linus Torvalds
     

16 Sep, 2010

1 commit

  • If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised
    window, the SWS logic will packetize to half the MSS unnecessarily.

    This causes problems with some embedded devices.

    However for large MSS devices we do want to half-MSS packetize
    otherwise we never get enough packets into the pipe for things
    like fast retransmit and recovery to work.

    Be careful also to handle the case where MSS > window, otherwise
    we'll never send until the probe timer.

    Reported-by: ツ Leandro Melo de Sales
    Signed-off-by: David S. Miller

    Alexey Kuznetsov
     

15 Sep, 2010

3 commits

  • * ssh://master.kernel.org/home/hpa/tree/sec:
    x86-64, compat: Retruncate rax after ia32 syscall entry tracing
    x86-64, compat: Test %rax for the syscall number, not %eax
    compat: Make compat_alloc_user_space() incorporate the access_ok()

    Linus Torvalds
     
  • * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
    SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies
    statfs() gives ESTALE error
    NFS: Fix a typo in nfs_sockaddr_match_ipaddr6
    sunrpc: increase MAX_HASHTABLE_BITS to 14
    gss:spkm3 miss returning error to caller when import security context
    gss:krb5 miss returning error to caller when import security context
    Remove incorrect do_vfs_lock message
    SUNRPC: cleanup state-machine ordering
    SUNRPC: Fix a race in rpc_info_open
    SUNRPC: Fix race corrupting rpc upcall
    Fix null dereference in call_allocate

    Linus Torvalds
     
  • compat_alloc_user_space() expects the caller to independently call
    access_ok() to verify the returned area. A missing call could
    introduce problems on some architectures.

    This patch incorporates the access_ok() check into
    compat_alloc_user_space() and also adds a sanity check on the length.
    The existing compat_alloc_user_space() implementations are renamed
    arch_compat_alloc_user_space() and are used as part of the
    implementation of the new global function.

    This patch assumes NULL will cause __get_user()/__put_user() to either
    fail or access userspace on all architectures. This should be
    followed by checking the return value of compat_access_user_space()
    for NULL in the callers, at which time the access_ok() in the callers
    can also be removed.

    Reported-by: Ben Hawkes
    Signed-off-by: H. Peter Anvin
    Acked-by: Benjamin Herrenschmidt
    Acked-by: Chris Metcalf
    Acked-by: David S. Miller
    Acked-by: Ingo Molnar
    Acked-by: Thomas Gleixner
    Acked-by: Tony Luck
    Cc: Andrew Morton
    Cc: Arnd Bergmann
    Cc: Fenghua Yu
    Cc: H. Peter Anvin
    Cc: Heiko Carstens
    Cc: Helge Deller
    Cc: James Bottomley
    Cc: Kyle McMartin
    Cc: Martin Schwidefsky
    Cc: Paul Mackerras
    Cc: Ralf Baechle
    Cc:

    H. Peter Anvin
     

14 Sep, 2010

3 commits

  • v2: Julien Cristau pointed out that @nondestructive results in
    double-negatives and confusion when trying to interpret the parameter,
    so use @force instead. Much easier to type as well. ;-)

    And fix the miscompilation of vmgfx reported by Sedat Dilek.

    Signed-off-by: Chris Wilson
    Cc: stable@kernel.org
    Signed-off-by: Dave Airlie

    Chris Wilson
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
    dquot: do full inode dirty in allocating space

    Linus Torvalds
     
  • * 'next-spi' of git://git.secretlab.ca/git/linux-2.6:
    spi/pl022: move probe call to subsys_initcall()
    powerpc/5200: mpc52xx_uart.c: Add of_node_put to avoid memory leak
    spi/pl022: fix APB pclk power regression on U300
    spi/spi_s3c64xx: Warn if PIO transfers time out
    spi/s3c64xx: Fix incorrect reuse of 'val' local variable.
    spi/s3c64xx: Fix compilation warning
    spi/dw_spi: clean the cs_control code
    spi/dw_spi: Allow interrupt sharing
    spi/spi_s3c64xx: Increase dead reckoning time in wait_for_xfer()
    spi/spi_s3c64xx: Move to subsys_initcall()
    spi: free children in spi_unregister_master, not siblings
    gpiolib: Add 'struct gpio_chip' forward declaration for !GPIOLIB case
    of: Fix missing includes - ll_temac
    spi/spi_s3c64xx: Staticise non-exported functions
    spi/spi_s3c64xx: Make probe more robust against missing board config

    Linus Torvalds
     

13 Sep, 2010

3 commits

  • Destructive load-detection is very expensive and due to failings
    elsewhere can trigger system wide stalls of up to 600ms. A simple
    first step to correcting this is not to invoke such an expensive
    and destructive load-detection operation automatically.

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=29536
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16265
    Reported-by: Bruno Prémont
    Tested-by: Sitsofe Wheeler
    Signed-off-by: Chris Wilson
    Cc: stable@kernel.org
    Signed-off-by: Dave Airlie

    Chris Wilson
     
  • Update copyright notice and add Documentation/workqueue.txt.

    Randy Dunlap, Dave Chinner: misc fixes.

    Signed-off-by: Tejun Heo
    Reviewed-By: Florian Mickler
    Cc: Ingo Molnar
    Cc: Christoph Lameter
    Cc: Randy Dunlap
    Cc: Dave Chinner

    Tejun Heo
     
  • There is a race between rpc_info_open and rpc_release_client()
    in that nothing stops a process from opening the file after
    the clnt->cl_kref goes to zero.

    Fix this by using atomic_inc_unless_zero()...

    Reported-by: J. Bruce Fields
    Signed-off-by: Trond Myklebust
    Cc: stable@kernel.org

    Trond Myklebust
     

11 Sep, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (28 commits)
    ipheth: remove incorrect devtype to WWAN
    MAINTAINERS: Add CAIF
    sctp: fix test for end of loop
    KS8851: Correct RX packet allocation
    udp: add rehash on connect()
    net: blackhole route should always be recalculated
    ipv4: Suppress lockdep-RCU false positive in FIB trie (3)
    niu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL
    ipvs: fix active FTP
    gro: Re-fix different skb headrooms
    via-velocity: Turn scatter-gather support back off.
    ipv4: Fix reverse path filtering with multipath routing.
    UNIX: Do not loop forever at unix_autobind().
    PATCH: b44 Handle RX FIFO overflow better (simplified)
    irda: off by one
    3c59x: Fix deadlock in vortex_error()
    netfilter: discard overlapping IPv6 fragment
    ipv6: discard overlapping fragment
    net: fix tx queue selection for bridged devices implementing select_queue
    bonding: Fix jiffies overflow problems (again)
    ...

    Fix up trivial conflicts due to the same cgroup API thinko fix going
    through both Andrew and the networking tree. However, there were small
    differences between the two, with Andrew's version generally being the
    nicer one, and the one I merged first. So pick that one.

    Conflicts in: include/linux/cgroup.h and kernel/cgroup.c

    Linus Torvalds
     

10 Sep, 2010

11 commits

  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    block: Range check cpu in blk_cpu_to_group
    scatterlist: prevent invalid free when alloc fails
    writeback: Fix lost wake-up shutting down writeback thread
    writeback: do not lose wakeup events when forking bdi threads
    cciss: fix reporting of max queue depth since init
    block: switch s390 tape_block and mg_disk to elevator_change()
    block: add function call to switch the IO scheduler from a driver
    fs/bio-integrity.c: return -ENOMEM on kmalloc failure
    bio-integrity.c: remove dependency on __GFP_NOFAIL
    BLOCK: fix bio.bi_rw handling
    block: put dev->kobj in blk_register_queue fail path
    cciss: handle allocation failure
    cfq-iosched: Documentation help for new tunables
    cfq-iosched: blktrace print per slice sector stats
    cfq-iosched: Implement tunable group_idle
    cfq-iosched: Do group share accounting in IOPS when slice_idle=0
    cfq-iosched: Do not idle if slice_idle=0
    cciss: disable doorbell reset on reset_devices
    blkio: Fix return code for mkdir calls

    Linus Torvalds
     
  • David S. Miller
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
    libata-sff: Reenable Port Multiplier after libata-sff remodeling.
    libata: skip EH autopsy and recovery during suspend
    ahci: AHCI and RAID mode SATA patch for Intel Patsburg DeviceIDs
    ata_piix: IDE Mode SATA patch for Intel Patsburg DeviceIDs
    libata,pata_via: revert ata_wait_idle() removal from ata_sff/via_tf_load()
    ahci: fix hang on failed softreset
    pata_artop: Fix device ID parity check

    Linus Torvalds
     
  • Keep track of the link on the which the current request is in progress.
    It allows support of links behind port multiplier.

    Not all libata-sff is PMP compliant. Code for native BMDMA controller
    does not take in accound PMP.

    Tested on Marvell 7042 and Sil7526.

    Signed-off-by: Gwendal Grignou
    Signed-off-by: Jeff Garzik

    Gwendal Grignou
     
  • For some mysterious reason, certain hardware reacts badly to usual EH
    actions while the system is going for suspend. As the devices won't
    be needed until the system is resumed, ask EH to skip usual autopsy
    and recovery and proceed directly to suspend.

    Signed-off-by: Tejun Heo
    Tested-by: Stephan Diestelhorst
    Cc: stable@kernel.org
    Signed-off-by: Jeff Garzik

    Tejun Heo
     
  • …low and kswapd is awake

    Ordinarily watermark checks are based on the vmstat NR_FREE_PAGES as it is
    cheaper than scanning a number of lists. To avoid synchronization
    overhead, counter deltas are maintained on a per-cpu basis and drained
    both periodically and when the delta is above a threshold. On large CPU
    systems, the difference between the estimated and real value of
    NR_FREE_PAGES can be very high. If NR_FREE_PAGES is much higher than
    number of real free page in buddy, the VM can allocate pages below min
    watermark, at worst reducing the real number of pages to zero. Even if
    the OOM killer kills some victim for freeing memory, it may not free
    memory if the exit path requires a new page resulting in livelock.

    This patch introduces a zone_page_state_snapshot() function (courtesy of
    Christoph) that takes a slightly more accurate view of an arbitrary vmstat
    counter. It is used to read NR_FREE_PAGES while kswapd is awake to avoid
    the watermark being accidentally broken. The estimate is not perfect and
    may result in cache line bounces but is expected to be lighter than the
    IPI calls necessary to continually drain the per-cpu counters while kswapd
    is awake.

    Signed-off-by: Christoph Lameter <cl@linux.com>
    Signed-off-by: Mel Gorman <mel@csn.ul.ie>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Christoph Lameter
     
  • Tests with recent firmware on Intel X25-M 80GB and OCZ Vertex 60GB SSDs
    show a shift since I last tested in December: in part because of firmware
    updates, in part because of the necessary move from barriers to awaiting
    completion at the block layer. While discard at swapon still shows as
    slightly beneficial on both, discarding 1MB swap cluster when allocating
    is now disadvanteous: adds 25% overhead on Intel, adds 230% on OCZ (YMMV).

    Surrender: discard as presently implemented is more hindrance than help
    for swap; but might prove useful on other devices, or with improvements.
    So continue to do the discard at swapon, but make discard while swapping
    conditional on a SWAP_FLAG_DISCARD to sys_swapon() (which has been using
    only the lower 16 bits of int flags).

    We can add a --discard or -d to swapon(8), and a "discard" to swap in
    /etc/fstab: matching the mount option for btrfs, ext4, fat, gfs2, nilfs2.

    Signed-off-by: Hugh Dickins
    Cc: Christoph Hellwig
    Cc: Nigel Cunningham
    Cc: Tejun Heo
    Cc: Jens Axboe
    Cc: James Bottomley
    Cc: "Martin K. Petersen"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Please revert 2.6.36-rc commit d2997b1042ec150616c1963b5e5e919ffd0b0ebf
    "hibernation: freeze swap at hibernation". It complicated matters by
    adding a second swap allocation path, just for hibernation; without in any
    way fixing the issue that it was intended to address - page reclaim after
    fixing the hibernation image might free swap from a page already imaged as
    swapcache, letting its swap be reallocated to store a different page of
    the image: resulting in data corruption if the imaged page were freed as
    clean then swapped back in. Pages freed to si->swap_map were still in
    danger of being reallocated by the alternative allocation path.

    I guess it inadvertently fixed slow SSD swap allocation for hibernation,
    as reported by Nigel Cunningham: by missing out the discards that occur on
    the usual swap allocation path; but that was unintentional, and needs a
    separate fix.

    Signed-off-by: Hugh Dickins
    Cc: KAMEZAWA Hiroyuki
    Cc: KOSAKI Motohiro
    Cc: "Rafael J. Wysocki"
    Cc: Ondrej Zary
    Cc: Andrea Gelmini
    Cc: Balbir Singh
    Cc: Andrea Arcangeli
    Cc: Nigel Cunningham
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • There's been some recent confusion about error checking GPIO numbers.
    briefly, it should be handled mostly during setup, when gpio_request() is
    called, and NEVER by expectig gpio_is_valid to report more than
    never-usable GPIO numbers.

    [akpm@linux-foundation.org: terminate unterminated comment]
    Signed-off-by: David Brownell
    Cc: Eric Miao"
    Cc: "Ryan Mallon"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Brownell
     
  • Replace the arbitrary software-reset call from the device-probe
    method, because:

    - It is defective. To work correctly, it should be two byte writes,
    not a single word write. As it stands, it does nothing.

    - Some devices with sx150x expanders installed have their NRESET pins
    ganged on the same line, so resetting one causes the others to reset -
    not a nice thing to do arbitrarily!

    - The probe, usually taking place at boot, implies a recent hard-reset,
    so a software reset at this point is just a waste of energy anyway.

    Therefore, make it optional, defaulting to off, as this will match the
    common case of probing at powerup and also matches the current broken
    no-op behavior.

    Signed-off-by: Gregory Bean
    Reviewed-by: Jean Delvare
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gregory Bean
     
  • The pte_same check is reliable only if the swap entry remains pinned (by
    the page lock on swapcache). We've also to ensure the swapcache isn't
    removed before we take the lock as try_to_free_swap won't care about the
    page pin.

    One of the possible impacts of this patch is that a KSM-shared page can
    point to the anon_vma of another process, which could exit before the page
    is freed.

    This can leave a page with a pointer to a recycled anon_vma object, or
    worse, a pointer to something that is no longer an anon_vma.

    [riel@redhat.com: changelog help]
    Signed-off-by: Andrea Arcangeli
    Acked-by: Hugh Dickins
    Reviewed-by: Rik van Riel
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli