11 Aug, 2010

40 commits

  • If signalfd is used to consume a signal generated by a POSIX interval
    timer or POSIX message queue, the ssi_int field does not reflect the data
    (sigevent->sigev_value) supplied to timer_create(2) or mq_notify(3). (The
    ssi_ptr field, however, is filled in.)

    This behavior differs from signalfd's treatment of sigqueue-generated
    signals -- see the default case in signalfd_copyinfo. It also gives
    results that differ from the case when a signal is handled conventionally
    via a sigaction-registered handler.

    So, set signalfd_siginfo->ssi_int in the remaining cases (__SI_TIMER,
    __SI_MESGQ) where ssi_ptr is set.

    akpm: a non-back-compatible change. Merge into -stable to minimise the
    number of kernels which are in the field and which miss this feature.

    Signed-off-by: Nathan Lynch
    Acked-by: Davide Libenzi
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nathan Lynch
     
  • exit_ptrace() takes tasklist_lock unconditionally. We need this lock to
    avoid the race with ptrace_traceme(), it acts as a barrier.

    Change its caller, forget_original_parent(), to call exit_ptrace() under
    tasklist_lock. Change exit_ptrace() to drop and reacquire this lock if
    needed.

    This allows us to add the fastpath list_empty(ptraced) check. In the
    likely no-tracees case exit_ptrace() just returns and we avoid the lock()
    + unlock() sequence.

    "Zhang, Yanmin" suggested to add this
    check, and he reports that this change adds about 11% improvement in some
    tests.

    Suggested-and-tested-by: "Zhang, Yanmin"
    Signed-off-by: Oleg Nesterov
    Acked-by: Roland McGrath
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • We have zone_to_nid(). this patch convert all existing users of
    zone->zone_pgdat->node_id.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Nishimura Daisuke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • mem_cgroup_soft_limit_reclaim() has zone, nid and zid argument. but nid
    and zid can be calculated from zone. So remove it.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Mel Gorman
    Cc: Balbir Singh
    Cc: Nishimura Daisuke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Currently mem_cgroup_shrink_node_zone() call shrink_zone() directly. thus
    it doesn't need to initialize sc.nodemask because shrink_zone() doesn't
    use it at all.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Mel Gorman
    Cc: Balbir Singh
    Cc: Nishimura Daisuke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • sc.nr_reclaimed and sc.nr_scanned have already been initialized few lines
    above "struct scan_control sc = {}" statement.

    So, This patch remove this unnecessary code.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Balbir Singh
    Cc: Mel Gorman
    Cc: Nishimura Daisuke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Currently, mem_cgroup_shrink_node_zone() initialize sc.nr_to_reclaim as 0.
    It mean shrink_zone() only scan 32 pages and immediately return even if
    it doesn't reclaim any pages.

    This patch fixes it.

    Signed-off-by: KOSAKI Motohiro
    Acked-by: KAMEZAWA Hiroyuki
    Acked-by: Mel Gorman
    Cc: Balbir Singh
    Cc: Nishimura Daisuke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Now, memory cgroup increments css(cgroup subsys state)'s reference count
    per a charged page. And the reference count is kept until the page is
    uncharged. But this has 2 bad effect.

    1. Because css_get/put calls atomic_inc()/dec, heavy call of them
    on large smp will not scale well.
    2. Because css's refcnt cannot be in a state as "ready-to-release",
    cgroup's notify_on_release handler can't work with memcg.
    3. css's refcnt is atomic_t, it means smaller than 32bit. Maybe too small.

    This has been a problem since the 1st merge of memcg.

    This is a trial to remove css's refcnt per a page. Even if we remove
    refcnt, pre_destroy() does enough synchronization as
    - check res->usage == 0.
    - check no pages on LRU.

    This patch removes css's refcnt per page. Even after this patch, at the
    1st look, it seems css_get() is still called in try_charge().

    But the logic is.

    - If a memcg of mm->owner is cached one, consume_stock() will work.
    At success, return immediately.
    - If consume_stock returns false, css_get() is called and go to
    slow path which may be blocked. At the end of slow path,
    css_put() is called and restart from the start if necessary.

    So, in the fast path, we don't call css_get() and can avoid access to
    shared counter. This patch can make the most possible case fast.

    Here is a result of multi-threaded page fault benchmark.

    [Before]
    25.32% multi-fault-all [kernel.kallsyms] [k] clear_page_c
    9.30% multi-fault-all [kernel.kallsyms] [k] _raw_spin_lock_irqsave
    8.02% multi-fault-all [kernel.kallsyms] [k] try_get_mem_cgroup_from_mm
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Daisuke Nishimura
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • When the OOM killer scans task, it check a task is under memcg or
    not when it's called via memcg's context.

    But, as Oleg pointed out, a thread group leader may have NULL ->mm
    and task_in_mem_cgroup() may do wrong decision. We have to use
    find_lock_task_mm() in memcg as generic OOM-Killer does.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Oleg Nesterov
    Cc: Daisuke Nishimura
    Cc: Balbir Singh
    Reviewed-by: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • mem_cgroup_charge_common() is always called with @mem = NULL, so it's
    meaningless. This patch removes it.

    Signed-off-by: Daisuke Nishimura
    Cc: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daisuke Nishimura
     
  • - try_get_mem_cgroup_from_mm() calls rcu_read_lock/unlock by itself, so we
    don't have to call them in task_in_mem_cgroup().
    - *mz is not used in __mem_cgroup_uncharge_common().
    - we don't have to call lookup_page_cgroup() in mem_cgroup_end_migration()
    after we've cleared PCG_MIGRATION of @oldpage.
    - remove empty comment.
    - remove redundant empty line in mem_cgroup_cache_charge().

    Signed-off-by: Daisuke Nishimura
    Acked-by: Balbir Singh
    Acked-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daisuke Nishimura
     
  • Now, for checking a memcg is under task-account-moving, we do css_tryget()
    against mc.to and mc.from. But this is just complicating things. This
    patch makes the check easier.

    This patch adds a spinlock to move_charge_struct and guard modification of
    mc.to and mc.from. By this, we don't have to think about complicated
    races arount this not-critical path.

    [balbir@linux.vnet.ibm.com: don't crash on a null memcg being passed]
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Balbir Singh
    Cc: Daisuke Nishimura
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • mem_cgroup_try_charge() has a big loop in it and seems to be hard to read.
    Most of routines are for slow path. This patch moves codes out from the
    loop and make it clear what's done.

    Summary:
    - refactoring a function to detect a memcg is under acccount move or not.
    - refactoring a function to wait for the end of moving task acct.
    - refactoring a main loop('s slow path) as a function and make it clear
    why we retry or quit by return code.
    - add fatal_signal_pending() check for bypassing charge loops.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Daisuke Nishimura
    Cc: Balbir Singh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • It's 11 months since we changed swap_map[] to indicates SWAP_HAS_CACHE.
    Since that, memcg's swap accounting has been very stable and it seems
    it can be maintained.

    So, I'd like to remove EXPERIMENTAL from the config.

    Acked-by: Balbir Singh
    Acked-by: Daisuke Nishimura
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     
  • The cgroup device whitelist code gets confused when trying to grant
    permission to a disk partition that is not currently open. Part of
    blkdev_open() includes __blkdev_get() on the whole disk.

    Basically, the only ways to reliably allow a cgroup access to a partition
    on a block device when using the whitelist are to 1) also give it access
    to the whole block device or 2) make sure the partition is already open in
    a different context.

    The patch avoids the cgroup check for the whole disk case when opening a
    partition.

    Addresses https://bugzilla.redhat.com/show_bug.cgi?id=589662

    Signed-off-by: Chris Wright
    Acked-by: Serge E. Hallyn
    Tested-by: Serge E. Hallyn
    Reported-by: Vivek Goyal
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: "Daniel P. Berrange"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wright
     
  • The original code didn't leave enough space for a NULL terminator. These
    strings are copied with strcpy() into fixed length buffers in
    cgroup_root_from_opts().

    Signed-off-by: Dan Carpenter
    Acked-by: Serge E. Hallyn
    Reviewd-by: KAMEZAWA Hiroyuki
    Cc: Paul Menage
    Cc: Li Zefan
    Cc: Ben Blum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Carpenter
     
  • Fix typos & grammar.
    Use CPU instead of cpu in text.

    Signed-off-by: Randy Dunlap
    Acked-by: Steffen Klassert
    Cc: Herbert Xu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • The exception.txt has been removed from the Documentation directory. So
    update the index file for it.

    Signed-off-by: Huang Shijie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Shijie
     
  • $ rm -rf build
    $ mkdir build
    $ cp .config build
    $ make O=build htmldocs
    ...
    xmlto: linux-2.6/build/Documentation/DocBook/media.xml
    does not validate (status 3)
    xmlto: Fix document syntax or use --skip-validation option
    linux-2.6/build/Documentation/DocBook/media.xml:4:
    warning: failed to load external entity
    "linux-2.6/build/Documentation/DocBook/media-entities.tmpl"

    We need the xmldoclinks built for any document types built from the
    XML sources.

    Signed-off-by: Ben Hutchings
    Acked-by: Andy Whitcroft
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ben Hutchings
     
  • Commit 1d794e3b353b ("Staging: wavelan: delete the driver") removed the
    source, so remove the documentation as well.

    Signed-off-by: Joe Perches
    Cc: Jean Tourrilhes
    Acked-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joe Perches
     
  • Fix mtd/nand_base.c kernel-doc warnings and typos.

    Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'mtd'
    Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'ofs'
    Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'len'
    Warning(drivers/mtd/nand/nand_base.c:893): No description found for parameter 'invert'
    Warning(drivers/mtd/nand/nand_base.c:930): No description found for parameter 'mtd'
    Warning(drivers/mtd/nand/nand_base.c:930): No description found for parameter 'ofs'
    Warning(drivers/mtd/nand/nand_base.c:930): No description found for parameter 'len'
    Warning(drivers/mtd/nand/nand_base.c:987): No description found for parameter 'mtd'
    Warning(drivers/mtd/nand/nand_base.c:987): No description found for parameter 'ofs'
    Warning(drivers/mtd/nand/nand_base.c:987): No description found for parameter 'len'
    Warning(drivers/mtd/nand/nand_base.c:2087): No description found for parameter 'len'

    Signed-off-by: Randy Dunlap
    Cc: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Fix (delete) empty kernel-doc lines/warnings:
    Warning(drivers/message/fusion/mptbase.c:6916): bad line:
    Warning(drivers/message/fusion/mptbase.c:7060): bad line:

    Signed-off-by: Randy Dunlap
    Cc: Eric Moore
    Cc: James Bottomley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Signed-off-by: Changli Gao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changli Gao
     
  • Cc: Kulikov Vasiliy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • If iga_init() fails, code releases resources and continues to use it. It
    seems that after releasing resources 'return' should be.

    Signed-off-by: Kulikov Vasiliy
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kulikov Vasiliy
     
  • When we setup up the VMA flags for the mmap flag and we end up using the
    fallback mmap functionality we set the vma->vm_flags |= VM_IO. However we
    neglect to propagate the flag to the vma->vm_page_prot.

    This bug was found when Linux kernel was running under Xen. In that
    scenario, any page that has VM_IO flag to it, means that it MUST be a
    MMIO/VRAM backend memory , _not_ System RAM. That is what the fbmem.c
    does: sets VM_IO, ioremaps the region - everything is peachy.

    Well, not exactly. The vm_page_prot does not get the relevant PTE flags
    set (_PAGE_IOMAP) which under Xen is a death-kneel to pages that are
    referencing real physical devices but don't have that flag set.

    This patch fixes this.

    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Daniel De Graaf
    Tested-by: Eamon Walsh
    Cc: Florian Tobias Schandinat
    Cc: Dave Airlie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel De Graaf
     
  • Since "s3c-fb: Automatically calculate pixel clock when none is given",
    there's no need for manually calculating the pixel clock anymore so remove
    these lines and add the correct refresh rate where appropriately.

    Signed-off-by: Maurus Cuelenaere
    Cc: Pawel Osciak
    Cc: Marek Szyprowski
    Cc: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maurus Cuelenaere
     
  • Add a simple algorithm which calculates the pixel clock based on the video
    mode parameters. This is only done when no pixel clock is supplied
    through the platform data.

    This allows drivers to omit the pixel clock data and thus share the
    algorithm used for calculating it.

    Signed-off-by: Maurus Cuelenaere
    Cc: Pawel Osciak
    Cc: Marek Szyprowski
    Cc: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Russell King
    Tested-by: Donghwa Lee
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Maurus Cuelenaere
     
  • S5PV210 SoCs allow enabling/disabling DMA channels per window. For a
    window to display data from framebuffer memory, its channel has to be
    enabled.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Marek Szyprowski
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • This patch fixes the following section mismatch errors:

    WARNING: vmlinux.o(.data+0x20b40): Section mismatch in reference from the variable s3c_fb_driver_ids to the (unknown reference) .devinit.data:(unknown)
    The variable s3c_fb_driver_ids references
    the (unknown reference) __devinitdata (unknown)
    If the reference is valid then annotate the
    variable with __init* or __refdata (see linux/init.h) or name the variable:
    *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

    WARNING: vmlinux.o(.data+0x20b58): Section mismatch in reference from the variable s3c_fb_driver_ids to the (unknown reference) .devinit.data:(unknown)
    The variable s3c_fb_driver_ids references
    the (unknown reference) __devinitdata (unknown)
    If the reference is valid then annotate the
    variable with __init* or __refdata (see linux/init.h) or name the variable:
    *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

    WARNING: vmlinux.o(.data+0x20b70): Section mismatch in reference from the variable s3c_fb_driver_ids to the (unknown reference) .devinit.data:(unknown)
    The variable s3c_fb_driver_ids references
    the (unknown reference) __devinitdata (unknown)
    If the reference is valid then annotate the
    variable with __init* or __refdata (see linux/init.h) or name the variable:
    *driver, *_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console,

    Signed-off-by: Marek Szyprowski
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Marek Szyprowski
     
  • Newer hardware (S3C6410, S5P) have the ability to block updates from
    shadow registers during reconfiguration. Add protect calls for set_par
    and clear protection when resetting.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • S3C64xx and S5P OSD registers for OSD size and alpha are as follows:
    VIDOSDC: win 0 - size, win 1-4: alpha
    VIDOSDD: win 1-2 - size; not present for windows 0, 3 and 4

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • S5PV210 allows per-window locking of register value updates from shadow
    registers.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • Add VSYNC interrupt support and an ioctl that allows waiting for it.
    Interrupts are turned on only when needed.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • Supports all bpp modes.

    The PRTCON register is used to disable in-hardware updates of registers
    that store start and end addresses of framebuffer memory. This prevents
    display corruption in case we do not make it before VSYNC with updating
    them atomically. With this feature there is no need to wait for a VSYNC
    interrupt before each such update.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • Add framebuffer device name initialization calls for S3C2443, S3C64xx and
    S5P machines.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • S5PC100 and S5PV210 framebuffer devices differ slightly in terms of
    available registers and their driver data structures have to be separate.
    Those differences include dissimilar ways to control shadow register
    updates.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • FRAMESEL1 bitfield starts on 13th bit, not on 14th.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Acked-by: Ben Dooks
    Cc: InKi Dae
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak
     
  • The following problems were found in the above situation:

    sfb->windows[win] was being assigned at the end of s3c_fb_probe_win only.
    This resulted in passing a NULL to s3c_fb_release_win if probe_win
    returned early and a memory leak.

    dma_free_writecombine does not allow its third argument to be NULL.

    fb_dealloc_cmap does not verify whether its argument is not NULL.

    Signed-off-by: Pawel Osciak
    Signed-off-by: Kyungmin Park
    Cc: InKi Dae
    Cc: Ben Dooks
    Cc: Marek Szyprowski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pawel Osciak