28 Jun, 2011

1 commit

  • commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
    of nodes. But, it's not defined in linux/mmzone.h but defined in
    /arch/???/include/mmzone.h which is included only under
    CONFIG_NEED_MULTIPLE_NODES=y.

    Then, we see
    mm/page_cgroup.c: In function 'page_cgroup_init':
    mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
    mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'

    So, fixiing page_cgroup.c is an idea...

    But node_start_pfn()/node_end_pfn() is a very generic macro and
    should be implemented in the same manner for all archs.
    (m32r has different implementation...)

    This patch removes definitions of node_start/end_pfn() in each archs
    and defines a unified one in linux/mmzone.h. It's not under
    CONFIG_NEED_MULTIPLE_NODES, now.

    A result of macro expansion is here (mm/page_cgroup.c)

    for !NUMA
    start_pfn = ((&contig_page_data)->node_start_pfn);
    end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

    for NUMA (x86-64)
    start_pfn = ((node_data[nid])->node_start_pfn);
    end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

    Changelog:
    - fixed to avoid using "nid" twice in node_end_pfn() macro.

    Reported-and-acked-by: Randy Dunlap
    Reported-and-tested-by: Ingo Molnar
    Acked-by: Mel Gorman
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

29 May, 2011

2 commits

  • * setns:
    ns: Wire up the setns system call

    Done as a merge to make it easier to fix up conflicts in arm due to
    addition of sendmmsg system call

    Linus Torvalds
     
  • 32bit and 64bit on x86 are tested and working. The rest I have looked
    at closely and I can't find any problems.

    setns is an easy system call to wire up. It just takes two ints so I
    don't expect any weird architecture porting problems.

    While doing this I have noticed that we have some architectures that are
    very slow to get new system calls. cris seems to be the slowest where
    the last system calls wired up were preadv and pwritev. avr32 is weird
    in that recvmmsg was wired up but never declared in unistd.h. frv is
    behind with perf_event_open being the last syscall wired up. On h8300
    the last system call wired up was epoll_wait. On m32r the last system
    call wired up was fallocate. mn10300 has recvmmsg as the last system
    call wired up. The rest seem to at least have syncfs wired up which was
    new in the 2.6.39.

    v2: Most of the architecture support added by Daniel Lezcano
    v3: ported to v2.6.36-rc4 by: Eric W. Biederman
    v4: Moved wiring up of the system call to another patch
    v5: ported to v2.6.39-rc6
    v6: rebased onto parisc-next and net-next to avoid syscall conflicts.
    v7: ported to Linus's latest post 2.6.39 tree.

    >  arch/blackfin/include/asm/unistd.h     |    3 ++-
    >  arch/blackfin/mach-common/entry.S      |    1 +
    Acked-by: Mike Frysinger

    Oh - ia64 wiring looks good.
    Acked-by: Tony Luck

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

27 May, 2011

1 commit

  • By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
    CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
    to test for existence of find bitops anymore.

    Signed-off-by: Akinobu Mita
    Acked-by: Greg Ungerer
    Cc: Arnd Bergmann
    Cc: Russell King
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

25 May, 2011

4 commits

  • This constant hasn't been used since before the git era (2.6.12) and thus
    can be dropped.

    Signed-off-by: Stephen Boyd
    Cc: Russell King
    Cc: Richard Weinberger
    Cc: Hirokazu Takata
    Cc: Kyle McMartin
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     
  • Fold all the mmu_gather rework patches into one for submission

    Signed-off-by: Peter Zijlstra
    Reported-by: Hugh Dickins
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Architectures that implement their own show_mem() function did not pass
    the filter argument to show_free_areas() to appropriately avoid emitting
    the state of nodes that are disallowed in the current context. This patch
    now passes the filter argument to show_free_areas() so those nodes are now
    avoided.

    This patch also removes the show_free_areas() wrapper around
    __show_free_areas() and converts existing callers to pass an empty filter.

    ia64 emits additional information for each node, so skip_free_areas_zone()
    must be made global to filter disallowed nodes and it is converted to use
    a nid argument rather than a zone for this use case.

    Signed-off-by: David Rientjes
    Cc: Russell King
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Kyle McMartin
    Cc: Helge Deller
    Cc: James Bottomley
    Cc: "David S. Miller"
    Cc: Guan Xuetao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: Unify input section names
    percpu: Avoid extra NOP in percpu_cmpxchg16b_double
    percpu: Cast away printk format warning
    percpu: Always align percpu output section to PAGE_SIZE

    Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun

    Linus Torvalds
     

24 May, 2011

1 commit


23 May, 2011

1 commit


22 May, 2011

1 commit


20 May, 2011

1 commit

  • A new utility function (core_kernel_data()) is used to determine if a
    passed in address is part of core kernel data or not. It may or may not
    return true for RO data, but this utility must work for RW data.

    Thus both _sdata and _edata must be defined and continuous,
    without .init sections that may later be freed and replaced by
    volatile memory (memory that can be freed).

    This utility function is used to determine if data is safe from
    ever being freed. Thus it should return true for all RW global
    data that is not in a module or has been allocated, or false
    otherwise.

    Also change core_kernel_data() back to the more precise _sdata condition
    and document the function.

    Signed-off-by: Steven Rostedt
    Acked-by: Ralf Baechle
    Acked-by: Hirokazu Takata
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: Geert Uytterhoeven
    Cc: Roman Zippel
    Cc: linux-m68k@lists.linux-m68k.org
    Cc: Kyle McMartin
    Cc: Helge Deller
    Cc: JamesE.J.Bottomley
    Link: http://lkml.kernel.org/r/1305855298.1465.19.camel@gandalf.stny.rr.com
    Signed-off-by: Ingo Molnar
    ----
    arch/alpha/kernel/vmlinux.lds.S | 1 +
    arch/m32r/kernel/vmlinux.lds.S | 1 +
    arch/m68k/kernel/vmlinux-std.lds | 2 ++
    arch/m68k/kernel/vmlinux-sun3.lds | 1 +
    arch/mips/kernel/vmlinux.lds.S | 1 +
    arch/parisc/kernel/vmlinux.lds.S | 3 +++
    kernel/extable.c | 12 +++++++++++-
    7 files changed, 20 insertions(+), 1 deletion(-)

    Steven Rostedt
     

12 May, 2011

1 commit


21 Apr, 2011

1 commit

  • When a DISCONTIGMEM memory range is brought online as a NUMA node, it
    also needs to have its bet set in N_NORMAL_MEMORY. This is necessary for
    generic kernel code that utilizes N_NORMAL_MEMORY as a subset of N_ONLINE
    for memory savings.

    These types of hacks can hopefully be removed once DISCONTIGMEM is either
    removed or abstracted away from CONFIG_NUMA.

    Fixes a panic in the slub code which only initializes structures for
    N_NORMAL_MEMORY to save memory:

    Backtrace:
    [] add_partial+0x28/0x98
    [] __slab_free+0x1d0/0x1d8
    [] kmem_cache_free+0xc4/0x128
    [] ida_get_new_above+0x21c/0x2c0
    [] sysfs_new_dirent+0xd0/0x238
    [] create_dir+0x5c/0x168
    [] sysfs_create_dir+0x98/0x128
    [] kobject_add_internal+0x114/0x258
    [] kobject_add_varg+0x7c/0xa0
    [] kobject_add+0x50/0x90
    [] kobject_create_and_add+0x54/0xc8
    [] cgroup_init+0x138/0x1f0
    [] start_kernel+0x5a0/0x840
    [] start_parisc+0xa4/0xb8
    [] packet_ioctl+0x16c/0x208
    [] ip_mroute_setsockopt+0x260/0xf20

    Signed-off-by: David Rientjes
    Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    David Rientjes
     

16 Apr, 2011

7 commits

  • Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    James Bottomley
     
  • Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    James Bottomley
     
  • Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    James Bottomley
     
  • Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    James Bottomley
     
  • According to Appendix F, the TLB is the primary arbiter of speculation.
    Thus, if a page has a TLB entry, it may be speculatively read into the
    cache. On linux, this can cause us incoherencies because if we're about
    to do a disk read, we call get_user_pages() to do the flush/invalidate
    in user space, but we still potentially have the user TLB entries, and
    the cache could speculate the lines back into userspace (thus causing
    stale data to be used). This is fixed by purging the TLB entries before
    we flush through the tmpalias space. Now, the only way the line could
    be re-speculated is if the user actually tries to touch it (which is not
    allowed).

    Signed-off-by: James Bottomley

    James Bottomley
     
  • Currently parisc has the whole kernel marked as RWX, meaning any
    kernel page at all is eligible to be executed. This can cause a
    theoretical problem on systems with combined I/D TLB because the act
    of referencing a page causes a TLB insertion with an executable bit.
    This TLB entry may be used by the CPU as the basis for speculating the
    page into the I-Cache. If this speculated page is subsequently used
    for a user process, there is the possibility we will get a stale
    I-cache line picked up as the binary executes.

    As a point of good practise, only mark actual kernel text pages as
    executable. The same has to be done for init_text pages, but they're
    converted to data pages (and the I-Cache flushed) when the init memory
    is released.

    Signed-off-by: James Bottomley

    James Bottomley
     
  • Fix style of flush_user_dcache_range_asm procedure declaration in
    arch/parisc/kernel/pacache.s to be consistent with other assembly
    procedures.

    Signed-off-by: Meelis Roos
    Signed-off-by: James Bottomley

    Meelis Roos
     

14 Apr, 2011

1 commit

  • For future rework of try_to_wake_up() we'd like to push part of that
    function onto the CPU the task is actually going to run on.

    In order to do so we need a generic callback from the existing scheduler IPI.

    This patch introduces such a generic callback: scheduler_ipi() and
    implements it as a NOP.

    BenH notes: PowerPC might use this IPI on offline CPUs under rare conditions!

    Acked-by: Russell King
    Acked-by: Martin Schwidefsky
    Acked-by: Chris Metcalf
    Acked-by: Jesper Nilsson
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Ralf Baechle
    Reviewed-by: Frank Rowand
    Cc: Mike Galbraith
    Cc: Nick Piggin
    Cc: Linus Torvalds
    Cc: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20110405152728.744338123@chello.nl

    Peter Zijlstra
     

31 Mar, 2011

1 commit


30 Mar, 2011

1 commit


29 Mar, 2011

3 commits


25 Mar, 2011

2 commits

  • Commit ddd588b5dd55 ("oom: suppress nodes that are not allowed from
    meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which
    resulted in build warnings on all architectures that implement their own
    versions of show_mem():

    lib/lib.a(show_mem.o): In function `show_mem':
    show_mem.c:(.text+0x1f4): multiple definition of `show_mem'
    arch/sparc/mm/built-in.o:(.text+0xd70): first defined here

    The fix is to remove __show_mem() and add its argument to show_mem() in
    all implementations to prevent this breakage.

    Architectures that implement their own show_mem() actually don't do
    anything with the argument yet, but they could be made to filter nodes
    that aren't allowed in the current context in the future just like the
    generic implementation.

    Reported-by: Stephen Rothwell
    Reported-by: James Bottomley
    Suggested-by: Andrew Morton
    Signed-off-by: David Rientjes
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Percpu allocator honors alignment request upto PAGE_SIZE and both the
    percpu addresses in the percpu address space and the translated kernel
    addresses should be aligned accordingly. The calculation of the
    former depends on the alignment of percpu output section in the kernel
    image.

    The linker script macros PERCPU_VADDR() and PERCPU() are used to
    define this output section and the latter takes @align parameter.
    Several architectures are using @align smaller than PAGE_SIZE breaking
    percpu memory alignment.

    This patch removes @align parameter from PERCPU(), renames it to
    PERCPU_SECTION() and makes it always align to PAGE_SIZE. While at it,
    add PCPU_SETUP_BUG_ON() checks such that alignment problems are
    reliably detected and remove percpu alignment comment recently added
    in workqueue.c as the condition would trigger BUG way before reaching
    there.

    For um, this patch raises the alignment of percpu area. As the area
    is in .init, there shouldn't be any noticeable difference.

    This problem was discovered by David Howells while debugging boot
    failure on mn10300.

    Signed-off-by: Tejun Heo
    Acked-by: Mike Frysinger
    Cc: uclinux-dist-devel@blackfin.uclinux.org
    Cc: David Howells
    Cc: Jeff Dike
    Cc: user-mode-linux-devel@lists.sourceforge.net

    Tejun Heo
     

24 Mar, 2011

5 commits

  • There is no user now.

    Signed-off-by: FUJITA Tomonori
    Cc: David Miller
    Cc: Ralf Baechle
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     
  • minix bit operations are only used by minix filesystem and useless by
    other modules. Because byte order of inode and block bitmaps is different
    on each architecture like below:

    m68k:
    big-endian 16bit indexed bitmaps

    h8300, microblaze, s390, sparc, m68knommu:
    big-endian 32 or 64bit indexed bitmaps

    m32r, mips, sh, xtensa:
    big-endian 32 or 64bit indexed bitmaps for big-endian mode
    little-endian bitmaps for little-endian mode

    Others:
    little-endian bitmaps

    In order to move minix bit operations from asm/bitops.h to architecture
    independent code in minix filesystem, this provides two config options.

    CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
    CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
    native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
    m32r, mips, sh, xtensa). The architectures which always use little-endian
    bitmaps do not select these options.

    Finally, we can remove minix bit operations from asm/bitops.h for all
    architectures.

    Signed-off-by: Akinobu Mita
    Acked-by: Arnd Bergmann
    Acked-by: Greg Ungerer
    Cc: Geert Uytterhoeven
    Cc: Roman Zippel
    Cc: Andreas Schwab
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Yoshinori Sato
    Cc: Michal Simek
    Cc: "David S. Miller"
    Cc: Hirokazu Takata
    Acked-by: Ralf Baechle
    Acked-by: Paul Mundt
    Cc: Chris Zankel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • As the result of conversions, there are no users of ext2 non-atomic bit
    operations except for ext2 filesystem itself. Now we can put them into
    architecture independent code in ext2 filesystem, and remove from
    asm/bitops.h for all architectures.

    Signed-off-by: Akinobu Mita
    Cc: Jan Kara
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • Introduce little-endian bit operations to the big-endian architectures
    which do not have native little-endian bit operations and the
    little-endian architectures. (alpha, avr32, blackfin, cris, frv, h8300,
    ia64, m32r, mips, mn10300, parisc, sh, sparc, tile, x86, xtensa)

    These architectures can just include generic implementation
    (asm-generic/bitops/le.h).

    Signed-off-by: Akinobu Mita
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Mikael Starvik
    Cc: David Howells
    Cc: Yoshinori Sato
    Cc: "Luck, Tony"
    Cc: Ralf Baechle
    Cc: Kyle McMartin
    Cc: Matthew Wilcox
    Cc: Grant Grundler
    Cc: Paul Mundt
    Cc: Kazumoto Kojima
    Cc: Hirokazu Takata
    Cc: "David S. Miller"
    Cc: Chris Zankel
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Acked-by: Hans-Christian Egtvedt
    Acked-by: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • This introduces CONFIG_GENERIC_FIND_BIT_LE to tell whether to use generic
    implementation of find_*_bit_le() in lib/find_next_bit.c or not.

    For now we select CONFIG_GENERIC_FIND_BIT_LE for all architectures which
    enable CONFIG_GENERIC_FIND_NEXT_BIT.

    But m68knommu wants to define own faster find_next_zero_bit_le() and
    continues using generic find_next_{,zero_}bit().
    (CONFIG_GENERIC_FIND_NEXT_BIT and !CONFIG_GENERIC_FIND_BIT_LE)

    Signed-off-by: Akinobu Mita
    Cc: Greg Ungerer
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

23 Mar, 2011

1 commit

  • All architectures can use the common dma_addr_t typedef now. We can
    remove the arch specific dma_addr_t.

    Signed-off-by: FUJITA Tomonori
    Acked-by: Arnd Bergmann
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc: Ivan Kokshaysky
    Cc: Richard Henderson
    Cc: Matt Turner
    Cc: "Luck, Tony"
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: Chris Metcalf
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    FUJITA Tomonori
     

22 Mar, 2011

1 commit


18 Mar, 2011

1 commit

  • Make __get_user_pages return -EHWPOISON for HWPOISON page only if
    FOLL_HWPOISON is specified. With this patch, the interested callers
    can distinguish HWPOISON pages from general FAULT pages, while other
    callers will still get -EFAULT for all these pages, so the user space
    interface need not to be changed.

    This feature is needed by KVM, where UCR MCE should be relayed to
    guest for HWPOISON page, while instruction emulation and MMIO will be
    tried for general FAULT page.

    The idea comes from Andrew Morton.

    Signed-off-by: Huang Ying
    Cc: Andrew Morton
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Huang Ying
     

17 Mar, 2011

1 commit

  • * 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (76 commits)
    pch_uart: reference clock on CM-iTC
    pch_phub: add new device ML7213
    n_gsm: fix UIH control byte : P bit should be 0
    n_gsm: add a documentation
    serial: msm_serial_hs: Add MSM high speed UART driver
    tty_audit: fix tty_audit_add_data live lock on audit disabled
    tty: move cd1865.h to drivers/staging/tty/
    Staging: tty: fix build with epca.c driver
    pcmcia: synclink_cs: fix prototype for mgslpc_ioctl()
    Staging: generic_serial: fix double locking bug
    nozomi: don't use flush_scheduled_work()
    tty/serial: Relax the device_type restriction from of_serial
    MAINTAINERS: Update HVC file patterns
    tty: phase out of ioctl file pointer for tty3270 as well
    tty: forgot to remove ipwireless from drivers/char/pcmcia/Makefile
    pch_uart: Fix DMA channel miss-setting issue.
    pch_uart: fix exclusive access issue
    pch_uart: fix auto flow control miss-setting issue
    pch_uart: fix uart clock setting issue
    pch_uart : Use dev_xxx not pr_xxx
    ...

    Fix up trivial conflicts in drivers/misc/pch_phub.c (same patch applied
    twice, then changes to the same area in one branch)

    Linus Torvalds
     

16 Mar, 2011

2 commits