01 Jul, 2011

3 commits

  • The perf_event overflow handler does not receive any caller-derived
    argument, so many callers need to resort to looking up the perf_event
    in their local data structure. This is ugly and doesn't scale if a
    single callback services many perf_events.

    Fix by adding a context parameter to perf_event_create_kernel_counter()
    (and derived hardware breakpoints APIs) and storing it in the perf_event.
    The field can be accessed from the callback as event->overflow_handler_context.
    All callers are updated.

    Signed-off-by: Avi Kivity
    Signed-off-by: Peter Zijlstra
    Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com
    Signed-off-by: Ingo Molnar

    Avi Kivity
     
  • Add a NODE level to the generic cache events which is used to measure
    local vs remote memory accesses. Like all other cache events, an
    ACCESS is HIT+MISS, if there is no way to distinguish between reads
    and writes do reads only etc..

    The below needs filling out for !x86 (which I filled out with
    unsupported events).

    I'm fairly sure ARM can leave it like that since it doesn't strike me as
    an architecture that even has NUMA support. SH might have something since
    it does appear to have some NUMA bits.

    Sparc64, PowerPC and MIPS certainly want a good look there since they
    clearly are NUMA capable.

    Signed-off-by: Peter Zijlstra
    Cc: David Miller
    Cc: Anton Blanchard
    Cc: David Daney
    Cc: Deng-Cheng Zhu
    Cc: Paul Mundt
    Cc: Will Deacon
    Cc: Robert Richter
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The nmi parameter indicated if we could do wakeups from the current
    context, if not, we would set some state and self-IPI and let the
    resulting interrupt do the wakeup.

    For the various event classes:

    - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
    - tracepoint: nmi=0; since tracepoint could be from NMI context.
    - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

    As one can see, there is very little nmi=1 usage, and the down-side of
    not using it is that on some platforms some software events can have a
    jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

    The up-side however is that we can remove the nmi parameter and save a
    bunch of conditionals in fast paths.

    Signed-off-by: Peter Zijlstra
    Cc: Michael Cree
    Cc: Will Deacon
    Cc: Deng-Cheng Zhu
    Cc: Anton Blanchard
    Cc: Eric B Munson
    Cc: Heiko Carstens
    Cc: Paul Mundt
    Cc: David S. Miller
    Cc: Frederic Weisbecker
    Cc: Jason Wessel
    Cc: Don Zickus
    Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

28 Jun, 2011

1 commit

  • commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
    of nodes. But, it's not defined in linux/mmzone.h but defined in
    /arch/???/include/mmzone.h which is included only under
    CONFIG_NEED_MULTIPLE_NODES=y.

    Then, we see
    mm/page_cgroup.c: In function 'page_cgroup_init':
    mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
    mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'

    So, fixiing page_cgroup.c is an idea...

    But node_start_pfn()/node_end_pfn() is a very generic macro and
    should be implemented in the same manner for all archs.
    (m32r has different implementation...)

    This patch removes definitions of node_start/end_pfn() in each archs
    and defines a unified one in linux/mmzone.h. It's not under
    CONFIG_NEED_MULTIPLE_NODES, now.

    A result of macro expansion is here (mm/page_cgroup.c)

    for !NUMA
    start_pfn = ((&contig_page_data)->node_start_pfn);
    end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

    for NUMA (x86-64)
    start_pfn = ((node_data[nid])->node_start_pfn);
    end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

    Changelog:
    - fixed to avoid using "nid" twice in node_end_pfn() macro.

    Reported-and-acked-by: Randy Dunlap
    Reported-and-tested-by: Ingo Molnar
    Acked-by: Mel Gorman
    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

17 Jun, 2011

1 commit


16 Jun, 2011

3 commits


14 Jun, 2011

1 commit


08 Jun, 2011

1 commit

  • gUSA special cases r15 for part of its login/out sequence, meaning that
    any parameters need to be explicitly prohibited from accidentally being
    assigned that particular register, and the compiler ultimately needs to
    use a temporary instead.

    Certain configurations have begun generating code paths that do indeed
    get allocated r15, resulting in immediate corruption of the exchanged
    value. This was observed in (amongst others) exit_mm() code generation
    where the xchg_u32 call was immediately corrupting a structure address.

    As this is a general gUSA restriction, the rest of the users likewise
    need to be updated to ensure sensible constraints.

    References: https://bugzilla.stlinux.com/show_bug.cgi?id=11229
    Signed-off-by: Srinivas Kandagatla
    Reviewed-by: Stuart Menefy
    Signed-off-by: Paul Mundt

    Srinivas KANDAGATLA
     

06 Jun, 2011

2 commits

  • SH-2A is unable to combine the kernel and libgcc objects due to
    fundamental disagreements over FDPIC settings. As the kernel already
    contains all of the libgcc bits broken out, there's not much need to
    bother with the linking anymore, as everything can already be derived
    from the lib dir.

    This simply plugs in the necessary bits to ensure that everything is
    built uniformly, enabling us to wean the compressed build off of explicit
    libgcc linking.

    Reported-by: Phil Edworthy
    Signed-off-by: Paul Mundt

    Paul Mundt
     
  • This patch fixes a icache/dcache address-array start address while
    dumping its entires in debugfs. Perviously the code was attempting to
    remember the address in static variable, which is no more required
    for debugfs, as the function can be executed in one pass.

    Without this patch the start address ends up in wrong place and the
    /sys/kernel/debug/sh/icache or dcache debugfs contents may not be correct.

    Signed-off-by: Srinivas Kandagatla
    Cc: Stuart Menefy
    Signed-off-by: Paul Mundt

    Srinivas KANDAGATLA
     

31 May, 2011

5 commits


29 May, 2011

2 commits

  • * setns:
    ns: Wire up the setns system call

    Done as a merge to make it easier to fix up conflicts in arm due to
    addition of sendmmsg system call

    Linus Torvalds
     
  • 32bit and 64bit on x86 are tested and working. The rest I have looked
    at closely and I can't find any problems.

    setns is an easy system call to wire up. It just takes two ints so I
    don't expect any weird architecture porting problems.

    While doing this I have noticed that we have some architectures that are
    very slow to get new system calls. cris seems to be the slowest where
    the last system calls wired up were preadv and pwritev. avr32 is weird
    in that recvmmsg was wired up but never declared in unistd.h. frv is
    behind with perf_event_open being the last syscall wired up. On h8300
    the last system call wired up was epoll_wait. On m32r the last system
    call wired up was fallocate. mn10300 has recvmmsg as the last system
    call wired up. The rest seem to at least have syncfs wired up which was
    new in the 2.6.39.

    v2: Most of the architecture support added by Daniel Lezcano
    v3: ported to v2.6.36-rc4 by: Eric W. Biederman
    v4: Moved wiring up of the system call to another patch
    v5: ported to v2.6.39-rc6
    v6: rebased onto parisc-next and net-next to avoid syscall conflicts.
    v7: ported to Linus's latest post 2.6.39 tree.

    >  arch/blackfin/include/asm/unistd.h     |    3 ++-
    >  arch/blackfin/mach-common/entry.S      |    1 +
    Acked-by: Mike Frysinger

    Oh - ia64 wiring looks good.
    Acked-by: Tony Luck

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Linus Torvalds

    Eric W. Biederman
     

28 May, 2011

1 commit


27 May, 2011

4 commits

  • By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
    CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
    to test for existence of find bitops anymore.

    Signed-off-by: Akinobu Mita
    Acked-by: Greg Ungerer
    Cc: Arnd Bergmann
    Cc: Russell King
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     
  • The Blackfin arch, like the x86 arch, needs to adjust the PC manually
    after a breakpoint is hit as normally this is handled by the remote gdb.
    However, rather than starting another arch ifdef mess, create a common
    GDB_ADJUSTS_BREAK_OFFSET define for any arch to opt-in via their kgdb.h.

    Signed-off-by: Mike Frysinger
    Cc: Oleg Nesterov
    Cc: Jason Wessel
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Acked-by: Paul Mundt
    Acked-by: Dongdong Deng
    Cc: Sergei Shtylyov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     
  • Signed-off-by: Mike Frysinger
    Cc: Oleg Nesterov
    Cc: Jason Wessel
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Paul Mundt
    Cc: Sergei Shtylyov
    Cc: Dongdong Deng
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     
  • The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and
    leads to some problems:

    * cgroup creation is out-of-control
    * cgroup name can conflict when pids are looping
    * it is not possible to have a single process handling a lot of
    namespaces without falling in a exponential creation time
    * we may want to create a namespace without creating a cgroup

    The ns_cgroup was replaced by a compatibility flag 'clone_children',
    where a newly created cgroup will copy the parent cgroup values.
    The userspace has to manually create a cgroup and add a task to
    the 'tasks' file.

    This patch removes the ns_cgroup as suggested in the following thread:

    https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html

    The 'cgroup_clone' function is removed because it is no longer used.

    This is a userspace-visible change. Commit 45531757b45c ("cgroup: notify
    ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a
    printk warning users that the feature is planned for removal. Since that
    time we have heard from XXX users who were affected by this.

    Signed-off-by: Daniel Lezcano
    Signed-off-by: Serge E. Hallyn
    Cc: Eric W. Biederman
    Cc: Jamal Hadi Salim
    Reviewed-by: Li Zefan
    Acked-by: Paul Menage
    Acked-by: Matt Helsley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Lezcano
     

25 May, 2011

6 commits

  • Most arches define CONFIG_DEBUG_STACK_USAGE exactly the same way. Move it
    to lib/Kconfig.debug so each arch doesn't have to define it. This
    obviously makes the option generic, but that's fine because the config is
    already used in generic code.

    It's not obvious to me that sysrq-P actually does anything caution by
    keeping the most inclusive wording.

    Signed-off-by: Stephen Boyd
    Cc: Chris Metcalf
    Acked-by: David S. Miller
    Acked-by: Richard Weinberger
    Acked-by: Mike Frysinger
    Cc: Russell King
    Cc: Hirokazu Takata
    Acked-by: Ralf Baechle
    Cc: Paul Mackerras
    Acked-by: Benjamin Herrenschmidt
    Cc: Chen Liqin
    Cc: Lennox Wu
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Boyd
     
  • Fold all the mmu_gather rework patches into one for submission

    Signed-off-by: Peter Zijlstra
    Reported-by: Hugh Dickins
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Fix up the sh mmu_gather code to conform to the new API.

    Signed-off-by: Peter Zijlstra
    Acked-by: Paul Mundt
    Cc: Benjamin Herrenschmidt
    Cc: David Miller
    Cc: Martin Schwidefsky
    Cc: Russell King
    Cc: Jeff Dike
    Cc: Richard Weinberger
    Cc: Tony Luck
    Cc: KAMEZAWA Hiroyuki
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: KOSAKI Motohiro
    Cc: Nick Piggin
    Cc: Namhyung Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • This makes it possible to leave DMA slave IDs in the platform data
    at default 0 value without hitting DMA channel allocation error paths.

    Signed-off-by: Guennadi Liakhovetski
    Signed-off-by: Paul Mundt

    Guennadi Liakhovetski
     
  • All architectures supporting hibernation define
    arch_prepare_suspend() as an empty function, so remove it.

    Signed-off-by: Rafael J. Wysocki

    Rafael J. Wysocki
     
  • * 'for-2.6.40' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
    percpu: Unify input section names
    percpu: Avoid extra NOP in percpu_cmpxchg16b_double
    percpu: Cast away printk format warning
    percpu: Always align percpu output section to PAGE_SIZE

    Fix up fairly trivial conflict in arch/x86/include/asm/percpu.h as per Tejun

    Linus Torvalds
     

24 May, 2011

4 commits


23 May, 2011

6 commits