10 Jul, 2020

1 commit

  • 'perf kmem' has an input file option but current an output file option
    fails:

    $ sudo perf kmem record -o /tmp/p.data sleep 1  
     Error: unknown switch `o'

    Usage: perf kmem [] {record|stat}

       -f, --force           don't complain, do it
       -i, --input    input file name
       -l, --line      show n lines
       -s, --sort
                             sort by keys: ptr, callsite, bytes, hit, pingpong, frag, page, order, mig>
       -v, --verbose         be more verbose (show symbol address, etc)
           --alloc           show per-allocation statistics
           --caller          show per-callsite statistics
           --live            Show live page stat
           --page            Analyze page allocator
           --raw-ip          show raw ip instead of symbol
           --slab            Analyze slab allocator
           --time      Time span of interest (start,stop)

    'perf sched' is similar in implementation and avoids the problem by
    passing additional arguments to 'perf record'.

    This change makes 'perf kmem' parse command line options consistently
    with 'perf sched', although neither actually list that -o is a supported
    option.

    Signed-off-by: Ian Rogers
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lore.kernel.org/lkml/20200708183919.4141023-1-irogers@google.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Ian Rogers
     

06 May, 2020

3 commits


12 Nov, 2019

1 commit


16 Oct, 2019

1 commit

  • The memory @orig_flags is allocated by strdup(), it is freed on the
    normal path, but leak to free on the error path.

    Fix this by adding free(orig_flags) on the error path.

    Fixes: 0e11115644b3 ("perf kmem: Print gfp flags in human readable string")
    Signed-off-by: Yunfeng Ye
    Cc: Alexander Shishkin
    Cc: Feilong Lin
    Cc: Hu Shiyuan
    Cc: Jiri Olsa
    Cc: Mark Rutland
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lore.kernel.org/lkml/f9e9f458-96f3-4a97-a1d5-9feec2420e07@huawei.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Yunfeng Ye
     

21 Sep, 2019

1 commit

  • This patch is to return error code of perf_new_session function on
    failure instead of NULL.

    Test Results:

    Before Fix:

    $ perf c2c report -input
    failed to open nput: No such file or directory

    $ echo $?
    0
    $

    After Fix:

    $ perf c2c report -input
    failed to open nput: No such file or directory

    $ echo $?
    254
    $

    Committer notes:

    Fix 'perf tests topology' case, where we use that TEST_ASSERT_VAL(...,
    session), i.e. we need to pass zero in case of failure, which was the
    case before when NULL was returned by perf_session__new() for failure,
    but now we need to negate the result of IS_ERR(session) to respect that
    TEST_ASSERT_VAL) expectation of zero meaning failure.

    Reported-by: Nageswara R Sastry
    Signed-off-by: Mamatha Inamdar
    Tested-by: Arnaldo Carvalho de Melo
    Tested-by: Nageswara R Sastry
    Acked-by: Ravi Bangoria
    Reviewed-by: Jiri Olsa
    Reviewed-by: Mukesh Ojha
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Greg Kroah-Hartman
    Cc: Jeremie Galarneau
    Cc: Kate Stewart
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Shawn Landden
    Cc: Song Liu
    Cc: Thomas Gleixner
    Cc: Tzvetomir Stoyanov
    Link: http://lore.kernel.org/lkml/20190822071223.17892.45782.stgit@localhost.localdomain
    Signed-off-by: Arnaldo Carvalho de Melo

    Mamatha Inamdar
     

01 Sep, 2019

3 commits

  • So that we can reduce the header dependency tree further, in the process
    noticed that lots of places were getting even things like build-id
    routines and 'struct perf_tool' definition indirectly, so fix all those
    too.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-ti0btma9ow5ndrytyoqdk62j@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Libtraceevent APIs for printing various trace events information are
    complicated, there are complex extra parameters. To control the way
    event information is printed, the user should call a set of functions in
    a specific sequence.

    These APIs are reimplemented to provide a more simple interface for
    printing event information.

    Removed APIs:

    tep_print_event_task()
    tep_print_event_time()
    tep_print_event_data()
    tep_event_info()
    tep_is_latency_format()
    tep_set_latency_format()
    tep_data_latency_format()
    tep_set_print_raw()

    A new API for printing event information is introduced:
    void tep_print_event(struct tep_handle *tep, struct trace_seq *s,
    struct tep_record *record, const char *fmt, ...);
    where "fmt" is a printf-like format string, followed by the event
    fields to be printed. Supported fields:
    TEP_PRINT_PID, "%d" - event PID
    TEP_PRINT_CPU, "%d" - event CPU
    TEP_PRINT_COMM, "%s" - event command string
    TEP_PRINT_NAME, "%s" - event name
    TEP_PRINT_LATENCY, "%s" - event latency
    TEP_PRINT_TIME, %d - event time stamp. A divisor and precision
    can be specified as part of this format string:
    "%precision.divisord". Example:
    "%3.1000d" - divide the time by 1000 and print the first 3 digits
    before the dot. Thus, the time stamp "123456000" will be printed as
    "123.456"
    TEP_PRINT_INFO, "%s" - event information.
    TEP_PRINT_INFO_RAW, "%s" - event information, in raw format.

    Example:
    tep_print_event(tep, s, record, "%16s-%-5d [%03d] %s %6.1000d %s %s",
    TEP_PRINT_COMM, TEP_PRINT_PID, TEP_PRINT_CPU,
    TEP_PRINT_LATENCY, TEP_PRINT_TIME, TEP_PRINT_NAME, TEP_PRINT_INFO);
    Output:
    ls-11314 [005] d.h. 185207.366383 function __wake_up

    Signed-off-by: Tzvetomir Stoyanov
    Cc: Andrew Morton
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: linux-trace-devel@vger.kernel.org
    Cc: Patrick McLean
    Link: http://lore.kernel.org/linux-trace-devel/20190801074959.22023-2-tz.stoyanov@gmail.com
    Link: http://lore.kernel.org/lkml/20190805204355.041132030@goodmis.org
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Arnaldo Carvalho de Melo

    Tzvetomir Stoyanov
     
  • All we need there is a forward declaration for 'union perf_event', so
    remove it from there and add missing header directives in places using
    things from this indirect include.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

30 Jul, 2019

1 commit

  • Rename struct perf_evsel to struct evsel, so we don't have a name clash
    when we add struct perf_evsel in libperf.

    Committer notes:

    Added fixes for arm64, provided by Jiri.

    Signed-off-by: Jiri Olsa
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Michael Petlan
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

09 Jul, 2019

1 commit


26 Jun, 2019

2 commits

  • We got the sane_ctype.h headers from git and kept using it so far, but
    since that code originally came from the kernel sources to the git
    sources, perhaps its better to just use the one in the kernel, so that
    we can leverage tools/perf/check_headers.sh to be notified when our copy
    gets out of sync, i.e. when fixes or goodies are added to the code we've
    copied.

    This will help with things like tools/lib/string.c where we want to have
    more things in common with the kernel, such as strim(), skip_spaces(),
    etc so as to go on removing the things that we have in tools/perf/util/
    and instead using the code in the kernel, indirectly and removing things
    like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
    are made to the original code.

    Hopefully this also should help with reducing the difference of code
    hosted in tools/ to the one in the kernel proper.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     
  • Those are not in that file in the git repo, lets move it from there so
    that we get that sane ctype code fully isolated to allow getting it in
    sync either with the git sources or better with the kernel sources
    (include/linux/ctype.h + lib/ctype.h), that way we can use
    check_headers.h to get notified when changes are made in the original
    code so that we can cherry-pick.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-ioh5sghn3943j0rxg6lb2dgs@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

02 Apr, 2019

1 commit

  • The member "pevent" of the struct tep_event is renamed to "tep". This
    makes the struct consistent with the chosen naming convention:

    tep (trace event parser), instead of the old pevent.

    Signed-off-by: Tzvetomir Stoyanov
    Cc: Andrew Morton
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-3-tstoyanov@vmware.com
    Link: http://lkml.kernel.org/r/20190401164344.627724996@goodmis.org
    Signed-off-by: Steven Rostedt (VMware)
    Signed-off-by: Arnaldo Carvalho de Melo

    Tzvetomir Stoyanov
     

23 Feb, 2019

1 commit

  • Add a 'path' member to 'struct perf_data'. It will keep the configured
    path for the data (const char *). The path in struct perf_data_file is
    now dynamically allocated (duped) from it.

    This scheme is useful/used in following patches where struct
    perf_data::path holds the 'configure' directory path and struct
    perf_data_file::path holds the allocated path for specific files.

    Also it actually makes the code little simpler.

    Signed-off-by: Jiri Olsa
    Cc: Adrian Hunter
    Cc: Alexander Shishkin
    Cc: Alexey Budankov
    Cc: Andi Kleen
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Stephane Eranian
    Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
    [ Fixup data-convert-bt.c missing conversion ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

06 Feb, 2019

1 commit

  • Lots of places get the map.h file indirectly, and since we're going to
    remove it from machine.h, then those need to include it directly, do it
    now, before we remove that dep.

    Cc: Adrian Hunter
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Link: https://lkml.kernel.org/n/tip-ob8jehdjda8h5jsrv9dqj9tf@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

22 Jan, 2019

1 commit

  • An automatic const char[] variable gets initialized at runtime, just
    like any other automatic variable. For long strings, that uses a lot of
    stack and wastes time building the string; e.g. for the "No %s
    allocation events..." case one has:

    444516: 48 b8 4e 6f 20 25 73 20 61 6c movabs $0x6c61207325206f4e,%rax # "No %s al"
    ...
    444674: 48 89 45 80 mov %rax,-0x80(%rbp)
    444678: 48 b8 6c 6f 63 61 74 69 6f 6e movabs $0x6e6f697461636f6c,%rax # "location"
    444682: 48 89 45 88 mov %rax,-0x78(%rbp)
    444686: 48 b8 20 65 76 65 6e 74 73 20 movabs $0x2073746e65766520,%rax # " events "
    444690: 66 44 89 55 c4 mov %r10w,-0x3c(%rbp)
    444695: 48 89 45 90 mov %rax,-0x70(%rbp)
    444699: 48 b8 66 6f 75 6e 64 2e 20 20 movabs $0x20202e646e756f66,%rax

    Make them all static so that the compiler just references objects in .rodata.

    Committer testing:

    Ok, using dwarves's codiff tool:

    $ codiff --functions /tmp/perf.before ~/bin/perf
    builtin-sched.c:
    cmd_sched | -48
    1 function changed, 48 bytes removed, diff: -48

    builtin-report.c:
    cmd_report | -32
    1 function changed, 32 bytes removed, diff: -32

    builtin-kmem.c:
    cmd_kmem | -64
    build_alloc_func_list | -50
    2 functions changed, 114 bytes removed, diff: -114

    builtin-c2c.c:
    perf_c2c__report | -390
    1 function changed, 390 bytes removed, diff: -390

    ui/browsers/header.c:
    tui__header_window | -104
    1 function changed, 104 bytes removed, diff: -104

    /home/acme/bin/perf:
    9 functions changed, 688 bytes removed, diff: -688

    Signed-off-by: Rasmus Villemoes
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20181102230624.20064-1-linux@rasmusvillemoes.dk
    Signed-off-by: Arnaldo Carvalho de Melo

    Rasmus Villemoes
     

14 Aug, 2018

3 commits

  • In order to make libtraceevent into a proper library, variables, data
    structures and functions require a unique prefix to prevent name space
    conflicts. That prefix will be "tep_" and not "pevent_". This changes
    pevent_get_page_size API and enum pevent_flag to enum tep_flag

    Signed-off-by: Tzvetomir Stoyanov (VMware)
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Yordan Karadzhov (VMware)
    Cc: linux-trace-devel@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180808180701.623942406@goodmis.org
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Tzvetomir Stoyanov (VMware)
     
  • In order to make libtraceevent into a proper library, variables, data
    structures and functions require a unique prefix to prevent name space
    conflicts. That prefix will be "tep_" and not "pevent_". This changes
    APIs: pevent_alloc, pevent_free, pevent_event_info and pevent_func_resolver_t

    Signed-off-by: Tzvetomir Stoyanov (VMware)
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Yordan Karadzhov (VMware)
    Cc: linux-trace-devel@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180808180700.152609945@goodmis.org
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Tzvetomir Stoyanov (VMware)
     
  • In order to make libtraceevent into a proper library, variables, data
    structures and functions require a unique prefix to prevent name space
    conflicts. That prefix will be "tep_" and not "pevent_". This changes
    the 'struct pevent_record' to 'struct tep_record'.

    Signed-off-by: Tzvetomir Stoyanov (VMware)
    Cc: Andrew Morton
    Cc: Peter Zijlstra
    Cc: Yordan Karadzhov (VMware)
    Cc: linux-trace-devel@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180808180659.866021298@goodmis.org
    Signed-off-by: Steven Rostedt
    Signed-off-by: Arnaldo Carvalho de Melo

    Tzvetomir Stoyanov (VMware)
     

30 Apr, 2018

1 commit


16 Nov, 2017

2 commits

  • As the page free path makes no distinction between cache hot and cold
    pages, there is no real useful ordering of pages in the free list that
    allocation requests can take advantage of. Juding from the users of
    __GFP_COLD, it is likely that a number of them are the result of copying
    other sites instead of actually measuring the impact. Remove the
    __GFP_COLD parameter which simplifies a number of paths in the page
    allocator.

    This is potentially controversial but bear in mind that the size of the
    per-cpu pagelists versus modern cache sizes means that the whole per-cpu
    list can often fit in the L3 cache. Hence, there is only a potential
    benefit for microbenchmarks that alloc/free pages in a tight loop. It's
    even worse when THP is taken into account which has little or no chance
    of getting a cache-hot page as the per-cpu list is bypassed and the
    zeroing of multiple pages will thrash the cache anyway.

    The truncate microbenchmarks are not shown as this patch affects the
    allocation path and not the free path. A page fault microbenchmark was
    tested but it showed no sigificant difference which is not surprising
    given that the __GFP_COLD branches are a miniscule percentage of the
    fault path.

    Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net
    Signed-off-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Cc: Andi Kleen
    Cc: Dave Chinner
    Cc: Dave Hansen
    Cc: Jan Kara
    Cc: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • Now that kmemcheck is gone, we don't need the NOTRACK flags.

    Link: http://lkml.kernel.org/r/20171007030159.22241-5-alexander.levin@verizon.com
    Signed-off-by: Sasha Levin
    Cc: Alexander Potapenko
    Cc: Eric W. Biederman
    Cc: Michal Hocko
    Cc: Pekka Enberg
    Cc: Steven Rostedt
    Cc: Tim Hansen
    Cc: Vegard Nossum
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Levin, Alexander (Sasha Levin)
     

07 Nov, 2017

1 commit

  • Conflicts:
    tools/perf/arch/arm/annotate/instructions.c
    tools/perf/arch/arm64/annotate/instructions.c
    tools/perf/arch/powerpc/annotate/instructions.c
    tools/perf/arch/s390/annotate/instructions.c
    tools/perf/arch/x86/tests/intel-cqm.c
    tools/perf/ui/tui/progress.c
    tools/perf/util/zlib.c

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

31 Oct, 2017

2 commits

  • Add struct perf_data_file to represent a single file within a perf_data
    struct.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Changbin Du
    Cc: David Ahern
    Cc: Jin Yao
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-c3f9p4xzykr845ktqcek6p4t@git.kernel.org
    [ Fixup recent changes in 'perf script --per-event-dump' ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     
  • Rename struct perf_data_file to perf_data, because we will add the
    possibility to have multiple files under perf.data, so the 'perf_data'
    name fits better.

    Signed-off-by: Jiri Olsa
    Cc: Andi Kleen
    Cc: Changbin Du
    Cc: David Ahern
    Cc: Jin Yao
    Cc: Namhyung Kim
    Cc: Peter Zijlstra
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-39wn4d77phel3dgkzo3lyan0@git.kernel.org
    [ Fixup recent changes in 'perf script --per-event-dump' ]
    Signed-off-by: Arnaldo Carvalho de Melo

    Jiri Olsa
     

24 Oct, 2017

1 commit

  • If the string passed in '--time' is invalid, we must do some cleanup
    before leaving. As in the other error handling paths of this function.

    Signed-off-by: Christophe JAILLET
    Cc: Alexander Shishkin
    Cc: Peter Zijlstra
    Cc: kernel-janitors@vger.kernel.org
    Fixes: 2a865bd8dddd ("perf kmem: Add option to specify time window of interest")
    Link: http://lkml.kernel.org/r/20170916060936.28199-1-christophe.jaillet@wanadoo.fr
    Signed-off-by: Arnaldo Carvalho de Melo

    Christophe JAILLET
     

14 Sep, 2017

1 commit

  • GFP_TEMPORARY was introduced by commit e12ba74d8ff3 ("Group short-lived
    and reclaimable kernel allocations") along with __GFP_RECLAIMABLE. It's
    primary motivation was to allow users to tell that an allocation is
    short lived and so the allocator can try to place such allocations close
    together and prevent long term fragmentation. As much as this sounds
    like a reasonable semantic it becomes much less clear when to use the
    highlevel GFP_TEMPORARY allocation flag. How long is temporary? Can the
    context holding that memory sleep? Can it take locks? It seems there is
    no good answer for those questions.

    The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
    __GFP_RECLAIMABLE which in itself is tricky because basically none of
    the existing caller provide a way to reclaim the allocated memory. So
    this is rather misleading and hard to evaluate for any benefits.

    I have checked some random users and none of them has added the flag
    with a specific justification. I suspect most of them just copied from
    other existing users and others just thought it might be a good idea to
    use without any measuring. This suggests that GFP_TEMPORARY just
    motivates for cargo cult usage without any reasoning.

    I believe that our gfp flags are quite complex already and especially
    those with highlevel semantic should be clearly defined to prevent from
    confusion and abuse. Therefore I propose dropping GFP_TEMPORARY and
    replace all existing users to simply use GFP_KERNEL. Please note that
    SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
    so they will be placed properly for memory fragmentation prevention.

    I can see reasons we might want some gfp flag to reflect shorterm
    allocations but I propose starting from a clear semantic definition and
    only then add users with proper justification.

    This was been brought up before LSF this year by Matthew [1] and it
    turned out that GFP_TEMPORARY really doesn't have a clear semantic. It
    seems to be a heuristic without any measured advantage for most (if not
    all) its current users. The follow up discussion has revealed that
    opinions on what might be temporary allocation differ a lot between
    developers. So rather than trying to tweak existing users into a
    semantic which they haven't expected I propose to simply remove the flag
    and start from scratch if we really need a semantic for short term
    allocations.

    [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org

    [akpm@linux-foundation.org: fix typo]
    [akpm@linux-foundation.org: coding-style fixes]
    [sfr@canb.auug.org.au: drm/i915: fix up]
    Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
    Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Signed-off-by: Stephen Rothwell
    Acked-by: Mel Gorman
    Acked-by: Vlastimil Babka
    Cc: Matthew Wilcox
    Cc: Neil Brown
    Cc: "Theodore Ts'o"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

13 Jul, 2017

1 commit

  • __GFP_REPEAT was designed to allow retry-but-eventually-fail semantic to
    the page allocator. This has been true but only for allocations
    requests larger than PAGE_ALLOC_COSTLY_ORDER. It has been always
    ignored for smaller sizes. This is a bit unfortunate because there is
    no way to express the same semantic for those requests and they are
    considered too important to fail so they might end up looping in the
    page allocator for ever, similarly to GFP_NOFAIL requests.

    Now that the whole tree has been cleaned up and accidental or misled
    usage of __GFP_REPEAT flag has been removed for !costly requests we can
    give the original flag a better name and more importantly a more useful
    semantic. Let's rename it to __GFP_RETRY_MAYFAIL which tells the user
    that the allocator would try really hard but there is no promise of a
    success. This will work independent of the order and overrides the
    default allocator behavior. Page allocator users have several levels of
    guarantee vs. cost options (take GFP_KERNEL as an example)

    - GFP_KERNEL & ~__GFP_RECLAIM - optimistic allocation without _any_
    attempt to free memory at all. The most light weight mode which even
    doesn't kick the background reclaim. Should be used carefully because
    it might deplete the memory and the next user might hit the more
    aggressive reclaim

    - GFP_KERNEL & ~__GFP_DIRECT_RECLAIM (or GFP_NOWAIT)- optimistic
    allocation without any attempt to free memory from the current
    context but can wake kswapd to reclaim memory if the zone is below
    the low watermark. Can be used from either atomic contexts or when
    the request is a performance optimization and there is another
    fallback for a slow path.

    - (GFP_KERNEL|__GFP_HIGH) & ~__GFP_DIRECT_RECLAIM (aka GFP_ATOMIC) -
    non sleeping allocation with an expensive fallback so it can access
    some portion of memory reserves. Usually used from interrupt/bh
    context with an expensive slow path fallback.

    - GFP_KERNEL - both background and direct reclaim are allowed and the
    _default_ page allocator behavior is used. That means that !costly
    allocation requests are basically nofail but there is no guarantee of
    that behavior so failures have to be checked properly by callers
    (e.g. OOM killer victim is allowed to fail currently).

    - GFP_KERNEL | __GFP_NORETRY - overrides the default allocator behavior
    and all allocation requests fail early rather than cause disruptive
    reclaim (one round of reclaim in this implementation). The OOM killer
    is not invoked.

    - GFP_KERNEL | __GFP_RETRY_MAYFAIL - overrides the default allocator
    behavior and all allocation requests try really hard. The request
    will fail if the reclaim cannot make any progress. The OOM killer
    won't be triggered.

    - GFP_KERNEL | __GFP_NOFAIL - overrides the default allocator behavior
    and all allocation requests will loop endlessly until they succeed.
    This might be really dangerous especially for larger orders.

    Existing users of __GFP_REPEAT are changed to __GFP_RETRY_MAYFAIL
    because they already had their semantic. No new users are added.
    __alloc_pages_slowpath is changed to bail out for __GFP_RETRY_MAYFAIL if
    there is no progress and we have already passed the OOM point.

    This means that all the reclaim opportunities have been exhausted except
    the most disruptive one (the OOM killer) and a user defined fallback
    behavior is more sensible than keep retrying in the page allocator.

    [akpm@linux-foundation.org: fix arch/sparc/kernel/mdesc.c]
    [mhocko@suse.com: semantic fix]
    Link: http://lkml.kernel.org/r/20170626123847.GM11534@dhcp22.suse.cz
    [mhocko@kernel.org: address other thing spotted by Vlastimil]
    Link: http://lkml.kernel.org/r/20170626124233.GN11534@dhcp22.suse.cz
    Link: http://lkml.kernel.org/r/20170623085345.11304-3-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: Alex Belits
    Cc: Chris Wilson
    Cc: Christoph Hellwig
    Cc: Darrick J. Wong
    Cc: David Daney
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: NeilBrown
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

27 Jun, 2017

1 commit


20 Apr, 2017

4 commits


27 Mar, 2017

1 commit

  • We got it from the git sources but never used it for anything, with the
    place where this would be somehow used remaining:

    static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
    {
    prefix = NULL;
    if (p->option & RUN_SETUP)
    prefix = NULL; /* setup_perf_directory(); */

    Ditch it.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-uw5swz05vol0qpr32c5lpvus@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

14 Mar, 2017

1 commit

  • Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
    by the kernel when fork, clone, setns or unshare are invoked. And update
    perf-record documentation with the new option to record namespace
    events.

    Committer notes:

    Combined it with a later patch to allow printing it via 'perf report -D'
    and be able to test the feature introduced in this patch. Had to move
    here also perf_ns__name(), that was introduced in another later patch.

    Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:

    util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
    ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
    ^
    Testing it:

    # perf record --namespaces -a
    ^C[ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
    #
    # perf report -D

    3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
    [0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
    4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]

    0x1151e0 [0x30]: event: 9
    .
    . ... raw event: size 48 bytes
    . 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
    . 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
    . 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................

    NAMESPACES events: 1

    #

    Signed-off-by: Hari Bathini
    Acked-by: Jiri Olsa
    Tested-by: Arnaldo Carvalho de Melo
    Cc: Alexander Shishkin
    Cc: Alexei Starovoitov
    Cc: Ananth N Mavinakayanahalli
    Cc: Aravinda Prasad
    Cc: Brendan Gregg
    Cc: Daniel Borkmann
    Cc: Eric Biederman
    Cc: Peter Zijlstra
    Cc: Sargun Dhillon
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
    Signed-off-by: Arnaldo Carvalho de Melo

    Hari Bathini
     

14 Feb, 2017

1 commit

  • As it is an array, so will always evaluate to 'true', as reported by
    clang:

    builtin-sched.c:2070:19: error: address of array 'sym->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
    if (sym && sym->name) {
    ~~ ~~~~~^~~~
    1 warning generated.

    So just ditch all those useless checks.

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-ydpm927col06paixb775jjx5@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo
     

27 Jan, 2017

1 commit

  • Previously these were being ignored, sometimes silently.

    Stop doing that, emitting debug messages and handling the errors.

    Testing it:

    $ cat ~/.perfconfig
    cat: /home/acme/.perfconfig: No such file or directory
    $ perf stat -e cycles usleep 1

    Performance counter stats for 'usleep 1':

    938,996 cycles:u

    0.003813731 seconds time elapsed

    $ perf top --stdio
    Error:
    You may not have permission to collect system-wide stats.

    Consider tweaking /proc/sys/kernel/perf_event_paranoid,

    [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ]
    [acme@jouet linux]$ perf report --stdio
    # To display the perf.data header info, please use --header/--header-only options.
    # Overhead Command Shared Object Symbol
    # ........ ....... ................. .........................
    71.77% usleep libc-2.24.so [.] _dl_addr
    27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry
    1.13% usleep [kernel.kallsyms] [k] page_fault
    $
    $ touch ~/.perfconfig
    $ ls -la ~/.perfconfig
    -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig
    $
    $ perf stat -e instructions usleep 1

    Performance counter stats for 'usleep 1':

    244,610 instructions:u

    0.000805383 seconds time elapsed

    $
    [root@jouet ~]# chown acme.acme ~/.perfconfig
    [root@jouet ~]# perf stat -e cycles usleep 1
    Warning: File /root/.perfconfig not owned by current user or root, ignoring it.

    Performance counter stats for 'usleep 1':

    937,615 cycles

    0.000836931 seconds time elapsed
    #

    Cc: Adrian Hunter
    Cc: David Ahern
    Cc: Jiri Olsa
    Cc: Namhyung Kim
    Cc: Wang Nan
    Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo

    Arnaldo Carvalho de Melo