16 Dec, 2009

26 commits

  • Register per node hstate sysfs attributes only for nodes with memory.
    Global replacement of 'all online nodes" with "all nodes with memory" in
    mm/hugetlb.c. Suggested by David Rientjes.

    A subsequent patch will handle adding/removing of per node hstate sysfs
    attributes when nodes transition to/from memoryless state via memory
    hotplug.

    NOTE: this patch has not been tested with memoryless nodes.

    Signed-off-by: Lee Schermerhorn
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Acked-by: David Rientjes
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • Update the kernel huge tlb documentation to describe the numa memory
    policy based huge page management. Additionaly, the patch includes a fair
    amount of rework to improve consistency, eliminate duplication and set the
    context for documenting the memory policy interaction.

    Signed-off-by: Lee Schermerhorn
    Acked-by: David Rientjes
    Acked-by: Mel Gorman
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Lee Schermerhorn
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • Add the per huge page size control/query attributes to the per node
    sysdevs:

    /sys/devices/system/node/node/hugepages/hugepages-/
    nr_hugepages - r/w
    free_huge_pages - r/o
    surplus_huge_pages - r/o

    The patch attempts to re-use/share as much of the existing global hstate
    attribute initialization and handling, and the "nodes_allowed" constraint
    processing as possible.

    Calling set_max_huge_pages() with no node indicates a change to global
    hstate parameters. In this case, any non-default task mempolicy will be
    used to generate the nodes_allowed mask. A valid node id indicates an
    update to that node's hstate parameters, and the count argument specifies
    the target count for the specified node. From this info, we compute the
    target global count for the hstate and construct a nodes_allowed node mask
    contain only the specified node.

    Setting the node specific nr_hugepages via the per node attribute
    effectively ignores any task mempolicy or cpuset constraints.

    With this patch:

    (me):ls /sys/devices/system/node/node0/hugepages/hugepages-2048kB
    ./ ../ free_hugepages nr_hugepages surplus_hugepages

    Starting from:
    Node 0 HugePages_Total: 0
    Node 0 HugePages_Free: 0
    Node 0 HugePages_Surp: 0
    Node 1 HugePages_Total: 0
    Node 1 HugePages_Free: 0
    Node 1 HugePages_Surp: 0
    Node 2 HugePages_Total: 0
    Node 2 HugePages_Free: 0
    Node 2 HugePages_Surp: 0
    Node 3 HugePages_Total: 0
    Node 3 HugePages_Free: 0
    Node 3 HugePages_Surp: 0
    vm.nr_hugepages = 0

    Allocate 16 persistent huge pages on node 2:
    (me):echo 16 >/sys/devices/system/node/node2/hugepages/hugepages-2048kB/nr_hugepages

    [Note that this is equivalent to:
    numactl -m 2 hugeadmin --pool-pages-min 2M:+16
    ]

    Yields:
    Node 0 HugePages_Total: 0
    Node 0 HugePages_Free: 0
    Node 0 HugePages_Surp: 0
    Node 1 HugePages_Total: 0
    Node 1 HugePages_Free: 0
    Node 1 HugePages_Surp: 0
    Node 2 HugePages_Total: 16
    Node 2 HugePages_Free: 16
    Node 2 HugePages_Surp: 0
    Node 3 HugePages_Total: 0
    Node 3 HugePages_Free: 0
    Node 3 HugePages_Surp: 0
    vm.nr_hugepages = 16

    Global controls work as expected--reduce pool to 8 persistent huge pages:
    (me):echo 8 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages

    Node 0 HugePages_Total: 0
    Node 0 HugePages_Free: 0
    Node 0 HugePages_Surp: 0
    Node 1 HugePages_Total: 0
    Node 1 HugePages_Free: 0
    Node 1 HugePages_Surp: 0
    Node 2 HugePages_Total: 8
    Node 2 HugePages_Free: 8
    Node 2 HugePages_Surp: 0
    Node 3 HugePages_Total: 0
    Node 3 HugePages_Free: 0
    Node 3 HugePages_Surp: 0

    Signed-off-by: Lee Schermerhorn
    Acked-by: Mel Gorman
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: David Rientjes
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • Move definition of NUMA_NO_NODE from ia64 and x86_64 arch specific headers
    to generic header 'linux/numa.h' for use in generic code. NUMA_NO_NODE
    replaces bare '-1' where it's used in this series to indicate "no node id
    specified". Ultimately, it can be used to replace the -1 elsewhere where
    it is used similarly.

    Signed-off-by: Lee Schermerhorn
    Acked-by: David Rientjes
    Acked-by: Mel Gorman
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • This patch derives a "nodes_allowed" node mask from the numa mempolicy of
    the task modifying the number of persistent huge pages to control the
    allocation, freeing and adjusting of surplus huge pages when the pool page
    count is modified via the new sysctl or sysfs attribute
    "nr_hugepages_mempolicy". The nodes_allowed mask is derived as follows:

    * For "default" [NULL] task mempolicy, a NULL nodemask_t pointer
    is produced. This will cause the hugetlb subsystem to use
    node_online_map as the "nodes_allowed". This preserves the
    behavior before this patch.
    * For "preferred" mempolicy, including explicit local allocation,
    a nodemask with the single preferred node will be produced.
    "local" policy will NOT track any internode migrations of the
    task adjusting nr_hugepages.
    * For "bind" and "interleave" policy, the mempolicy's nodemask
    will be used.
    * Other than to inform the construction of the nodes_allowed node
    mask, the actual mempolicy mode is ignored. That is, all modes
    behave like interleave over the resulting nodes_allowed mask
    with no "fallback".

    See the updated documentation [next patch] for more information
    about the implications of this patch.

    Examples:

    Starting with:

    Node 0 HugePages_Total: 0
    Node 1 HugePages_Total: 0
    Node 2 HugePages_Total: 0
    Node 3 HugePages_Total: 0

    Default behavior [with or without this patch] balances persistent
    hugepage allocation across nodes [with sufficient contiguous memory]:

    sysctl vm.nr_hugepages[_mempolicy]=32

    yields:

    Node 0 HugePages_Total: 8
    Node 1 HugePages_Total: 8
    Node 2 HugePages_Total: 8
    Node 3 HugePages_Total: 8

    Of course, we only have nr_hugepages_mempolicy with the patch,
    but with default mempolicy, nr_hugepages_mempolicy behaves the
    same as nr_hugepages.

    Applying mempolicy--e.g., with numactl [using '-m' a.k.a.
    '--membind' because it allows multiple nodes to be specified
    and it's easy to type]--we can allocate huge pages on
    individual nodes or sets of nodes. So, starting from the
    condition above, with 8 huge pages per node, add 8 more to
    node 2 using:

    numactl -m 2 sysctl vm.nr_hugepages_mempolicy=40

    This yields:

    Node 0 HugePages_Total: 8
    Node 1 HugePages_Total: 8
    Node 2 HugePages_Total: 16
    Node 3 HugePages_Total: 8

    The incremental 8 huge pages were restricted to node 2 by the
    specified mempolicy.

    Similarly, we can use mempolicy to free persistent huge pages
    from specified nodes:

    numactl -m 0,1 sysctl vm.nr_hugepages_mempolicy=32

    yields:

    Node 0 HugePages_Total: 4
    Node 1 HugePages_Total: 4
    Node 2 HugePages_Total: 16
    Node 3 HugePages_Total: 8

    The 8 huge pages freed were balanced over nodes 0 and 1.

    [rientjes@google.com: accomodate reworked NODEMASK_ALLOC]
    Signed-off-by: David Rientjes
    Signed-off-by: Lee Schermerhorn
    Acked-by: Mel Gorman
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • Factor init_nodemask_of_node() out of the nodemask_of_node() macro.

    This will be used to populate the huge pages "nodes_allowed" nodemask for
    a single node when basing nodes_allowed on a preferred/local mempolicy or
    when a persistent huge page pool page count is modified via a per node
    sysfs attribute.

    Signed-off-by: Lee Schermerhorn
    Acked-by: Mel Gorman
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Acked-by: David Rientjes
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • In preparation for constraining huge page allocation and freeing by the
    controlling task's numa mempolicy, add a "nodes_allowed" nodemask pointer
    to the allocate, free and surplus adjustment functions. For now, pass
    NULL to indicate default behavior--i.e., use node_online_map. A
    subsqeuent patch will derive a non-default mask from the controlling
    task's numa mempolicy.

    Note that this method of updating the global hstate nr_hugepages under the
    constraint of a nodemask simplifies keeping the global state
    consistent--especially the number of persistent and surplus pages relative
    to reservations and overcommit limits. There are undoubtedly other ways
    to do this, but this works for both interfaces: mempolicy and per node
    attributes.

    [rientjes@google.com: fix HIGHMEM compile error]
    Signed-off-by: Lee Schermerhorn
    Reviewed-by: Mel Gorman
    Acked-by: David Rientjes
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: Andi Kleen
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • Modify the hstate_next_node* functions to allow them to be called to
    obtain the "start_nid". Then, whereas prior to this patch we
    unconditionally called hstate_next_node_to_{alloc|free}(), whether or not
    we successfully allocated/freed a huge page on the node, now we only call
    these functions on failure to alloc/free to advance to next allowed node.

    Factor out the next_node_allowed() function to handle wrap at end of
    node_online_map. In this version, the allowed nodes include all of the
    online nodes.

    Signed-off-by: Lee Schermerhorn
    Reviewed-by: Mel Gorman
    Acked-by: David Rientjes
    Reviewed-by: Andi Kleen
    Cc: KAMEZAWA Hiroyuki
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: Andi Kleen
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • This is a series of patches to provide control over the location of the
    allocation and freeing of persistent huge pages on a NUMA platform.
    Please consider for merging into mmotm.

    This series uses two mechanisms to constrain the nodes from which
    persistent huge pages are allocated: 1) the task NUMA mempolicy of the
    task modifying a new sysctl "nr_hugepages_mempolicy", based on a
    suggestion by Mel Gorman; and 2) a subset of the hugepages hstate sysfs
    attributes have been added [in V4] to each node system device under:

    /sys/devices/node/node[0-9]*/hugepages

    The per node attibutes allow direct assignment of a huge page count on a
    specific node, regardless of the task's mempolicy or cpuset constraints.

    This patch:

    NODEMASK_ALLOC(x, m) assumes x is a type of struct, which is unnecessary.
    It's perfectly reasonable to use this macro to allocate a nodemask_t,
    which is anonymous, either dynamically or on the stack depending on
    NODES_SHIFT.

    Signed-off-by: David Rientjes
    Signed-off-by: Lee Schermerhorn
    Acked-by: KAMEZAWA Hiroyuki
    Cc: Mel Gorman
    Cc: Randy Dunlap
    Cc: Nishanth Aravamudan
    Cc: Andi Kleen
    Cc: David Rientjes
    Cc: Adam Litke
    Cc: Andy Whitcroft
    Cc: Eric Whitney
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Christoph pointed out inc_zone_page_state(NR_ISOLATED) should be placed
    in right after isolate_page().

    This patch does it.

    Reviewed-by: Christoph Lameter
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Signed-off-by: Wu Fengguang
    Cc: Andi Kleen
    Cc: Avi Kivity
    Cc: Greg Kroah-Hartman
    Cc: Johannes Berg
    Cc: Marcelo Tosatti
    Cc: Mark Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Also rename "len" to "sz". No behavior change.

    Signed-off-by: Wu Fengguang
    Cc: Andi Kleen
    Cc: Avi Kivity
    Cc: Greg Kroah-Hartman
    Cc: Johannes Berg
    Cc: Marcelo Tosatti
    Cc: Mark Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Also convert more size_inside_page() users.

    Signed-off-by: Wu Fengguang
    Cc: Andi Kleen
    Cc: Avi Kivity
    Cc: Greg Kroah-Hartman
    Cc: Johannes Berg
    Cc: Marcelo Tosatti
    Cc: Mark Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • No behaviour change.

    [akpm@linux-foundation.org: cleanuplets]
    [akpm@linux-foundation.org: remove unused `ret']
    Signed-off-by: Wu Fengguang
    Acked-by: Andi Kleen
    Cc: Marcelo Tosatti
    Cc: Greg Kroah-Hartman
    Cc: Mark Brown
    Cc: Johannes Berg
    Cc: Avi Kivity
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Introduce size_inside_page() to replace duplicate /dev/mem code.

    Also apply it to /dev/kmem, whose alignment logic was buggy.

    Signed-off-by: Wu Fengguang
    Acked-by: Andi Kleen
    Cc: Marcelo Tosatti
    Cc: Greg Kroah-Hartman
    Cc: Mark Brown
    Cc: Johannes Berg
    Cc: Avi Kivity
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • The len test in write_kmem() is always true, so can be reduced.

    Signed-off-by: Wu Fengguang
    Acked-by: Andi Kleen
    Cc: Marcelo Tosatti
    Cc: Greg Kroah-Hartman
    Cc: Mark Brown
    Cc: Johannes Berg
    Cc: Avi Kivity
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • On ia64, the following test program exit abnormally, because glibc thread
    library called abort().

    ========================================================
    (gdb) bt
    #0 0xa000000000010620 in __kernel_syscall_via_break ()
    #1 0x20000000003208e0 in raise () from /lib/libc.so.6.1
    #2 0x2000000000324090 in abort () from /lib/libc.so.6.1
    #3 0x200000000027c3e0 in __deallocate_stack () from /lib/libpthread.so.0
    #4 0x200000000027f7c0 in start_thread () from /lib/libpthread.so.0
    #5 0x200000000047ef60 in __clone2 () from /lib/libc.so.6.1
    ========================================================

    The fact is, glibc call munmap() when thread exitng time for freeing
    stack, and it assume munlock() never fail. However, munmap() often make
    vma splitting and it with many mapcount make -ENOMEM.

    Oh well, that's crazy, because stack unmapping never increase mapcount.
    The maxcount exceeding is only temporary. internal temporary exceeding
    shouldn't make ENOMEM.

    This patch does it.

    test_max_mapcount.c
    ==================================================================
    #include
    #include
    #include
    #include
    #include
    #include

    #define THREAD_NUM 30000
    #define MAL_SIZE (8*1024*1024)

    void *wait_thread(void *args)
    {
    void *addr;

    addr = malloc(MAL_SIZE);
    sleep(10);

    return NULL;
    }

    void *wait_thread2(void *args)
    {
    sleep(60);

    return NULL;
    }

    int main(int argc, char *argv[])
    {
    int i;
    pthread_t thread[THREAD_NUM], th;
    int ret, count = 0;
    pthread_attr_t attr;

    ret = pthread_attr_init(&attr);
    if(ret) {
    perror("pthread_attr_init");
    }

    ret = pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
    if(ret) {
    perror("pthread_attr_setdetachstate");
    }

    for (i = 0; i < THREAD_NUM; i++) {
    ret = pthread_create(&th, &attr, wait_thread, NULL);
    if(ret) {
    fprintf(stderr, "[%d] ", count);
    perror("pthread_create");
    } else {
    printf("[%d] create OK.\n", count);
    }
    count++;

    ret = pthread_create(&thread[i], &attr, wait_thread2, NULL);
    if(ret) {
    fprintf(stderr, "[%d] ", count);
    perror("pthread_create");
    } else {
    printf("[%d] create OK.\n", count);
    }
    count++;
    }

    sleep(3600);
    return 0;
    }
    ==================================================================

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Hugh Dickins
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • On a system with large amount of memory (256GB), invoking page-types can
    take quite a long time, which is unreasonable considering the user only
    wants a description of the flags:

    # time ./page-types -d 0x10
    0x0000000000000010 ____D_____________________________ dirty

    real 0m34.285s
    user 0m1.966s
    sys 0m32.313s

    This is because we still walk the entire address range.

    Exiting early seems like a reasonble solution:

    # time ./page-types -d 0x10
    0x0000000000000010 ____D_____________________________ dirty

    real 0m0.007s
    user 0m0.001s
    sys 0m0.005s

    Signed-off-by: Alex Chiang
    Cc: Andi Kleen
    Cc: Haicheng Li
    Acked-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Chiang
     
  • Align the output when page-type -h is invoked.

    Signed-off-by: Alex Chiang
    Acked-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Chiang
     
  • Teach page-types to describe page flags directly from the command line.

    Why is this useful? For instance, if you're using memory hotplug and see
    this in /var/log/messages:

    kernel: removing from LRU failed 3836dd0/1/1e00000000000010

    It would be nice to decode those page flags without staring at the source.

    Example usage and output:

    # Documentation/vm/page-types -d 0x10
    0x0000000000000010 ____D_____________________________ dirty

    # Documentation/vm/page-types -d anon
    0x0000000000001000 ____________a_____________________ anonymous

    # Documentation/vm/page-types -d anon,0x10
    0x0000000000001010 ____D_______a_____________________ dirty,anonymous

    [achiang@hp.com: documentation]
    Signed-off-by: Alex Chiang
    Signed-off-by: Wu Fengguang
    Cc: Andi Kleen
    Cc: Haicheng Li
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alex Chiang
     
  • If not signed, testing of the read() return value in this function
    will not work.

    Signed-off-by: Roel Kluin
    Cc: Wu Fengguang
    Cc: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roel Kluin
     
  • Signed-off-by: Tommi Rantala
    Cc: Randy Dunlap
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Tommi Rantala
     
  • The oom killer header, including information such as the allocation order
    and gfp mask, current's cpuset and memory controller, call trace, and VM
    state information is currently only shown when the oom killer has selected
    a task to kill.

    This information is omitted, however, when the oom killer panics either
    because of panic_on_oom sysctl settings or when no killable task was
    found. It is still relevant to know crucial pieces of information such as
    the allocation order and VM state when diagnosing such issues, especially
    at boot.

    This patch displays the oom killer header whenever it panics so that bug
    reports can include pertinent information to debug the issue, if possible.

    Signed-off-by: David Rientjes
    Reviewed-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • Sam was fine with handing over kbuild maintainership to me. The git
    trees are already in linux-next, a merge request will follow shortly.

    Acked-by: Sam Ravnborg
    Signed-off-by: Michal Marek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Marek
     
  • A specially-crafted Hierarchical File System (HFS) filesystem could cause
    a buffer overflow to occur in a process's kernel stack during a memcpy()
    call within the hfs_bnode_read() function (at fs/hfs/bnode.c:24). The
    attacker can provide the source buffer and length, and the destination
    buffer is a local variable of a fixed length. This local variable (passed
    as "&entry" from fs/hfs/dir.c:112 and allocated on line 60) is stored in
    the stack frame of hfs_bnode_read()'s caller, which is hfs_readdir().
    Because the hfs_readdir() function executes upon any attempt to read a
    directory on the filesystem, it gets called whenever a user attempts to
    inspect any filesystem contents.

    [amwang@redhat.com: modify this patch and fix coding style problems]
    Signed-off-by: WANG Cong
    Cc: Eugene Teo
    Cc: Roman Zippel
    Cc: Al Viro
    Cc: Christoph Hellwig
    Cc: Alexey Dobriyan
    Cc: Dave Anderson
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Amerigo Wang
     
  • commit d8e180dcd5bbbab9cd3ff2e779efcf70692ef541 "bsdacct: switch
    credentials for writing to the accounting file" introduced credential
    switching during final acct data collecting. However, uid/gid pair
    continued to be collected from current which became credentials of who
    created acct file, not who exits.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14676

    Signed-off-by: Alexey Dobriyan
    Reported-by: Juho K. Juopperi
    Acked-by: Serge Hallyn
    Acked-by: David Howells
    Reviewed-by: Michal Schmidt
    Cc: James Morris
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

15 Dec, 2009

14 commits

  • * 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
    i2c-core: i2c bus should support PM entries in struct dev_pm_ops
    i2c: Get rid of I2C_CLIENT_MODULE_PARM
    i2c: Drop I2C_CLIENT_INSMOD_2 to 8
    i2c: Drop I2C_CLIENT_INSMOD_1
    i2c: Get rid of struct i2c_client_address_data
    i2c: Drop the kind parameter from detect callbacks

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6:
    udf: Avoid IO in udf_clear_inode
    udf: Try harder when looking for VAT inode
    udf: Fix compilation with UDFFS_DEBUG enabled

    Linus Torvalds
     
  • It is not very good to do IO in udf_clear_inode. First, VFS does not really
    expect inode to become dirty there and thus we have to write it ourselves,
    second, memory reclaim gets blocked waiting for IO when it does not really
    expect it, third, the IO pattern (e.g. on umount) resulting from writes in
    udf_clear_inode is bad and it slows down writing a lot.

    The reason why UDF needed to do IO in udf_clear_inode is that UDF standard
    mandates extent length to exactly match inode size. But when we allocate
    extents to a file or directory, we don't really know what exactly the final
    file size will be and thus temporarily set it to block boundary and later
    truncate it to exact length in udf_clear_inode. Now, this is changed to
    truncate to final file size in udf_release_file for regular files. For
    directories and symlinks, we do the truncation at the moment when learn
    what the final file size will be.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Some disks do not contain VAT inode in the last recorded block as required
    by the standard but a few blocks earlier (or the number of recorded blocks
    is wrong). So look for the VAT inode a bit before the end of the media.

    Signed-off-by: Jan Kara

    Jan Kara
     
  • Signed-off-by: Jan Kara

    Jan Kara
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, mce: Clean up thermal init by introducing intel_thermal_supported()
    x86, mce: Thermal monitoring depends on APIC being enabled
    x86: Gart: fix breakage due to IOMMU initialization cleanup
    x86: Move swiotlb initialization before dma32_free_bootmem
    x86: Fix build warning in arch/x86/mm/mmio-mod.c
    x86: Remove usedac in feature-removal-schedule.txt
    x86: Fix duplicated UV BAU interrupt vector
    nvram: Fix write beyond end condition; prove to gcc copy is safe
    mm: Adjust do_pages_stat() so gcc can see copy_from_user() is safe
    x86: Limit the number of processor bootup messages
    x86: Remove enabling x2apic message for every CPU
    doc: Add documentation for bootloader_{type,version}
    x86, msr: Add support for non-contiguous cpumasks
    x86: Use find_e820() instead of hard coded trampoline address
    x86, AMD: Fix stale cpuid4_info shared_map data in shared_cpu_map cpumasks

    Trivial percpu-naming-introduced conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6:
    pcmcia: CodingStyle fixes
    pcmcia: remove unused IRQ_FIRST_SHARED

    Linus Torvalds
     
  • Struct dev_pm_ops is not configured in current i2c bus type. i2c drivers
    only depends on suspend/resume entries in struct dev_pm_ops are not
    informed of PM suspend and resume events by i2c framework.

    Signed-off-by: Sonic Zhang
    Signed-off-by: Jean Delvare

    sonic zhang
     
  • There is no user left of I2C_CLIENT_MODULE_PARM, so we can finally
    get rid of this ugly macro.

    Signed-off-by: Jean Delvare
    Tested-by: Wolfram Sang

    Jean Delvare
     
  • These macros simply declare an enum, so drivers might as well declare
    it themselves. This puts an end to the arbitrary limit of 8 chip types
    per i2c driver.

    Signed-off-by: Jean Delvare
    Tested-by: Wolfram Sang

    Jean Delvare
     
  • This macro simply declares an enum, so drivers might as well declare
    it themselves.

    Signed-off-by: Jean Delvare
    Tested-by: Wolfram Sang

    Jean Delvare
     
  • Struct i2c_client_address_data only contains one field at this point,
    which makes its usefulness questionable. Get rid of it and pass simple
    address lists around instead.

    Signed-off-by: Jean Delvare
    Tested-by: Wolfram Sang

    Jean Delvare
     
  • The "kind" parameter always has value -1, and nobody is using it any
    longer, so we can remove it.

    Signed-off-by: Jean Delvare
    Tested-by: Wolfram Sang

    Jean Delvare
     
  • * 'next-spi' of git://git.secretlab.ca/git/linux-2.6: (23 commits)
    spi: fix probe/remove section markings
    Add OMAP spi100k driver
    spi-imx: don't access struct device directly but use dev_get_platdata
    spi-imx: Add mx25 support
    spi-imx: use positive logic to distinguish cpu variants
    spi-imx: correct check for platform_get_irq failing
    ARM: NUC900: Add spi driver support for nuc900
    spi: SuperH MSIOF SPI Master driver V2
    spi: fix spidev compilation failure when VERBOSE is defined
    spi/au1550_spi: fix setupxfer not to override cfg with zeros
    spi/mpc8xxx: don't use __exit_p to wrap plat_mpc8xxx_spi_remove
    spi/i.MX: fix broken error handling for gpio_request
    spi/i.mx: drain MXC SPI transfer buffer when probing device
    MAINTAINERS: add SPI co-maintainer.
    spi/xilinx_spi: fix incorrect casting
    spi/mpc52xx-spi: minor cleanups
    xilinx_spi: add a platform driver using the xilinx_spi common module.
    xilinx_spi: add support for the DS570 IP.
    xilinx_spi: Switch to iomem functions and support little endian.
    xilinx_spi: Split into of driver and generic part.
    ...

    Linus Torvalds