29 Apr, 2008
40 commits
-
Implement a cgroup to track and enforce open and mknod restrictions on device
files. A device cgroup associates a device access whitelist with each cgroup.
A whitelist entry has 4 fields. 'type' is a (all), c (char), or b (block).
'all' means it applies to all types and all major and minor numbers. Major
and minor are either an integer or * for all. Access is a composition of r
(read), w (write), and m (mknod).The root device cgroup starts with rwm to 'all'. A child devcg gets a copy of
the parent. Admins can then remove devices from the whitelist or add new
entries. A child cgroup can never receive a device access which is denied its
parent. However when a device access is removed from a parent it will not
also be removed from the child(ren).An entry is added using devices.allow, and removed using
devices.deny. For instanceecho 'c 1:3 mr' > /cgroups/1/devices.allow
allows cgroup 1 to read and mknod the device usually known as
/dev/null. Doingecho a > /cgroups/1/devices.deny
will remove the default 'a *:* mrw' entry.
CAP_SYS_ADMIN is needed to change permissions or move another task to a new
cgroup. A cgroup may not be granted more permissions than the cgroup's parent
has. Any task can move itself between cgroups. This won't be sufficient, but
we can decide the best way to adequately restrict movement later.[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix may-be-used-uninitialized warning]
Signed-off-by: Serge E. Hallyn
Acked-by: James Morris
Looks-good-to: Pavel Emelyanov
Cc: Daniel Hokka Zakrisson
Cc: Li Zefan
Cc: Paul Menage
Cc: Balbir Singh
Cc: KAMEZAWA Hiroyuki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Trigger callback can be used to receive a kick-up from the user space. The
string written is ignored.The cftype->private is used for multiplexing events.
Signed-off-by: Pavel Emelyanov
Acked-by: Paul Menage
Acked-by: KAMEZAWA Hiroyuki
Cc: Balbir Singh
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There is a race between create_proc_entry() and the assignment of file ops.
proc_create() is invented to fix it.Signed-off-by: Li Zefan
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
It is called by cgroup_init() and cgroup_init_early() only, which are
annotated with __init.Signed-off-by: Li Zefan
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This removes some filesystem boilerplate from the CFS cgroup subsystem.
Signed-off-by: Paul Menage
Acked-by: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
These patches add cgroups read_s64 and write_s64 control file methods (the
signed equivalent of read_u64/write_u64) and use them to implement the
cpu.rt_runtime_us control file in the CFS cgroup subsystem.This patch:
These are the signed equivalents of the read_u64/write_u64 methods
Signed-off-by: Paul Menage
Acked-by: Peter Zijlstra
Cc: Ingo Molnar
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The cgroup debug subsystem isn't generally useful for users. It should
default to "n".Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
The "releasable" control file provided by the cgroup framework exports the
state of a per-cgroup flag that's related to the notify-on-release feature.
This isn't really generally useful, unless you're trying to debug this
particular feature of cgroups.This patch moves the "releasable" file to the cgroup_debug subsystem.
Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This function isn't needed - a NULL pointer in the cftype read function will
result in the same EINVAL response to userspace.Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the seq_file boilerplate used to construct the memcontrol stats map,
and instead use the new map representation for cgroup control filesSigned-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Adds a new type of supported control file representation, a map from strings
to u64 values.Each map entry is printed as a line in a similar format to /proc/vmstat, i.e.
"$key $value\n"Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Many of the cpusets control files are simple integer values, which don't
require the overhead of memory allocations for reads and writes.Move the handlers for these control files into cpuset_read_u64() and
cpuset_write_u64().[akpm@linux-foundation.org: ad dmissing `break']
Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This removes the need for people to remember to pass the -n flag to echo when
writing values to cgroup control files.Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Update the memory controller to use read_u64 for its limit/usage/failcnt
control files, calling the new res_counter_read_u64() function.Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Adds a function for returning the value of a resource counter member, in a
form suitable for use in a cgroup read_u64 control file method.Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Several people have justifiably complained that the "_uint" suffix is
inappropriate for functions that handle u64 values, so this patch just renames
all these functions and their users to have the suffic _u64.[peterz@infradead.org: build fix]
Signed-off-by: Paul Menage
Cc: "Li Zefan"
Cc: Balbir Singh
Cc: Paul Jackson
Cc: Pavel Emelyanov
Cc: KAMEZAWA Hiroyuki
Cc: "YAMAMOTO Takashi"
Signed-off-by: Peter Zijlstra
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Every file should include the headers containing the externs its global
functions (in this case for ns_cgroup_clone()).Signed-off-by: Adrian Bunk
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix a code warning: symbol 'p' shadows an earlier one
This is a reincarnation of Harvey Harrison's patch:
cpuset: sparse warnings in cpuset.cIndependently, Cliff Wickman moved the affected code,
from kernel/cpuset.c to kernel/cgroup.c, in his patch:
cpusets: update_cpumask revisionSigned-off-by: Paul Jackson
Cc: Harvey Harrison
Cc: Cliff Wickman
Acked-by: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make the needlessly global cgroup_enable_task_cg_lists() static.
Signed-off-by: Adrian Bunk
Acked-by: David Rientjes
Cc: Paul Menage
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
This adds support for OLPC XO hardware. Open Firmware on XOs don't contain
the VSA, so it is necessary to emulate the PCI BARs in the kernel. This also
adds functionality for running EC commands, and a CONFIG_OLPC.A number of OLPC drivers depend upon CONFIG_OLPC.
olpc_ec_timeout is a hack to work around Embedded Controller bugs.
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: geode_has_vsa build fix]
[akpm@linux-foundation.org: olpc_register_battery_callback doesn't exist]
Signed-off-by: Andres Salomon
Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: Andi Kleen
Cc: Jordan Crouse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make sure crypt_stat->flags is protected with a lock in ecryptfs_open().
Signed-off-by: Michael Halcrow
Cc: Al Viro
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Make eCryptfs key module subsystem respect namespaces.
Since I will be removing the netlink interface in a future patch, I just made
changes to the netlink.c code so that it will not break the build. With my
recent patches, the kernel module currently defaults to the device handle
interface rather than the netlink interface.[akpm@linux-foundation.org: export free_user_ns()]
Signed-off-by: Michael Halcrow
Acked-by: Serge Hallyn
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Update the versioning information. Make the message types generic. Add an
outgoing message queue to the daemon struct. Make the functions to parse
and write the packet lengths available to the rest of the module. Add
functions to create and destroy the daemon structs. Clean up some of the
comments and make the code a little more consistent with itself.[akpm@linux-foundation.org: printk fixes]
Signed-off-by: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
A regular device file was my real preference from the get-go, but I went with
netlink at the time because I thought it would be less complex for managing
send queues (i.e., just do a unicast and move on). It turns out that we do
not really get that much complexity reduction with netlink, and netlink is
more heavyweight than a device handle.In addition, the netlink interface to eCryptfs has been broken since 2.6.24.
I am assuming this is a bug in how eCryptfs uses netlink, since the other
in-kernel users of netlink do not seem to be having any problems. I have had
one report of a user successfully using eCryptfs with netlink on 2.6.24, but
for my own systems, when starting the userspace daemon, the initial helo
message sent to the eCryptfs kernel module results in an oops right off the
bat. I spent some time looking at it, but I have not yet found the cause.
The netlink interface breaking gave me the motivation to just finish my patch
to migrate to a regular device handle. If I cannot find out soon why the
netlink interface in eCryptfs broke, I am likely to just send a patch to
disable it in 2.6.24 and 2.6.25. I would like the device handle to be the
preferred means of communicating with the userspace daemon from 2.6.26 on
forward.This patch:
Functions to facilitate reading and writing to the eCryptfs miscellaneous
device handle. This will replace the netlink interface as the preferred
mechanism for communicating with the userspace eCryptfs daemon.Each user has his own daemon, which registers itself by opening the eCryptfs
device handle. Only one daemon per euid may be registered at any given time.
The eCryptfs module sends a message to a daemon by adding its message to the
daemon's outgoing message queue. The daemon reads the device handle to get
the oldest message off the queue.Incoming messages from the userspace daemon are immediately handled. If the
message is a response, then the corresponding process that is blocked waiting
for the response is awakened.Signed-off-by: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Callers of notify_change() need to hold i_mutex.
Signed-off-by: Miklos Szeredi
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove the no longer used ecryptfs_header_cache_0.
Signed-off-by: Adrian Bunk
Cc: Michael Halcrow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Introduced between 2.6.25-rc2 and -rc3
drivers/block/xen-blkfront.c:139:5: warning: symbol 'blkif_getgeo' was not declared. Should it be static?Signed-off-by: Harvey Harrison
Cc: Jeremy Fitzhardinge
Cc: Jens Axboe
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
A command that causes a line feed while a background color is active,
such asperl -e 'print "x" x 60, "\e[44m", "x" x 40, "\e[0m\n"'
and
perl -e 'print "x" x 40, "\e[44m\n", "x" x 40, "\e[0m\n"'causes the line that was started as a result of the line feed to be completely
filled with the currently active background color instead of the default
color.When scrolling, part of the current screen is memcpy'd/memmove'd to the new
region, and the new line(s) that will appear as a result are cleared using
memset. However, the lines are cleared with vc->vc_video_erase_char, causing
them to be colored with the currently active background color. This is
different from X11 terminal emulators which always paint the new lines with
the default background color (e.g. `xterm -bg black`).The clear operation (\e[1J and \e[2J) also use vc_video_erase_char, so a new
vc->vc_scrl_erase_char is introduced with contains the erase character used
for scrolling, which is built from vc->vc_def_color instead of vc->vc_color.Signed-off-by: Jan Engelhardt
Cc: "Antonino A. Daplas"
Cc: "H. Peter Anvin"
Cc: Alan Cox
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Instead of using the malloc() and free() wrappers needed by the
lib/inflate.c code for allocations, simply use kmalloc() and kfree() in the
initramfs code. This is needed for a further lib/inflate.c-related cleanup
patch that will remove the malloc() and free() functions.Take that opportunity to remove the useless kmalloc() return value
cast.Based on work done by Matt Mackall.
Signed-off-by: Thomas Petazzoni
Signed-off-by: Matt Mackall
Cc: Jan Engelhardt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Due to the rcupreempt.h WARN_ON trigged, I got 2G syslog file. For some
serious complaining of kernel, we need repeat the warnings, so here I isolate
the ratelimit part of printk.c to a standalone file.Signed-off-by: Dave Young
Acked-by: Paul E. McKenney
Tested-by: Paul E. McKenney
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
We need to check ->s_dirt before calling write_super(). It became the cause
of an unneeded write.This bug was noticed by Sudhanshu Saxena.
Signed-off-by: OGAWA Hirofumi
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add missing consts to xattr function arguments.
Signed-off-by: David Howells
Cc: Andreas Gruenbacher
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Remove lives_below_in_same_fs() since is_subdir() from fs/dcache.c is
providing the same functionality.Signed-off-by: Jan Blunck
Acked-by: Miklos Szeredi
Cc: Al Viro
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
fs/inode.c: use hlist_for_each_entry() in find_inode() and find_inode_fast()
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Matthias Kaehlcke
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Many inodes have no pagecache, so we can avoid lots of lock-takings.
Signed-off-by: Jan Kara
Cc: Fengguang Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Fix longstanding lock inversion in drop_pagecache_sb by dropping inode_lock
before calling __invalidate_mapping_pages(). We just have to make sure inode
won't go away from under us by keeping reference to it and putting the
reference only after we have safely resumed the scan of the inode list. A bit
tricky but not too bad...Signed-off-by: Jan Kara
Cc: Fengguang Wu
Cc: David Chinner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
iommu_is_span_boundary in lib/iommu-helper.c was exported for PARISC IOMMUs
(commit 3715863aa142c4f4c5208f5f3e5e9bac06006d2f). SWIOTLB can use it instead
of the homegrown function.Signed-off-by: FUJITA Tomonori
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: H. Peter Anvin
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
There's a pointlessly braced block of code in there. Remove the braces and
save a tabstop.Cc: Andi Kleen
Cc: FUJITA Tomonori
Cc: Jan Beulich
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Before requesting firmware, printk a message saying what we're requesting. This
makes it easier to see what's going on, and provides an explanation for the
huge silent delay that one would otherwise get after accidentally building
ipw2200 as a non-module.Cc: Greg KH
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds