03 Sep, 2008

1 commit

  • During the use of a dev_cgroup, we should guarantee the corresponding
    cgroup won't be deleted (i.e. via rmdir). This can be done through
    css_get(&dev_cgroup->css), but here we can just get and use the dev_cgroup
    under rcu_read_lock.

    And also remove checking NULL dev_cgroup, it won't be NULL since a task
    always belongs to a cgroup.

    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Paul Menage
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

26 Jul, 2008

3 commits

  • - clean up set_majmin()
    - use simple_strtoul() to parse major/minor

    [akpm@linux-foundation.org: fix simple_strtoul() usage]
    [kosaki.motohiro@jp.fujitsu.com: fix warnings]
    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Serge Hallyn
    Cc: Paul Menage
    Cc: Pavel Emelyanov
    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Currently this list is protected with a simple spinlock, even for reading
    from one. This is OK, but can be better.

    Actually I want it to be better very much, since after replacing the
    OpenVZ device permissions engine with the cgroup-based one I noticed, that
    we set 12 default device permissions for each newly created container (for
    /dev/null, full, terminals, ect devices), and people sometimes have up to
    20 perms more, so traversing the ~30-40 elements list under a spinlock
    doesn't seem very good.

    Here's the RCU protection for white-list - dev_whitelist_item-s are added
    and removed under the devcg->lock, but are looked up in permissions
    checking under the rcu_read_lock.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Balbir Singh
    Cc: Paul Menage
    Cc: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This patch converts devcgroup_access_write() from a raw file handler
    into a handler for the cgroup write_string() method. This allows some
    boilerplate copying/locking/checking to be removed and simplifies the
    cleanup path, since these functions are performed by the cgroups
    framework before calling the handler.

    Signed-off-by: Paul Menage
    Cc: Paul Jackson
    Cc: Pavel Emelyanov
    Cc: Balbir Singh
    Acked-by: Serge Hallyn
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Menage
     

14 Jul, 2008

2 commits

  • # cat devices.list
    c 1:3 r
    # echo 'c 1:3 w' > sub/devices.allow
    # cat sub/devices.list
    c 1:3 w

    As illustrated, the parent group has no write permission to /dev/null, so
    it's child should not be allowed to add this write permission.

    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Serge Hallyn
    Cc: Paul Menage
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • # echo "b $((0x7fffffff)):$((0x80000000)) rwm" > devices.allow
    # cat devices.list
    b 214748364:-21474836 rwm

    though a major/minor number of 0x800000000 is meaningless, we
    should not cast it to a negative value.

    Signed-off-by: Li Zefan
    Acked-by: Serge Hallyn
    Cc: Serge Hallyn
    Cc: Paul Menage
    Cc: Pavel Emelyanov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

05 Jul, 2008

1 commit

  • # cat /devcg/devices.list
    a *:* rwm
    # echo a > devices.allow
    # cat /devcg/devices.list
    a *:* rwm
    a 0:0 rwm

    This is odd and maybe confusing. With this patch, writing 'a' to
    devices.allow will add 'a *:* rwm' to the whitelist.

    Also a few fixes and updates to the document.

    Signed-off-by: Li Zefan
    Cc: Pavel Emelyanov
    Cc: Serge E. Hallyn
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     

07 Jun, 2008

3 commits

  • Consider you added a 'c foo:bar r' permission to some cgroup and then (a
    bit later) 'c'foo:bar w' for it. After this you'll see the

    c foo:bar r
    c foo:bar w

    lines in a devices.list file.

    Another example - consider you added 10 'c foo:bar r' permissions to some
    cgroup (e.g. by mistake). After this you'll see 10 c foo:bar r lines in
    a list file.

    This is weird. This situation also has one more annoying consequence.
    Having many items in a white list makes permissions checking slower, sine
    it has to walk a longer list.

    The proposal is to merge permissions for items, that correspond to the
    same device.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • Two functions, that need to get a device_cgroup from a task (they are
    devcgroup_inode_permission and devcgroup_inode_mknod) make it in a strange
    way:

    They get a css_set from task, then a subsys_state from css_set, then a
    cgroup from the state and then a subsys_state again from the cgroup.
    Besides, the devices_subsys_id is read from memory, whilst there's a
    enum-ed constant for it.

    Optimize this part a bit:
    1. Get the subsys_stats form the task and be done - no 2 extra
    dereferences,
    2. Use the device_subsys_id constant, not the value from memory
    (i.e. one less dereference).

    Found while preparing 2.6.26 OpenVZ port.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Acked-by: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     
  • This is just picking the container_of out of cgroup_to_devcgroup into a
    separate function.

    This new css_to_devcgroup will be used in the 2nd patch.

    Signed-off-by: Pavel Emelyanov
    Acked-by: Serge Hallyn
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Pavel Emelyanov
     

29 Apr, 2008

2 commits

  • Introduce a read_seq() helper in cftype, which uses seq_file to print out
    lists. Use it in the devices cgroup. Also split devices.allow into two
    files, so now devices.deny and devices.allow are the ones to use to manipulate
    the whitelist, while devices.list outputs the cgroup's current whitelist.

    Signed-off-by: Serge E. Hallyn
    Acked-by: Paul Menage
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn
     
  • Implement a cgroup to track and enforce open and mknod restrictions on device
    files. A device cgroup associates a device access whitelist with each cgroup.
    A whitelist entry has 4 fields. 'type' is a (all), c (char), or b (block).
    'all' means it applies to all types and all major and minor numbers. Major
    and minor are either an integer or * for all. Access is a composition of r
    (read), w (write), and m (mknod).

    The root device cgroup starts with rwm to 'all'. A child devcg gets a copy of
    the parent. Admins can then remove devices from the whitelist or add new
    entries. A child cgroup can never receive a device access which is denied its
    parent. However when a device access is removed from a parent it will not
    also be removed from the child(ren).

    An entry is added using devices.allow, and removed using
    devices.deny. For instance

    echo 'c 1:3 mr' > /cgroups/1/devices.allow

    allows cgroup 1 to read and mknod the device usually known as
    /dev/null. Doing

    echo a > /cgroups/1/devices.deny

    will remove the default 'a *:* mrw' entry.

    CAP_SYS_ADMIN is needed to change permissions or move another task to a new
    cgroup. A cgroup may not be granted more permissions than the cgroup's parent
    has. Any task can move itself between cgroups. This won't be sufficient, but
    we can decide the best way to adequately restrict movement later.

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: fix may-be-used-uninitialized warning]
    Signed-off-by: Serge E. Hallyn
    Acked-by: James Morris
    Looks-good-to: Pavel Emelyanov
    Cc: Daniel Hokka Zakrisson
    Cc: Li Zefan
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Serge E. Hallyn