04 Jan, 2012

1 commit


01 Nov, 2011

1 commit


20 Apr, 2011

2 commits


24 Mar, 2011

1 commit


26 Oct, 2010

1 commit

  • Now, rw_verify_area() checsk f_pos is negative or not. And if negative,
    returns -EINVAL.

    But, some special files as /dev/(k)mem and /proc//mem etc.. has
    negative offsets. And we can't do any access via read/write to the
    file(device).

    So introduce FMODE_UNSIGNED_OFFSET to allow negative file offsets.

    Signed-off-by: Wu Fengguang
    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Al Viro
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Al Viro

    KAMEZAWA Hiroyuki
     

23 Oct, 2010

1 commit

  • * 'llseek' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl:
    vfs: make no_llseek the default
    vfs: don't use BKL in default_llseek
    llseek: automatically add .llseek fop
    libfs: use generic_file_llseek for simple_attr
    mac80211: disallow seeks in minstrel debug code
    lirc: make chardev nonseekable
    viotape: use noop_llseek
    raw: use explicit llseek file operations
    ibmasmfs: use generic_file_llseek
    spufs: use llseek in all file operations
    arm/omap: use generic_file_llseek in iommu_debug
    lkdtm: use generic_file_llseek in debugfs
    net/wireless: use generic_file_llseek in debugfs
    drm: use noop_llseek

    Linus Torvalds
     

15 Oct, 2010

1 commit

  • All file_operations should get a .llseek operation so we can make
    nonseekable_open the default for future file operations without a
    .llseek pointer.

    The three cases that we can automatically detect are no_llseek, seq_lseek
    and default_llseek. For cases where we can we can automatically prove that
    the file offset is always ignored, we use noop_llseek, which maintains
    the current behavior of not returning an error from a seek.

    New drivers should normally not use noop_llseek but instead use no_llseek
    and call nonseekable_open at open time. Existing drivers can be converted
    to do the same when the maintainer knows for certain that no user code
    relies on calling seek on the device file.

    The generated code is often incorrectly indented and right now contains
    comments that clarify for each added line why a specific variant was
    chosen. In the version that gets submitted upstream, the comments will
    be gone and I will manually fix the indentation, because there does not
    seem to be a way to do that using coccinelle.

    Some amount of new code is currently sitting in linux-next that should get
    the same modifications, which I will do at the end of the merge window.

    Many thanks to Julia Lawall for helping me learn to write a semantic
    patch that does all this.

    ===== begin semantic patch =====
    // This adds an llseek= method to all file operations,
    // as a preparation for making no_llseek the default.
    //
    // The rules are
    // - use no_llseek explicitly if we do nonseekable_open
    // - use seq_lseek for sequential files
    // - use default_llseek if we know we access f_pos
    // - use noop_llseek if we know we don't access f_pos,
    // but we still want to allow users to call lseek
    //
    @ open1 exists @
    identifier nested_open;
    @@
    nested_open(...)
    {

    }

    @ open exists@
    identifier open_f;
    identifier i, f;
    identifier open1.nested_open;
    @@
    int open_f(struct inode *i, struct file *f)
    {

    }

    @ read disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {

    }

    @ read_no_fpos disable optional_qualifier exists @
    identifier read_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ write @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    expression E;
    identifier func;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {

    }

    @ write_no_fpos @
    identifier write_f;
    identifier f, p, s, off;
    type ssize_t, size_t, loff_t;
    @@
    ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
    {
    ... when != off
    }

    @ fops0 @
    identifier fops;
    @@
    struct file_operations fops = {
    ...
    };

    @ has_llseek depends on fops0 @
    identifier fops0.fops;
    identifier llseek_f;
    @@
    struct file_operations fops = {
    ...
    .llseek = llseek_f,
    ...
    };

    @ has_read depends on fops0 @
    identifier fops0.fops;
    identifier read_f;
    @@
    struct file_operations fops = {
    ...
    .read = read_f,
    ...
    };

    @ has_write depends on fops0 @
    identifier fops0.fops;
    identifier write_f;
    @@
    struct file_operations fops = {
    ...
    .write = write_f,
    ...
    };

    @ has_open depends on fops0 @
    identifier fops0.fops;
    identifier open_f;
    @@
    struct file_operations fops = {
    ...
    .open = open_f,
    ...
    };

    // use no_llseek if we call nonseekable_open
    ////////////////////////////////////////////
    @ nonseekable1 depends on !has_llseek && has_open @
    identifier fops0.fops;
    identifier nso ~= "nonseekable_open";
    @@
    struct file_operations fops = {
    ... .open = nso, ...
    +.llseek = no_llseek, /* nonseekable */
    };

    @ nonseekable2 depends on !has_llseek @
    identifier fops0.fops;
    identifier open.open_f;
    @@
    struct file_operations fops = {
    ... .open = open_f, ...
    +.llseek = no_llseek, /* open uses nonseekable */
    };

    // use seq_lseek for sequential files
    /////////////////////////////////////
    @ seq depends on !has_llseek @
    identifier fops0.fops;
    identifier sr ~= "seq_read";
    @@
    struct file_operations fops = {
    ... .read = sr, ...
    +.llseek = seq_lseek, /* we have seq_read */
    };

    // use default_llseek if there is a readdir
    ///////////////////////////////////////////
    @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier readdir_e;
    @@
    // any other fop is used that changes pos
    struct file_operations fops = {
    ... .readdir = readdir_e, ...
    +.llseek = default_llseek, /* readdir is present */
    };

    // use default_llseek if at least one of read/write touches f_pos
    /////////////////////////////////////////////////////////////////
    @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read.read_f;
    @@
    // read fops use offset
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = default_llseek, /* read accesses f_pos */
    };

    @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ... .write = write_f, ...
    + .llseek = default_llseek, /* write accesses f_pos */
    };

    // Use noop_llseek if neither read nor write accesses f_pos
    ///////////////////////////////////////////////////////////

    @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    identifier write_no_fpos.write_f;
    @@
    // write fops use offset
    struct file_operations fops = {
    ...
    .write = write_f,
    .read = read_f,
    ...
    +.llseek = noop_llseek, /* read and write both use no f_pos */
    };

    @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier write_no_fpos.write_f;
    @@
    struct file_operations fops = {
    ... .write = write_f, ...
    +.llseek = noop_llseek, /* write uses no f_pos */
    };

    @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    identifier read_no_fpos.read_f;
    @@
    struct file_operations fops = {
    ... .read = read_f, ...
    +.llseek = noop_llseek, /* read uses no f_pos */
    };

    @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
    identifier fops0.fops;
    @@
    struct file_operations fops = {
    ...
    +.llseek = noop_llseek, /* no read or write fn */
    };
    ===== End semantic patch =====

    Signed-off-by: Arnd Bergmann
    Cc: Julia Lawall
    Cc: Christoph Hellwig

    Arnd Bergmann
     

22 Sep, 2010

1 commit


07 Aug, 2010

1 commit

  • Make /dev/console get initialised before any initialisation routine that
    invokes modprobe because if modprobe fails, it's going to want to open
    /dev/console, presumably to write an error message to.

    The problem with that is that if the /dev/console driver is not yet
    initialised, the chardev handler will call request_module() to invoke
    modprobe, which will fail, because we never compile /dev/console as a
    module.

    This will lead to a modprobe loop, showing the following in the kernel
    log:

    request_module: runaway loop modprobe char-major-5-1
    request_module: runaway loop modprobe char-major-5-1
    request_module: runaway loop modprobe char-major-5-1
    request_module: runaway loop modprobe char-major-5-1
    request_module: runaway loop modprobe char-major-5-1

    This can happen, for example, when the built in md5 module can't find
    the built in cryptomgr module (because the latter fails to initialise).
    The md5 module comes before the call to tty_init(), presumably because
    'crypto' comes before 'drivers' alphabetically.

    Fix this by calling tty_init() from chrdev_init().

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

07 Apr, 2010

3 commits

  • Hide uncached_access() when pgprot_noncached is not #defined. This prevents
    the following warning:

    CC drivers/char/mem.o
    drivers/char/mem.c:229: warning: 'uncached_access' defined but not used

    Repairs d7d4d849b4e3acc405ec222884936800ffb26d48 ("drivers/char/mem.c:
    cleanups").

    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • commit dcefafb6 ("/dev/mem: dont allow seek to last page") inadvertently
    disabled rewinding on /dev/mem.

    This broke x86info for example.

    Signed-off-by: Eric Dumazet
    Acked-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • I hit this when we had a bug in IDR for a few days. Basically sysfs would
    fail to create new inodes since it uses an IDR and therefore class_create
    would fail.

    While we are unlikely to see this fail we may as well handle it instead of
    oopsing.

    Signed-off-by: Anton Blanchard
    Reviewed-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Anton Blanchard
     

13 Mar, 2010

2 commits

  • - fix switch statement layout

    - fix whitespace stuff

    - fix comment layout

    - remove unneeded inlining

    - use __weak

    - remove trailing whitespace

    - move uncached_access() inside `#ifndef __HAVE_PHYS_MEM_ACCESS_PROT' - it
    is otherwise unused.

    Cc: KAMEZAWA Hiroyuki
    Cc: OGAWA Hirofumi
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • So as to return a uniform error -EOVERFLOW instead of a random one:

    # kmem-seek 0xfffffffffffffff0
    seek /dev/kmem: Device or resource busy
    # kmem-seek 0xfffffffffffffff1
    seek /dev/kmem: Block device required

    Suggested by OGAWA Hirofumi.

    Cc: OGAWA Hirofumi
    Reviewed-by: KAMEZAWA Hiroyuki
    Signed-off-by: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

03 Feb, 2010

2 commits

  • write_kmem() used to assume vwrite() always return the full buffer length.
    However now vwrite() could return 0 to indicate memory hole. This
    creates a bug that "buf" is not advanced accordingly.

    Fix it to simply ignore the return value, hence the memory hole.

    Signed-off-by: Wu Fengguang
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Christoph Lameter
    Cc: Ingo Molnar
    Cc: Tejun Heo
    Cc: Nick Piggin
    Cc: KAMEZAWA Hiroyuki
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     
  • Otherwise vmalloc_to_page() will BUG().

    This also makes the kmem read/write implementation aligned with mem(4):
    "References to nonexistent locations cause errors to be returned." Here we
    return -ENXIO (inspired by Hugh) if no bytes have been transfered to/from
    user space, otherwise return partial read/write results.

    Signed-off-by: KAMEZAWA Hiroyuki
    Signed-off-by: Wu Fengguang
    Cc: Greg Kroah-Hartman
    Cc: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

16 Dec, 2009

6 commits


10 Dec, 2009

2 commits

  • While Linux provided an O_SYNC flag basically since day 1, it took until
    Linux 2.4.0-test12pre2 to actually get it implemented for filesystems,
    since that day we had generic_osync_around with only minor changes and the
    great "For now, when the user asks for O_SYNC, we'll actually give
    O_DSYNC" comment. This patch intends to actually give us real O_SYNC
    semantics in addition to the O_DSYNC semantics. After Jan's O_SYNC
    patches which are required before this patch it's actually surprisingly
    simple, we just need to figure out when to set the datasync flag to
    vfs_fsync_range and when not.

    This patch renames the existing O_SYNC flag to O_DSYNC while keeping it's
    numerical value to keep binary compatibility, and adds a new real O_SYNC
    flag. To guarantee backwards compatiblity it is defined as expanding to
    both the O_DSYNC and the new additional binary flag (__O_SYNC) to make
    sure we are backwards-compatible when compiled against the new headers.

    This also means that all places that don't care about the differences can
    just check O_DSYNC and get the right behaviour for O_SYNC, too - only
    places that actuall care need to check __O_SYNC in addition. Drivers and
    network filesystems have been updated in a fail safe way to always do the
    full sync magic if O_DSYNC is set. The few places setting O_SYNC for
    lower layers are kept that way for now to stay failsafe.

    We enforce that O_DSYNC is set when __O_SYNC is set early in the open path
    to make sure we always get these sane options.

    Note that parisc really screwed up their headers as they already define a
    O_DSYNC that has always been a no-op. We try to repair it by using it for
    the new O_DSYNC and redefinining O_SYNC to send both the traditional
    O_SYNC numerical value _and_ the O_DSYNC one.

    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Grant Grundler
    Cc: "David S. Miller"
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: Al Viro
    Cc: Andreas Dilger
    Acked-by: Trond Myklebust
    Acked-by: Kyle McMartin
    Acked-by: Ulrich Drepper
    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Jan Kara

    Christoph Hellwig
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (42 commits)
    tree-wide: fix misspelling of "definition" in comments
    reiserfs: fix misspelling of "journaled"
    doc: Fix a typo in slub.txt.
    inotify: remove superfluous return code check
    hdlc: spelling fix in find_pvc() comment
    doc: fix regulator docs cut-and-pasteism
    mtd: Fix comment in Kconfig
    doc: Fix IRQ chip docs
    tree-wide: fix assorted typos all over the place
    drivers/ata/libata-sff.c: comment spelling fixes
    fix typos/grammos in Documentation/edac.txt
    sysctl: add missing comments
    fs/debugfs/inode.c: fix comment typos
    sgivwfb: Make use of ARRAY_SIZE.
    sky2: fix sky2_link_down copy/paste comment error
    tree-wide: fix typos "couter" -> "counter"
    tree-wide: fix typos "offest" -> "offset"
    fix kerneldoc for set_irq_msi()
    spidev: fix double "of of" in comment
    comment typo fix: sybsystem -> subsystem
    ...

    Linus Torvalds
     

04 Dec, 2009

1 commit

  • That is "success", "unknown", "through", "performance", "[re|un]mapping"
    , "access", "default", "reasonable", "[con]currently", "temperature"
    , "channel", "[un]used", "application", "example","hierarchy", "therefore"
    , "[over|under]flow", "contiguous", "threshold", "enough" and others.

    Signed-off-by: André Goddard Rosa
    Signed-off-by: Jiri Kosina

    André Goddard Rosa
     

14 Oct, 2009

1 commit

  • The generic open callback for the mem class devices is "protected" by
    the bkl.

    Let's look at the datas manipulated inside memory_open:

    - inode and file: safe
    - the devlist: safe because it is constant
    - the memdev classes inside this array are safe too (constant)

    After we find out which memdev file operation we need to use, we call
    its open callback. Depending on the targeted memdev, we call either
    open_port() that doesn't manipulate any racy data (just a capable()
    check), or we call nothing.

    So it's safe to remove the big kernel lock there.

    Signed-off-by: Frederic Weisbecker
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Frederic Weisbecker
     

28 Sep, 2009

1 commit


24 Sep, 2009

1 commit


20 Sep, 2009

1 commit

  • This allows subsytems to provide devtmpfs with non-default permissions
    for the device node. Instead of the default mode of 0600, null, zero,
    random, urandom, full, tty, ptmx now have a mode of 0666, which allows
    non-privileged processes to access standard device nodes in case no
    other userspace process applies the expected permissions.

    This also fixes a wrong assignment in pktcdvd and a checkpatch.pl complain.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

16 Sep, 2009

2 commits

  • When I build and boot -next on fedora 10, I can not login anymore.
    When I input the user name and password, the system does not output
    any message and requires user to input the user name and password
    again and again.

    I find the patch which caused this problem with "GIT BISECT" command.
    And the patch is
    commit 7c4b7daa1878972ed0137c95f23569124bd6e2b1
    "mem_class: use minor as index instead of searching the array".

    Though I don't know the real reason why user could not login, I
    confirmed the patch I made as following could resolve the problem on
    fedora 10.

    Signed-off-by: Jin Dongming
    Acked-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Jin Dongming
     
  • Declare the device list with the minor numbers as the index, which saves us from
    searching for a matching list entry. Remove old devfs permissions declaration.

    Signed-off-by: Kay Sievers
    Signed-off-by: Greg Kroah-Hartman

    Kay Sievers
     

11 Sep, 2009

1 commit


19 Jun, 2009

1 commit


10 Jun, 2009

1 commit

  • This helps with bad latencies for large reads from /dev/zero, but might
    conceivably break some application that "knows" that a read of /dev/zero
    cannot return early. So do this early in the merge window to give us
    maximal test coverage, even if the patch is totally trivial.

    Obviously, no well-behaved application should ever depend on the read
    being uninterruptible, but hey, bugs happen.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Jun, 2009

1 commit

  • While running 20 parallel instances of dd as follows:

    #!/bin/bash
    for i in `seq 1 20`; do
    dd if=/dev/zero of=/export/hda3/dd_$i bs=1073741824 count=1 &
    done
    wait

    on a 16G machine, we noticed that rather than just killing the processes,
    the entire kernel went down. Stracing dd reveals that it first does an
    mmap2, which makes 1GB worth of zero page mappings. Then it performs a
    read on those pages from /dev/zero, and finally it performs a write.

    The machine died during the reads. Looking at the code, it was noticed
    that /dev/zero's read operation had been changed by
    557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") from giving
    zero page mappings to actually zeroing the page.

    The zeroing of the pages causes physical pages to be allocated to the
    process. But, when the process exhausts all the memory that it can, the
    kernel cannot kill it, as it is still in the kernel mode allocating more
    memory. Consequently, the kernel eventually crashes.

    To fix this, I propose that when a fatal signal is pending during
    /dev/zero read operation, we simply return and let the user process die.

    Signed-off-by: Salman Qazi
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    [ Modified error return and comment trivially. - Linus]
    Signed-off-by: Linus Torvalds

    Salman Qazi
     

10 Apr, 2009

1 commit

  • /dev/mem mmap code was doing memtype reserve/free for a while now.
    Recently we added memtype tracking in remap_pfn_range, and /dev/mem mmap
    uses it indirectly. So, we don't need seperate tracking in /dev/mem code
    any more. That means another ~100 lines of code removed :-).

    Signed-off-by: Suresh Siddha
    Signed-off-by: Venkatesh Pallipadi
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

07 Jan, 2009

1 commit

  • Sparse output following warnings.

    mm/vmalloc.c:1436:6: warning: symbol 'vread' was not declared. Should it be static?
    mm/vmalloc.c:1474:6: warning: symbol 'vwrite' was not declared. Should it be static?

    However, it is used by /dev/kmem. fixed here.

    Signed-off-by: KOSAKI Motohiro
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     

17 Oct, 2008

1 commit


25 Jul, 2008

1 commit

  • Use generic_access_phys as the access_process_vm access function for
    /dev/mem mappings. This makes it possible to debug the X server.

    [akpm@linux-foundation.org: repair all the architectures which broke]
    Signed-off-by: Rik van Riel
    Cc: Benjamin Herrensmidt
    Cc: Dave Airlie
    Cc: Hugh Dickins
    Cc: Paul Mackerras
    Cc: Arnd Bergmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel