05 Jan, 2012

4 commits

  • bitmap size sanity checks should be done *before* allocating ->s_root;
    there their cleanup on failure would be correct. As it is, we do iput()
    on root inode, but leak the root dentry...

    Signed-off-by: Al Viro
    Acked-by: Josh Boyer
    Signed-off-by: Linus Torvalds

    Al Viro
     
  • This is the temporary simple fix for 3.2, we need more changes in this
    area.

    1. do_signal_stop() assumes that the running untraced thread in the
    stopped thread group is not possible. This was our goal but it is
    not yet achieved: a stopped-but-resumed tracee can clone the running
    thread which can initiate another group-stop.

    Remove WARN_ON_ONCE(!current->ptrace).

    2. A new thread always starts with ->jobctl = 0. If it is auto-attached
    and this group is stopped, __ptrace_unlink() sets JOBCTL_STOP_PENDING
    but JOBCTL_STOP_SIGMASK part is zero, this triggers WANR_ON(!signr)
    in do_jobctl_trap() if another debugger attaches.

    Change __ptrace_unlink() to set the artificial SIGSTOP for report.

    Alternatively we could change ptrace_init_task() to copy signr from
    current, but this means we can copy it for no reason and hide the
    possible similar problems.

    Acked-by: Tejun Heo
    Cc: [3.1]
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • Test-case:

    int main(void)
    {
    int pid, status;

    pid = fork();
    if (!pid) {
    for (;;) {
    if (!fork())
    return 0;
    if (waitpid(-1, &status, 0) < 0) {
    printf("ERR!! wait: %m\n");
    return 0;
    }
    }
    }

    assert(ptrace(PTRACE_ATTACH, pid, 0,0) == 0);
    assert(waitpid(-1, NULL, 0) == pid);

    assert(ptrace(PTRACE_SETOPTIONS, pid, 0,
    PTRACE_O_TRACEFORK) == 0);

    do {
    ptrace(PTRACE_CONT, pid, 0, 0);
    pid = waitpid(-1, NULL, 0);
    } while (pid > 0);

    return 1;
    }

    It fails because ->real_parent sees its child in EXIT_DEAD state
    while the tracer is going to change the state back to EXIT_ZOMBIE
    in wait_task_zombie().

    The offending commit is 823b018e which moved the EXIT_DEAD check,
    but in fact we should not blame it. The original code was not
    correct as well because it didn't take ptrace_reparented() into
    account and because we can't really trust ->ptrace.

    This patch adds the additional check to close this particular
    race but it doesn't solve the whole problem. We simply can't
    rely on ->ptrace in this case, it can be cleared if the tracer
    is multithreaded by the exiting ->parent.

    I think we should kill EXIT_DEAD altogether, we should always
    remove the soon-to-be-reaped child from ->children or at least
    we should never do the DEAD->ZOMBIE transition. But this is too
    complex for 3.2.

    Reported-and-tested-by: Denys Vlasenko
    Tested-by: Lukasz Michalik
    Acked-by: Tejun Heo
    Cc: [3.0+]
    Signed-off-by: Oleg Nesterov
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • * git://git.samba.org/sfrench/cifs-2.6:
    [CIFS] default ntlmv2 for cifs mount delayed to 3.3
    cifs: fix bad buffer length check in coalesce_t2

    Linus Torvalds
     

04 Jan, 2012

6 commits

  • This reverts commit 93b2ec0128c431148b216b8f7337c1a52131ef03.

    The call to "schedule_work()" in rtc_initialize_alarm() happens too
    early, and can cause oopses at bootup

    Neil Brown explains why we do it:

    "If you set an alarm in the future, then shutdown and boot again after
    that time, then you will end up with a timer_queue node which is in
    the past.

    When this happens the queue gets stuck. That entry-in-the-past won't
    get removed until and interrupt happens and an interrupt won't happen
    because the RTC only triggers an interrupt when the alarm is "now".

    So you'll find that e.g. "hwclock" will always tell you that
    'select' timed out.

    So we force the interrupt work to happen at the start just in case."

    and has a patch that convert it to do things in-process rather than with
    the worker thread, but right now it's too late to play around with this,
    so we just revert the patch that caused problems for now.

    Reported-by: Sander Eikelenboom
    Requested-by: Konrad Rzeszutek Wilk
    Requested-by: John Stultz
    Cc: Neil Brown
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Turned out the ntlmv2 (default security authentication)
    upgrade was harder to test than expected, and we ran
    out of time to test against Apple and a few other servers
    that we wanted to. Delay upgrade of default security
    from ntlm to ntlmv2 (on mount) to 3.3. Still works
    fine to specify it explicitly via "sec=ntlmv2" so this
    should be fine.

    Acked-by: Jeff Layton
    Signed-off-by: Steve French

    Steve French
     
  • The current check looks to see if the RFC1002 length is larger than
    CIFSMaxBufSize, and fails if it is. The buffer is actually larger than
    that by MAX_CIFS_HDR_SIZE.

    This bug has been around for a long time, but the fact that we used to
    cap the clients MaxBufferSize at the same level as the server tended
    to paper over it. Commit c974befa changed that however and caused this
    bug to bite in more cases.

    Reported-and-Tested-by: Konstantinos Skarlatos
    Tested-by: Shirish Pargaonkar
    Signed-off-by: Jeff Layton
    Signed-off-by: Steve French

    Jeff Layton
     
  • This reverts commit c0afabd3d553c521e003779c127143ffde55a16f.

    It causes failures on Toshiba laptops - instead of disabling the alarm,
    it actually seems to enable it on the affected laptops, resulting in
    (for example) the laptop powering on automatically five minutes after
    shutdown.

    There's a patch for it that appears to work for at least some people,
    but it's too late to play around with this, so revert for now and try
    again in the next merge window.

    See for example

    http://bugs.debian.org/652869

    Reported-and-bisected-by: Andreas Friedrich (Toshiba Tecra)
    Reported-by: Antonio-M. Corbi Bellot (Toshiba Portege R500)
    Reported-by: Marco Santos (Toshiba Portege Z830)
    Reported-by: Christophe Vu-Brugier (Toshiba Portege R830)
    Cc: Jonathan Nieder
    Requested-by: John Stultz
    Cc: stable@kernel.org # for the versions that applied this
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • vfork parent uninterruptibly and unkillably waits for its child to
    exec/exit. This wait is of unbounded length. Ignore such waits
    in the hung_task detector.

    Signed-off-by: Mandeep Singh Baines
    Reported-by: Sasha Levin
    LKML-Reference:
    Cc: Linus Torvalds
    Cc: Ingo Molnar
    Cc: Peter Zijlstra
    Cc: Andrew Morton
    Cc: John Kacur
    Cc: stable@kernel.org
    Signed-off-by: Linus Torvalds

    Mandeep Singh Baines
     
  • Commit 1e39f384bb01 ("evm: fix build problems") makes the stub version
    of security_old_inode_init_security() return 0 when CONFIG_SECURITY is
    not set.

    But that makes callers such as reiserfs_security_init() assume that
    security_old_inode_init_security() has set name, value, and len
    arguments properly - but security_old_inode_init_security() left them
    uninitialized which then results in interesting failures.

    Revert security_old_inode_init_security() to the old behavior of
    returning EOPNOTSUPP since both callers (reiserfs and ocfs2) handle this
    just fine.

    [ Also fixed the S_PRIVATE(inode) case of the actual non-stub
    security_old_inode_init_security() function to return EOPNOTSUPP
    for the same reason, as pointed out by Mimi Zohar.

    It got incorrectly changed to match the new function in commit
    fb88c2b6cbb1: "evm: fix security/security_old_init_security return
    code". - Linus ]

    Reported-by: Jorge Bastos
    Acked-by: James Morris
    Acked-by: Mimi Zohar
    Signed-off-by: Jan Kara
    Signed-off-by: Linus Torvalds

    Jan Kara
     

03 Jan, 2012

2 commits


02 Jan, 2012

1 commit


01 Jan, 2012

3 commits

  • * 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
    ASoC: wm8776: add missing break in sample size switch

    Linus Torvalds
     
  • The current gspca core code has a regression where it no longer properly
    falls back to lower alt settings when there is not enough bandwidth.

    This causes many iso based usb-1 cameras to not work when plugged into a
    usb2 hub or a sandybridge chipset motherboard!

    This patch fixes this.

    Signed-off-by: Hans de Goede
    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Linus Torvalds

    Mauro Carvalho Chehab
     
  • It was found (by Sasha) that if you use a futex located in the gate
    area we get stuck in an uninterruptible infinite loop, much like the
    ZERO_PAGE issue.

    While looking at this problem, PeterZ realized you'll get into similar
    trouble when hitting any install_special_pages() mapping. And are there
    still drivers setting up their own special mmaps without page->mapping,
    and without special VM or pte flags to make get_user_pages fail?

    In most cases, if page->mapping is NULL, we do not need to retry at all:
    Linus points out that even /proc/sys/vm/drop_caches poses no problem,
    because it ends up using remove_mapping(), which takes care not to
    interfere when the page reference count is raised.

    But there is still one case which does need a retry: if memory pressure
    called shmem_writepage in between get_user_pages_fast dropping page
    table lock and our acquiring page lock, then the page gets switched from
    filecache to swapcache (and ->mapping set to NULL) whatever the refcount.
    Fault it back in to get the page->mapping needed for key->shared.inode.

    Reported-by: Sasha Levin
    Signed-off-by: Hugh Dickins
    Cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

31 Dec, 2011

11 commits


30 Dec, 2011

11 commits

  • * 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
    iommu: Initialize domain->handler in iommu_domain_alloc()

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    packet: fix possible dev refcnt leak when bind fail
    netem: dont call vfree() under spinlock and BH disabled
    netfilter: ctnetlink: fix scheduling while atomic if helper is autoloaded
    netfilter: ctnetlink: fix return value of ctnetlink_get_expect()

    Linus Torvalds
     
  • * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86: Fix raw_spin_unlock_irqrestore() usage
    oprofile, arm/sh: Fix oprofile_arch_exit() linkage issue

    Linus Torvalds
     
  • * 'for-linus' of git://oss.sgi.com/xfs/xfs:
    xfs: log all dirty inodes in xfs_fs_sync_fs
    xfs: log the inode in ->write_inode calls for kupdate

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-block:
    block: fix blk_queue_end_tag()
    block: re-use existing 'reading' variable instead of checking direction again
    block, cfq: fix empty queue crash caused by request merge

    Linus Torvalds
     
  • If a huge page is enqueued under the protection of hugetlb_lock, then the
    operation is atomic and safe.

    Signed-off-by: Hillf Danton
    Reviewed-by: Michal Hocko
    Acked-by: KAMEZAWA Hiroyuki
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hillf Danton
     
  • Commit 2a95ea6c0d129b4 ("procfs: do not overflow get_{idle,iowait}_time
    for nohz") did not take into account that one some architectures jiffies
    and cputime use different units.

    This causes get_idle_time() to return numbers in the wrong units, making
    the idle time fields in /proc/stat wrong.

    Instead of converting the usec value returned by
    get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
    usecs_to_cputime64 to convert it to the correct unit of cputime64_t.

    Signed-off-by: Andreas Schwab
    Acked-by: Michal Hocko
    Cc: Arnd Bergmann
    Cc: "Artem S. Tashkinov"
    Cc: Dave Jones
    Cc: Alexey Dobriyan
    Cc: Thomas Gleixner
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Schwab
     
  • commit 8aacc9f550 ("mm/mempolicy.c: fix pgoff in mbind vma merge") is the
    slightly incorrect fix.

    Why? Think following case.

    1. map 4 pages of a file at offset 0

    [0123]

    2. map 2 pages just after the first mapping of the same file but with
    page offset 2

    [0123][23]

    3. mbind() 2 pages from the first mapping at offset 2.
    mbind_range() should treat new vma is,

    [0123][23]
    |23|
    mbind vma

    but it does

    [0123][23]
    |01|
    mbind vma

    Oops. then, it makes wrong vma merge and splitting ([01][0123] or similar).

    This patch fixes it.

    [testcase]
    test result - before the patch

    case4: 126: test failed. expect '2,4', actual '2,2,2'
    case5: passed
    case6: passed
    case7: passed
    case8: passed
    case_n: 246: test failed. expect '4,2', actual '1,4'

    ------------[ cut here ]------------
    kernel BUG at mm/filemap.c:135!
    invalid opcode: 0000 [#4] SMP DEBUG_PAGEALLOC

    (snip long bug on messages)

    test result - after the patch

    case4: passed
    case5: passed
    case6: passed
    case7: passed
    case8: passed
    case_n: passed

    source: mbind_vma_test.c
    ============================================================
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    static unsigned long pagesize;
    void* mmap_addr;
    struct bitmask *nmask;
    char buf[1024];
    FILE *file;
    char retbuf[10240] = "";
    int mapped_fd;

    char *rubysrc = "ruby -e '\
    pid = %d; \
    vstart = 0x%llx; \
    vend = 0x%llx; \
    s = `pmap -q #{pid}`; \
    rary = []; \
    s.each_line {|line|; \
    ary=line.split(\" \"); \
    addr = ary[0].to_i(16); \
    if(vstart < vend) then \
    rary.push(ary[1].to_i()/4); \
    end; \
    }; \
    print rary.join(\",\"); \
    '";

    void init(void)
    {
    void* addr;
    char buf[128];

    nmask = numa_allocate_nodemask();
    numa_bitmask_setbit(nmask, 0);

    pagesize = getpagesize();

    sprintf(buf, "%s", "mbind_vma_XXXXXX");
    mapped_fd = mkstemp(buf);
    if (mapped_fd == -1)
    perror("mkstemp "), exit(1);
    unlink(buf);

    if (lseek(mapped_fd, pagesize*8, SEEK_SET) < 0)
    perror("lseek "), exit(1);
    if (write(mapped_fd, "\0", 1) < 0)
    perror("write "), exit(1);

    addr = mmap(NULL, pagesize*8, PROT_NONE,
    MAP_SHARED, mapped_fd, 0);
    if (addr == MAP_FAILED)
    perror("mmap "), exit(1);

    if (mprotect(addr+pagesize, pagesize*6, PROT_READ|PROT_WRITE) < 0)
    perror("mprotect "), exit(1);

    mmap_addr = addr + pagesize;

    /* make page populate */
    memset(mmap_addr, 0, pagesize*6);
    }

    void fin(void)
    {
    void* addr = mmap_addr - pagesize;
    munmap(addr, pagesize*8);

    memset(buf, 0, sizeof(buf));
    memset(retbuf, 0, sizeof(retbuf));
    }

    void mem_bind(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_BIND, nmask->maskp, nmask->size, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void mem_interleave(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_INTERLEAVE, nmask->maskp, nmask->size, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void mem_unbind(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_DEFAULT, NULL, 0, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void Assert(char *expected, char *value, char *name, int line)
    {
    if (strcmp(expected, value) == 0) {
    fprintf(stderr, "%s: passed\n", name);
    return;
    }
    else {
    fprintf(stderr, "%s: %d: test failed. expect '%s', actual '%s'\n",
    name, line,
    expected, value);
    // exit(1);
    }
    }

    /*
    AAAA
    PPPPPPNNNNNN
    might become
    PPNNNNNNNNNN
    case 4 below
    */
    void case4(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 4);
    mem_unbind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("2,4", retbuf, "case4", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPPPNNNNNN
    might become
    PPPPPPPPPPNN
    case 5 below
    */
    void case5(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case5", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPPPPPPPPP 6
    */
    void case6(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_bind(4, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("6", retbuf, "case6", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPPPPPXXXX 7
    */
    void case7(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_interleave(4, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case7", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPNNNNNNNN 8
    */
    void case8(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_interleave(4, 2);
    mem_interleave(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("2,4", retbuf, "case8", __LINE__);

    fin();
    }

    void case_n(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    /* make redundunt mappings [0][1234][34][7] */
    mmap(mmap_addr + pagesize*4, pagesize*2, PROT_READ|PROT_WRITE,
    MAP_FIXED|MAP_SHARED, mapped_fd, pagesize*3);

    /* Expect to do nothing. */
    mem_unbind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case_n", __LINE__);

    fin();
    }

    int main(int argc, char** argv)
    {
    case4();
    case5();
    case6();
    case7();
    case8();
    case_n();

    return 0;
    }
    =============================================================

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Johannes Weiner
    Cc: Minchan Kim
    Cc: Caspar Zhang
    Cc: KOSAKI Motohiro
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Cc: [3.1.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • The new iso bandwidth calculation code accidentally has broken support
    for bulk mode cameras. This has broken the following drivers:
    finepix, jeilinj, ovfx2, ov534, ov534_9, se401, sq905, sq905c, sq930x,
    stv0680, vicam.

    Thix patch fixes this. Fix tested with: se401, sq905, sq905c, stv0680 & vicam
    cams.

    Signed-off-by: Hans de Goede
    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Linus Torvalds

    Hans de Goede
     
  • Fixing wrong register offset which is used to retrieve the number of buttons
    attached to the hardware.

    Signed-off-by: Tai-hwa Liang
    Signed-off-by: Dmitry Torokhov

    Tai-hwa Liang
     
  • Ceph attempts to use the dcache to satisfy negative lookups and readdir
    when the entire directory contents are in cache. Disable this behavior
    until lingering bugs in this code are shaken out; we'll re-enable these
    hooks once things are fully stable.

    Signed-off-by: Sage Weil

    Sage Weil
     

29 Dec, 2011

1 commit

  • Commit 5e081591 "block: warn if tag is greater than real_max_depth"
    cleaned up blk_queue_end_tag() to warn when the tag is truly invalid
    (greater than real_max_depth). However, it changed behavior in the tag <
    max_depth case to not end the request. Leading to triggering of
    BUG_ON(blk_queued_rq(rq)) in the request completion path:

    http://marc.info/?l=linux-kernel&m=132204370518629&w=2

    In order to allow blk_queue_resize_tags() to shrink the tag space
    blk_queue_end_tag() must always complete tags with a value less than
    real_max_depth regardless of the current max_depth. The comment about
    "handling the shrink case" seems to be what prompted changes in this
    space, so remove it and BUG on all invalid tags (made even simpler by
    Matthew's suggestion to use an unsigned compare).

    Signed-off-by: Dan Williams
    Cc: Tao Ma
    Cc: Matthew Wilcox
    Reported-by: Meelis Roos
    Reported-by: Ed Nadolski
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Dan Williams
     

28 Dec, 2011

1 commit