31 Dec, 2011

1 commit


30 Dec, 2011

17 commits

  • * 'iommu/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
    iommu: Initialize domain->handler in iommu_domain_alloc()

    Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
    packet: fix possible dev refcnt leak when bind fail
    netem: dont call vfree() under spinlock and BH disabled
    netfilter: ctnetlink: fix scheduling while atomic if helper is autoloaded
    netfilter: ctnetlink: fix return value of ctnetlink_get_expect()

    Linus Torvalds
     
  • * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    perf/x86: Fix raw_spin_unlock_irqrestore() usage
    oprofile, arm/sh: Fix oprofile_arch_exit() linkage issue

    Linus Torvalds
     
  • * 'for-linus' of git://oss.sgi.com/xfs/xfs:
    xfs: log all dirty inodes in xfs_fs_sync_fs
    xfs: log the inode in ->write_inode calls for kupdate

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-block:
    block: fix blk_queue_end_tag()
    block: re-use existing 'reading' variable instead of checking direction again
    block, cfq: fix empty queue crash caused by request merge

    Linus Torvalds
     
  • If a huge page is enqueued under the protection of hugetlb_lock, then the
    operation is atomic and safe.

    Signed-off-by: Hillf Danton
    Reviewed-by: Michal Hocko
    Acked-by: KAMEZAWA Hiroyuki
    Cc: [2.6.37+]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hillf Danton
     
  • Commit 2a95ea6c0d129b4 ("procfs: do not overflow get_{idle,iowait}_time
    for nohz") did not take into account that one some architectures jiffies
    and cputime use different units.

    This causes get_idle_time() to return numbers in the wrong units, making
    the idle time fields in /proc/stat wrong.

    Instead of converting the usec value returned by
    get_cpu_{idle,iowait}_time_us to units of jiffies, use the new function
    usecs_to_cputime64 to convert it to the correct unit of cputime64_t.

    Signed-off-by: Andreas Schwab
    Acked-by: Michal Hocko
    Cc: Arnd Bergmann
    Cc: "Artem S. Tashkinov"
    Cc: Dave Jones
    Cc: Alexey Dobriyan
    Cc: Thomas Gleixner
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andreas Schwab
     
  • commit 8aacc9f550 ("mm/mempolicy.c: fix pgoff in mbind vma merge") is the
    slightly incorrect fix.

    Why? Think following case.

    1. map 4 pages of a file at offset 0

    [0123]

    2. map 2 pages just after the first mapping of the same file but with
    page offset 2

    [0123][23]

    3. mbind() 2 pages from the first mapping at offset 2.
    mbind_range() should treat new vma is,

    [0123][23]
    |23|
    mbind vma

    but it does

    [0123][23]
    |01|
    mbind vma

    Oops. then, it makes wrong vma merge and splitting ([01][0123] or similar).

    This patch fixes it.

    [testcase]
    test result - before the patch

    case4: 126: test failed. expect '2,4', actual '2,2,2'
    case5: passed
    case6: passed
    case7: passed
    case8: passed
    case_n: 246: test failed. expect '4,2', actual '1,4'

    ------------[ cut here ]------------
    kernel BUG at mm/filemap.c:135!
    invalid opcode: 0000 [#4] SMP DEBUG_PAGEALLOC

    (snip long bug on messages)

    test result - after the patch

    case4: passed
    case5: passed
    case6: passed
    case7: passed
    case8: passed
    case_n: passed

    source: mbind_vma_test.c
    ============================================================
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    static unsigned long pagesize;
    void* mmap_addr;
    struct bitmask *nmask;
    char buf[1024];
    FILE *file;
    char retbuf[10240] = "";
    int mapped_fd;

    char *rubysrc = "ruby -e '\
    pid = %d; \
    vstart = 0x%llx; \
    vend = 0x%llx; \
    s = `pmap -q #{pid}`; \
    rary = []; \
    s.each_line {|line|; \
    ary=line.split(\" \"); \
    addr = ary[0].to_i(16); \
    if(vstart < vend) then \
    rary.push(ary[1].to_i()/4); \
    end; \
    }; \
    print rary.join(\",\"); \
    '";

    void init(void)
    {
    void* addr;
    char buf[128];

    nmask = numa_allocate_nodemask();
    numa_bitmask_setbit(nmask, 0);

    pagesize = getpagesize();

    sprintf(buf, "%s", "mbind_vma_XXXXXX");
    mapped_fd = mkstemp(buf);
    if (mapped_fd == -1)
    perror("mkstemp "), exit(1);
    unlink(buf);

    if (lseek(mapped_fd, pagesize*8, SEEK_SET) < 0)
    perror("lseek "), exit(1);
    if (write(mapped_fd, "\0", 1) < 0)
    perror("write "), exit(1);

    addr = mmap(NULL, pagesize*8, PROT_NONE,
    MAP_SHARED, mapped_fd, 0);
    if (addr == MAP_FAILED)
    perror("mmap "), exit(1);

    if (mprotect(addr+pagesize, pagesize*6, PROT_READ|PROT_WRITE) < 0)
    perror("mprotect "), exit(1);

    mmap_addr = addr + pagesize;

    /* make page populate */
    memset(mmap_addr, 0, pagesize*6);
    }

    void fin(void)
    {
    void* addr = mmap_addr - pagesize;
    munmap(addr, pagesize*8);

    memset(buf, 0, sizeof(buf));
    memset(retbuf, 0, sizeof(retbuf));
    }

    void mem_bind(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_BIND, nmask->maskp, nmask->size, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void mem_interleave(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_INTERLEAVE, nmask->maskp, nmask->size, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void mem_unbind(int index, int len)
    {
    int err;

    err = mbind(mmap_addr+pagesize*index, pagesize*len,
    MPOL_DEFAULT, NULL, 0, 0);
    if (err)
    perror("mbind "), exit(err);
    }

    void Assert(char *expected, char *value, char *name, int line)
    {
    if (strcmp(expected, value) == 0) {
    fprintf(stderr, "%s: passed\n", name);
    return;
    }
    else {
    fprintf(stderr, "%s: %d: test failed. expect '%s', actual '%s'\n",
    name, line,
    expected, value);
    // exit(1);
    }
    }

    /*
    AAAA
    PPPPPPNNNNNN
    might become
    PPNNNNNNNNNN
    case 4 below
    */
    void case4(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 4);
    mem_unbind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("2,4", retbuf, "case4", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPPPNNNNNN
    might become
    PPPPPPPPPPNN
    case 5 below
    */
    void case5(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case5", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPPPPPPPPP 6
    */
    void case6(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_bind(4, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("6", retbuf, "case6", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPPPPPXXXX 7
    */
    void case7(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_interleave(4, 2);
    mem_bind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case7", __LINE__);

    fin();
    }

    /*
    AAAA
    PPPPNNNNXXXX
    might become
    PPPPNNNNNNNN 8
    */
    void case8(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    mem_bind(0, 2);
    mem_interleave(4, 2);
    mem_interleave(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("2,4", retbuf, "case8", __LINE__);

    fin();
    }

    void case_n(void)
    {
    init();
    sprintf(buf, rubysrc, getpid(), mmap_addr, mmap_addr+pagesize*6);

    /* make redundunt mappings [0][1234][34][7] */
    mmap(mmap_addr + pagesize*4, pagesize*2, PROT_READ|PROT_WRITE,
    MAP_FIXED|MAP_SHARED, mapped_fd, pagesize*3);

    /* Expect to do nothing. */
    mem_unbind(2, 2);

    file = popen(buf, "r");
    fread(retbuf, sizeof(retbuf), 1, file);
    Assert("4,2", retbuf, "case_n", __LINE__);

    fin();
    }

    int main(int argc, char** argv)
    {
    case4();
    case5();
    case6();
    case7();
    case8();
    case_n();

    return 0;
    }
    =============================================================

    Signed-off-by: KOSAKI Motohiro
    Acked-by: Johannes Weiner
    Cc: Minchan Kim
    Cc: Caspar Zhang
    Cc: KOSAKI Motohiro
    Cc: Christoph Lameter
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Cc: [3.1.x]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • The new iso bandwidth calculation code accidentally has broken support
    for bulk mode cameras. This has broken the following drivers:
    finepix, jeilinj, ovfx2, ov534, ov534_9, se401, sq905, sq905c, sq930x,
    stv0680, vicam.

    Thix patch fixes this. Fix tested with: se401, sq905, sq905c, stv0680 & vicam
    cams.

    Signed-off-by: Hans de Goede
    Signed-off-by: Mauro Carvalho Chehab
    Signed-off-by: Linus Torvalds

    Hans de Goede
     
  • In some of the rt6_bind_neighbour() call sites, it hasn't hooked
    up the rt->dst.dev pointer yet, so we'd deref a NULL pointer when
    obtaining dev->ifindex for the neighbour hash function computation.

    Just pass the netdevice explicitly in to fix this problem.

    Reported-by: Bjarke Istrup Pedersen
    Signed-off-by: David S. Miller

    David S. Miller
     
  • Michael S. Tsirkin also noticed that we could run the refill work
    multiple CPUs: if we kick off a refill on one CPU and then on another,
    they would both manipulate the queue at the same time (they use
    napi_disable to avoid racing against the receive handler itself).

    Tejun points out that this is what the WQ_NON_REENTRANT flag is for,
    and that there is a convenient system kthread we can use.

    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     
  • Michael S. Tsirkin noticed that we could run the refill work after
    ndo_close, which can re-enable napi - we don't disable it until
    virtnet_remove. This is clearly wrong, so move the workqueue control
    to ndo_open and ndo_stop (aka. virtnet_open and virtnet_close).

    One subtle point: virtnet_probe() could simply fail if it couldn't
    allocate a receive buffer, but that's less polite in virtnet_open() so
    we schedule a refill as we do in the normal receive path if we run out
    of memory.

    Signed-off-by: Rusty Russell
    Signed-off-by: David S. Miller

    Rusty Russell
     
  • I missed this while adding ipv6 support to inet_peer.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • At the moment VFs can only operate in Eth mode.
    In addition we don't want the VF to attempt link sensing,
    so we block this option as well.

    Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Signed-off-by: Yevgeny Petrilin
    Signed-off-by: David S. Miller

    Yevgeny Petrilin
     
  • Provide child qdisc backlog (byte count) information so that "tc -s
    qdisc" can report it to user.

    qdisc netem 30: root refcnt 18 limit 1000 delay 20.0ms 10.0ms
    Sent 948517 bytes 898 pkt (dropped 0, overlimits 0 requeues 1)
    rate 175056bit 16pps backlog 114b 1p requeues 1
    qdisc tbf 40: parent 30: rate 256000bit burst 20Kb/8 mpu 0b lat 0us
    Sent 948517 bytes 898 pkt (dropped 15, overlimits 611 requeues 0)
    backlog 18168b 12p requeues 0

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • warning: (NETFILTER_XT_MATCH_NFACCT) selects NETFILTER_NETLINK_ACCT which has
    unmet direct dependencies (NET && INET && NETFILTER && NETFILTER_ADVANCED)

    and then

    ERROR: "nfnetlink_subsys_unregister" [net/netfilter/nfnetlink_acct.ko] undefined!
    ERROR: "nfnetlink_subsys_register" [net/netfilter/nfnetlink_acct.ko] undefined!

    Reported-by: Randy Dunlap
    Signed-off-by: Pablo Neira Ayuso
    Acked-by: Randy Dunlap
    Signed-off-by: David S. Miller

    Pablo Neira Ayuso
     

29 Dec, 2011

10 commits

  • Commit 5e081591 "block: warn if tag is greater than real_max_depth"
    cleaned up blk_queue_end_tag() to warn when the tag is truly invalid
    (greater than real_max_depth). However, it changed behavior in the tag <
    max_depth case to not end the request. Leading to triggering of
    BUG_ON(blk_queued_rq(rq)) in the request completion path:

    http://marc.info/?l=linux-kernel&m=132204370518629&w=2

    In order to allow blk_queue_resize_tags() to shrink the tag space
    blk_queue_end_tag() must always complete tags with a value less than
    real_max_depth regardless of the current max_depth. The comment about
    "handling the shrink case" seems to be what prompted changes in this
    space, so remove it and BUG on all invalid tags (made even simpler by
    Matthew's suggestion to use an unsigned compare).

    Signed-off-by: Dan Williams
    Cc: Tao Ma
    Cc: Matthew Wilcox
    Reported-by: Meelis Roos
    Reported-by: Ed Nadolski
    Cc: Tejun Heo
    Signed-off-by: Andrew Morton
    Signed-off-by: Jens Axboe

    Dan Williams
     
  • It just obscures that the netdevice pointer and the expires value are
    implemented in the dst_entry sub-object of the ipv6 route.

    And it makes grepping for dst_entry member uses much harder too.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Also, create and use an rt6_bind_neighbour() in net/ipv6/route.c to
    consolidate some common logic.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • In order to perform a proper universal hash on a vector of integers,
    we have to use different universal hashes on each vector element.

    Which means we need 4 different hash randoms for ipv6.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • Check setsockopt arguments to avoid overflows and return -EINVAL for
    too large arguments.

    Signed-off-by: Xi Wang
    Signed-off-by: David S. Miller

    Xi Wang
     
  • Commit be639ac6 ("NET: AX.25: Check ioctl arguments to avoid overflows
    further down the road") rejects very large arguments, but doesn't
    completely fix overflows on 64-bit systems. Consider the AX25_T2 case.

    int opt;
    ...
    if (opt < 1 || opt > ULONG_MAX / HZ) {
    res = -EINVAL;
    break;
    }
    ax25->t2 = opt * HZ;

    The 32-bit multiplication opt * HZ would overflow before being assigned
    to 64-bit ax25->t2. This patch changes "opt" to unsigned long.

    Signed-off-by: Xi Wang
    Cc: Ralf Baechle
    Signed-off-by: David S. Miller

    Xi Wang
     
  • When testing L2TP support, I discovered that the l2tp module is not autoloaded
    as are other netlink interfaces. There is because of lack of hook in genetlink to call
    request_module and load the module.

    Signed-off-by: Stephen Hemminger
    Signed-off-by: David S. Miller

    Stephen Hemminger
     
  • The route we have here is for the address being added to the interface,
    ie. for input packet processing.

    Therefore using that route to determine whether an output nexthop gateway
    is known and resolved doesn't make any sense.

    So, simply remove this test, it never triggered anyways.

    Signed-off-by: David S. Miller
    Acked-By: Neil Horman

    David Miller
     
  • David S. Miller
     
  • On extensive NFS boots on a mx6qsabrelite board it was noted that "FEC: MDIO read timeout" were occuring,
    which caused failure on loading the FEC driver.

    The original FEC_MII_TIMEOUT was set to 1 ms, which is too low when passed to the usecs_to_jiffies macro.

    On ARM one jiffy is 10ms, so use a timeout of 30ms, which corresponds to 3 jiffies.

    After running extensive NFS boots, the MDIO timeouts do not occur anymore with this change.

    Signed-off-by: Rogerio Pimentel
    Signed-off-by: Fabio Estevam
    Signed-off-by: David S. Miller

    Rogerio Pimentel
     

28 Dec, 2011

12 commits