28 Jun, 2010

2 commits

  • This patch updates percpu allocator such that it can serve limited
    amount of allocation before slab comes online. This is primarily to
    allow slab to depend on working percpu allocator.

    Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how
    much memory space and allocation map slots are reserved. If this
    reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation
    will fail till slab comes online.

    The following changes are made to implement early alloc.

    * pcpu_mem_alloc() now checks slab_is_available()

    * Chunks are allocated using pcpu_mem_alloc()

    * Init paths make sure ai->dyn_size is at least as large as
    PERCPU_DYNAMIC_EARLY_SIZE.

    * Initial alloc maps are allocated in __initdata and copied to
    kmalloc'd areas once slab is online.

    Signed-off-by: Tejun Heo
    Cc: Christoph Lameter

    Tejun Heo
     
  • In pcpu_build_alloc_info() and pcpu_embed_first_chunk(), @dyn_size was
    ssize_t, -1 meant auto-size, 0 forced 0 and positive meant minimum
    size. There's no use case for forcing 0 and the upcoming early alloc
    support always requires non-zero dynamic size. Make @dyn_size always
    mean minimum dyn_size.

    While at it, make pcpu_build_alloc_info() static which doesn't have
    any external caller as suggested by David Rientjes.

    Signed-off-by: Tejun Heo
    Cc: David Rientjes

    Tejun Heo
     

18 Jun, 2010

1 commit

  • per_cpu_ptr_to_phys() determines whether the passed in @addr belongs
    to the first_chunk or not by just matching the address against the
    address range of the base unit (unit0, used by cpu0). When an adress
    from another cpu was passed in, it will always determine that the
    address doesn't belong to the first chunk even when it does. This
    makes the function return a bogus physical address which may lead to
    crash.

    This problem was discovered by Cliff Wickman while investigating a
    crash during kdump on a SGI UV system.

    Signed-off-by: Tejun Heo
    Reported-by: Cliff Wickman
    Tested-by: Cliff Wickman
    Cc: stable@kernel.org

    Tejun Heo
     

17 Jun, 2010

1 commit


12 Jun, 2010

27 commits


11 Jun, 2010

9 commits

  • when we use remap_file_pages() to remap a file, remap_file_pages always return
    error. It is because btrfs didn't set VM_CAN_NONLINEAR for vma.

    Signed-off-by: Miao Xie
    Signed-off-by: Chris Mason

    Miao Xie
     
  • refs can be used with uninitialized data if btrfs_lookup_extent_info()
    fails on the first pass through the loop. In the original code if that
    happens then check_path_shared() probably returns 1, this patch
    changes it to return 1 for safety.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Chris Mason

    Dan Carpenter
     
  • Seems that when btrfs_fallocate was converted to use the new ENOSPC stuff we
    dropped passing the mode to the function that actually does the preallocation.
    This breaks anybody who wants to use FALLOC_FL_KEEP_SIZE. Thanks,

    Signed-off-by: Josef Bacik
    Signed-off-by: Chris Mason

    Josef Bacik
     
  • We cannot use the loop device which has been connected to a file in the btrf

    The reproduce steps is following:
    # dd if=/dev/zero of=vdev0 bs=1M count=1024
    # losetup /dev/loop0 vdev0
    # mkfs.btrfs /dev/loop0
    ...
    failed to zero device start -5

    The reason is that the btrfs don't implement either ->write_begin or ->write
    the VFS API, so we fix it by setting ->write to do_sync_write().

    Signed-off-by: Miao Xie
    Signed-off-by: Chris Mason

    Miao Xie
     
  • fix a race at the end of NAPI complete processing, it had
    better do __napi_complete() first before re-enable interrupt.

    Signed-off-by:Figo.zhang

    Signed-off-by: David S. Miller

    Figo.zhang
     
  • This patch correct a bug in the delay of pktgen.
    It makes sure the inter-packet interval is accurate.

    Signed-off-by: Daniel Turull
    Signed-off-by: Robert Olsson
    Signed-off-by: David S. Miller

    Daniel Turull
     
  • gen_kill_estimator() / gen_new_estimator() is not always called with
    RTNL held.

    net/netfilter/xt_RATEEST.c is one user of these API that do not hold
    RTNL, so random corruptions can occur between "tc" and "iptables".

    Add a new fine grained lock instead of trying to use RTNL in netfilter.

    Signed-off-by: Eric Dumazet
    Signed-off-by: David S. Miller

    Eric Dumazet
     
  • Currently, the accelerated receive path for VLAN's will
    drop packets if the real device is an inactive slave and
    is not one of the special pkts tested for in
    skb_bond_should_drop(). This behavior is different then
    the non-accelerated path and for pkts over a bonded vlan.

    For example,

    vlanx -> bond0 -> ethx

    will be dropped in the vlan path and not delivered to any
    packet handlers at all. However,

    bond0 -> vlanx -> ethx

    and

    bond0 -> ethx

    will be delivered to handlers that match the exact dev,
    because the VLAN path checks the real_dev which is not a
    slave and netif_recv_skb() doesn't drop frames but only
    delivers them to exact matches.

    This patch adds a sk_buff flag which is used for tagging
    skbs that would previously been dropped and allows the
    skb to continue to skb_netif_recv(). Here we add
    logic to check for the deliver_no_wcard flag and if it
    is set only deliver to handlers that match exactly. This
    makes both paths above consistent and gives pkt handlers
    a way to identify skbs that come from inactive slaves.
    Without this patch in some configurations skbs will be
    delivered to handlers with exact matches and in others
    be dropped out right in the vlan path.

    I have tested the following 4 configurations in failover modes
    and load balancing modes.

    # bond0 -> ethx

    # vlanx -> bond0 -> ethx

    # bond0 -> vlanx -> ethx

    # bond0 -> ethx
    |
    vlanx -> --

    Signed-off-by: John Fastabend
    Signed-off-by: David S. Miller

    John Fastabend
     
  • If we have enough memory to allocate a new cap release message, do so, so
    that we can send a partial release message immediately. This keeps us from
    making the MDS wait when the cap release it needs is in a partially full
    release message.

    If we fail because of ENOMEM, oh well, they'll just have to wait a bit
    longer.

    Signed-off-by: Sage Weil

    Sage Weil