20 Sep, 2010

1 commit

  • This patch adds AEAD support into the cryptd framework. Having AEAD
    support in cryptd enables crypto drivers that use the AEAD
    interface type (such as the patch for AEAD based RFC4106 AES-GCM
    implementation using Intel New Instructions) to leverage cryptd for
    asynchronous processing.

    Signed-off-by: Adrian Hoban
    Signed-off-by: Tadeusz Struk
    Signed-off-by: Gabriele Paoloni
    Signed-off-by: Aidan O'Mahony
    Signed-off-by: Herbert Xu

    Adrian Hoban
     

17 Feb, 2010

1 commit

  • Add __percpu sparse annotations to places which didn't make it in one
    of the previous patches. All converions are trivial.

    These annotations are to make sparse consider percpu variables to be
    in a different address space and warn if accessed without going
    through percpu accessors. This patch doesn't affect normal builds.

    Signed-off-by: Tejun Heo
    Acked-by: Borislav Petkov
    Cc: Dan Williams
    Cc: Huang Ying
    Cc: Len Brown
    Cc: Neil Brown

    Tejun Heo
     

15 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
    m68k: rename global variable vmalloc_end to m68k_vmalloc_end
    percpu: add missing per_cpu_ptr_to_phys() definition for UP
    percpu: Fix kdump failure if booted with percpu_alloc=page
    percpu: make misc percpu symbols unique
    percpu: make percpu symbols in ia64 unique
    percpu: make percpu symbols in powerpc unique
    percpu: make percpu symbols in x86 unique
    percpu: make percpu symbols in xen unique
    percpu: make percpu symbols in cpufreq unique
    percpu: make percpu symbols in oprofile unique
    percpu: make percpu symbols in tracer unique
    percpu: make percpu symbols under kernel/ and mm/ unique
    percpu: remove some sparse warnings
    percpu: make alloc_percpu() handle array types
    vmalloc: fix use of non-existent percpu variable in put_cpu_var()
    this_cpu: Use this_cpu_xx in trace_functions_graph.c
    this_cpu: Use this_cpu_xx for ftrace
    this_cpu: Use this_cpu_xx in nmi handling
    this_cpu: Use this_cpu operations in RCU
    this_cpu: Use this_cpu ops for VM statistics
    ...

    Fix up trivial (famous last words) global per-cpu naming conflicts in
    arch/x86/kvm/svm.c
    mm/slab.c

    Linus Torvalds
     

19 Oct, 2009

1 commit

  • PCLMULQDQ is used to accelerate the most time-consuming part of GHASH,
    carry-less multiplication. More information about PCLMULQDQ can be
    found at:

    http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/

    Because PCLMULQDQ changes XMM state, its usage must be enclosed with
    kernel_fpu_begin/end, which can be used only in process context, the
    acceleration is implemented as crypto_ahash. That is, request in soft
    IRQ context will be defered to the cryptd kernel thread.

    Signed-off-by: Huang Ying
    Signed-off-by: Herbert Xu

    Huang Ying
     

03 Oct, 2009

1 commit


06 Aug, 2009

1 commit


22 Jul, 2009

1 commit


15 Jul, 2009

1 commit


14 Jul, 2009

4 commits


02 Jun, 2009

1 commit

  • Use crypto_alloc_base() instead of crypto_alloc_ablkcipher() to
    allocate underlying tfm in cryptd_alloc_ablkcipher. Because
    crypto_alloc_ablkcipher() prefer GENIV encapsulated crypto instead of
    raw one, while cryptd_alloc_ablkcipher needed the raw one.

    Signed-off-by: Huang Ying
    Signed-off-by: Herbert Xu

    Huang Ying
     

19 Feb, 2009

1 commit

  • Original cryptd thread implementation has scalability issue, this
    patch solve the issue with a per-CPU thread implementation.

    struct cryptd_queue is defined to be a per-CPU queue, which holds one
    struct cryptd_cpu_queue for each CPU. In struct cryptd_cpu_queue, a
    struct crypto_queue holds all requests for the CPU, a struct
    work_struct is used to run all requests for the CPU.

    Testing based on dm-crypt on an Intel Core 2 E6400 (two cores) machine
    shows 19.2% performance gain. The testing script is as follow:

    -------------------- script begin ---------------------------
    #!/bin/sh

    dmc_create()
    {
    # Create a crypt device using dmsetup
    dmsetup create $2 --table "0 `blockdev --getsize $1` crypt cbc(aes-asm)?cryptd?plain:plain babebabebabebabebabebabebabebabe 0 $1 0"
    }

    dmsetup remove crypt0
    dmsetup remove crypt1

    dd if=/dev/zero of=/dev/ram0 bs=1M count=4 >& /dev/null
    dd if=/dev/zero of=/dev/ram1 bs=1M count=4 >& /dev/null

    dmc_create /dev/ram0 crypt0
    dmc_create /dev/ram1 crypt1

    cat >tr.sh <& /dev/null &
    dd if=/dev/dm-1 of=/dev/null >& /dev/null &
    done
    wait
    EOF

    for n in $(seq 10); do
    /usr/bin/time sh tr.sh
    done
    rm tr.sh
    -------------------- script end ---------------------------

    The separator of dm-crypt parameter is changed from "-" to "?", because
    "-" is used in some cipher driver name too, and cryptds need to specify
    cipher driver name instead of cipher name.

    The test result on an Intel Core2 E6400 (two cores) is as follow:

    without patch:
    -----------------wo begin --------------------------
    0.04user 0.38system 0:00.39elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6566minor)pagefaults 0swaps
    0.07user 0.35system 0:00.35elapsed 121%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6567minor)pagefaults 0swaps
    0.06user 0.34system 0:00.30elapsed 135%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6562minor)pagefaults 0swaps
    0.05user 0.37system 0:00.36elapsed 119%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6607minor)pagefaults 0swaps
    0.06user 0.36system 0:00.35elapsed 120%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6562minor)pagefaults 0swaps
    0.05user 0.37system 0:00.31elapsed 136%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6594minor)pagefaults 0swaps
    0.04user 0.34system 0:00.30elapsed 126%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6597minor)pagefaults 0swaps
    0.06user 0.32system 0:00.31elapsed 125%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6571minor)pagefaults 0swaps
    0.06user 0.34system 0:00.31elapsed 134%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6581minor)pagefaults 0swaps
    0.05user 0.38system 0:00.31elapsed 138%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6600minor)pagefaults 0swaps
    -----------------wo end --------------------------

    with patch:
    ------------------w begin --------------------------
    0.02user 0.31system 0:00.24elapsed 141%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6554minor)pagefaults 0swaps
    0.05user 0.34system 0:00.31elapsed 127%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6606minor)pagefaults 0swaps
    0.07user 0.33system 0:00.26elapsed 155%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6559minor)pagefaults 0swaps
    0.07user 0.32system 0:00.26elapsed 151%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6562minor)pagefaults 0swaps
    0.05user 0.34system 0:00.26elapsed 150%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6603minor)pagefaults 0swaps
    0.03user 0.36system 0:00.31elapsed 124%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6562minor)pagefaults 0swaps
    0.04user 0.35system 0:00.26elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6586minor)pagefaults 0swaps
    0.03user 0.37system 0:00.27elapsed 146%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6562minor)pagefaults 0swaps
    0.04user 0.36system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6594minor)pagefaults 0swaps
    0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (0major+6557minor)pagefaults 0swaps
    ------------------w end --------------------------

    The middle value of elapsed time is:
    wo cryptwq: 0.31
    w cryptwq: 0.26

    The performance gain is about (0.31-0.26)/0.26 = 0.192.

    Signed-off-by: Huang Ying
    Signed-off-by: Herbert Xu

    Huang Ying
     

18 Feb, 2009

1 commit


10 Jul, 2008

3 commits


01 May, 2008

1 commit


08 Feb, 2008

1 commit


11 Jan, 2008

2 commits

  • If the underlying algorithm specifies a specific geniv algorithm then
    we should use it for the cryptd version as well.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Up until now we have ablkcipher algorithms have been identified as
    type BLKCIPHER with the ASYNC bit set. This is suboptimal because
    ablkcipher refers to two things. On the one hand it refers to the
    top-level ablkcipher interface with requests. On the other hand it
    refers to and algorithm type underneath.

    As it is you cannot request a synchronous block cipher algorithm
    with the ablkcipher interface on top. This is a problem because
    we want to be able to eventually phase out the blkcipher top-level
    interface.

    This patch fixes this by making ABLKCIPHER its own type, just as
    we have distinct types for HASH and DIGEST. The type it associated
    with the algorithm implementation only.

    Which top-level interface is used for synchronous block ciphers is
    then determined by the mask that's used. If it's a specific mask
    then the old blkcipher interface is given, otherwise we go with the
    new ablkcipher interface.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

11 Oct, 2007

1 commit


31 May, 2007

1 commit


02 May, 2007

1 commit