22 Sep, 2016

8 commits

  • The current internal buffer size is way too large for crypto core, so
    shrink it to be smaller. This makes the buffer to fit into the space
    reserved for the export/import buffers also.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Now that the driver has been converted to use scatterlists for data
    handling, add proper implementation for the export/import stubs also.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Currently, the internal buffer has been used for data transmission. Change
    this so that scatterlists are used instead, and change the driver to
    actually use the previously introduced helper functions for scatterlist
    preparation.

    This patch also removes the old buffer handling code which is no longer
    needed.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Currently the threshold value was hardcoded in the driver. Having a define
    for it makes it easier to configure.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Currently omap-sham uses a huge internal buffer for caching data, and
    pushing this out to the DMA as large chunks. This, unfortunately,
    doesn't work too well with the export/import functionality required
    for ahash algorithms, and must be changed towards more scatterlist
    centric approach.

    This patch adds support functions for (mostly) scatterlist based data
    handling. omap_sham_prepare_request() prepares a scatterlist for DMA
    transfer to SHA crypto accelerator. This requires checking the data /
    offset / length alignment of the data, splitting the data to SHA block
    size granularity, and adding any remaining data back to the buffer.
    With this patch, the code doesn't actually go live yet, the support code
    will be taken properly into use with additional patches that modify the
    SHA driver functionality itself.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • The current usage of sgl will be deprecated, and will be replaced by an
    array required by the sg based driver implementation. Rename the existing
    variable as sgl_tmp so that it can be removed from the driver easily later.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • OMAP HW generally expects data for DMA to be on word boundary, so make the
    SHA driver inform crypto framework of the same preference.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Initially these just return -ENOTSUPP to indicate that they don't
    really do anything yet. Some sort of implementation is required
    for the driver to at least probe.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     

13 Sep, 2016

3 commits

  • If software fallback is used on older hardware accelerator setup (OMAP2/
    OMAP3), the first block of data must be purged from the buffer. The
    first block contains the pre-generated ipad value required by the HW,
    but the software fallback algorithm generates its own, causing wrong
    results.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • If we have processed any data with the hardware accelerator (digcnt > 0),
    we must complete the entire hash by using it. This is because the current
    hash value can't be imported to the software fallback algorithm. Otherwise
    we end up with wrong hash results.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Some of the call paths of OMAP SHA driver can avoid executing the next
    step of the crypto queue under tasklet; instead, execute the next step
    directly via function call. This avoids a costly round-trip via the
    scheduler giving a slight performance boost.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     

01 Jul, 2016

1 commit


24 Jun, 2016

4 commits

  • Adds software fallback support for small crypto requests. In these cases,
    it is undesirable to use DMA, as setting it up itself is rather heavy
    operation. Gives about 40% extra performance in ipsec usecase.

    Signed-off-by: Bin Liu
    [t-kristo@ti.com: dropped the extra traces, updated some comments
    on the code]
    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Bin Liu
     
  • The extra call to dmaengine_terminate_all is not needed, as the DMA
    is not running at this point. This improves performance slightly.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • Change crypto queue size from 1 to 10 for omap SHA driver. This should
    allow clients to enqueue requests more effectively to avoid serializing
    whole crypto sequences, giving extra performance.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     
  • Calling runtime PM API for every block causes serious performance hit to
    crypto operations that are done on a long buffer. As crypto is performed
    on a page boundary, encrypting large buffers can cause a series of crypto
    operations divided by page. The runtime PM API is also called those many
    times.

    Convert the driver to use runtime_pm autosuspend instead, with a default
    timeout value of 1 second. This results in upto ~50% speedup.

    Signed-off-by: Tero Kristo
    Signed-off-by: Herbert Xu

    Tero Kristo
     

19 May, 2016

1 commit


03 May, 2016

1 commit


17 Aug, 2015

1 commit


18 May, 2015

1 commit


15 May, 2015

1 commit


03 Apr, 2015

1 commit

  • kmap_atomic() gives only the page address of the input page.
    Driver should take care of adding the offset of the scatterlist
    within the page to the returned page address.
    omap-sham driver is not adding the offset to page and directly operates
    on the return vale of kmap_atomic(), because of which the following
    error comes when running crypto tests:

    00000000: d9 a1 1b 7c aa 90 3b aa 11 ab cb 25 00 b8 ac bf
    [ 2.338169] 00000010: c1 39 cd ff 48 d0 a8 e2 2b fa 33 a1
    [ 2.344008] alg: hash: Chunking test 1 failed for omap-sha256

    So adding the scatterlist offset to vaddr.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Vutla, Lokesh
     

01 Apr, 2015

1 commit

  • omap_sham_handle_queue() can be called as part of done_task tasklet.
    During this its atomic and any calls to pm functions cannot sleep.

    But there is a call to pm_runtime_get_sync() (which can sleep) in
    omap_sham_handle_queue(), because of which the following appears:
    " [ 116.169969] BUG: scheduling while atomic: kworker/0:2/2676/0x00000100"

    Add pm_runtime_irq_safe() to avoid this.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Vutla, Lokesh
     

20 Oct, 2014

1 commit


14 Oct, 2014

1 commit

  • Replaced the use of a Variable Length Array In Struct (VLAIS) with a C99
    compliant equivalent. This patch allocates the appropriate amount of memory
    using a char array using the SHASH_DESC_ON_STACK macro.

    The new code can be compiled with both gcc and clang.

    Signed-off-by: Behan Webster
    Reviewed-by: Mark Charlebois
    Reviewed-by: Jan-Simon Möller
    Acked-by: Herbert Xu

    Behan Webster
     

10 Mar, 2014

2 commits


20 Dec, 2013

1 commit

  • Command "tcrypt sec=1 mode=403" give the follwoing error for Polling
    mode:
    root@am335x-evm:/# insmod tcrypt.ko sec=1 mode=403
    [...]

    [ 346.982754] test 15 ( 4096 byte blocks, 1024 bytes per update, 4 updates): 4352 opers/sec, 17825792 bytes/sec
    [ 347.992661] test 16 ( 4096 byte blocks, 4096 bytes per update, 1 updates): 7095 opers/sec, 29061120 bytes/sec
    [ 349.002667] test 17 ( 8192 byte blocks, 16 bytes per update, 512 updates):
    [ 349.010882] Unable to handle kernel NULL pointer dereference at virtual address 00000000
    [ 349.020037] pgd = ddeac000
    [ 349.022884] [00000000] *pgd=9dcb4831, *pte=00000000, *ppte=00000000
    [ 349.029816] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
    [ 349.035482] Modules linked in: tcrypt(+)
    [ 349.039617] CPU: 0 PID: 1473 Comm: insmod Not tainted 3.12.4-01566-g6279006-dirty #38
    [ 349.047832] task: dda91540 ti: ddcd2000 task.ti: ddcd2000
    [ 349.053517] PC is at omap_sham_xmit_dma+0x6c/0x238
    [ 349.058544] LR is at omap_sham_xmit_dma+0x38/0x238
    [ 349.063570] pc : [] lr : [] psr: 20000013
    [ 349.063570] sp : ddcd3c78 ip : 00000000 fp : 9d8980b8
    [ 349.075610] r10: 00000000 r9 : 00000000 r8 : 00000000
    [ 349.081090] r7 : 00001000 r6 : dd898000 r5 : 00000040 r4 : ddb10550
    [ 349.087935] r3 : 00000004 r2 : 00000010 r1 : 53100080 r0 : 00000000
    [ 349.094783] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
    [ 349.102268] Control: 10c5387d Table: 9deac019 DAC: 00000015
    [ 349.108294] Process insmod (pid: 1473, stack limit = 0xddcd2248)

    [...]

    This is because polling_mode is not enabled for ctx without FLAGS_FINUP.

    For polling mode the bufcnt is made 0 unconditionally. But it should be made 0
    only if it is a final update or a total is not zero(This condition is similar
    to what is done in DMA case). Because of this wrong hashes are produced.

    Fixing the same.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     

05 Dec, 2013

1 commit

  • In omap_sham_probe() and omap_sham_remove(), 'dd->dma_lch'
    is released without checking to see if it was successfully
    requested or not. This is a bug and was identified and
    reported by Dan Carpenter here:

    http://www.spinics.net/lists/devicetree/msg11023.html

    Add code to only release 'dd->dma_lch' when its not NULL
    (that is, when it was successfully requested).

    Reported-by: Dan Carpenter
    CC: Joel Fernandes
    Signed-off-by: Mark A. Greer
    Signed-off-by: Herbert Xu

    Mark A. Greer
     

24 Nov, 2013

1 commit

  • Pull crypto update from Herbert Xu:
    - Made x86 ablk_helper generic for ARM
    - Phase out chainiv in favour of eseqiv (affects IPsec)
    - Fixed aes-cbc IV corruption on s390
    - Added constant-time crypto_memneq which replaces memcmp
    - Fixed aes-ctr in omap-aes
    - Added OMAP3 ROM RNG support
    - Add PRNG support for MSM SoC's
    - Add and use Job Ring API in caam
    - Misc fixes

    [ NOTE! This pull request was sent within the merge window, but Herbert
    has some questionable email sending setup that makes him public enemy
    #1 as far as gmail is concerned. So most of his emails seem to be
    trapped by gmail as spam, resulting in me not seeing them. - Linus ]

    * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (49 commits)
    crypto: s390 - Fix aes-cbc IV corruption
    crypto: omap-aes - Fix CTR mode counter length
    crypto: omap-sham - Add missing modalias
    padata: make the sequence counter an atomic_t
    crypto: caam - Modify the interface layers to use JR API's
    crypto: caam - Add API's to allocate/free Job Rings
    crypto: caam - Add Platform driver for Job Ring
    hwrng: msm - Add PRNG support for MSM SoC's
    ARM: DT: msm: Add Qualcomm's PRNG driver binding document
    crypto: skcipher - Use eseqiv even on UP machines
    crypto: talitos - Simplify key parsing
    crypto: picoxcell - Simplify and harden key parsing
    crypto: ixp4xx - Simplify and harden key parsing
    crypto: authencesn - Simplify key parsing
    crypto: authenc - Export key parsing helper function
    crypto: mv_cesa: remove deprecated IRQF_DISABLED
    hwrng: OMAP3 ROM Random Number Generator support
    crypto: sha256_ssse3 - also test for BMI2
    crypto: mv_cesa - Remove redundant of_match_ptr
    crypto: sahara - Remove redundant of_match_ptr
    ...

    Linus Torvalds
     

30 Oct, 2013

1 commit


24 Oct, 2013

1 commit

  • Replace some instances of of_irq_map_one()/irq_create_of_mapping() and
    of_irq_to_resource() by the simpler equivalent irq_of_parse_and_map().

    Signed-off-by: Thierry Reding
    Acked-by: Rob Herring
    [grant.likely: resolved conflicts with core code renames]
    Signed-off-by: Grant Likely

    Thierry Reding
     

21 Aug, 2013

2 commits

  • Each cycle of SHA512 operates on 32 data words where as
    SHA256 operates on 16 data words. This needs to be updated
    while configuring DMA channels. Doing the same.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • For writing input buffer into DATA_IN register current driver
    has the following state machine:
    -> if input buffer < 9 : use fallback driver
    -> else if input buffer < block size : Copy input buffer into data_in regs
    -> else use dma transfer.

    In cases where requesting for DMA channels fails for some reason,
    or channel numbers are not provided in DT or platform data, probe
    also fails. Instead of returning from driver use cpu polling mode.
    In this mode processor polls on INPUT_READY bit and writes data into
    data_in regs when it equals 1. This operation is repeated until the
    length of message.

    Now the state machine looks like:
    -> if input buffer < 9 : use fallback driver
    -> else if input buffer < block size : Copy input buffer into data_in regs
    -> else if dma enabled: use dma transfer
    else use cpu polling mode.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     

01 Aug, 2013

4 commits

  • Use devm_kzalloc() to make cleanup paths simpler.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • Using devm_request_irq() rather than request_irq().
    So removing free_irq() calls from the probe error
    path and the remove handler.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • Add support for the OMAP5 version of the SHAM module
    that is present on OMAP5 and AM43xx SoCs.

    This module is very simialar to OMAP4 version of SHAM module,
    and adds SHA384 SHA512 hardware-accelerated hash functions to it.
    To handle the higher digest size of SHA512, few SHA512_DIGEST_i
    (i=1-16, and first 8 registers are duplicated from SHA_DIGEST_i
    registers) registers are added at the end of register set.
    So adding the above register offsets and module info in pdata.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • Adding support for SHA348 and SHA512 in addition to MD5, SHA1, SHA224
    SHA256 that the omap sha module supports.

    In order to add the support
    - Removed hard coded register offsets and passing offsets from pdata
    - Updating Flag offsets so that they can be used for SHA256 and SHA512
    - Adding the algo info.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     

24 May, 2013

1 commit


10 Mar, 2013

1 commit