07 Sep, 2013

1 commit


02 Sep, 2013

10 commits

  • Each call to the co-processor, with exception of the last call, needs to
    send data that is multiple of block size. As consequence, any remaining
    data is kept in the internal NX context.

    This patch fixes a bug in the driver that causes it to save incorrect
    data into the context when data is bigger than the block size.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • The NX CGM implementation doesn't support zero length messages and the
    current implementation has two flaws:

    - When the input data length is zero, it ignores the associated data.
    - Even when both lengths are zero, it uses the Crypto API to encrypt a
    zeroed block using ctr(aes) and because of this it allocates a new
    transformation and sets the key for this new tfm. Both operations are
    intended to be used only in user context, while the cryptographic
    operations can be called in both user and softirq contexts.

    This patch replaces the nested Crypto API use and adds two special
    cases:

    - When input data and associated data lengths are zero: it uses NX ECB
    mode to emulate the encryption of a zeroed block using ctr(aes).
    - When input data is zero and associated data is available: it uses NX
    GMAC mode to calculate the associated data MAC.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • The NX XCBC implementation doesn't support zero length messages and
    because of that NX is currently returning a hard-coded hash for zero
    length messages. However this approach is incorrect since the hash value
    also depends on which key is used.

    This patch removes the hard-coded hash and replace it with an
    implementation based on the RFC 3566 using ECB.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • This patch updates the NX driver to perform several hyper calls when necessary
    so that the length limits of scatter/gather lists are respected.

    Reviewed-by: Marcelo Cerri
    Signed-off-by: Joy Latten
    Signed-off-by: Fionnuala Gunter
    Signed-off-by: Herbert Xu

    Fionnuala Gunter
     
  • This patch updates the NX driver to perform several hyper calls when necessary
    so that the length limits of scatter/gather lists are respected.

    Reviewed-by: Joy Latten
    Reviewed-by: Marcelo Cerri
    Signed-off-by: Fionnuala Gunter
    Signed-off-by: Herbert Xu

    Fionnuala Gunter
     
  • This patch updates the nx-aes-gcm implementation to perform several
    hyper calls if needed in order to always respect the length limits for
    scatter/gather lists.

    Two different limits are considered:

    - "ibm,max-sg-len": maximum number of bytes of each scatter/gather
    list.

    - "ibm,max-sync-cop":
    - The total number of bytes that a scatter/gather list can hold.
    - The maximum number of elements that a scatter/gather list can have.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • This patch updates the nx-aes-ctr implementation to perform several
    hyper calls if needed in order to always respect the length limits for
    scatter/gather lists.

    Two different limits are considered:

    - "ibm,max-sg-len": maximum number of bytes of each scatter/gather
    list.

    - "ibm,max-sync-cop":
    - The total number of bytes that a scatter/gather list can hold.
    - The maximum number of elements that a scatter/gather list can have.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • This patch updates the nx-aes-cbc implementation to perform several
    hyper calls if needed in order to always respect the length limits for
    scatter/gather lists.

    Two different limits are considered:

    - "ibm,max-sg-len": maximum number of bytes of each scatter/gather
    list.

    - "ibm,max-sync-cop":
    - The total number of bytes that a scatter/gather list can hold.
    - The maximum number of elements that a scatter/gather list can have.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • This patch updates the nx-aes-ecb implementation to perform several
    hyper calls if needed in order to always respect the length limits for
    scatter/gather lists.

    Two different limits are considered:

    - "ibm,max-sg-len": maximum number of bytes of each scatter/gather
    list.

    - "ibm,max-sync-cop":
    - The total number of bytes that a scatter/gather list can hold.
    - The maximum number of elements that a scatter/gather list can have.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • This patch includes one more parameter to nx_build_sg_lists() to skip
    the given number of bytes from beginning of each sg list.

    This is needed in order to implement the fixes for the AES modes to make
    them able to process larger chunks of data.

    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     

21 Aug, 2013

20 commits

  • Each cycle of SHA512 operates on 32 data words where as
    SHA256 operates on 16 data words. This needs to be updated
    while configuring DMA channels. Doing the same.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • For writing input buffer into DATA_IN register current driver
    has the following state machine:
    -> if input buffer < 9 : use fallback driver
    -> else if input buffer < block size : Copy input buffer into data_in regs
    -> else use dma transfer.

    In cases where requesting for DMA channels fails for some reason,
    or channel numbers are not provided in DT or platform data, probe
    also fails. Instead of returning from driver use cpu polling mode.
    In this mode processor polls on INPUT_READY bit and writes data into
    data_in regs when it equals 1. This operation is repeated until the
    length of message.

    Now the state machine looks like:
    -> if input buffer < 9 : use fallback driver
    -> else if input buffer < block size : Copy input buffer into data_in regs
    -> else if dma enabled: use dma transfer
    else use cpu polling mode.

    Signed-off-by: Lokesh Vutla
    Signed-off-by: Herbert Xu

    Lokesh Vutla
     
  • The bug here is that:

    while (eng_busy & (!icq_empty) & dma_busy)

    is never true because it's using bitwise instead of logical ANDs. The
    other bitwise AND conditions work as intended but I changed them as well
    for consistency.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Herbert Xu

    Dan Carpenter
     
  • There is a typo here. "dev->hw_link[]" is an array, not a pointer, so
    the check is nonsense. We should be checking recently allocated
    "dev->hw_link[0]" instead.

    Signed-off-by: Dan Carpenter
    Signed-off-by: Herbert Xu

    Dan Carpenter
     
  • For AM437x SoC, ARCH_OMAP2 and ARCH_OMAP3 is not enabled in the defconfig. We
    follow same thing as SHA driver, and add depends on ARCH_OMAP2PLUS so that the
    config is selectable for AES driver on AM437x SoC builds.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • Keeps request_irq exit/error code paths simpler.

    Suggested-by: Lokesh Vutla
    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • Use devm_kzalloc instead of kzalloc. With this change, there is no need to
    call kfree in error/exit paths.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • For cases where offset/length of on any page of the input SG is not aligned by
    AES_BLOCK_SIZE, we copy all the pages from the input SG list into a contiguous
    buffer and prepare a single element SG list for this buffer with length as the
    total bytes to crypt.

    This is requried for cases such as when an SG list of 16 bytes total size
    contains 16 pages each containing 1 byte. DMA using the direct buffers of such
    instances is not possible.

    For this purpose, we first detect if the unaligned case and accordingly
    allocate enough number of pages to satisfy the request and prepare SG lists.
    We then copy data into the buffer, and copy data out of it on completion.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • In cases where requesting for DMA channels fails for some reason, or channel
    numbers are not provided in DT or platform data, we switch to PIO-only mode
    also checking if platform provides IRQ numbers and interrupt register offsets
    in DT and platform data. All dma-only paths are avoided in this mode.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • We initialize the scatter gather walk lists needed for PIO mode and avoid all
    DMA paths such as mapping/unmapping buffers by checking for the pio_only flag.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • We add an IRQ handler that implements a state-machine for PIO-mode and data
    structures for walking the scatter-gather list. The IRQ handler is called in
    succession both when data is available to read or next data can be sent for
    processing. This process continues till the entire in/out SG lists have been
    walked. Once the SG-list has been completely walked, the IRQ handler schedules
    the done_task tasklet.

    Also add a useful macro that is used through out the IRQ code for a common
    pattern of calculating how much an SG list has been walked. This improves code
    readability and avoids checkpatch errors.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • Add IRQ information to pdata and helper macros. These are required
    for PIO-mode support.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • Intermdiate buffers were allocated, mapped and used for DMA. These are no
    longer required as we use the SGs from crypto layer directly in previous
    commits in the series. Also along with it, remove the logic for copying SGs
    etc as they are no longer used, and all the associated variables in omap_aes_device.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • Earlier functions that did a similar sync are replaced by the dma_sync_sg_*
    which can operate on entire SG list.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • In early version of this driver, assumptions were made such as DMA layer
    requires contiguous buffers etc. Due to this, new buffers were allocated,
    mapped and used for DMA. These assumptions are no longer true and DMAEngine
    scatter-gather DMA doesn't have such requirements. We simply the DMA operations
    by directly using the scatter-gather buffers provided by the crypto layer
    instead of creating our own.

    Lot of logic that handled DMA'ing only X number of bytes of the total, or as
    much as fitted into a 3rd party buffer is removed and is no longer required.

    Also, good performance improvement of atleast ~20% seen with encrypting a
    buffer size of 8K (1800 ops/sec vs 1400 ops/sec). Improvement will be higher
    for much larger blocks though such benchmarking is left as an exercise for the
    reader. Also DMA usage is much more simplified and coherent with rest of the
    code.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • Crypto layer only passes nbytes but number of SG elements is needed for mapping
    or unmapping SGs at one time using dma_map* API and also needed to pass in for
    dmaengine prep function.

    We call function added to scatterwalk for this purpose in omap_aes_handle_queue
    to populate the values which are used later.

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • When DEBUG is enabled, these macros can be used to print variables in integer
    and hex format, and clearly display which registers, offsets and values are
    being read/written , including printing the names of the offsets and their values.

    Using statement expression macros in read path as,
    Suggested-by: Joe Perches

    Signed-off-by: Joel Fernandes
    Signed-off-by: Herbert Xu

    Joel Fernandes
     
  • This patch fixes a bug in the nx-aes-gcm implementation.
    Corrected the code so that the authtag is always verified after
    decrypting and not just when there is associated data included.
    Also, corrected the code to retrieve the input authtag from src
    instead of dst.

    Reviewed-by: Fionnuala Gunter
    Reviewed-by: Marcelo Cerri
    Signed-off-by: Joy Latten
    Signed-off-by: Herbert Xu

    jmlatten@linux.vnet.ibm.com
     
  • This patch adds an option to the Kconfig file for
    SEC which enables the user to see the debug messages
    that are printed inside the SEC driver.

    Signed-off-by: Alex Porosanu
    Signed-off-by: Herbert Xu

    Alex Porosanu
     
  • CAAM driver contains one macro (xstr) used for printing
    the line location in a file where a memdump is done. This patch
    replaces the xstr macro with the already existing __stringify
    macro that performs the same function.

    Signed-off-by: Alex Porosanu
    Signed-off-by: Herbert Xu

    Alex Porosanu
     

14 Aug, 2013

3 commits

  • The NX driver uses the transformation context to store several fields
    containing data related to the state of the operations in progress.
    Since a single tfm can be used by different kernel threads at the same
    time, we need to protect the data stored into the context.

    This patch makes use of spin locks to protect the data where a race
    condition can happen.

    Reviewed-by: Fionnuala Gunter
    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • These local symbols are used only in this file.
    Fix the following sparse warnings:

    drivers/crypto/amcc/crypto4xx_alg.c:35:6: warning: symbol 'set_dynamic_sa_command_0' was not declared. Should it be static?
    drivers/crypto/amcc/crypto4xx_alg.c:55:6: warning: symbol 'set_dynamic_sa_command_1' was not declared. Should it be static?

    Signed-off-by: Jingoo Han
    Signed-off-by: Herbert Xu

    Jingoo Han
     
  • This local symbol is used only in this file.
    Fix the following sparse warnings:

    drivers/crypto/sahara.c:420:6: warning: symbol 'sahara_watchdog' was not declared. Should it be static?

    Signed-off-by: Jingoo Han
    Signed-off-by: Herbert Xu

    Jingoo Han
     

09 Aug, 2013

3 commits

  • This patch fixes a bug that is triggered when cts(cbc(aes)) is used with
    nx-crypto driver on input larger than 32 bytes.

    The chaining value from co-processor was not being saved. This value is
    needed because it is used as the IV by cts(cbc(aes)).

    Signed-off-by: Fionnuala Gunter
    Reviewed-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Fionnuala Gunter
     
  • The co-processor has several limits regarding the length of
    scatter/gather lists and the total number of bytes in it. These limits
    are available in the device tree, as following:

    - "ibm,max-sg-len": maximum number of bytes of each scatter/gather
    list.

    - "ibm,max-sync-cop": used for synchronous operations, it is an array
    of structures that contains information regarding the limits that
    must be considered for each mode and operation. The most important
    limits in it are:
    - The total number of bytes that a scatter/gather list can hold.
    - The maximum number of elements that a scatter/gather list can
    have.

    This patch updates the NX driver to perform several hyper calls if
    needed in order to always respect the length limits for scatter/gather
    lists.

    Reviewed-by: Fionnuala Gunter
    Reviewed-by: Joel Schopp
    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     
  • The co-processor receives data to be hashed through scatter/gather lists
    pointing to physical addresses. When a vmalloc'ed data is given, the
    driver must calculate the physical address to each page of the data.

    However the current version of it just calculates the physical address
    once and keeps incrementing it even when a page boundary is crossed.
    This patch fixes this behaviour.

    Reviewed-by: Fionnuala Gunter
    Reviewed-by: Joel Schopp
    Reviewed-by: Joy Latten
    Signed-off-by: Marcelo Cerri
    Signed-off-by: Herbert Xu

    Marcelo Cerri
     

01 Aug, 2013

3 commits