03 Aug, 2011

1 commit

  • When loading aes via the module alias, a padlock module failing to
    load due to missing hardware is not particularly notable. With
    v2.6.27-rc1~1107^2~14 (crypto: padlock - Make module loading quieter
    when hardware isn't available, 2008-07-03), the padlock-aes module
    suppresses the relevant messages when the "quiet" flag is in use; but
    better to suppress this particular message completely, since the
    administrator can already distinguish such errors by the absence of a
    message indicating initialization failing or succeeding.

    This avoids occasional messages in syslog of the form

    padlock_aes: VIA PadLock not detected.

    Signed-off-by: Jonathan Nieder
    Signed-off-by: Herbert Xu

    Jonathan Nieder
     

14 Jan, 2011

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (46 commits)
    hwrng: via_rng - Fix memory scribbling on some CPUs
    crypto: padlock - Move padlock.h into include/crypto
    hwrng: via_rng - Fix asm constraints
    crypto: n2 - use __devexit not __exit in n2_unregister_algs
    crypto: mark crypto workqueues CPU_INTENSIVE
    crypto: mv_cesa - dont return PTR_ERR() of wrong pointer
    crypto: ripemd - Set module author and update email address
    crypto: omap-sham - backlog handling fix
    crypto: gf128mul - Remove experimental tag
    crypto: af_alg - fix af_alg memory_allocated data type
    crypto: aesni-intel - Fixed build with binutils 2.16
    crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets
    net: Add missing lockdep class names for af_alg
    include: Install linux/if_alg.h for user-space crypto API
    crypto: omap-aes - checkpatch --file warning fixes
    crypto: omap-aes - initialize aes module once per request
    crypto: omap-aes - unnecessary code removed
    crypto: omap-aes - error handling implementation improved
    crypto: omap-aes - redundant locking is removed
    crypto: omap-aes - DMA initialization fixes for OMAP off mode
    ...

    Linus Torvalds
     

07 Jan, 2011

1 commit


05 Nov, 2010

1 commit

  • On certain VIA chipsets AES-CBC requires the input/output to be
    a multiple of 64 bytes. We had a workaround for this but it was
    buggy as it sent the whole input for processing when it is meant
    to only send the initial number of blocks which makes the rest
    a multiple of 64 bytes.

    As expected this causes memory corruption whenever the workaround
    kicks in.

    Reported-by: Phil Sutter
    Signed-off-by: Herbert Xu

    Herbert Xu
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

15 Dec, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (34 commits)
    m68k: rename global variable vmalloc_end to m68k_vmalloc_end
    percpu: add missing per_cpu_ptr_to_phys() definition for UP
    percpu: Fix kdump failure if booted with percpu_alloc=page
    percpu: make misc percpu symbols unique
    percpu: make percpu symbols in ia64 unique
    percpu: make percpu symbols in powerpc unique
    percpu: make percpu symbols in x86 unique
    percpu: make percpu symbols in xen unique
    percpu: make percpu symbols in cpufreq unique
    percpu: make percpu symbols in oprofile unique
    percpu: make percpu symbols in tracer unique
    percpu: make percpu symbols under kernel/ and mm/ unique
    percpu: remove some sparse warnings
    percpu: make alloc_percpu() handle array types
    vmalloc: fix use of non-existent percpu variable in put_cpu_var()
    this_cpu: Use this_cpu_xx in trace_functions_graph.c
    this_cpu: Use this_cpu_xx for ftrace
    this_cpu: Use this_cpu_xx in nmi handling
    this_cpu: Use this_cpu operations in RCU
    this_cpu: Use this_cpu ops for VM statistics
    ...

    Fix up trivial (famous last words) global per-cpu naming conflicts in
    arch/x86/kvm/svm.c
    mm/slab.c

    Linus Torvalds
     

03 Nov, 2009

1 commit


29 Oct, 2009

1 commit

  • This patch updates misc percpu related symbols such that percpu
    symbols are unique and don't clash with local symbols. This serves
    two purposes of decreasing the possibility of global percpu symbol
    collision and allowing dropping per_cpu__ prefix from percpu symbols.

    * drivers/crypto/padlock-aes.c: s/last_cword/paes_last_cword/

    * drivers/lguest/x86/core.c: s/last_cpu/lg_last_cpu/

    * drivers/s390/net/netiucv.c: rename the variable used in a macro to
    avoid clashing with percpu symbol

    * arch/mn10300/kernel/kprobes.c: replace current_ prefix with cur_ for
    static variables. Please note that percpu symbol current_kprobe
    can't be changed as it's used by generic code.

    Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
    which cause name clashes" patch.

    Signed-off-by: Tejun Heo
    Reviewed-by: Christoph Lameter
    Cc: Rusty Russell
    Cc: Herbert Xu
    Cc: Chuck Ebbert
    Cc: David Howells
    Cc: Koichi Yasutake
    Cc: Ananth N Mavinakayanahalli
    Cc: Anil S Keshavamurthy
    Cc: David S. Miller
    Cc: Masami Hiramatsu
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: linux390@de.ibm.com

    Tejun Heo
     

18 Jun, 2009

2 commits


02 Jun, 2009

1 commit


21 Apr, 2009

1 commit


26 Feb, 2009

1 commit

  • With the mandatory algorithm testing at registration, we have
    now created a deadlock with algorithms requiring fallbacks.
    This can happen if the module containing the algorithm requiring
    fallback is loaded first, without the fallback module being loaded
    first. The system will then try to test the new algorithm, find
    that it needs to load a fallback, and then try to load that.

    As both algorithms share the same module alias, it can attempt
    to load the original algorithm again and block indefinitely.

    As algorithms requiring fallbacks are a special case, we can fix
    this by giving them a different module alias than the rest. Then
    it's just a matter of using the right aliases according to what
    algorithms we're trying to find.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

25 Dec, 2008

1 commit

  • Resetting the control word is quite expensive. Fortunately this
    isn't an issue for the common operations such as CBC and ECB as
    the whole operation is done through a single call. However, modes
    such as LRW and XTS have to call padlock over and over again for
    one operation which really hurts if each call resets the control
    word.

    This patch uses an idea by Sebastian Siewior to store the last
    control word used on a CPU and only reset the control word if
    that changes.

    Note that any task switch automatically resets the control word
    so we only need to be accurate with regard to the stored control
    word when no task switches occur.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

13 Aug, 2008

1 commit

  • Wolfgang Walter reported this oops on his via C3 using padlock for
    AES-encryption:

    ##################################################################

    BUG: unable to handle kernel NULL pointer dereference at 000001f0
    IP: [] __switch_to+0x30/0x117
    *pde = 00000000
    Oops: 0002 [#1] PREEMPT
    Modules linked in:

    Pid: 2071, comm: sleep Not tainted (2.6.26 #11)
    EIP: 0060:[] EFLAGS: 00010002 CPU: 0
    EIP is at __switch_to+0x30/0x117
    EAX: 00000000 EBX: c0493300 ECX: dc48dd00 EDX: c0493300
    ESI: dc48dd00 EDI: c0493530 EBP: c04cff8c ESP: c04cff7c
    DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
    Process sleep (pid: 2071, ti=c04ce000 task=dc48dd00 task.ti=d2fe6000)
    Stack: dc48df30 c0493300 00000000 00000000 d2fe7f44 c03b5b43 c04cffc8 00000046
    c0131856 0000005a dc472d3c c0493300 c0493470 d983ae00 00002696 00000000
    c0239f54 00000000 c04c4000 c04cffd8 c01025fe c04f3740 00049800 c04cffe0
    Call Trace:
    [] ? schedule+0x285/0x2ff
    [] ? pm_qos_requirement+0x3c/0x53
    [] ? acpi_processor_idle+0x0/0x434
    [] ? cpu_idle+0x73/0x7f
    [] ? rest_init+0x61/0x63
    =======================

    Wolfgang also found out that adding kernel_fpu_begin() and kernel_fpu_end()
    around the padlock instructions fix the oops.

    Suresh wrote:

    These padlock instructions though don't use/touch SSE registers, but it behaves
    similar to other SSE instructions. For example, it might cause DNA faults
    when cr0.ts is set. While this is a spurious DNA trap, it might cause
    oops with the recent fpu code changes.

    This is the code sequence that is probably causing this problem:

    a) new app is getting exec'd and it is somewhere in between
    start_thread() and flush_old_exec() in the load_xyz_binary()

    b) At pont "a", task's fpu state (like TS_USEDFPU, used_math() etc) is
    cleared.

    c) Now we get an interrupt/softirq which starts using these encrypt/decrypt
    routines in the network stack. This generates a math fault (as
    cr0.ts is '1') which sets TS_USEDFPU and restores the math that is
    in the task's xstate.

    d) Return to exec code path, which does start_thread() which does
    free_thread_xstate() and sets xstate pointer to NULL while
    the TS_USEDFPU is still set.

    e) At the next context switch from the new exec'd task to another task,
    we have a scenarios where TS_USEDFPU is set but xstate pointer is null.
    This can cause an oops during unlazy_fpu() in __switch_to()

    Now:

    1) This should happen with or with out pre-emption. Viro also encountered
    similar problem with out CONFIG_PREEMPT.

    2) kernel_fpu_begin() and kernel_fpu_end() will fix this problem, because
    kernel_fpu_begin() will manually do a clts() and won't run in to the
    situation of setting TS_USEDFPU in step "c" above.

    3) This was working before the fpu changes, because its a spurious
    math fault which doesn't corrupt any fpu/sse registers and the task's
    math state was always in an allocated state.

    With out the recent lazy fpu allocation changes, while we don't see oops,
    there is a possible race still present in older kernels(for example,
    while kernel is using kernel_fpu_begin() in some optimized clear/copy
    page and an interrupt/softirq happens which uses these padlock
    instructions generating DNA fault).

    This is the failing scenario that existed even before the lazy fpu allocation
    changes:

    0. CPU's TS flag is set

    1. kernel using FPU in some optimized copy routine and while doing
    kernel_fpu_begin() takes an interrupt just before doing clts()

    2. Takes an interrupt and ipsec uses padlock instruction. And we
    take a DNA fault as TS flag is still set.

    3. We handle the DNA fault and set TS_USEDFPU and clear cr0.ts

    4. We complete the padlock routine

    5. Go back to step-1, which resumes clts() in kernel_fpu_begin(), finishes
    the optimized copy routine and does kernel_fpu_end(). At this point,
    we have cr0.ts again set to '1' but the task's TS_USEFPU is stilll
    set and not cleared.

    6. Now kernel resumes its user operation. And at the next context
    switch, kernel sees it has do a FP save as TS_USEDFPU is still set
    and then will do a unlazy_fpu() in __switch_to(). unlazy_fpu()
    will take a DNA fault, as cr0.ts is '1' and now, because we are
    in __switch_to(), math_state_restore() will get confused and will
    restore the next task's FP state and will save it in prev tasks's FP state.
    Remember, in __switch_to() we are already on the stack of the next task
    but take a DNA fault for the prev task.

    This causes the fpu leakage.

    Fix the padlock instruction usage by calling them inside the
    context of new routines irq_ts_save/restore(), which clear/restore cr0.ts
    manually in the interrupt context. This will not generate spurious DNA
    in the context of the interrupt which will fix the oops encountered and
    the possible FPU leakage issue.

    Reported-and-bisected-by: Wolfgang Walter
    Signed-off-by: Suresh Siddha
    Signed-off-by: Herbert Xu

    Suresh Siddha
     

10 Jul, 2008

1 commit


21 Apr, 2008

1 commit


11 Jan, 2008

3 commits

  • Currently we reset the key for each segment fed to the xcrypt instructions.
    This patch optimises this for CBC and ECB so that we only do this once for
    each encrypt/decrypt operation.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This three defines are used in all AES related hardware.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     
  • The previous patch fixed spurious read faults from occuring by copying
    the data if we happen to have a single block at the end of a page. It
    appears that gcc cannot guarantee 16-byte alignment in the kernel with
    __attribute__. The following report from Torben Viets shows a buffer
    that's only 8-byte aligned:

    > eneral protection fault: 0000 [#1]
    > Modules linked in: xt_TCPMSS xt_tcpmss iptable_mangle ipt_MASQUERADE
    > xt_tcpudp xt_mark xt_state iptable_nat nf_nat nf_conntrack_ipv4
    > iptable_filter ip_tables x_tables pppoe pppox af_packet ppp_generic slhc
    > aes_i586
    > CPU: 0
    > EIP: 0060:[] Not tainted VLI
    > EFLAGS: 00010292 (2.6.23.12 #7)
    > EIP is at aes_crypt_copy+0x28/0x40
    > eax: f7639ff0 ebx: f6c24050 ecx: 00000001 edx: f6c24030
    > esi: f7e89dc8 edi: f7639ff0 ebp: 00010000 esp: f7e89dc8

    Since the hardware must have 16-byte alignment, the following patch fixes
    this by open coding the alignment adjustment.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

28 Dec, 2007

1 commit


11 Oct, 2007

1 commit

  • Loading the crypto algorithm by the alias instead of by module directly
    has the advantage that all possible implementations of this algorithm
    are loaded automatically and the crypto API can choose the best one
    depending on its priority.

    Additionally it ensures that the generic implementation as well as the
    HW driver (if available) is loaded in case the HW driver needs the
    generic version as fallback in corner cases.

    Signed-off-by: Sebastian Siewior
    Signed-off-by: Herbert Xu

    Sebastian Siewior
     

21 Sep, 2006

7 commits

  • This patch removes obsolete block operations of the simple cipher type
    from drivers. These were preserved so that existing users can make a
    smooth transition. Now that the transition is complete, they are no
    longer needed.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • This patch adds block cipher algorithms for cbc(aes) and ecb(aes) for
    the PadLock device. Once all users to the old cipher type have been
    converted the old cbc/ecb PadLock operations will be removed.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Now that the tfm is passed directly to setkey instead of the ctx, we no
    longer need to pass the &tfm->crt_flags pointer.

    This patch also gets rid of a few unnecessary checks on the key length
    for ciphers as the cipher layer guarantees that the key length is within
    the bounds specified by the algorithm.

    Rather than testing dia_setkey every time, this patch does it only once
    during crypto_alloc_tfm. The redundant check from crypto_digest_setkey
    is also removed.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Compile a helper module padlock.ko that will try
    to autoload all configured padlock algorithms.

    This also provides backward compatibility with
    the ancient times before padlock.ko was renamed
    to padlock-aes.ko

    Signed-off-by: Michal Ludvig
    Signed-off-by: Herbert Xu

    Michal Ludvig
     
  • PADLOCK_CRA_PRIORITY is shared between padlock-aes and padlock-sha
    so it should be in the header.

    On the other hand "struct cword" is only used in padlock-aes.c
    so it's unnecessary to have it in padlock.h

    Signed-off-by: Michal Ludvig
    Signed-off-by: Herbert Xu

    Michal Ludvig
     
  • Whenever we rename modules we should add an alias to ensure that existing
    users can still locate the new module.

    This patch also gets rid of the now unused module function prototypes from
    padlock.h.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Merge padlock-generic.c into padlock-aes.c and compile
    AES as a standalone module. We won't make a monolithic
    padlock.ko with all supported algorithms, instead we'll
    compile each driver into its own module.

    Signed-off-by: Michal Ludvig
    Signed-off-by: Herbert Xu

    Michal Ludvig
     

15 Jul, 2006

1 commit


26 Jun, 2006

2 commits

  • i386 assembly has more compact instructions for accessing 7-bit offsets.
    So by moving the large members to the end of the structure we can save
    quite a bit of code size. This patch shaves about 10% or 300 bytes off
    the padlock-aes file.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • Up until now algorithms have been happy to get a context pointer since
    they know everything that's in the tfm already (e.g., alignment, block
    size).

    However, once we have parameterised algorithms, such information will
    be specific to each tfm. So the algorithm API needs to be changed to
    pass the tfm structure instead of the context pointer.

    This patch is basically a text substitution. The only tricky bit is
    the assembly routines that need to get the context pointer offset
    through asm-offsets.h.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

21 Mar, 2006

1 commit

  • Since tfm contexts can contain arbitrary types we should provide at least
    natural alignment (__attribute__ ((__aligned__))) for them. In particular,
    this is needed on the Xscale which is a 32-bit architecture with a u64 type
    that requires 64-bit alignment. This problem was reported by Ronen Shitrit.

    The crypto_tfm structure's size was 44 bytes on 32-bit architectures and
    80 bytes on 64-bit architectures. So adding this requirement only means
    that we have to add an extra 4 bytes on 32-bit architectures.

    On i386 the natural alignment is 16 bytes which also benefits the VIA
    Padlock as it no longer has to manually align its context structure to
    128 bits.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

22 Feb, 2006

1 commit


10 Jan, 2006

2 commits

  • As the Crypto API now allows multiple implementations to be registered
    for the same algorithm, we no longer have to play tricks with Kconfig
    to select the right AES implementation.

    This patch sets the driver name and priority for all the AES
    implementations and removes the Kconfig conditions on the C implementation
    for AES.

    Signed-off-by: Herbert Xu

    Herbert Xu
     
  • A lot of crypto code needs to read/write a 32-bit/64-bit words in a
    specific gender. Many of them open code them by reading/writing one
    byte at a time. This patch converts all the applicable usages over
    to use the standard byte order macros.

    This is based on a previous patch by Denis Vlasenko.

    Signed-off-by: Herbert Xu

    Herbert Xu
     

07 Jul, 2005

4 commits

  • When the Padlock does CBC encryption, the memory pointed to by EAX is
    not updated at all. Instead, it updates the value of EAX by pointing
    it to the last block in the output. Therefore to maintain the correct
    semantics we need to copy the IV.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • This patch ensures that cit_iv is aligned according to cra_alignmask
    by allocating it as part of the tfm structure. As a side effect the
    crypto layer will also guarantee that the tfm ctx area has enough space
    to be aligned by cra_alignmask. This allows us to remove the extra
    space reservation from the Padlock driver.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • By operating on multiple blocks at once, we expect to extract more
    performance out of the VIA Padlock.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu
     
  • Most of the work done aes_padlock can be done in aes_set_key. This
    means that we only have to do it once when the key changes rather
    than every time we perform an encryption or decryption.

    This patch also sets cra_alignmask to let the upper layer ensure
    that the buffers fed to us are aligned correctly.

    Signed-off-by: Herbert Xu
    Signed-off-by: David S. Miller

    Herbert Xu