Eric Lee / smarc-fsl-linux-kernel | Embedian Git Server

08 Jun, 2016

1 commit

f5967101e x86/hweight: Get rid of the special calling convention ... Browse Code »

People complained about ARCH_HWEIGHT_CFLAGS and how it throws a wrench
into kcov, lto, etc, experimentations.

Add asm versions for __sw_hweight{32,64}() and do explicit saving and
restoring of clobbered registers. This gets rid of the special calling
convention. We get to call those functions on !X86_FEATURE_POPCNT CPUs.

We still need to hardcode POPCNT and register operands as some old gas
versions which we support, do not know about POPCNT.

Btw, remove redundant REX prefix from 32-bit POPCNT because alternatives
can do padding now.

Suggested-by: H. Peter Anvin
Signed-off-by: Borislav Petkov
Acked-by: Peter Zijlstra (Intel)
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1464605787-20603-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar

Borislav Petkov
2016-06-08 21:01:02 +0800

14 Sep, 2014

1 commit

72d931046 Make ARCH_HAS_FAST_MULTIPLIER a real config variable ... Browse Code »

It used to be an ad-hoc hack defined by the x86 version of
that enabled a couple of library routines to know whether
an integer multiply is faster than repeated shifts and additions.

This just makes it use the real Kconfig system instead, and makes x86
(which was the only architecture that did this) select the option.

NOTE! Even for x86, this really is kind of wrong. If we cared, we would
probably not enable this for builds optimized for netburst (P4), where
shifts-and-adds are generally faster than multiplies. This patch does
*not* change that kind of logic, though, it is purely a syntactic change
with no code changes.

This was triggered by the fact that we have other places that really
want to know "do I want to expand multiples by constants by hand or
not", particularly the hash generation code.

Signed-off-by: Linus Torvalds

Linus Torvalds
2014-09-14 02:14:53 +0800

08 Mar, 2012

1 commit

8bc3bcc93 lib: reduce the use of module.h wherever possible ... Browse Code »
43

For files only using THIS_MODULE and/or EXPORT_SYMBOL, map
them onto including export.h -- or if the file isn't even
using those, then just delete the include. Fix up any implicit
include dependencies that were being masked by module.h along
the way.

Signed-off-by: Paul Gortmaker

Paul Gortmaker
2012-03-08 04:04:04 +0800

07 Apr, 2010

2 commits

d61931d89 x86: Add optimized popcnt variants ... Browse Code »

Add support for the hardware version of the Hamming weight function,
popcnt, present in CPUs which advertize it under CPUID, Function
0x0000_0001_ECX[23]. On CPUs which don't support it, we fallback to the
default lib/hweight.c sw versions.

A synthetic benchmark comparing popcnt with __sw_hweight64 showed almost
a 3x speedup on a F10h machine.

Signed-off-by: Borislav Petkov
LKML-Reference:
Signed-off-by: H. Peter Anvin

Borislav Petkov
2010-04-07 06:52:11 +0800
1527bc8b9 bitops: Optimize hweight() by making use of compile-time evaluation ... Browse Code »

Rename the extisting runtime hweight() implementations to
__arch_hweight(), rename the compile-time versions to __const_hweight()
and then have hweight() pick between them.

Suggested-by: H. Peter Anvin
Signed-off-by: Peter Zijlstra
LKML-Reference:
Acked-by: H. Peter Anvin
LKML-Reference:
Signed-off-by: H. Peter Anvin

Peter Zijlstra
2010-04-07 06:52:11 +0800

28 Dec, 2009

1 commit

39d997b51 x86, core: Optimize hweight32() ... Browse Code »

Optimize hweight32 by using the same technique in hweight64.

The proof of this technique can be found in the commit log for
f9b4192923fa6e38331e88214b1fe5fc21583fcc ("bitops: hweight()
speedup").

The userspace benchmark on x86_32 showed 20% speedup with
bitmap_weight() which uses hweight32 to count bits for each
unsigned long on 32bit architectures.

int main(void)
{
#define SZ (1024 * 1024 * 512)

static DECLARE_BITMAP(bitmap, SZ) = {
[0 ... 100] = 1,
};

return bitmap_weight(bitmap, SZ);
}

Signed-off-by: Akinobu Mita
Signed-off-by: Andrew Morton
Cc: Linus Torvalds
LKML-Reference:
[ only x86 sets ARCH_HAS_FAST_MULTIPLIER so we do this via the x86 tree]
Signed-off-by: Ingo Molnar

Akinobu Mita
2009-12-28 17:41:39 +0800

20 Oct, 2007

1 commit

1977f0327 remove asm/bitops.h includes ... Browse Code »

remove asm/bitops.h includes

including asm/bitops directly may cause compile errors. don't include it
and include linux/bitops instead. next patch will deny including asm header
directly.

Cc: Adrian Bunk
Signed-off-by: Jiri Slaby
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jiri Slaby
2007-10-20 02:53:41 +0800