The code is based on a precomputed initial seed table, instead of re-seeding from scratch everytime the whole state. On my x86 test machine this new code seems to be ~60% faster.
Some further testing and tuning may be needed.
Credits to @1yura.
This reverts commit e552d93a50a25e902dc6b44d29f174fd9a8671bb.
Usign GCC version 5.4.0 20160609 the code was 3 times slower (probably due to missing inlining and other optimizations). The binary was also >15kB bigger.
sources list created via
make clean ; make CC=gcc LDFLAGS="-Wl,--gc-sections" CFLAGS="-O -ffunction-sections"
readelf -a pixiewps | grep '\.c' | awk '{print "./" $8 " \\"}' > tfm_used.txt
and some manual cleanups.
we can disable highly optimized mul/sqr operations for about 30% speed
decrease but saving a lot in binary size.
only build the files necessary by including an explicit list of filenames
rather than doing a wildcard over tfm/*.c.
compiling with tinycc, we get:
fp_montgomery_reduce.c:510: error: invalid clobber register '%rax'
disabling asm pulls in a couple new files, adding them too.
Added casts to u32 for 'rcons' and 'Td4s' which are of type uint8_t*, so their elements, before being shifted, are promoted to int (not to unsigned int) unless explicitly casted, due to integer promotion rules of the C language.
This caused the "left shift of * by 24 places cannot be represented in type 'int'" error when compiling with GCC's -fsanitize=undefined.
The code is from an old version of wpa_supplicant/hostapd.
The seeds were always printed even if there wasn't need to bruteforce
the state of the PRNG:
[*] Seed N1: 0 (01/01/70 00:00:00 UTC)
[*] Seed
ES1: 0 (01/01/70 00:00:00 UTC)
[*] Seed ES2: 0 (01/01/70 00:00:00
UTC)
Correct:
[*] Seed N1: -
[*] Seed ES1: -
[*] Seed ES2:
-
Introduced in (6082da8).