Discussion:
Spiped 1.4.x segfaults on i386
Denis Krienbühl
2014-10-06 13:36:41 UTC
Permalink
Hi,

I'm running spiped 1.3.1 across our infrastructure to connect various
services. Today I tried to upgrade to 1.4.1, but I had to roll back
because 1.4.0 as well as 1.4.1 crashes on our i386 servers as soon as I
try to send a packet.

This is what I get in the syslog:

Oct 6 12:51:05 server kernel: traps: spiped[6505] general protection
ip:8052fe2 sp:bfc97240 error:0 in spiped[8048000+10000]

Rolling back to 1.3.1 solves the problem. Both 1.4.0 and 1.4.1 work
fine on our armv6l and amd64 servers, so this seems to be a regression
introduced in 1.4.0.

I have little experience with debugging C, so just let me know what I
need to provide to help fix this bug.

Regards,

Denis
Denis Krienbühl
2014-10-06 13:28:46 UTC
Permalink
Hi

I'm running spiped 1.3.1 across our infrastructure to connect various
services. Today I tried to upgrade to 1.4.1, but I had to roll back
because 1.4.0 as well as 1.4.1 crashes on our i386 servers as soon as I
try to send a packet.

This is what I get in the syslog:

Oct 6 12:51:05 server kernel: traps: spiped[6505] general protection
ip:8052fe2 sp:bfc97240 error:0 in spiped[8048000+10000]

Rolling back to 1.3.1 solves the problem. Both 1.4.0 and 1.4.1 works
fine on our armv6l and amd64 servers, so this seems to be a regression
introduced in 1.4.0.

I have little experience with debugging C, so just let me know what I
need to provide to help fix this bug.

Regards,

Denis
Colin Percival
2014-10-06 16:48:38 UTC
Permalink
Post by Denis Krienbühl
I'm running spiped 1.3.1 across our infrastructure to connect various
services. Today I tried to upgrade to 1.4.1, but I had to roll back
because 1.4.0 as well as 1.4.1 crashes on our i386 servers as soon as I
try to send a packet.
What hardware is this exactly? The major change in spiped 1.4.x is the
use of AESNI, which will depend on the particular CPU it's running on.

Also, what OS version?

--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Denis Krienbühl
2014-10-06 19:08:34 UTC
Permalink
Post by Colin Percival
What hardware is this exactly? The major change in spiped 1.4.x is the
use of AESNI, which will depend on the particular CPU it's running on.
That would be an Intel Xeon E5-2680 v2. I attached the output of
/proc/cpuinfo.

The OS is Ubuntu 12.04 with all the latest Update on a Linode VPS, which
I believe runs on Xen.
Post by Colin Percival
Post by Denis Krienbühl
I'm running spiped 1.3.1 across our infrastructure to connect various
services. Today I tried to upgrade to 1.4.1, but I had to roll back
because 1.4.0 as well as 1.4.1 crashes on our i386 servers as soon as I
try to send a packet.
What hardware is this exactly? The major change in spiped 1.4.x is the
use of AESNI, which will depend on the particular CPU it's running on.
Also, what OS version?
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly
paranoid
Colin Percival
2014-10-06 20:03:44 UTC
Permalink
Post by Denis Krienbühl
Post by Colin Percival
What hardware is this exactly? The major change in spiped 1.4.x is the
use of AESNI, which will depend on the particular CPU it's running on.
That would be an Intel Xeon E5-2680 v2. I attached the output of
/proc/cpuinfo.
OK, looks like the cpu supports aesni, so the problem is probably in there
somewhere... the only question is where. Can you please extract the 1.4.1
source code and run

# make CFLAGS="-O2 -g"

and then run ./spiped/spiped or ./spipe/spipe? This will include debugging
information.

Then after it dies,

# addr2line -e ./spiped/spiped ADDR
or
# addr2line -e ./spipe/spipe ADDR

where ADDR is the ip:XXXXXXX value from your syslog. This should let me
know where the fault is being triggered.

Thanks,
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Denis Krienbühl
2014-10-07 07:08:16 UTC
Permalink
I followed your steps and got the following results after running
addr2line:

/opt/spiped/1.4.1/spiped/../libcperciva/crypto/crypto_aes_aesni.c:52
Post by Colin Percival
Post by Denis Krienbühl
Post by Colin Percival
What hardware is this exactly? The major change in spiped 1.4.x is the
use of AESNI, which will depend on the particular CPU it's running on.
That would be an Intel Xeon E5-2680 v2. I attached the output of
/proc/cpuinfo.
OK, looks like the cpu supports aesni, so the problem is probably in
there
somewhere... the only question is where. Can you please extract the
1.4.1
source code and run
# make CFLAGS="-O2 -g"
and then run ./spiped/spiped or ./spipe/spipe? This will include
debugging
information.
Then after it dies,
# addr2line -e ./spiped/spiped ADDR
or
# addr2line -e ./spipe/spipe ADDR
where ADDR is the ip:XXXXXXX value from your syslog. This should let me
know where the fault is being triggered.
Thanks,
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly
paranoid
Colin Percival
2014-10-07 19:41:23 UTC
Permalink
Post by Denis Krienbühl
I followed your steps and got the following results after running
/opt/spiped/1.4.1/spiped/../libcperciva/crypto/crypto_aes_aesni.c:52
Hmm, interesting! Ok, next step:

1. Build again with `make CFLAGS="-O0 -g"`.
2. Run the utility and watch it crash.
3. Run `gdb ./spiped/spiped spiped.core` and at the prompt "p key" and
"p rkeys".

If gdb complains that spiped.core doesn't exist you'll need to enable
core dumps -- I'm not sure if Ubuntu has them turned on by default.

I *think* I know what the problem is here, but seeing the value of those
two pointers when the crash occurs should confirm it.

--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Denis Krienbühl
2014-10-08 08:06:40 UTC
Permalink
I did that and got the following results:

(gdb) p key
$1 = (const uint8_t *) 0xbfa8a85c ""

(gdb) p rkeys
$2 = (__m128i *) 0x8a8b7c8

The complete output is in the attachment.
Post by Colin Percival
Post by Denis Krienbühl
I followed your steps and got the following results after running
/opt/spiped/1.4.1/spiped/../libcperciva/crypto/crypto_aes_aesni.c:52
1. Build again with `make CFLAGS="-O0 -g"`.
2. Run the utility and watch it crash.
3. Run `gdb ./spiped/spiped spiped.core` and at the prompt "p key" and
"p rkeys".
If gdb complains that spiped.core doesn't exist you'll need to enable
core dumps -- I'm not sure if Ubuntu has them turned on by default.
I *think* I know what the problem is here, but seeing the value of those
two pointers when the crash occurs should confirm it.
--
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly
paranoid
Colin Percival
2014-10-09 01:23:37 UTC
Permalink
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
(gdb) p key $1 = (const uint8_t *) 0xbfa8a85c ""
(gdb) p rkeys $2 = (__m128i *) 0x8a8b7c8
Thanks, that's exactly what I was hoping to see. The problem is that the
SSE instructions used require the AES round keys to be stored aligned to
16-byte boundaries, and the malloc on your system is providing unaligned
allocations.

Can you try the attached patch? In a clean source tree,
# patch < rkeys-align.patch
# make all
and then you should find that it works again.

- --
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
Denis Krienbühl
2014-10-09 06:47:51 UTC
Permalink
Yep, this indeed fixes the problem. Thank you for your helpful and fast
response.
Post by Colin Percival
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
(gdb) p key $1 = (const uint8_t *) 0xbfa8a85c ""
(gdb) p rkeys $2 = (__m128i *) 0x8a8b7c8
Thanks, that's exactly what I was hoping to see. The problem is that the
SSE instructions used require the AES round keys to be stored aligned to
16-byte boundaries, and the malloc on your system is providing unaligned
allocations.
Can you try the attached patch? In a clean source tree,
# patch < rkeys-align.patch
# make all
and then you should find that it works again.
- --
Colin Percival
Security Officer Emeritus, FreeBSD | The power to serve
Founder, Tarsnap | www.tarsnap.com | Online backups for the truly
paranoid
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (FreeBSD)
iEYEARECAAYFAlQ145kACgkQOM7KaQxqam4z8ACdHZ9lOrEoKKrAm4G2ucfM3XbJ
bLQAni51EQ3YfuDHbfkwbbcGNrlzj0Qt
=ripw
-----END PGP SIGNATURE-----
+ rkeys-align.patch
1k (text/plain)
+ rkeys-align.patch.sig
1k (application/pgp-signature)
Loading...