Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake test failures on s390x Fedora 33 and 34 #296

Closed
mdomsch opened this issue Oct 30, 2020 · 9 comments
Closed

cmake test failures on s390x Fedora 33 and 34 #296

mdomsch opened this issue Oct 30, 2020 · 9 comments
Assignees
Labels

Comments

@mdomsch
Copy link
Contributor

mdomsch commented Oct 30, 2020

Direwolf 1.6 tag, building on s390x architecture on Fedora 33 and rawhide (will be F34) fails 'cmake test' with a buffer overflow detected. Build succeeds on s390x on Fedora 32, and on all other architectures of F32-34. Test #7 (enctest) has the failure.

Fedora rawhide (future F34) https://koji.fedoraproject.org/koji/buildinfo?buildID=1636157
Fedora 33 https://koji.fedoraproject.org/koji/buildinfo?buildID=1636158
Fedora 32 https://koji.fedoraproject.org/koji/buildinfo?buildID=1636159

While it's possible to ExcludeArch: s390x, and one could argue that no one sane would try running direwolf on a mainframe, it's often the case that these failures are latent on other architectures when detected on one.

From the build.log files of the rawhide build:

Executing(%check): /bin/sh -e /var/tmp/rpm-tmp.wGau2O

@sharkcz
Copy link

sharkcz commented Oct 30, 2020

@mdomsch, please ping me on IRC, if you want to look into it yourself. The same applies to upstream devels, I can provide access to a s390x machine. Just by the name enctest I would suspect some endianness bug ...

@wb2osz
Copy link
Owner

wb2osz commented Oct 30, 2020

The only place where endianness should matter is in serializing integers with the AGW network protocol. There is already code in there to handle this. It's the same idea as the htonl() and similar functions except it has the opposite network byte order. It would be interesting to know why it fails. As you said it might be exposing some latent issue.

@mdomsch
Copy link
Contributor Author

mdomsch commented Nov 2, 2020

Reproduced on an s390x VM provided by @sharkcz.


Starting program: /var/lib/mock/fedora-rawhide-s390x-mdomsch/root/builddir/build/BUILD/direwolf-1.6/s390x-redhat-linux-gnu/test/enctest
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.31-4.fc32.s390x
*** buffer overflow detected ***: terminated

Program received signal SIGABRT, Aborted.
0x000003fffdc4a426 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x000003fffdc4a426 in raise () from /lib64/libc.so.6
#1  0x000003fffdc2b00e in abort () from /lib64/libc.so.6
#2  0x000003fffdc910c0 in __libc_message () from /lib64/libc.so.6
#3  0x000003fffdd34094 in __fortify_fail () from /lib64/libc.so.6
#4  0x000003fffdd32798 in __chk_fail () from /lib64/libc.so.6
#5  0x000003fffdc87282 in _IO_str_chk_overflow () from /lib64/libc.so.6
#6  0x000003fffdc7c0f4 in __vfprintf_internal () from /lib64/libc.so.6
#7  0x000003fffdc87342 in __vsprintf_internal () from /lib64/libc.so.6
#8  0x000003fffdd3229a in __sprintf_chk@@GLIBC_2.4 () from /lib64/libc.so.6
#9  0x000002aa0000198c in sprintf (__fmt=0x2aa00007acc "%03d%s%c", __s=0x3ffffffecae "07126.47\357\070") at /builddir/build/BUILD/direwolf-1.6/src/latlong.c:145
#10 longitude_to_str (slong=0x3ffffffecae "07126.47\357\070", ambiguity=0, dlong=71.44116666666666) at /builddir/build/BUILD/direwolf-1.6/src/latlong.c:181
#11 set_norm_position (presult=0x3ffffffeca5, ambiguity=0, dlong=-71.44116666666666, dlat=42.576833333333333, symbol=38 '&', symtab=68 'D')
    at /builddir/build/BUILD/direwolf-1.6/src/encode_aprs.c:93
#12 encode_position (result_size=100, presult=0x3ffffffeca4 "!4234.61ND07126.47\357\070", comment=0x0, offset=0, tone=0, freq=0, speed=0, course=-999999, dir=0x0, gain=0, height=0, power=0,
    symbol=38 '&', symtab=68 'D', alt_ft=-999999, ambiguity=0, lon=-71.44116666666666, lat=42.576833333333333, compressed=0, messaging=0)
    at /builddir/build/BUILD/direwolf-1.6/src/encode_aprs.c:560
#13 main (argc=<optimized out>, argv=<optimized out>) at /builddir/build/BUILD/direwolf-1.6/src/encode_aprs.c:843

@mdomsch
Copy link
Contributor Author

mdomsch commented Nov 3, 2020

        snprintf (smin, sizeof(smin), "%05.2f", dmin);
        /* Due to roundoff, 59.9999 could come out as "60.00" */
        if (smin[0] == '6') {
          smin[0] = '0';
          ideg++;
        }

        sprintf (slong, "%03d%s%c", ideg, smin, hemi);

slong is 9 characters (from position_t). smin is 8 characters (from this function's stack). This format string can thus be 3 + 8 + 1 + null terminator = 13 characters long, trying to fit in a 9-char buffer.

Shouldn't the formatting of smin be %02.2f instead of %05.2f ? You can't have more than 60 minutes. That saves 3 characters, which would then fit (except for the null terminator).

I tried changing this to %02.2f, and it still fails.

@mdomsch
Copy link
Contributor Author

mdomsch commented Nov 3, 2020

Are the lat and lon fields in position_t supposed to be null-terminated strings, or an array of char with no null terminator? If the latter, then doing the sprintf into a temporary (stack) buffer of say size 16, then memcpy from the buffer into each of lat and lon just the length of lat and long, does succeed and lets all the tests pass. Not pretty, but it works.

diff --git a/src/latlong.c b/src/latlong.c
index b3eadcc..1de9c5a 100644
--- a/src/latlong.c
+++ b/src/latlong.c
@@ -78,6 +78,7 @@ void latitude_to_str (double dlat, int ambiguity, char *slat)
        int ideg;       /* whole number of degrees. */
        double dmin;    /* Minutes after removing degrees. */
        char smin[8];   /* Minutes in format mm.mm */
+       char tempbuf[16];

        if (dlat < -90.) {
          text_color_set(DW_COLOR_ERROR);
@@ -101,14 +102,17 @@ void latitude_to_str (double dlat, int ambiguity, char *slat)
        ideg = (int)dlat;
        dmin = (dlat - ideg) * 60.;

-       snprintf (smin, sizeof(smin), "%05.2f", dmin);
+       snprintf (smin, sizeof(smin), "%02.2f", dmin);
        /* Due to roundoff, 59.9999 could come out as "60.00" */
        if (smin[0] == '6') {
          smin[0] = '0';
          ideg++;
        }

-       sprintf (slat, "%02d%s%c", ideg, smin, hemi);
+       text_color_set(DW_COLOR_DEBUG);
+       dw_printf("latitude_to_str ideg=%d smin=%s hemi=%c\n", ideg, smin, hemi);
+       snprintf(tempbuf, sizeof(tempbuf), "%02d%s%c", ideg, smin, hemi);
+       memcpy(slat, tempbuf, 8);

        if (ambiguity >= 1) {
          slat[6] = ' ';
@@ -148,6 +152,7 @@ void longitude_to_str (double dlong, int ambiguity, char *slong)
        int ideg;       /* whole number of degrees. */
        double dmin;    /* Minutes after removing degrees. */
        char smin[8];   /* Minutes in format mm.mm */
+       char tempbuf[16];

        if (dlong < -180.) {
          text_color_set(DW_COLOR_ERROR);
@@ -171,14 +176,17 @@ void longitude_to_str (double dlong, int ambiguity, char *slong)
        ideg = (int)dlong;
        dmin = (dlong - ideg) * 60.;

-       snprintf (smin, sizeof(smin), "%05.2f", dmin);
+       snprintf (smin, sizeof(smin), "%02.2f", dmin);
        /* Due to roundoff, 59.9999 could come out as "60.00" */
        if (smin[0] == '6') {
          smin[0] = '0';
          ideg++;
        }

-       sprintf (slong, "%03d%s%c", ideg, smin, hemi);
+       text_color_set(DW_COLOR_DEBUG);
+       dw_printf ("longitude_to_str ideg=%d smin=%s hemi=%c\n", ideg, smin, hemi);
+       sprintf (tempbuf, "%03d%s%c", ideg, smin, hemi);
+       memcpy(slong, tempbuf, 9);
 /*
  * The spec says position ambiguity in latitude also
  * applies to longitude automatically.


@mdomsch
Copy link
Contributor Author

mdomsch commented Nov 4, 2020

While that worked on s390x and armv7hl, the lltest executable fails all tests on all the other architectures including i686 and x86_64. It clearly needed the null terminators. So, back to the drawing board.


 5/20 Test  #6: lltest ...........................***Failed    0.01 sec
Error 1.1: Did not expect "4515.00N��
�$V"
Error 1.2: Did not expect "4515.00S��
�$V"
Error 1.3: Did not expect "4559.99N��
�$V"
Error 1.4: Did not expect "4600.00N��
�$V"
Error 1.5: Did not expect "4559.9 N��
�$V"
Error 1.6: Did not expect "4559.  N��
�$V"
Error 1.7: Did not expect "455 .  N��
�$V"
Error 1.8: Did not expect "45  .  N��
�$V"
Error 2.1: Did not expect "04515.00E�
�$V"
Error 2.2: Did not expect "04515.00W�
�$V"
Error 2.3: Did not expect "04559.99E�
�$V"
Error 2.4: Did not expect "04600.00E�
�$V"
Error 2.5: Did not expect "04559.9 E�
�$V"
Error 2.6: Did not expect "04559.  E�
�$V"
Error 2.7: Did not expect "0455 .  E�
�$V"
Error 2.8: Did not expect "045  .  E�
�$V"

@wb2osz wb2osz self-assigned this Nov 5, 2020
@wb2osz wb2osz added the bug label Nov 5, 2020
@wb2osz
Copy link
Owner

wb2osz commented Nov 5, 2020

Fixed in dev branch. 4a1aa2b
Is this good enough or do you want it merged to master branch?

@mdomsch
Copy link
Contributor Author

mdomsch commented Nov 6, 2020 via email

@wb2osz
Copy link
Owner

wb2osz commented Nov 7, 2020

Fixed.

@wb2osz wb2osz closed this as completed Nov 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants