Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need method to disable CPU flag detection at build time #266

Closed
matt-domsch-sp opened this issue Apr 20, 2020 · 3 comments
Closed

Need method to disable CPU flag detection at build time #266

matt-domsch-sp opened this issue Apr 20, 2020 · 3 comments

Comments

@matt-domsch-sp
Copy link

In this commit, CPU flag detection was added to the build process.

commit a1c16a6
Author: Davide Gerhard rainbow@irh.it
Date: Thu Aug 15 08:30:42 2019 +0200

cmake: new build tool

     - cpu flags are auto-discovered in the default build and it works
    on gcc/clang/msvc on x86/x86_64/arm; you can force cpu flags with
    -DFORCE_SSE=1 for example (see CMakeLists.txt on root)

When building on/for the target system, this is great. However when building in a Linux distribution environment such as Fedora, this introduces build-system dependencies on the runtime system. Specifically, a build system may support CPU instructions that are not supported on the runtime CPUs of every target.

When packaging for Fedora, I must patch out the line that invokes the FindCPUflags method. I would prefer to have a build-time define option to disable this. This allows the distribution to continue to choose the broad set of CPUs that it targets by setting the CFLAGS itself at build time.

@wb2osz
Copy link
Owner

wb2osz commented Apr 21, 2020

What a coincidence! I was working on this today after discovering that the application built on a not terribly old computer died when run on an older computer in the "shack." My specific case was for Windows but the same would apply to Linux.

Added to cmake/modules/FindCPUflags.cmake

+# direwolf versions thru 1.5 were only pre-built for 32 bit Windows targets.
+# Much research and experimentation revealed that the SSE instructions made a big
+# difference in runtime speed but SSE2 and later were not significantly better
+# for this application. I decided to build with only the SSE instructions making
+# the Pentium 3 the minimum requirement. SSE2 would require at least a Pentium 4
+# and offered no significant performance advantage.
+# These are ancient history - from the previous Century - but old computers, generally
+# considered useless for anything else, often end up in the ham shack.
+#
+# When cmake was first used for direwolf, the default target became 64 bit and the
+# SSE2, SSE3, SSE4.1, and SSE4.2 instructions were automatically enabled based on the
+# build machine capabilities. This was fine until I tried running the application
+# on a computer much older than where it was built. It did not have the SSE4 instructions.
+# Just how much benefit do these new instructions provide for this application?
+#
+# Times to run atest with Track 1 of the TNC test CD:
+#
+# direwolf 1.5 - 32 bit target - gcc 6.3.0
+#
+# 60.4 sec. Pentium 3 with SSE
+#
+# direwolf 1.6 - 32 bit target - gcc 7.4.0
+#
+# 81.0 sec. with no SIMD instructions enabled.
+# 54.4 sec. with SSE
+# 52.0 sec. with SSE2
+# 52.4 sec. with SSE2, SSE3
+# 52.3 sec. with SSE2, SSE3, SSE4.1, SSE4.2
+#
+# That's what I found several years ago with a much older compiler.
+# The original SSE helped a lot but later made little difference.
+#
+# I don't even know if the last one makes sense. A later Pentium 4
+# (Prescott) had SSE3. but were there any 32 bit processors that
+# supported the SSE4 instructions?
+#
+# direwolf 1.6 - 64 bit target - gcc 7.4.0
+#
+# 34.8 sec. with no SIMD instructions enabled.
+# 34.8 sec. with SSE
+# 34.8 sec. with SSE2
+# 34.2 sec. with SSE2, SSE3
+# 33.5 sec. with SSE2, SSE3, SSE4.1, SSE4.2
+#
+# Why are the first three the same? Maybe a 64-bit target implies
+# that the SSE (and probably SSE2) instructions are available.
+#
+# That's interesting. Building for a 64 bit target makes it run more than
+# 1.5x faster on the same hardware. direwolf 1.6 will be available for both
+# 64 and 32 bit targets. I don't know if the i686 in the compiler name means
+# that a minimum of a Pentium Pro would be required.
+#
+# Here we force it use plain old SSE without hacking up the following
+# automatic selection logic. Some ambitious person might want to reorder the
+# tests so a later instruction set could be specified on the cmake command line.
+# I have more pressing matters right now.
+#
+set(FORCE_SSE 1)
+

I will push this to github after testing it on a a very old computer.

@matt-domsch-sp
Copy link
Author

Fedora (and similar distributions that use RPM as the packaging system) uses a set of gcc compiler options that are optimized for a wide range of processors, with separate i686 and x86_64 compiler options. These appear to be:

i686: -m32 -march=i686 -mtune=generic -msse2 -mfpmath=sse (effectively pentium 3 mobile and newer)

x86_64: -m64 -mtune=generic (as recommended by GCC for processor-agnostic builds). All the 64-bit capable chips supported MMX, SSE, SSE2, SSE3 so these are included already.

For my part, I think I can just use cmake -DENABLE_GENERIC=1 to disable these tests, and then it uses the build system-provided options above, plus the few optimization options specified in the top-level CMakeLists.txt (-ffast-math -ftree-vectorize and the warning switches). That's good enough for me.

@wb2osz
Copy link
Owner

wb2osz commented Apr 22, 2020

Fixed in f293186 .

Default behavior is now like the Fedora standard for maximum compatibility.
Packagers won't need to do anything special.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants