Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider the /3 sample rate default for ARM #234

Closed
arodland opened this issue Dec 2, 2019 · 2 comments
Closed

Reconsider the /3 sample rate default for ARM #234

arodland opened this issue Dec 2, 2019 · 2 comments

Comments

@arodland
Copy link
Contributor

arodland commented Dec 2, 2019

I would hazard a guess that most ARM devices running Direwolf today aren't Raspberry Pi 1s. Most newer systems easily support the full-rate demodulator. Getting full sample rate on these devices requires a fair amount of digging through the documentation to figure out the modem syntax. In order to make Direwolf show its best performance, I think it may be time to default to /1 on all devices.

For those who are running low-power CPUs that can't handle the full-rate decode, maybe we can handle that a little bit more gracefully — for example, by updating the message printed on -EPIPE to mention the decimate setting, or maybe even re-initing the demodulator on the fly after a few overruns.

@dranch
Copy link
Collaborator

dranch commented Aug 3, 2020

Since you can easily override this configuration default, have you seen a marked decode improvement when removing /3? Btw, Direwolf is running on a lot of RaspberryPi Zero, Zero W, etc.

@wb2osz
Copy link
Owner

wb2osz commented Dec 31, 2020

I'm more of a scientist than a philosopher so I like to run experiments to answer questions.

========== RPi 3 ========================

First let's see what impact this has. I'm running on an RPi 3 here.
"-P+" means use multiple slicers.
"-D1" is the decimation ratio for the audio samples in.

$ atest -P+ -D1 01_Track_1.wav

While this is running, it is taking 100% of one core. No surprise there.
We are compute bound, not held back by the real-time audio arriving.

top - 18:01:19 up 17 days, 7:24, 4 users, load average: 0.80, 0.28, 0.10
Tasks: 137 total, 2 running, 135 sleeping, 0 stopped, 0 zombie
%Cpu(s): 25.0 us, 0.3 sy, 0.0 ni, 74.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 936.7 total, 49.9 free, 109.8 used, 777.1 buff/cache
MiB Swap: 100.0 total, 95.0 free, 5.0 used. 750.6 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6799 pi 20 0 7324 3196 2648 R 100.0 0.3 1:36.64 atest
6821 pi 20 0 10188 2844 2484 R 0.7 0.3 0:00.15 top

The RPi 3 is plenty fast, this is taking about 13% of one core.


… at the end

1011 from 01_Track_1.wav
1011 packets decoded in 199.722 seconds. 7.8 x realtime

So an RPi 3 is plenty fast.

Now with "-D3".

$ atest -P+ -D3 01_Track_1.wav


1007 from 01_Track_1.wav
1007 packets decoded in 69.470 seconds. 22.3 x realtime

Roughly 3 times faster which I suppose you might expect if the audio sample rate is 1/3.
Decode performance has degraded by only 4 out of 1011 which is about 0.4%.
Pretty insignificant.

=============== RPi 1A ===========================

I'm still running an RPi 1A (not even + variation) for my digipeater/IGate.
Others might want to use an RPi zero which is probably comparable.

top - 18:23:27 up 31 days, 22:19, 4 users, load average: 2.61, 2.88, 2.72
Tasks: 109 total, 1 running, 108 sleeping, 0 stopped, 0 zombie
%Cpu(s): 23.8 us, 3.6 sy, 0.0 ni, 71.5 id, 0.4 wa, 0.0 hi, 0.7 si, 0.0 st
MiB Mem : 432.2 total, 151.0 free, 56.4 used, 224.8 buff/cache
MiB Swap: 100.0 total, 44.8 free, 55.2 used. 301.6 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
835 pi 20 0 116876 3904 3164 S 20.2 0.9 9785:27 direwolf
16804 pi 20 0 10172 2960 2484 R 2.6 0.7 0:02.73 top

Direwolf is running about 20% CPU which I would consider comfortable.

Let's shutdown the digipeater for a while, to run some timing tests.

$ atest -P+ -D1 01_Track_1.wav

1011 from 01_Track_1.wav
1011 packets decoded in 959.914 seconds. 1.6 x realtime

It's faster than real-time but I would worry about it keeping up if anything else is going on.

$ atest -P+ -D3 01_Track_1.wav


1007 from 01_Track_1.wav
1007 packets decoded in 270.298 seconds. 5.7 x realtime

============ Summary =================================

So, there is the dilemma.
I want it to work on almost any ARM, even the low end models without doing anything special.
Based on what we see in the forum, too many people can't RTFM.

That's why the decimation factor of 3 is in there for ARM.

People with faster CPUs can add the "-D1" command line option (or /1 after MODEM in the config file) if they really care about the 0.4% improvement.

@wb2osz wb2osz closed this as completed Dec 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants