-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
^M carriage return in logs, dti row #317
Comments
Your first packet that has the ^M in it is not valid (see the destination callsign with a "?" in it. Are you using Direwolf's FIX_BITS feature? If so, turn them off. Even without using FIX_BITS, there is a none-zero chance that a packet will match the CRC check yet be broken. Maybe Direwolf could improve on it's output sanitation but it will never be perfect. I would recommend to have your application also sanitize it's input as well. Next up, if you read page 20 on APRS comments at http://www.aprs.org/doc/APRS101.PDF , it seems like carriage returns ARE allowed (not explicitly forbidden). |
I am sanitizing it on my app that watches the log and drops em into the DB in real time, but I'm trying to bulk import over 8 million rows into the database and that is a huge overhead on that end.. While a carriage return in a string is allowed in APRS, it is not allowed in CSV spec.. if these were raw logs tha'd be one thing but your outputting a standardized format it should be compliant otherwise this bespoke Direwolf CSV format you invented is not really useful if just about any tool to read them will error out. https://tools.ietf.org/html/rfc4180 I can accept losing a random packet that cant be parsed every once and in a blue moon, but as a station operator I'm responsible for my station and I believe having complete and accurate logs are important and I would much rather be assured that the logs I'm scraping wont keep coming up with new ways to deviate from standards over time.. |
Interesting that CSV doesn't allow carriage returns. I suppose if Direwolf to strictly follow the RFC, this would need to be changed. Not sure what WB2OSZ would want to do here as to be technically accurate, the ^M should stay as that's what was potentially received. |
I'm fine with them being escaped, just needed quotes put around the field and Postgres would take it as it is.. I would rather have more accurate data, then they can be displayed escaped or as intended on the final output, but for data handling the log file should conform to csv standards, not the incoming protocol standards. I would love to also have a raw log file that is unprocessed packets, faults and all in addition to the csv. The existing CSV file we are already losing information in the conversion from a packet into a series of objects as direwolf processed it, so I see no loss at all linting the csv output so its valid, encoded properly, and consumable by anything that can parse csv files. This would also make it trivial to transform the data into other formats as needed, such as json/yaml/xml/etc but if it wont even pass its own standards there is a rough road ahead for anyone wanting to transform it into a new format. |
I'm trying to import this data into a database for view and keep running into this issue and having to fix these lines by hand.. both ^M is being seen as 1 char in a text editor, i can back it out and replace with ^M and its fine afterward.
The text was updated successfully, but these errors were encountered: