1 2020-07-12T21:16Z; bktei> Note: This file is now retired since ~bklog~
 
   2 has replaced ~bkgpslog~.
 
   5 ** DONE Add job control for short buffer length
 
   6    CLOSED: [2020-07-02 Thu 16:04]
 
   7 2020-07-02T14:56Z; bktei> File write operations were bundled into a
 
   8 magicWriteBuffer function that is called then detached from the script
 
   9 shell (job control), but the detached job is not tracked by the main
 
  10 script. A problem may arise if two instances of magicWriteBuffer
 
  11 attempt to write to the same tar simultaneously. Two instances of
 
  12 magicWriteBuffer may exist if the buffer length is low (ex: 1 second);
 
  13 the default buffer length of 60 seconds should reduce the probability
 
  14 of a collision but it should be possible for the main script to track
 
  15 the process ID of a magicWriteBuffer() as soon as it detaches and then
 
  16 checking (via ~$!~ as described [[https://bashitout.com/2013/05/18/Ampersands-on-the-command-line.html][here]]) that the process is still alive.
 
  17 2020-07-02T15:23Z; bktei> I found that the Bash ~wait~ built-in can be
 
  18 used to delay processing until a specified job completes. The ~wait~
 
  19 command will pause script execution until all backgrounded processes
 
  21 2020-07-02T16:03Z; bktei> Added ~wait~.
 
  22 ** DONE Rewrite tar initialization function
 
  23    CLOSED: [2020-07-02 Thu 17:23]
 
  24 2020-07-02T17:23Z; bktei> Simplify tar initialization function so
 
  25 VERSION file is used to test appendability of tar as well as to mark
 
  26 when a new session is started.
 
  27 ** DONE Consolidate tar checking/creation into function
 
  28    CLOSED: [2020-07-02 Thu 18:33]
 
  29 2020-07-02T18:33Z; bktei> Simplify how the output tar file's existence
 
  30 is checked and its status as a valid tar file is validated. This was
 
  31 done using a new function ~checkMakeTar~.
 
  32 ** DONE Add VERSION if output tar deleted between writes
 
  34    CLOSED: [2020-07-02 Thu 20:22]
 
  35 2020-07-02T20:21Z; bktei> Added bkgpslog-specified function
 
  36 magicWriteVersion() to be called whenever a new time-stamped ~VERSION~
 
  37 file needs to be generated and appended to the output tar file
 
  39 ** DONE Rewrite buffer loop to reduce lag between gpspipe runs
 
  41    CLOSED: [2020-07-03 Fri 20:57]
 
  42 2020-07-03T17:10Z; bktei> As is, there is still a 5-6 second lag
 
  43 between when ~gpspipe~ times out at the end of a buffer round and when
 
  44 ~gpspipe~ is called by the subsequent buffer round. I believe this can
 
  45 be reduced by moving variable manipulations inside the
 
  46 asynchronously-executed magicWriteBuffer() function. Ideally, the
 
  47 while loop should look like:
 
  50 while( $SECONDS < $SCRIPT_TTL); do
 
  51     gpspipe-r > "$DIR_TMP"/buffer.nmea
 
  55 2020-07-03T20:56Z; bktei> I simplified it futher to something like
 
  58 while( $SECONDS < $SCRIPT_TTL); do
 
  64 Raspberry Pi Zero W shows approximately 71ms of drift per buffer round
 
  66 ** DONE Feature: Recipient watch folder
 
  67    CLOSED: [2020-07-12 Sun 21:08]
 
  68 2020-07-03T21:28Z; bktei> This feature would be to scan the contents
 
  69 of a specified directory at the start of every buffer round in order
 
  70 to determine encryption (age) recipients. This would allow a device to
 
  71 dynamically encrypt location data in response to automated changes
 
  72 made by other tools. For example, if such a directory were
 
  73 synchronized via Syncthing and changes to such a directory were
 
  74 managed by a trusted remote server, then that server could respond to
 
  75 human requests to secure location data.
 
  77 Two specific privacy subfeatures come to mind:
 
  79 1. Parallel encryption: Given a set of ~n~ public keys, encrypt data
 
  80    with a single ~age~ command with options causing all ~n~ pubkeys to
 
  81    be recipients. In order to decrypt the data, any individual private
 
  82    key could be used. No coordination between key owners would be
 
  85 2. Sequential encryption: Given a set of ~n~ public keys, encrypt data
 
  86    with ~n~ sequential ~age~ commands all piped in series with each
 
  87    ~age~ command utilizing only one of the ~n~ public keys. In order
 
  88    to decrypt the data, all ~n~ private keys would be required to
 
  89    decrypt the data. Since coordination is required, it is less
 
  90    convenient than parallel encryption.
 
  92 In either case, a directory would be useful for holding configuration
 
  93 files specifying how to execute which or combination of which features
 
  94 at the start of every buffer round.
 
  96 I don't yet know how to program the rules, although I think it'd be
 
  97 easier to simply add an option providing ~bkgpslog~ with a directory
 
  98 to watch. When examining the directory, check for a file with the
 
  99 appropriate file extension (ex: .pubkey) and then read the first line
 
 100 into the script's pubKey array.
 
 102 2020-07-12T21:08Z; bktei> ~-R~ watch directory option added in ~bkgpslog~ ver
 
 105 ** DONE Feature: Simplify option to reduce output size
 
 106    CLOSED: [2020-07-12 Sun 21:15]
 
 108 ~gpsbabel~ [[https://www.gpsbabel.org/htmldoc-development/filter_simplify.html][features]] a ~simplify~ option to trim data points from GPS
 
 109 data. There are several methods for prioritizing which points to keep
 
 110 and which to trim, although the following seems useful given some
 
 111 sample data I've recorded in a test run of ninfacyzga-01:
 
 114 gpsbabel -i nmea -f all.nmea -x simplify,error=10,relative -o gpx \
 
 115 -F all-simp-rel-10.gpx
 
 118 An error level of "10" with the "relative" option seems to retain all
 
 119 desireable features for GPS data while reducing the number of points
 
 120 along straightaways. File size is reduced by a factor of
 
 121 about 11. Noise from local stay-in-place drift isn't removed; a
 
 122 relative error of about 1000 is required to remove stay-in-place drift
 
 123 noise but this also trims all but 100m-size features of the recorded
 
 124 path. A relative error of 1000 reduces file size by a factor of
 
 128  67M relerror-0.001.kml
 
 129  66M relerror-0.01.kml
 
 133 797K relerror-100.kml
 
 134 152K relerror-1000.kml
 
 137 2020-07-12T21:13Z; bktei> Instead of programming data simplification
 
 138 in ~bkgpslog~, the data simplification step should be performed via
 
 139 ~bklog~'s ~-p~ option which specifies a processing command string to
 
 140 be ~eval~'d before data is compressed, encrypted, and written to
 
 141 disk. In other words, handling the simplification of data beyond
 
 142 allowing for a general command string specified by ~-p~ is outside the
 
 143 scope of ~bkgpslog~ or its successor ~bklog~.
 
 145 ** DONE Feature: Generalize bkgpslog to bklog function
 
 146    CLOSED: [2020-07-12 Sun 21:11]
 
 147 2020-07-05T02:42Z; bktei> Transform ~bkgpslog~ into a modular
 
 148 component called ~bklog~ such that it processes a stdout stream of any
 
 149 external command, not just ~gpspipe -r~. This would permit reuse of
 
 150 the ~bkgpslog~ code for logging not just GPS data but things like
 
 151 pressure, temperature, system statistics, etc. 
 
 152 2020-07-05T16:35Z; bktei>
 
 153 : bklog -r age1asdf -o log.tar # encrypt/compress stdin to log.tar
 
 154 : bklog -x -f log.tar -i age.key -O /tmp  # extract and decrypt
 
 156 Making ~bklog~ follow the [[https://en.wikipedia.org/wiki/Unix_philosophy][Unix philosophy]] means that it shouldn't care
 
 157 what kind of text is fed to it.
 
 159 *** ~bklog~ Design vs. Unix Philosophy
 
 160 **** Pubkey dir watching
 
 161 The feature of periodically checking a directory for changes in the
 
 162 pubkeys it contains should be justified by its usefulness; if the
 
 163 complexity cannot be justified then the feature should be removed.
 
 164 **** Defaults vs options
 
 165 Many options can cause the tool to become complex in unjustifiable
 
 166 ways. Currently I am adding options because I want the ability to
 
 167 modify the script's behavior without having to modify the source code
 
 168 on the machine in which the code is running. I should consider
 
 169 removing features at some point and having the program force defaults
 
 170 on the user. For example, allowing the specification of a temporary
 
 171 directory, while useful for me, is probably not useful for most people
 
 172 who don't know or care about the difference between ~/tmp~ and
 
 174 **** Script time to live (TTL)
 
 175 I initially implemented a script time-to-live feature because I was
 
 176 unsure in my ability to program script that could run for long periods
 
 177 of time without causing a runaway usage of memory. I still think it's
 
 178 a good idea to offer a script TTL option to the user but I think the
 
 179 default should be to simply run forver.
 
 181 2020-07-12T21:11Z; bktei> ~bklog~ script created and tested as of
 
 184 ** DONE TODO: Evaluate ~rsyslog~ as stand-in for this work
 
 185    CLOSED: [2020-07-12 Sun 21:09]
 
 186 2020-07-05T02:57Z; bktei> I searched for "debian iot logging" ("iot"
 
 187 as in "Internet of Things", the current buzzword for small low-power
 
 188 computers being used to provide microservices for owners in their own
 
 189 home) and came across several search results mentioning ~syslog~ and
 
 192 https://www.thissmarthouse.net/consolidating-iot-logs-into-mysql-using-rsyslog/
 
 193 https://rsyslog.readthedocs.io/en/latest/tutorials/tls.html
 
 194 https://serverfault.com/questions/20840/how-would-you-send-syslog-securely-over-the-public-internet
 
 195 https://www.rsyslog.com/
 
 197 My impression is that ~rsyslog~ is a complex software package designed
 
 198 to offer many features, some of which possibly might satisfy my
 
 201 However, as stated in the repository README, the objective of the
 
 202 ~ninfacyzga-01~ project is "Observing facts of the new". This means
 
 203 that the goal is not only to record location data but any data that
 
 204 can be captured by a sensor. This means the capture of the following
 
 205 environmental phenomena are within the scope of this device:
 
 207 *** Sounds (microphone)
 
 209 *** Temperature (thermocouple)
 
 210 *** Air Pressure (barometer)
 
 211 *** Acceleration Vector (acceleromter / gyroscope)
 
 212 *** Magnetic Field Vector (magnetometer)
 
 214 This brings up the issue of respecting privacy of others in shared
 
 215 spaces through which ~ninfacyzga-01~ may pass through. ~ninfacyzga-01~
 
 216 should encrypt data it records according to rules set by its
 
 219 One permissive rule could be that if ~ninfacyzga-01~ detects that a
 
 220 person (let's call her Alice) enters a room, it should add Alice's
 
 221 encryption public key to the list of recipients against which it
 
 222 encrypts data without Alice having to know how ~ninfacyzga-01~ is
 
 223 programmed (she might have a ~calkuptcana~ agent on her person that
 
 224 broadcasts her privacy preferences). Meanwhile, ~ninfacyzga-01~ may
 
 225 publish its observations to a repository that Alice and other members
 
 226 of the shared communal space have access to (ex: a read-only shared
 
 227 directory on a local network WiFi). Alice could download all the files
 
 228 in the shared repository but she would only be able to decrypt files
 
 229 generated when she was physically near enough to ~ninfacyzga-01~ for
 
 230 it to detect that her presence was within some spatial boundary.
 
 232 A more restrictive rule could resemble the permissive rule in that
 
 233 ~ninfacyzga-01~ uses Alice's encryption public key only when she is
 
 234 physically near by, except that it encrypts logged files against
 
 235 public keys in a sequential manner. This would mean that all people
 
 236 who were near ~ninfacyzga-01~ would have to pass around each log file
 
 237 to eachother so that they could decrypt the content.
 
 239 That said, according to [[https://www.rsyslog.com/doc/master/tutorials/database.html][this ~rsyslog~ page]], ~rsyslog~ is more a data
 
 240 wrangling system for collecting data from disparate sources of
 
 241 different types and outputting data to text files on disk than a
 
 242 system committed to the server-client model of database storage. So, I
 
 243 think converting ~bkgpslog~ into a ~bklog~ script that appends
 
 244 encrypted and compressed data to a tar file for later extraction
 
 245 (possibly the same script with future features) would be best.
 
 247 2020-07-12T21:10Z; bktei> rsyslog is outside the scope of what
 
 248 ~bkgpslog~ does (record location observations). A different tool
 
 249 should be used to retrieve and synchronize data. The dumb storage
 
 250 method of "tar files in a syncthing folder" works for now.
 
 251 ** TODO: Place persistent recip. updates in asynchronous coproc
 
 252 2020-07-06T19:37Z; bktei> In order to update the recipient list, the
 
 253 magicParseRecipientDir() function needs to be run each buffer period
 
 254 in order to scan for changes in the recipient list. However, such a
 
 255 scan takes time; if the magicGatherWriteBuffer() function must pause
 
 256 until magicParseRecipientDir() completes, then a significant pause
 
 257 between buffer sessions may occur, causing detectable gaps in location
 
 258 data between buffer rounds.
 
 260 I looked for ways in which I might start magicParseRecipientDir()
 
 261 asynchronously immediately before running the data collection command
 
 262 and then collect its output at the start of the next buffer round. One
 
 263 way using the ~coproc~ Bash built-in is described [[https://stackoverflow.com/a/20018504/10850071][here]]. I'd have to
 
 264 make the asynchronous function output the recipient list to stdout
 
 265 which would then be ~read~ into the ~recPubKeysValid~ array in the
 
 266 main loop. However, for now, I'm putting the magicParseRecipientDir()
 
 267 as-is in the main loop and accepting the delay for now.
 
 269 ** Initialize environment
 
 271 **** Save timeStart (YYYYmmddTHHMMSS±zz)
 
 273 **** Define Debugging functions
 
 274 **** Define Argument Processing function
 
 275 **** Define Main function
 
 277 *** Process Arguments
 
 278 *** Set output encryption and compression option strings
 
 279 *** Check that critical apps and dirs are available, displag missing ones.
 
 280 *** Set lifespans of script and buffer
 
 281 *** Init temp working dir ~DIR_TMP~
 
 282 Make temporary dir in tmpfs dir: ~/dev/shm/$(nonce)..bkgpslog/~ (~DIR_TMP~)
 
 283 *** Initialize ~tar~ archive
 
 284 **** Write ~bkgpslog~ version to ~$DIR_TMP/VERSION~
 
 285 **** Create empty ~tar~ archive in ~DIR_OUT~ at ~PATHOUT_TAR~
 
 287 Set output file name to:
 
 288 : PATHOUT_TAR="$DIR_OUT/YYYYmmdd..hostname_location.gz.age.tar"
 
 289 Usage: ~iso8601Period $timeStart $timeEnd~ 
 
 291 **** Append ~VERSION~ file to ~PATHOUT_TAR~
 
 293 Append ~$DIR_TMP/VERSION~ to ~PATHOUT_TAR~ via ~tar --append~
 
 295 *** Read/Write Loop (Record gps data until script lifespan ends)
 
 296 **** Determine output file paths
 
 297 **** Define GPS conversion commands
 
 298 **** Fill Bash variable buffer from ~gpspipe~
 
 299 **** Process bufferBash, save secured chunk set to ~DIR_TMP~
 
 300 **** Append each secured chunk to ~PATHOUT_TAR~
 
 301 : tar --append --directory=DIR_TMP --file=PATHOUT_TAR $(basename PATHOUT_{NMEA,GPX,KML} )
 
 302 **** Remove secured chunk from ~DIR_TMP~