Documents
Here is how to get to your Findings documents on disk:
- In Findings, choose the menu Findings > Preferences
- In the preferences, choose the ‘Library’ tab
- Press the button ‘Reveal in Finder’ below the text field labeled ‘Documents:’
- This will select a folder called ‘Default’ in the Finder
Note that if you use Dropbox sync, this folder will be stored on Dropbox.
The ‘Default’ directory contains your actual documents, which are organized in sudirectories ‘Experiments’ and ‘Protocols’. Each document is a file package that contains your data, using the open-source key-value store implementation PARStore (more on this below). The attachments are also stored inside the file package corresponding to the experiment or protocol, inside a subdirectory ‘attachments’. Each of the experiments, protocols and attachments are uniquely identified using a universal identifier (aka UUID), which is a random, but unique, string of digits and letters representing 16 bytes in hexadecimal, for instance: 387700B5-8B21-463D-B898-5068FE4A5327
Derived Data
Here is how to get to the derived data for your Findings installation on your Mac, which contains various files that are used to display your documents, but are not part of your actual documents:
- In Findings, choose the menu Findings > Preferences
- In the preferences, choose the ‘Library’ tab
- Press the arrow button in the ‘Derived Data’ box
- This will select a folder called ‘Findings’ in the Finder
The ‘Persistent’ directory contains some non-critical, basic metadata about your library.
The ‘Derived’ directory contains non-critical derived information about your library, that is used to display your experiments and protocols, and search them.
The ‘Local’ directory contains your actual documents, if stored only on your machine. If you use sync, your documents will be stored in Dropbox (see above)
More about PARStore
Each of your document is a file package that contains your data, using the open-source PARStore. In short, PARStore is a key-value store, where the keys and values are stored in a bunch of SQLite databases. The source code can be found here: GitHub - cparnot/PARStore: Open source Dropbox-resistant key-value store for local storage, written in Objective-C for Mac OS X and iOS.. Now for the longer version…
To allow multiple apps, users or devices to write at the same time to the same PARStore, the implementation uses a simple concept which I call ‘file ownership’. This means that each individual device/app is allowed to read/write the files that it created, but it can only read (and not modify) other files. This is done by contract, not by using r/w filesystem settings. Because files can only ever be modified by one device, it really makes it very easy to avoid corrupting or overwriting the content over file syncing system like Dropbox. More on this topic here: https://vimeo.com/90417635 and NSConf: Rethinking Syncing - The Guinea Pig in the Cocoa Mine. The first half of the video explains the concept, and the second half gives example, one of which is basically PARStore.
To allow the implementation of file ownership in the case of PARStore, the first time an app launches on a device, it creates a unique identifier (UUID) and stores it inside the app settings (preference file or better, ~/Library/Application Support’). This UUID is what I will call the device identifier. For instance, Findings will use a different device identifier on 2 different Macs. But note that 2 different apps on the same device (for example Findings and a hypothetical app ‘Results’) would both use a different device identifier. So maybe you can think of ‘device’ identifier as a ‘participant’ identifier. Anyway. The device identifier is then used in all the PARStore files that a particular instance of the app will manipulate on a particular device. In the video linked above, I talk of device ‘A’ and device ‘B’, and those names could be used as device identifiers, but there is the risk they would not be unique enough So instead I use UUID, such as ‘0C6EEE94-7291-4097-A894-910D90AB6190’. These can be generated using the uuidgen
tool on the command-line (man uuidgen!).
With the concept of device identifier in place, here is how PARStore files are organized and where these identifiers are used. Each experiment or protocol is a different PARStore. A PARStore is not a single file, it’s a file package directory that contains 2 subdirectories:
-
directory ‘devices’ contains SQLite databases, each inside its own subdirectory which are named with the ‘device identifier’; an individual device can read all the databases, but it can only modify its own; this is how file ownership is enforced; the same app on a given app will always use the same device identifier across multiple launches
-
directory ‘blobs’ that can contain arbitray files and directories and is used for larger assets that don’t really fit in a database; in the case of Findings, there is a subdirectory ‘attachments’, which is used to store attachments for the experiment/protocol; each attachment is stored under its original name inside yet another subdirectory named with a UUID corresponding to its location inside the document (see more below: the UUID is the paragraph identifier); the ‘blobs’ directory does not enforce file ownership and it’s up to each device to do the right thing
The SQLite databases are then very simple, with just one table, and 4 columns:
- key: a string
- value: a data blob corresponding to a property list stored in binary format (for Python, see here: Reading Binary Plist files with Python - Stack Overflow)
- timestamp: 64 bit integer corresponding to the number of milliseconds since reference date January 1, 2001, 12:00 GMT
- parentTimestamp: timestamp of the previous value for that key, from which it was modified to the new value (this can be null); this column can be used to improve merging and provide some kind of history tree for a given key/value pair
To get the final key-value pairs, you simply read all the databases and collect the most recent key/value pairs for each key, based on timestamp. With PARStore, you can use any key you’d like. PARStore could be used for other apps beyond Findings.
In the case of Findings, each experiment or protocol (more generally a ‘document’) is a separate PARStore. The key/value pairs used by Findings are:
- uuid: the uuid of the document, string
- author: author name, string
- title: document title, string
- rating: float
- done: boolean
- aim: string
- summary: string
- projectTitle: string
- projectUUID: string
- categories: list of protocol subfields, array of strings
- layout: list of paragraph UUIDs, array of strings
- doneParagraphs: list of paragraph UUIDs, array of strings
- : each paragraph has a different identifier, had the content is stored as a dictionary/hash, in particular with the ‘content’ key (this UUID is also used for the attachment identification within the blobs directory); note that there might be paragraphs that are not in the experiment any more and have been deleted, in which case the UUID won’t be in the
layout
array
The list is not meant to be exhaustive, and more keys may be added in the future as well (which would be backward-compatible).