Physics Derivation Graph navigation Sign in

MVP for PDG with SQL

Published 2019-12-12T03:22:00.002Z by Physics Derivation Graph

Currently I have a Docker container that runs flask on Ubuntu to present a web interface that uses forms to enter information.
A Python script on the backend handles conversion from string LaTeX to PNG using dvipng, with graphviz generating the static graph PNG.

The other major component of the backend is an sqlite3 database that holds the data when the container is offline. I don't have experience with SQL, so I need a plan to get to the minimum viable product.

The purpose of the sqlite3 file is to store the multiple tables offline.  I could use a Python Pickle file, but that would be specific to Python; the sqlite approach seems more portable and generic.

The only actions I need are
SQL tables <--> Python data structures <--> graph structure <--> graph viz, website generation, UI web

On startup, read data into Python from sqlite.
After that, every time there is a change to the structure in Python, write to sqlite.
This approach is not elegant compared to "write only diff" or "write at end of session" but it eliminates any possibility of inconsistency.
This approach doesn't scale for large databases or multiple users, but those aren't problems I need to solve right now (I'm intentionally incurring technical debt).

If I'm using SQL to store data structures from Python, I'll need to enumerate the table schemas. See
which shows the tables

Reviewing the options described on
I don't know which is applicable.

Motives for SQLite use:

SQLite options

From the perspective of file management, having one file feels cleaner than a file per derivation. 

5 tables in 1 SQLite file

One option is to implement 5 table schemas:
I suspect this layout of tables is suboptimal -- having the "derivation name" repeating in a column is an indicator that the table count should be 2+3*D to eliminate duplication (rather than 5). This 2+3*D (where "D" is the number of derivations) design is also apparent in the "dict of derivations" structure described below. My motive for using 5 is that if I use 2+3*D, the table names are not static.

2+3*D tables in 1 SQLite file

Two tables are independent of derivations:
And 3 tables are needed per derivation. Problem with this is that the name of the tables isn't known in advance.

2 tables in 1 SQLite file; 3 tables in D SQLite files

Same as previous option, except instead of a single SQLite file, the derivations are in separate files.

SQLite to Python

These tables in SQL are equivalently stored in Python as three data structures:

where each <step> has the structure 
{'inf rule': 'this inf rule', 
input: [{'expr local ID': 942, 'expr ID': 59285924}], 
output: [{'expr local ID': 218, 'expr ID': 954849}]}