navigation / documentation overview / design documentation

Recommendation: Read the user documentation and FAQ first. This page assumes familiarity with the jargon used in the Physics Derivation Graph.

This page provides background context for design decisions and implementation choices associated with the Physics Derivation Graph (PDG). Contributions to the project are welcome; see CONTRIBUTING.md on how to get started. The Physics Derivation Graph is covered by the Creative Commons Attribution 4.0 International License, so if you don't like a choice that was made you are welcome to fork the Physics Derivation Graph project.

Design Principles Documentation for the Physics Derivation Graph

This page enumerates design principles and goals for the Physics Derivation Graph (PDG).
The list is unordered.

Therefore

Design Choices for the Physics Derivation Graph

Contents

 

Goal

Restating the goal from the front page of https://allofphysics.com/,

"Write down all known mathematical physics in a way that can be both read by humans and checked by a computer algebra system."

This page describes the current status and historical evolution of design decisions critical to the Physics Derivation Graph.

 


 

Decisions are not made independently; one choice informs others. Below is a visual overview of the relations among various decisions.

legend for shapes:

colors:

rule of connectivity for this graph:

 


 

Decision: What constitutes "all known mathematical physics"?

For a list of topics, see https://en.wikipedia.org/wiki/Branches_of_physics.

Notationally, mathematical physics includes Dirac notation, calculus, differential equations, algebra, trigonometry. (There is https://en.wikipedia.org/wiki/List_of_common_physics_notations but there's not comprehensive coverage.
See also Supported mathematical features

Geometry is certainly part of mathematical physics but the developer of this project hasn't figure out how to incorporate geometry.
See the full list of not in scope.

 


 

Decision: What constitutes "spanning" for all known mathematical physics?

Merely having derivations in each domain (e.g., quantum, classical, relativistic) does not suffice.

Suppose we have a derivation in the domain of quantum mechanics and a derivation in the domain of classical mechanics. Suppose that there was a variable (e.g., length) common to the two derivations.

Suppose we have a derivation in the domain of quantum mechanics and a derivation in the domain of classical mechanics. Suppose that there was an expression common to the two derivations.

Is there a single derivation that involves expressions that are quantum and expressions that are classical?

See also spanning the topics and assumptions of Physics, dichotomy of assumptions, and finding major edges of the Physics Derivation Graph.

 


 

Decision: What does "human-readable" mean?

Reasonable for a human to understand without use of specialized knowledge.

Raw Latex (like \int_0^{\infty}) is not understandable to everyone, whereas \( \int_0^{\infty} \) is). Similarly, raw contentML is not human-readable.

 


 

Decision: What does "checkable by CAS" mean?

In a derivation the mathematical steps can be verified as correct using symbolic mathematical software. (See which CAS for examples of applicable Computer Algebra Systems.)

Once a CAS is introduced, there are multiple aspects that can be checked:

The relevance of checking the math is to distinguish from just writing down symbols and expressions.

Why this matters: Reliability of machine-verified logic, Reproducibility, and Accessibility (no leaps of logic).

Decision: a CAS is not sufficient verification

A CAS is not sufficient as it may report \( 1 = x/x \) as true, even though \( x = 0 \) is false. That is why a proof assistant is necessary. (See which proof assistant?.)

 


 

Decision: Which Computer Algebra System(s) to use?

current status: using SymPy.

https://github.com/allofphysicsgraph/task-tracker/issues/117

See comparison of CAS like Mathematica, MathCad, Sage, Maple, SymPy.

 


 

Decision: Which proof assistant(s) to use?

current status: using Lean 4.

https://github.com/allofphysicsgraph/task-tracker/issues/106

See comparison of Rocq, Isabelle, Lean

 


 

Decision: How to represent expressions?

current status: using Latex.

See comparison of Latex, Content MathML, Presentation ML

 


 

Decision: How to render graphs visually?

current status: using d3js and graphviz.

https://github.com/allofphysicsgraph/task-tracker/issues/97

d3js, graphviz, networkx

 


 

Decision: Which Property Graph database?

current status: using Neo4j.

https://github.com/allofphysicsgraph/task-tracker/issues/43

See comparison of Neo4j

 


 

Decision: Whether to cache of results

current status: Not caching generated information.

Caching could lower latency for users of the website. However, to eliminate a risk of incorrect caching, validation or checking or queries are done at the query time.

 


 

Historical Decision: Upgrade path from JSON/SQL to Neo4j

The repo ui_v7 used JSON/SQL and ui_v8 used Neo4j. The ui_v7 repo had a working implementation of Google authentication. I had trouble getting Google authentication working in ui_v8, and I didn't want to have to refactor all the static content from ui_v7, so I decided to create a new repo, "allofphysics.com" (now renamed to https://github.com/allofphysicsgraph/combined_v7_JSON_and_v8_neo4j.

Although getting Neo4j into ui_v7 was do-able, the "mash together to repos" ended up being a bad decision from a troubleshooting and cleanliness-of-design experience.

 


 

Decision: Naming of repo

I decided to name the repo "allofphysics.com" to make it clear which repo hosted the website on the Internet. Also, the alternative repo name "a mashup of v7 and v8" wasn't a good name, though it would have been more descriptive.

In retrospect, that was a bad naming convention because I later reverted to using ui_v7 for the website.

The "allofphysics.com" has been renamed to https://github.com/allofphysicsgraph/combined_v7_JSON_and_v8_neo4j.

As of 2026-02-06, ui_v8 is in use.

 


 

Decision: Page scope

current status: using page-per-decision.

Rendered HTML pages should have a single scope. (As opposed to a single-page website.)

 


 

Decision: How to display Latex on webpages?

current status: using MathJax.

 


 

Decision: Which VPS provider service company to use?

current status: Currently using Hetzner.

Example options: DigitalOcean, AWS, Oracle, Azure

https://github.com/allofphysicsgraph/task-tracker/issues/56

https://physicsderivationgraph.blogspot.com/2026/01/vps-price-comparison-september-2024.html
https://physicsderivationgraph.blogspot.com/2026/01/vps-price-comparison-january-2026.html

 


 

Decision: Separate some pages into a separate Python file?

current status: flask routes are in single file (pdg_app.py).

Options: have different categories of routes in separate .py scripts, or have all routes in a single .py script.

Having all the routes in a single file (e.g., pdg_app.py) results in a huge file with thousands of lines.
To make this more managable, Flask provides a way to separate routes into separate files (e.g., pdg_other_routes.py) using blueprints.

As an example of how this woule be enacted, suppose the file pdg_app.py contains

from pdg_other_routes import other_routes_bp
web_app.register_blueprint(other_routes_bp)

and pdg_other_routes.py contains

other_routes_bp = Blueprint("other_routes", __name__)

@other_routes_bp.route("/api_via_js")
def to_api_via_js() -> str:
    return render_template("js_with_api/api_js.html")

The cost of this separation is that Flask namespaces the endpoints to ensure there are no name collisions between different files. The url_for function in Jinja2 templates (and in the Python code) now expects the format:

other_routes.function_name

That would require figuring out which url_for functions point to routes in pdg_other_routes.py and which routes are in pdg_app.py

Therefore, having all routes in pdg_app.py is easier.

 


 

Decision: Which languages to use?

current status: using Python, HTML, Latex.

Languages I'm comfortable with and are widely used - Python; HTML; Docker

 


 

Decision: Interface

current status: primarily using web UI. Exploring use of API.

Plan to support API.

Why web ui? Why not a GUI? Or command line?

Accessibility

 


 

Decision: Not hiring contractors to enact features

current status: I am not hiring a designer for the web front-end because that would be premature at this point. I'm still in the exploration prototyping phase to figure out what the front end will need to be able to do.

 


 

Decision: Which data structure to use in the Physics Derivation Graph

current status: the Physics Derivation Graph is stored in a Neo4j Property Graph database.

(There are intermediate data structures in pdg_app.py that interface with the jinja2 HTML pages.)
Not clear whether the site could just be using Neo4j directly by the web UI.

The many alternatives data storage options (e.g., SQLite, Redis, Python Pickle, CSV, GraphML) offer trade-offs, a few of which have been explored.

 


 

\(\rm\LaTeX\) representation of expressions

There are multiple choices of how to represent a mathematical expression. The choices feature trade-offs between conciseness, ability to express the range of notations necessary for Physics, symantic meaning, and ability to use the expression in a computer algebra system (CAS). See the comparison of syntax. \(\rm\LaTeX\) was selected primarily because of the common use in Physics, display of complex math, conciseness, and expressiveness. The use of \(\rm\LaTeX\) means other tasks like parsing symbols and resolving ambiguity are harder.

 


 

Decision: Which objects should be represented?

There are a few obvious objects that need to be accounted for, like derivation, steps, inference rule, feed, and expression.

Beyond those there are objects that could be a either node in the graph or a property of a node. For example, should (LHS, relation, RHS) be separate nodes or properties of an expression? A framing that motivates the choice is whether a user may want to query LHS separately from the expression. The trade-off is that additional nodes better support custom queries but then incur more queries to extract information relevant for typical workflows.

Another framing to motivate the node-or-property decision is the nodes can have properties but properties cannot have edges. For example, if LHS is a property of expression, then a symbol-as-node has to be related to the expression rather than LHS.

:expression (LHS) -> HAS_SYMBOL -> :symbol (x)

versus

:expression -> HAS_SIDE -> LHS -> HAS_SYMBOL -> :symbol (x)

If "symbol" is a node, then is a relation is a symbol? Should the relation be a property of the expression-as-node, or should the schema be

:expression -> HAS_RELATION -> :relation (=)

 


 

Decision: Supported Mathematical features

Many but not all symbols are supported. Here are some categories of supported symbols. (Symbols exist on the left side or right side of an expression and can also appear in a feed.)

Operators. These act on symbols (listed above) and are part of either the left side or right side of an expression. Operators can also appear in feeds.

Relations. These evaluate to True for an expression. Relations do not appear in a feed.

To check the above documentation against the code, see https://github.com/allofphysicsgraph/ui_v8_website_flask_neo4j/blob/gh-pages/webserver_for_pdg/library/list_of_valid.py

 


 

Decision: Outside of Current Scope

Although the Physics Derivation Graph is intended to be comprehensive across domains, there are aspect of Physics not within the current scope of the project:

These aspects could be included if the data structure and workflow were adapted to an expanded scope.