Design¶
The design of the system is based on the following principles:
Focus on auto-generated API: the user should have a minimal effort to add their Rust project to Sphinx.
Minimal configuration: the user should not have to write a lot of configuration to get started, and the default configuration should be good enough for most projects.
Good looking documentation: the generated documentation should be easy to read and navigate. To this end, the layout in https://docs.rs/ is used as a reference, as the “gold-standard” for Rust documentation.
Support both reStructuredText and Markdown docstrings: the user should be able to write their API documentation in either reStructuredText or MyST Markdown.
Integrate well with the Sphinx caching system: the system should be able to take advantage of the Sphinx caching system to avoid re-generating the API documentation when it is not necessary.
Integrate well with the Sphinx warning system: the system should be able to generate warnings when reading / resolving the API documentation, with warnings pointing to the source code location of the issue.
Have a clear separation of concerns: the system should have a clear separation between the analysis of the Rust project, and the generation of the Sphinx documentation. This separation is realised by the use of a high-level representation of the API, which is independent of the Sphinx documentation generation.
The core components of the system are:
crates/analyzera rust package that analyzes the source code of a Rust project (usingsyn) and generates a “high-level representation” of the API, which is serializable/de-serializable to disk (usingserde).crates/py_bindinga rust package that provides a Python binding to the analyzer, usingpyo3.python/sphinx_rusta Python package that provides a Sphinx extension to generate documentation from the high-level representation of the API, bundlingcrates/py_bindingto integrate the analyzer within Sphinx.
The build steps within Sphinx are:
- At the start of the Sphinx build (
builder-initedevent), sphinx_rustcalls the analyzer to generate the high-level representation of the API, and writes it to disk, within the Sphinxbuilddirectory.
- At the start of the Sphinx build (
- Also at the start of the Sphinx build, after the analysis.
sphinx_rustuses the high-level representation, to generate folders and files within the Sphinxsourcedirectory, that provide the outline of the API documentation. For example a folder per crate, and a file per Rust object (module, struct, enum, trait, function, etc.).
- During the Sphinx read-phase, when a “directive” is encountered that requires the API documentation,
sphinx_rustreads the high-level representation from disk, and generates the docutils AST for the Rust object that the directive specifies. It also stores relevant information in the Sphinx environment, to be used during the resolve-phase.
- Also during the Sphinx read-phase, when a “cross-reference role” is encountered that requires the API documentation,
sphinx_rustgenerates apending_xrefnode that will be resolved during the resolve-phase.
- During the Sphinx resolve-phase, when a “pending_xref” node is encountered,
sphinx_rustuses the information stored in the Sphinx environment to resolve the cross-reference, or generate a warning if the cross-reference cannot be resolved (or the resolution has multiple matches).
Technical note on the analyzer¶
One annoying technical limitation of the current analyzer,
compared to what rustdoc does, is the “expansion” of macros.
rustdoc integrates directly into the compiler, to generate its high-level representation of the API,
which is generated after the macros have been expanded: https://rustc-dev-guide.rust-lang.org/rustdoc.html.
See the source code here: https://github.com/rust-lang/rust/blob/25087e02b34775520856b6cc15e50f5ed0c35f50/src/librustdoc/lib.rs#L773.
This is difficult to hook into from a third-party crate, as it requires modifying the compiler itself. You can view a debug of this intermediate representation like so:
`bash
rustup run nightly cargo rustc -- -Z unpretty=hir-tree
`
but without actually building the compiler, it’s hard to get this information in a usable form (e.g. a serialized JSON).
Perhaps this can be achieved at a later date, and at least the separation of concerns in the design allows for this to be swapped out.