Architecture and design philosophy#

This page explains why femorph-solver is shaped the way it is — the design choices that aren’t obvious from reading any one source file. It’s aimed at contributors landing code, and at sophisticated users debugging unexpected results who need to know which layer to suspect.

If you’re reading this to find your way around the codebase the first time, read it top-to-bottom; if you’re chasing a specific question, the section headings are deliberately granular so you can jump straight to the relevant decision.

Design principles#

Three principles steer every layer. They are listed in priority order; when they conflict, the one higher in this list wins.

Accuracy first. The first acceptance test for any element kernel is “does this match the closed-form / NAFEMS reference to N-digit precision?”, before anything about cross-solver parity. Speed and ergonomics are negotiated only after the numbers are known to be right.
Lean on existing libraries. pyvista, scipy.sparse, ARPACK, Pardiso, MUMPS, and threadpoolctl are mature, well-tested, and have a much larger contributor community than femorph-solver ever will. We write code where the libraries don’t already do the job — element kernels, neutral assembly, foreign-deck readers — and we hand off everywhere else.
No vendor lock-in in the data path. The public API speaks topology-first names (HEX8, BEAM2, …) and a UnitSystem stamp. Vendor catalogue spellings (SOLID185, BEAM188, PSHELL, …) appear only at the foreign-deck import boundary. That keeps the kernel layer free of “if-MAPDL else if-NASTRAN” branches.

Layer cake#

Top-down, the layers are:

Six-layer femorph-solver architecture, top-down: user code, public Model API, element kernels, sparse assembly, solver backends, result objects. — The femorph-solver layer cake. Data flows top-down: user code calls into `femorph_solver.Model`, which dispatches to element kernels for `K_e` / `M_e`, the sparse assembly aggregates them into `K` / `M` / `F`, the backend tier solves the linear / eigen / time-integration problem, and the result-object layer wraps the answer for the user.#

For a text-only rendering of the same stack (useful when diff-ing the source RST):

+----------------------------------------------------------+
|  User code (pyvista grid, materials, BCs, loads)         |
+----------------------------------------------------------+
|  femorph_solver.Model                                    |  ← public API
|    .from_grid / .assign / .fix / .apply_force            |
|    .solve / .solve_modal / .solve_harmonic / ...         |
+----------------------------------------------------------+
|  Element kernels (per topology: HEX8, BEAM2, ...)        |
|    K_e, M_e, strain_batch, stress_batch                  |
+----------------------------------------------------------+
|  Sparse assembly (scipy.sparse.csr_array)                |
|    K, M, F                                               |
+----------------------------------------------------------+
|  Linear / eigen / time-integration backends              |
|    Pardiso, CHOLMOD, MUMPS, UMFPACK, SuperLU,            |
|    ARPACK, LOBPCG, PRIMME, dense LAPACK,                 |
|    Newmark-β, modal superposition                        |
+----------------------------------------------------------+
|  Result objects (lazy IO over pyvista_zstd .pv)          |
|    StaticResult, ModalResult, TransientResult, ...       |
+----------------------------------------------------------+

Crossing one layer always goes through the layer immediately above or below it. An element kernel never calls a linear-solver backend directly; the assembly layer handles that. A solver backend never calls an element kernel directly; it only sees the assembled \(K\) and \(M\).

Foreign-deck importers (from_bdf, from_inp, from_cdb, from_fem) sit outside this stack. They translate vendor input grammar into the neutral Model API and disappear; the solver never knows which reader produced the model.

Why pyvista as the model container#

Three pieces of state are essential to a finite-element model: geometry, connectivity, and per-element / per-node attribute arrays. pyvista.UnstructuredGrid (which wraps vtkUnstructuredGrid) already provides all three:

points, cells, and cell types as compact int32 arrays — same representation VTK has used for thirty years.
point_data / cell_data / field_data — labelled, typed arrays attached at the right level (per-node / per-element / per-mesh).
fast IO via the existing VTK / pyvista format ecosystem.

Building a custom node / element table on top of NumPy or C++ would duplicate every piece of that, give us a worse story for downstream visualisation, and reinvent file IO that already works. The trade-off is that we accept some pyvista-specific conventions (field-data dictionaries are 1-element arrays, cell connectivity is flat, …); but those conventions are documented and stable.

The Model class augments the grid with three dictionaries — element type, material, real constant — keyed by ANSYS-style integer ids. That’s the minimum bookkeeping the kernel layer needs to know which formula to apply to which cell.

The single deviation from “the grid is the source of truth” is the incremental-build staging dicts (_staged_nodes, _staged_elements). Adding nodes one at a time via the legacy N / E builders is too slow if every call rebuilds the grid; the staging dicts buffer additions until the next grid-property read or solve, where they are flushed into the grid in one pass. After flush, the dicts are empty for the rest of the model’s lifetime — the grid is the truth again.

Why scipy.sparse for K and M#

scipy.sparse.csr_array is the library standard for sparse linear algebra in Python; everything in the solver-backend tier (scikit-sparse / pypardiso / MUMPS / pyamg / ARPACK / LOBPCG / PRIMME) accepts it natively. Building a custom sparse type would require us to implement (or marshal) every matrix-vector, matrix-matrix, and factorise call those backends already do.

The element-kernel layer assembles into CSR via the standard COO triplets pattern (per-element i, j, v arrays concatenated and converted to CSR once). Free-DOF reduction applies a contiguous mask via fancy indexing. Both operations are O(nnz) and well-optimised in scipy.

Where scipy.sparse is genuinely insufficient — e.g. block-CSR for the per-element 3-by-3 sub-matrices on a HEX8 — the kernel tier keeps a denser local representation and only converts to the global CSR at scatter time.

Why pluggable solver backends#

A linear-elastic FE problem on a 1 M-DOF mesh runs anywhere from “effortless” to “out of memory” depending on the linear-solver backend. Pinning to one backend forces a worst-of-both compromise: either we depend on a heavy MKL / Pardiso install on every machine, or we ship only the slowest fallback (SuperLU).

The fix is a thin registry pattern in femorph_solver.solvers.linear and femorph_solver.solvers.eigen. Each backend is one file with an available() probe and a factor / solve / decompose entry point. linear_solver="auto" walks the registry in priority order (Pardiso → CHOLMOD → UMFPACK → SuperLU) and picks the first one whose available() returns True. The user never has to know which one ran unless they care.

This also means adding a backend is a single new file plus a registry entry; nothing in the core has to change.

C++ extension (`_core`) — what’s in scope, what isn’t#

The compiled extension exists for one reason: per-element Ke / Me / strain-batch loops are tight enough numerically that Python-level dispatch dominates. Moving the inner loops to C++ keeps the assembly path linear in element count without re-implementing a sparse-matrix library.

What lives in _core:

Per-element shape-function / Jacobian / B-matrix evaluation in the topologies where it pays — HEX8, HEX20, TET10, the larger meshes’ hot path.
Per-element Ke / Me block computation as a row of an output buffer the Python side aggregates.
Strain-batch evaluation at the element nodes, used by femorph_solver.recover.compute_nodal_strain().

What stays in Python:

Element registry, Model class, BC / load bookkeeping, foreign-deck readers — all the surface a user ever touches.
Sparse assembly (the CSR matrix is built in Python; the C++ side only fills per-element block buffers).
Solver backend dispatch, including the available() probes.
Result classes and IO.

The boundary is enforced by keeping the C++ side data-only-in-data-only-out: every callable takes plain numpy arrays and returns plain numpy arrays. No Python objects, no function pointers, no callback invariants. That keeps the binding surface (currently nanobind) thin and lets us swap implementations without touching Python code.

Result classes — why disk-backed?#

For small problems (under ~50 k DOFs) keeping the result in memory is fine and the disk-backed result class would be overhead. For the realistic problems the project targets (cyclic-symmetry rotors, plate-with-hole verification, anything past 1 M DOFs), a single mode shape can be 100 MB; every mode in memory is gigabytes.

The disk-backed result classes (ModalResult and friends) sidestep that by storing each per-mode / per-step / per-frequency array as a separately-readable point-data field on a zstd- compressed pyvista file. Loading a single mode reads that mode’s array and nothing else. pyvista_zstd’s selected_point_arrays gates the read at the format level — sibling arrays are never touched.

The same model also makes “save the result and inspect it later” trivial: the file is a self-contained pyvista mesh; users (and external visualisers) can open it directly.

Where to look when something’s wrong#

“Wrong number on a single element.” → start in src/femorph_solver/elements/<topology>.py. Each element page in Element kernels has the formulas it implements; the kernel file is the literal implementation.
“Wrong number on the global K / M.” → assembly is in src/femorph_solver/model.py (_assemble) and the C++ extension’s assemble_* entry points. Always reproduce with a 1- or 2-element mesh first; the assembly bug only manifests at element boundaries.
“Right K / M, wrong displacement / spectrum.” → look at the linear / eigen backend that linear_solver="auto" picked. Report() reports which backends are available; force a specific one (linear_solver="superlu") to bisect against the fastest path.
“Right answer in memory, wrong answer when read back from .pv.” → result IO is in src/femorph_solver/result/ and the write_* functions in each subclass. Result.metadata shows what the writer actually stamped on the file.
“Reader produces the wrong mesh from a vendor deck.” → look in src/femorph_solver/interop/<vendor>/. The reader is the thinnest sane wrapper around the vendor-grammar parser; the bug is usually in the parser’s output, not in our translation.

Each of these layers is independently regression-tested under tests/ so a fix isolated to one layer should not require changes in another.

Testing strategy#

Three test pillars, each with its own success criterion:

Analytical / closed-form regression (tests/analytical/, tests/validation/). Every assertion traces to a published formula. This is the first-principles pillar — it doesn’t triangulate against another solver, it asks “do we match what the textbook says”.
Verification manuals (tests/cross_solver/, tests/interop/mapdl/). Re-author published verification- manual problems (NAFEMS LE / AVM, MAPDL VM, NASTRAN VG) natively and assert published targets. Tracks per-VM coverage in Vendor verification-manual coverage matrix.
Cross-solver round-trip (tests/interop/). Import the same problem from a NASTRAN .bdf, an Abaqus .inp, and a MAPDL .cdb; assert the spectrum / displacement field agrees with the closed-form to the same tolerance as a hand-built mesh. Cross-solver parity is a consequence of correctness, not the proof of it.

Coverage is monitored on every PR via the diff-cover step in tests.yml; the headline coverage number is ≥ 90 % across the package and 100 % on the result hierarchy.

Provenance and IP discipline#

Everything in femorph-solver is independently developed from public FEM literature. The Provenance inventory page tracks every non-trivial numerical kernel back to a published reference (Bathe, Zienkiewicz, Hughes, Cook, Rao, Timoshenko, …). Nothing is reverse-engineered or reproduced from confidential vendor material; SOURCES.md at the repo root is the authoritative sourcing policy.

This isn’t optional — see Verification and the fair-use citation pattern. PRs that introduce a new kernel without a citation block are bounced.