Validation framework#
femorph-solver ships a declarative validation framework that couples canonical FEA problems to their published reference values — closed-form textbook solutions, handbook tables, and peer-reviewed journal results — and runs mesh-convergence studies against them.
This page documents the framework. Runnable examples that
exercise it live under tests/analytical/; a future
“verification” gallery section will render the convergence
reports inline on the docs site (tracked as TA-10h in the
TODO).
Why a framework#
femorph-solver has three separate pillars of evidence:
Unit tests — prove that individual functions do what their docstrings claim.
Regression tests (
tests/analytical/) — prove that the whole pipeline produces the right physics on canonical problems.Validation framework (this page) — proves that the pipeline converges to a published reference value at the theoretical rate.
The first two catch regressions. The third one gives users verifiable tolerance claims: “on this mesh, at this refinement, the expected relative error vs the published closed form is \(< \\epsilon\), and the fitted convergence rate is \(p = 2/d\) as predicted by conforming-FE theory (Strang & Fix 1973).” Those are the claims you’d cite in an academic paper or a V&V report.
Architecture#
Three dataclasses and one ABC compose the framework:
PublishedValueA single reference quantity — value, unit, source citation, formula, acceptance tolerance. Immutable. The
sourcestring must uniquely identify the publication so a reviewer can find it without further lookup.BenchmarkProblem(ABC)A declarative problem statement. Subclasses declare a tuple of
PublishedValue, implementbuild_model()(construct aModelready to solve), and implementextract()(pull a named quantity from the solve output). The base class providesvalidate()which composesbuild_model→solve/modal_solve→extractand returns oneValidationResultper published quantity.ValidationResultOne validation attempt at a specific mesh refinement. Carries the published value, the computed value, the mesh parameters, the DOF count, and a
passedpredicate that compares the relative error against thePublishedValue’s tolerance.ConvergenceStudySweeps a
BenchmarkProblemacross a list of mesh refinements, captures aValidationResultat each, and fits a log-log convergence rate \(|e| \\sim n_\\text{dof}^{-p}\) over the two finest refinements. Returns oneConvergenceRecordper published quantity.
Example#
from femorph_solver.validation import ConvergenceStudy
from femorph_solver.validation.problems import CantileverEulerBernoulli
study = ConvergenceStudy(
problem=CantileverEulerBernoulli(),
refinements=[
{"nx": 20, "ny": 3, "nz": 3},
{"nx": 40, "ny": 3, "nz": 3},
{"nx": 80, "ny": 3, "nz": 3},
],
)
records = study.run()
for rec in records:
pv = rec.results[0].published
print(f"\\n{pv.name} — {pv.source}")
print(f" formula: {pv.formula}")
print(f" published: {pv.value:.6g} {pv.unit}")
for r in rec.results:
print(
f" n_dof={r.n_dof:>6} computed={r.computed:+.6g}"
f" rel_err={r.rel_error * 100:+6.2f}%"
f" {'pass' if r.passed else 'FAIL'}"
)
if rec.convergence_rate is not None:
print(f" fitted rate: |err| ∝ n_dof^(-{rec.convergence_rate:.2f})")
The output quantifies two things the user can cite:
Point-wise accuracy: computed vs published at each mesh.
Convergence rate: empirical slope of the error-vs-DOFs log-log plot. For a conforming 3D FE the theoretical rate is \(p = 2/d = 2/3\); observed rates significantly below this signal an element / integration-rule / formulation issue.
Reporters#
femorph_solver.validation._report.write_report() produces
both JSON (machine-consumable, append-only, feeds the TA-2
estimator) and Markdown (drops into docs / PR descriptions /
release notes). Both formats carry the full
PublishedValue citation so the report is
self-describing: a reader with only the output file can
reproduce the check.
The JSON schema also carries per-refinement wall_s and
peak_rss_mb captured inside validate() — so validation
runs can feed the femorph_solver.estimators training
loader the same way benchmark runs do:
from femorph_solver.estimators import (
Estimator, load_training_rows, load_validation_rows,
)
rows = (
load_training_rows()
+ load_validation_rows(host_signature="workstation-a")
)
estimator = Estimator.fit(rows)
estimate = estimator.predict(problem_spec, host_spec)
This lets every canonical verification study contribute time / memory observations to the estimator fit alongside the benchmark sweeps — useful early on when a machine has only a handful of benchmark runs.
Problem catalogue#
The catalogue lives under
femorph_solver.validation.problems. Current entries:
Problem |
Source |
Verifies |
Status |
|---|---|---|---|
Hughes 2000 §2.7 |
3D Hooke’s law + Poisson contraction + σ=Eε stress recovery |
Machine precision (rel err ~1e-13) |
|
Irons & Razzaque 1972 |
Uniform strain recovered exactly on distorted patch |
Machine precision (abs err ~1e-19) |
|
Timoshenko 1955 §5.4 |
Tip deflection + root bending stress \(\\sigma = P L c / I\) |
Converges; ≲ 6 % deflection at nx=40, ~20 % root stress |
|
Rao 2017 §8.5 Table 8.1 |
First bending frequency |
Converges; ≲ 3 % at nx=40 |
|
Timoshenko & Woinowsky-Krieger 1959 §63 |
SS plate fundamental \(f_{1,1}\) |
|
|
Timoshenko & Woinowsky-Krieger 1959 §30-§31 |
SS plate centre deflection vs Navier double-sine series |
Converges; ≲ 15 % at 40×40×2 mesh |
Planned additions (tracked as TA-10h-* in
TODO-parity-perf.md):
Cylindrical shell modes (Soedel 2004 §5).
Kirchhoff plate on a proper shell kernel (paired with MITC4 SHELL181).
Extending the catalogue#
Adding a new benchmark is three steps:
Identify the published reference. The source must be public — a textbook, paper, handbook, or open-technical-report.
Write a
BenchmarkProblemsubclass undersrc/femorph_solver/validation/_problems/. Declare thePublishedValuetuple with the formula in a__doc__block the docs can auto-reference; implementbuild_modelusing the nativeModelAPI; implementextractto pull the quantity from the solve output.Re-export the class from
src/femorph_solver/validation/problems.py. Add a regression test undertests/analytical/that pins the recommended mesh’s tolerance.
The framework is deliberately small — ~250 lines including the reporters — so the review burden for new entries stays on the physics rather than plumbing.