Validation framework
====================

femorph-solver ships a **declarative validation framework** that
couples canonical FEA problems to their **published** reference
values — closed-form textbook solutions, handbook tables, and
peer-reviewed journal results — and runs mesh-convergence
studies against them.

This page documents the framework.  Runnable examples that
exercise it live under ``tests/analytical/``; a future
"verification" gallery section will render the convergence
reports inline on the docs site (tracked as ``TA-10h`` in the
TODO).

Why a framework
---------------

femorph-solver has three separate pillars of evidence:

* **Unit tests** — prove that individual functions do what their
  docstrings claim.
* **Regression tests** (``tests/analytical/``) — prove that the
  whole pipeline produces the right physics on canonical problems.
* **Validation framework** (this page) — proves that the
  pipeline *converges to a published reference value* at the
  theoretical rate.

The first two catch regressions.  The third one gives users
**verifiable tolerance claims**: "on this mesh, at this
refinement, the expected relative error vs the published closed
form is :math:`< \\epsilon`, and the fitted convergence rate is
:math:`p = 2/d` as predicted by conforming-FE theory (Strang &
Fix 1973)."  Those are the claims you'd cite in an academic
paper or a V&V report.

Architecture
------------

Three dataclasses and one ABC compose the framework:

:class:`~femorph_solver.validation.PublishedValue`
    A single reference quantity — value, unit, source citation,
    formula, acceptance tolerance.  Immutable.  The ``source``
    string must uniquely identify the publication so a reviewer
    can find it without further lookup.

:class:`~femorph_solver.validation.BenchmarkProblem` (ABC)
    A declarative problem statement.  Subclasses declare a tuple
    of :class:`PublishedValue`, implement
    :meth:`~BenchmarkProblem.build_model` (construct a
    :class:`~femorph_solver.Model` ready to solve), and
    implement :meth:`~BenchmarkProblem.extract` (pull a named
    quantity from the solve output).  The base class provides
    :meth:`~BenchmarkProblem.validate` which composes
    ``build_model`` → ``solve`` / ``modal_solve`` → ``extract``
    and returns one :class:`ValidationResult` per published
    quantity.

:class:`~femorph_solver.validation.ValidationResult`
    One validation attempt at a specific mesh refinement.
    Carries the published value, the computed value, the
    mesh parameters, the DOF count, and a ``passed``
    predicate that compares the relative error against the
    :class:`PublishedValue`'s tolerance.

:class:`~femorph_solver.validation.ConvergenceStudy`
    Sweeps a :class:`BenchmarkProblem` across a list of mesh
    refinements, captures a :class:`ValidationResult` at each,
    and fits a log-log convergence rate :math:`|e| \\sim
    n_\\text{dof}^{-p}` over the two finest refinements.  Returns
    one :class:`ConvergenceRecord` per published quantity.

Example
-------

.. code-block:: python

    from femorph_solver.validation import ConvergenceStudy
    from femorph_solver.validation.problems import CantileverEulerBernoulli

    study = ConvergenceStudy(
        problem=CantileverEulerBernoulli(),
        refinements=[
            {"nx": 20, "ny": 3, "nz": 3},
            {"nx": 40, "ny": 3, "nz": 3},
            {"nx": 80, "ny": 3, "nz": 3},
        ],
    )
    records = study.run()

    for rec in records:
        pv = rec.results[0].published
        print(f"\\n{pv.name} — {pv.source}")
        print(f"  formula: {pv.formula}")
        print(f"  published: {pv.value:.6g} {pv.unit}")
        for r in rec.results:
            print(
                f"    n_dof={r.n_dof:>6}  computed={r.computed:+.6g}"
                f"  rel_err={r.rel_error * 100:+6.2f}%"
                f"  {'pass' if r.passed else 'FAIL'}"
            )
        if rec.convergence_rate is not None:
            print(f"  fitted rate: |err| ∝ n_dof^(-{rec.convergence_rate:.2f})")

The output quantifies two things the user can cite:

* **Point-wise accuracy**: computed vs published at each mesh.
* **Convergence rate**: empirical slope of the error-vs-DOFs
  log-log plot.  For a conforming 3D FE the theoretical rate is
  :math:`p = 2/d = 2/3`; observed rates significantly below this
  signal an element / integration-rule / formulation issue.

Reporters
---------

:func:`femorph_solver.validation._report.write_report` produces
both JSON (machine-consumable, append-only, feeds the TA-2
estimator) and Markdown (drops into docs / PR descriptions /
release notes).  Both formats carry the full
:class:`PublishedValue` citation so the report is
self-describing: a reader with only the output file can
reproduce the check.

The JSON schema also carries per-refinement ``wall_s`` and
``peak_rss_mb`` captured inside ``validate()`` — so validation
runs can feed the :mod:`femorph_solver.estimators` training
loader the same way benchmark runs do:

.. code-block:: python

    from femorph_solver.estimators import (
        Estimator, load_training_rows, load_validation_rows,
    )

    rows = (
        load_training_rows()
        + load_validation_rows(host_signature="workstation-a")
    )
    estimator = Estimator.fit(rows)
    estimate = estimator.predict(problem_spec, host_spec)

This lets every canonical verification study contribute
time / memory observations to the estimator fit alongside
the benchmark sweeps — useful early on when a machine has
only a handful of benchmark runs.

Problem catalogue
-----------------

The catalogue lives under
:mod:`femorph_solver.validation.problems`.  Current entries:

.. list-table::
   :widths: 28 22 40 10
   :header-rows: 1

   * - Problem
     - Source
     - Verifies
     - Status
   * - :class:`~femorph_solver.validation.problems.SingleHexUniaxial`
     - Hughes 2000 §2.7
     - 3D Hooke's law + Poisson contraction + σ=Eε stress recovery
     - Machine precision (rel err ~1e-13)
   * - :class:`~femorph_solver.validation.problems.IronsPatchTest`
     - Irons & Razzaque 1972
     - Uniform strain recovered exactly on distorted patch
     - Machine precision (abs err ~1e-19)
   * - :class:`~femorph_solver.validation.problems.CantileverEulerBernoulli`
     - Timoshenko 1955 §5.4
     - Tip deflection + root bending stress :math:`\\sigma = P L c / I`
     - Converges; ≲ 6 % deflection at nx=40, ~20 % root stress
   * - :class:`~femorph_solver.validation.problems.CantileverNaturalFrequency`
     - Rao 2017 §8.5 Table 8.1
     - First bending frequency
     - Converges; ≲ 3 % at nx=40
   * - :class:`~femorph_solver.validation.problems.KirchhoffSSPlateModes`
     - Timoshenko & Woinowsky-Krieger 1959 §63
     - SS plate fundamental :math:`f_{1,1}`
     - ``xfail`` — needs shell kernel or thicker geometry
   * - :class:`~femorph_solver.validation.problems.SSPlateStatic`
     - Timoshenko & Woinowsky-Krieger 1959 §30-§31
     - SS plate centre deflection vs Navier double-sine series
     - Converges; ≲ 15 % at 40×40×2 mesh

Planned additions (tracked as ``TA-10h-*`` in
``TODO-parity-perf.md``):

* Cylindrical shell modes (Soedel 2004 §5).
* Kirchhoff plate on a proper shell kernel (paired with
  MITC4 SHELL181).

Extending the catalogue
-----------------------

Adding a new benchmark is three steps:

1. Identify the published reference.  The source must be
   **public** — a textbook, paper, handbook, or
   open-technical-report.
2. Write a ``BenchmarkProblem`` subclass under
   ``src/femorph_solver/validation/_problems/``.  Declare the
   :class:`PublishedValue` tuple with the formula in a
   ``__doc__`` block the docs can auto-reference; implement
   ``build_model`` using the native :class:`~femorph_solver.Model`
   API; implement ``extract`` to pull the quantity from the
   solve output.
3. Re-export the class from
   ``src/femorph_solver/validation/problems.py``.  Add a
   regression test under ``tests/analytical/`` that pins the
   recommended mesh's tolerance.

The framework is deliberately small — ~250 lines including the
reporters — so the review burden for new entries stays on the
*physics* rather than plumbing.