Verification-manual spec
========================

The contract for adding a Verification Manual (VM) example to
``femorph-solver``.  Internal-developer scope.  Replaces the older
"Nastran is run through the cross-solver harness, MAPDL is run through
per-VM files" split — every VM problem now lands as a row in one
registry and runs through one parametrised harness, regardless of which
vendor's manual it came from.

If you are about to author a VM, this page is the only one you must
read end to end.  Companion documents:

* :doc:`provenance` — every published reference cited from
  ``femorph-solver`` source.
* ``mpdl/docs/verification-methodology.md`` — the philosophy of
  resolving discrepancies (branches a/b/c/d).  Quoted in `What "passing"
  means`_.

.. contents:: Page contents
    :local:
    :depth: 2

Goal
----

100 % of *in-scope* published Verification Manual problems, across
every solver we read.  "In scope" is exhaustively defined in `Scope`_
below — the short version is **linear-elastic structural** (static,
modal, harmonic, cyclic-symmetry; the element-formulation tests; the
NAFEMS LE / FV / Test-5 / 13 / 21 family).

Every benchmark must:

#. Round-trip through our reader from the *vendor's native deck format*
   (``.bdf`` / ``.dat`` / ``.inp`` / ``.cdb``) — never re-built in
   Python in a way that bypasses the reader.  The reader is the
   primary thing under test.
#. Match an analytical or NAFEMS reference within a tolerance pinned
   *just above* the empirically-measured FE error — equal-or-better
   than the source vendor's published number.
#. Carry a written record of any discrepancy resolution
   (`What "passing" means`_).

Coverage targets are tracked on `#601`_ (umbrella project) with detail
trackers `#345`_ (MSC Nastran VG) and `#511`_ (Ansys MAPDL VM).

.. _#601: https://github.com/femorph/solver/issues/601
.. _#345: https://github.com/femorph/solver/issues/345
.. _#511: https://github.com/femorph/solver/issues/511

Layout
------

A VM problem is **three artefacts** and (sometimes) one Python file.

::

    tests/interop/<vendor>/fixtures/<stem>.<ext>     # vendor deck (one per format)
    tests/cross_solver/_verification_registry.py    # one row per stem
    tests/validation/<vendor>_vm/test_<stem>.py     # OPTIONAL — only while the
                                                    #   reader for <ext> is pending

Vendor / extension matrix
~~~~~~~~~~~~~~~~~~~~~~~~~

.. list-table::
    :header-rows: 1
    :widths: 18 14 30 38

    * - Vendor
      - Extension
      - Fixture root
      - Reader
    * - MSC Nastran / NX Nastran / Autodesk Nastran
      - ``.bdf``
      - ``tests/interop/nastran/fixtures/``
      - :func:`femorph_solver.interop.nastran.from_bdf`
    * - Abaqus / SIMULIA
      - ``.inp``
      - ``tests/interop/abaqus/fixtures/``
      - :func:`femorph_solver.interop.abaqus.from_inp`
    * - Ansys MAPDL — input deck
      - ``.dat``
      - ``tests/interop/mapdl/fixtures/``
      - :func:`femorph_solver.interop.mapdl.from_dat` (Phase 1 landed
        in #522; per-card coverage is grown incrementally — when a
        VM hits a card the parser doesn't yet handle, fall back to
        `Reader-pending fallback`_)
    * - Ansys MAPDL — binary archive
      - ``.cdb``
      - ``tests/interop/mapdl/fixtures/``
      - :func:`femorph_solver.interop.mapdl.from_cdb`

A row's ``formats`` field declares which extensions to walk.  When
multiple extensions are paired (e.g. ``("bdf", "inp")``), the harness
also asserts the readers agree with each other — see
`Cross-format agreement`_.

Stem-naming convention
----------------------

The stem is the **on-disk filename without extension** and the
registry's primary key.  Conventions by source:

* ``vm_msc_vg_<section>_<descriptor>`` — MSC Nastran 2024.1 VG
  (e.g. ``vm_msc_vg_2_5_cantilever_statics``,
  ``vm_msc_vg_6_fv12_free_square_plate``).
* ``vm<N>[_<formulation>]`` — Ansys MAPDL VM (vendor's official
  numbering, e.g. ``vm1``, ``vm5_plane182_enhanced``,
  ``vm5_plane183``).
* ``vm_nafems_<id>`` — NAFEMS standalone benchmark
  (e.g. ``vm_nafems_le1``).
* ``vm_<descriptor>`` — internal canonical problems with a clear
  closed form, no specific manual citation
  (e.g. ``vm_cantilever_eb``, ``vm_pinched_ring``).

The descriptor disambiguates problems that share a number but exercise
different element formulations (e.g.
``vm_msc_vg_1_1_macneal_harder_axial``,
``vm_msc_vg_1_1_macneal_harder_yshear``,
``vm_msc_vg_1_1_macneal_harder_zshear``).  See `Multi-formulation
rows`_.

.. note::

    Stems are **case-sensitive** and must match the on-disk filename
    exactly.  The harness errors loud on a missing fixture — silent
    skip would let the matrix rot.

Sharing one fixture across multiple rows (``fixture_stem``)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When a single deck carries several independent assertions — e.g. a
uniaxial-tension cube reads UX on the x-face *and* UY on the y-face
*and* UZ on the z-face — give each assertion its own registry stem
and point all of them at one fixture file via ``fixture_stem``:

.. code-block:: python

    REGISTRY["vm_single_hex_uniaxial_axial"] = StaticSpec(
        coord_axis=0, coord_value=1.0,
        dof_index=0, expected=5.0e-6, tolerance=1.0e-6,
        fixture_stem="vm_single_hex_uniaxial",   # all three rows
    )                                            # share one fixture
    REGISTRY["vm_single_hex_uniaxial_poisson_y"] = StaticSpec(
        coord_axis=1, coord_value=1.0,
        dof_index=1, expected=-1.5e-6, tolerance=1.0e-6,
        fixture_stem="vm_single_hex_uniaxial",
    )

Defaults to ``None``; the harness then uses the registry key as the
on-disk fixture stem (the standard one-row-one-fixture pattern).
This is the right knob for splitting "one deck → N closed-form
checks" into N independent registry rows that the harness reports
separately.

Fixture authoring
-----------------

Vendor decks fall into two categories.  The provenance handling
differs:

Re-author from problem statement (default)
    For corpora whose decks Hexagon / Dassault / NAFEMS publish only
    inside the PDF (or only behind a vendor login), the fixture is
    **re-authored from scratch** off the public *problem statement*:
    geometry, materials, BCs, loads come from the published narrative;
    deck text is never lifted.  Header comment cites the source
    chapter and PDF page.  This is the convention for every
    ``.bdf``-based fixture in the corpus.

Vendor-shipped (with provenance preserved)
    For corpora that ship the input deck openly under public terms
    (Ansys publishes every ``vm{N}.dat`` on the help portal in plain
    text; ``.inp`` mirrors are commonly redistributed under fair use),
    the fixture is the vendor file with its ``/COM`` / ``** ANSYS
    MEDIA REL.`` provenance line preserved.  Do not strip vendor
    headers; do not edit the deck.

If a vendor's deck is available from a primary source, prefer the
vendor-shipped path: it lets the reader exercise the deck text the
real-world user will hand it.  Otherwise re-author.

Either way:

* The deck is **immutable** in the test repo.  If it doesn't run, the
  fix is on the *reader* or *kernel* side, not the deck.  This rule
  is enforced — it codifies the
  ``doc/source/verification/vendor_matrix.rst`` "do not modify input
  files" callout.
* If the deck was authored as a workaround for a reader bug
  (the ``vm_pinched_ring`` and ``vm_msc_vg_2_5_cantilever_statics``
  decks before #573), it is *non-canonical*; opening a coordinated
  reader-fix-plus-deck-canonicalisation PR is the right answer
  (#509 was the prior reference case).

Vendor decks that contain LICENSE-restricted geometry, problem
statements, or commentary (Hexagon's MSC VG copy-restricted notes,
SIMULIA's licensed examples) are **not** vendored under this corpus.
Re-author.

Spec types
----------

The registry holds dataclasses keyed by stem.  Adding a new VM is one
``dataclass(frozen=True)`` instance plus its fixture pair.  The four
spec types share the same selector primitives (`Selectors`_) and the
same status fields (`xfail / xfail_agreement`_).

ModalSpec
~~~~~~~~~

Asserts a sequence of natural frequencies from
``Model.solve_modal``.  Use for any benchmark whose published target
is one or more eigenfrequencies.

.. code-block:: python

    REGISTRY = {
        "vm_cantilever_modes": ModalSpec(
            n_modes=8,
            modes=(40.769, 40.769, 255.495, 255.495),
            tolerance=(5.0e-3,) * 4,
            rbm_threshold_hz=0.0,
            notes="Cantilever EB beam — degenerate yz pairs.",
        ),
    }

Fields:

* ``n_modes`` — number of modes the eigensolver is asked for
  (must include enough headroom for any rigid-body residuals).
* ``modes`` — reference frequencies (Hz), elastic-only, ascending.
* ``tolerance`` — per-mode relative tolerance.
* ``rbm_threshold_hz`` — frequencies below this are dropped before
  matching.  Use ``1.0`` for free-free structures (filters the 6 RBM
  residuals); use ``0.0`` when no RBM is expected.
* ``sigma`` — eigensolver shift.  Use a small negative value
  (``-1.0``) when ``K`` is positive *semi-* definite (free
  structures, partial constraints) so cholmod can factor
  ``(K + |σ|·M)``.

StaticSpec
~~~~~~~~~~

Asserts one displacement DOF — at a single node, or as a face-mean
across every node matching a coordinate filter.  This is the workhorse
spec for every "tip deflection equals analytical closed form" row.

Single-node form:

.. code-block:: python

    REGISTRY["vm_cantilever_eb"] = StaticSpec(
        node_id=2,
        dof_index=1,        # UY
        expected=3.2e-3,    # F L³ / (3 E I)
        tolerance=1.0e-3,
        notes="Hermite-cubic CBAR is EB-exact at the tip.",
    )

Face-mean form (3D-meshed cantilever benchmarks where the tip face
rotates and the centroidal value is the mean):

.. code-block:: python

    REGISTRY["vm_msc_vg_1_1_macneal_harder_axial"] = StaticSpec(
        coord_axis=0,           # filter on points[:, 0]
        coord_value=6.0,        # x = L
        coord_tol=1.0e-6,
        dof_index=0,            # UX
        expected=3.0e-5,        # F L / (E A)
        tolerance=1.0e-3,
        notes="MacNeal-Harder cantilever, axial.  Tip-face mean UX.",
    )

The single-node and face-mean forms are mutually exclusive; supplying
both (or neither) raises ``ValueError``.

StressSpec
~~~~~~~~~~

Asserts a Voigt stress component (0=σ_xx, 1=σ_yy, 2=σ_zz, 3=σ_xy,
4=σ_yz, 5=σ_xz) at a node or as a face-mean.  Recovers nodal stress
via :func:`compute_nodal_stress(model, displacement)
<femorph_solver.result._stress_recovery.compute_nodal_stress>`
internally; covers element-level stress benchmarks (Kirsch K_t at
the plate-with-hole hole top, NAFEMS LE1 σ_yy at point D, MAPDL VM2
bending stress, MAPDL VM5 PLANE σ_xx).

Single-node form:

.. code-block:: python

    REGISTRY["vm_nafems_le1"] = StressSpec(
        node_id=1,
        component=1,           # σ_yy
        expected=92.7e6,       # Pa (NAFEMS R0015 §2.1)
        tolerance=8.0e-2,      # NAFEMS-specified 8 %
    )

Face-mean form: same ``coord_axis`` / ``coord_value`` /
``coord_tol`` triple as ``StaticSpec``.

ReactionSpec (planned — see `Roadmap`_)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Asserts a reaction force at a constrained DOF.  Required for VMs that
target reaction values (Ansys VM1's ``REAC_1`` / ``REAC_2`` is the
canonical case).  Same selector primitives as ``StaticSpec``.

Selectors
---------

A spec picks the row(s) of the displacement / stress array it
asserts on:

* ``node_id`` — single 1-based node ID resolved at runtime via the
  ``ansys_node_num`` cell-data layer.
* ``coord_axis`` + ``coord_value`` (+ ``coord_tol``) — every node
  whose ``points[:, coord_axis]`` is within ``coord_tol`` of
  ``coord_value``.  The harness averages the requested DOF across
  the matching nodes.

These two are mutually exclusive on any one spec instance.  Future
selectors (multi-axis filter, radial projection) should land as a new
mutually-exclusive bundle, not by overloading existing fields.

Multi-formulation rows
----------------------

A VM problem that intentionally cycles element formulations becomes
**N rows** in the registry, one per formulation.  Convention is
``<problem>_<formulation>``:

::

    vm5_plane182_enhanced     # PLANE182 KEYOPT(1)=2 — Wilson Q6 EAS
    vm5_plane183              # PLANE183 — 8-node serendipity

Each row carries its own assertion (the same target reference, but the
solver answer differs by formulation).  The matrix grows linearly and
stays declarative.  Same pattern is in use today for
``vm_msc_vg_1_1_macneal_harder_{axial,yshear,zshear}`` (load case
cycle) and ``vm_msc_vg_1_8_twisted_beam_{inplane,outplane}``.

If a VM cycles **both** load case and formulation, fan out to the
Cartesian product (e.g. ``vm9_hex8_axial``, ``vm9_hex20_axial``,
``vm9_hex8_yshear`` …).

Cross-format agreement
----------------------

When a row's ``formats`` lists more than one extension, the harness
auto-fires a **pairwise agreement** test for every (fmt_i, fmt_j) pair
in addition to the per-format round-trip:

::

    formats=("bdf", "inp")
        ⇒ 2 round-trip cases  +  1 BDF↔INP agreement case

    formats=("bdf", "inp", "dat")
        ⇒ 3 round-trip cases  +  3 pairwise agreements

The agreement check uses ``agreement_rtol`` / ``agreement_atol`` from
the spec.  Tolerances should be tighter than the published-vendor
tolerance — it tests *reader* parity, not solver accuracy.

.. _vm-spec-reader-pending:

Reader-pending fallback
-----------------------

When the reader for a vendor format isn't yet production
(``from_dat`` until #513 lands; new readers landing in the future),
the registry row still exists but its ``formats`` is set to ``()`` so
the harness skips it.  In its place, an **OPTIONAL** per-VM file under
``tests/validation/<vendor>_vm/test_<stem>.py`` builds the equivalent
model via the vendor's interop shim (e.g. the ``APDL`` shim for
MAPDL VMs) and **delegates the assertion to the registry's helpers**:

.. code-block:: python

    from tests.cross_solver._verification_registry import REGISTRY
    from tests.cross_solver.test_verification_round_trip import (
        _assert_modal, _assert_static,
    )

    def test_vm5_plane182_mid_length_stress():
        spec = REGISTRY["vm5_plane182_enhanced"]
        m = _build_vm5_plane182_enhanced()      # APDL shim
        _assert_static(m, spec)                 # same code path the harness uses

This keeps the build-path tests aligned with the registry's
invariants.  When the reader lands, flip the row's ``formats`` to the
new extension, drop the ``_build_*`` factory, and the harness picks
the row up automatically — no change to the assertion or the
reference.  The ``test_<stem>.py`` file is then deleted.

This pattern is **opt-in for one PR per VM**: a row appears in the
registry on day one (``formats=()``), the build-path test sits next
to it, and they retire together when the reader lands.  Don't author
any new build-path tests *outside* this pattern.

xfail / xfail_agreement
-----------------------

Two independent fields:

* ``xfail`` — marks the round-trip cases (one per format) XFAIL with
  a reason string.  ``strict=False`` so accidental fixes don't
  surprise-fail the suite.  Use when the row hits a kernel /
  interop gap that opens a child issue.
* ``xfail_agreement`` — marks the cross-format agreement test XFAIL
  *independently* from the round-trip.  Use when the readers
  legitimately disagree (e.g. PBAR ``I12`` carried by one but not
  the other).

The two are deliberately separate.  A row whose round-trip fails the
analytical reference can still have both readers producing the *same*
wrong answer — the agreement should remain a real assertion so a
future reader change introducing a divergence trips loud.  Conversely,
a row whose round-trip is fine can have a documented BDF vs INP /
``.dat`` vs ``.cdb`` discrepancy that we want tracked but not blocking
the row.

Use the ``xfail`` field eagerly: a row is **never deleted** because it
fails.  Mark it XFAIL, link the gap-blocker child issue in the
``notes``, and the row stays on the matrix as a tracked TODO.

Tolerances
----------

The pinning rule is **just above the empirically-measured FE error**.
Three layers:

#. Authority: an analytical / NAFEMS reference value when one exists;
   otherwise the published-vendor reference table.  Vendors don't
   always agree among themselves and they're not always right —
   prefer the analytical / NAFEMS value when both are available.
#. Headroom for floating-point platform variation: typically 0.1–0.3 %
   above the measured value, never more.
#. Tolerances must **never** exceed the published-vendor tolerance.
   Equal-or-better than the vendor is the bar; if our tolerance has
   to widen past the vendor's stated band to pass, we have a kernel
   bug, not a tolerance bug.

When a row's tolerance widens, the PR description must record what
moved — element refinement, kernel change, reader fix.

What "passing" means
--------------------

A row "passes" when:

#. The harness's BDF / INP / DAT / CDB round-trip cases are all green
   (or have ``xfail`` set with a linked child issue).
#. Every cross-format agreement case is green (or has
   ``xfail_agreement`` set with a linked rationale).
#. The PR cites the closed-form / NAFEMS / vendor reference and notes
   the empirically-measured FE error.

If the row fails, the methodology branches in
``mpdl/docs/verification-methodology.md`` apply — in order:

(a) Deck mis-author.  Re-read the published statement; verify
    geometry, material, loading, BC.
(b) Reader gap.  Inspect what the reader produced (model.grid,
    real_constants, materials).  Open a ``[interop]`` child issue if
    the reader bug is real.
(c) Tolerance mis-set.  Compare the measured FE error against the
    published-vendor band.  Tighten or widen as appropriate, *never*
    past the vendor band.
(d) Kernel / analysis-feature gap.  Open a ``[kernel]`` child issue
    with the ``verify-blocked`` label.  Mark the row ``xfail``;
    *do not delete it*.

Branches (a) and (b) almost always fix the row in one PR.  Branch (c)
is a one-liner.  Branch (d) is the only path that creates a child
issue — the row stays on the matrix until the gap lands.

Status conventions
------------------

In a tracker (#345 / #511 / #322 / #601):

* ☑ — row has a registry entry with ``formats != ()`` and the harness
  is green.  PR ref noted in the row's `Notes` column.
* ☐ — row has no registry entry yet.
* ☐⊘ — row has a registry entry but is blocked on a child issue.
  ``xfail`` is set; ``notes`` link the blocker.
* ⊘ — row is out of scope (`Scope`_).  These are **not** TODOs; they
  exist on the tracker for completeness.

The umbrella tracker (#601) keeps the live aggregate count.

Scope
-----

In scope (these belong on the matrix, even if a row is currently
``xfail``-blocked):

* Linear static (``SOL 101`` / ``*STATIC`` / ``ANTYPE,STATIC``).
* Modal / normal-modes
  (``SOL 103`` / ``*FREQUENCY`` / ``ANTYPE,MODAL``).
* Cyclic-symmetry static + modal (deck-side parity check; the solver
  already supports cyclic).
* Harmonic / modal-frequency-response
  (``Model.solve_harmonic``).
* Element-formulation tests (patch test, MacNeal-Harder set, twisted
  beam, curved beam, Scordelis-Lo roof, hemisphere, …).
* Pre-stress / pre-strain / acceleration loading
  (``*GRAV``, ``TREF``, ``BFUNIF,TEMP``).

Out of scope (rows are ⊘ — not on this corpus):

* Nonlinear material — plasticity, hyperelastic, creep.
* Geometric nonlinear / large deflection / stress-stiffening.
* Linear and nonlinear buckling.
* Contact / gap / friction / slide-line.
* Transient direct + transient modal-superposition dynamics.
* Random / response-spectrum dynamics.
* Rotor-dynamics complex-eigenvalue.
* Optimisation (``SOL 200`` / topology / shape).
* Explicit dynamics (``SOL 700``, LS-DYNA blast / Taylor).
* Thermal / thermo-mechanical / acoustic / FSI / EM.
* Aeroelasticity / flutter / divergence.
* Fracture mechanics (J-integral, K-factor, crack-tip).
* Superelement / Craig-Bampton reduction.
* DMIG direct-matrix-input decks.
* Format / IO checkpointing benchmarks (deck format, not engineering).

These map to TA roadmap items beyond the current solver scope.
Adding any one of them re-opens the relevant row family on the
appropriate detail tracker.

PR checklist
------------

When opening a VM PR, the body must include:

* The published reference (chapter, page, DOI, or URL — never just
  the vendor's own reference table without an upstream).
* The closed-form value and how it was derived (formula or
  derivation reference).
* The empirically-measured FE error from a local run, against the
  published value.
* Links to: the registry stem name, the fixture path(s), any
  ``[kernel]`` / ``[interop]`` child issue this row depends on, the
  detail-tracker row being flipped (if any).
* A ``Test plan`` block citing the parametrised harness IDs that
  cover this row (e.g.
  ``test_verification_round_trip[vm5_plane182_enhanced-dat]``).

When the PR merges:

* Update the detail tracker (#345 / #511) row's ☐ → ☑ with a PR ref.
* Update the live count line on #601.
* If the row exposes new vendor-cross-references, add them to
  ``doc/source/verification/vendor_matrix.rst``.
* If the row demonstrates a new analytical / methodology pattern,
  add a runnable example under ``examples/verification/``.

Roadmap
-------

The spec types and harness branches landing in order:

* ☑ ``ModalSpec`` — landed.
* ☑ ``StaticSpec`` (single-node) — landed.
* ☑ ``StaticSpec`` (face-mean / coord-axis selector) — landed.
* ☑ Split ``xfail`` from ``xfail_agreement`` — landed.
* ☑ ``StressSpec`` — landed in #626 (used by ``vm_nafems_le1`` σ_yy
  at point D and ``vm_plate_with_hole`` Kirsch K_t = 3 σ_xx).
* ☑ ``fixture_stem`` — landed in #635, lets multiple registry rows
  share one fixture (used by ``vm_single_hex_uniaxial_{axial,
  poisson_y, poisson_z}``).
* ☑ ``from_dat`` reader Phase 1 (#513) — landed in #522, flips MAPDL
  VM rows from reader-pending to reader-driven on a per-card basis
  as the parser grows.
* ☐ ``ReactionSpec`` — needed for Ansys VM1-class rows.
* ☐ Multi-axis coord filter selector — needed for shell-of-revolution
  benchmarks where the tip face is selected by ``x ≈ 0`` *and*
  ``y > r_inner``.
* ☐ Radial / cylindrical projection selector — needed for thick-wall
  cylinder over multiple θ.

Each of these is one of: a small dataclass + harness branch
(``ReactionSpec``, the new selectors), or a reader PR (further
``from_dat`` cards).  Plan to land them lazily — at the moment the
first row that needs each one shows up.

References
----------

* :doc:`provenance` — citations for every non-trivial numerical
  algorithm in the source.
* `MSC Nastran 2024.1 Verification Guide
  <https://nexus.hexagon.com/documentationcenter/en-US/bundle/MSC_Nastran_2024.1_Verification_Guide/resource/MSC_Nastran_2024.1_Verification_Guide.pdf>`_
  — primary corpus for the BDF / INP track.
* `Ansys MAPDL Verification Manual
  <https://ansyshelp.ansys.com/public/account/secured?returnurl=/Views/Secured/corp/v251/en/ans_vm/>`_
  — primary corpus for the ``.dat`` track (#511).
* MacNeal, R. H. and Harder, R. L., 1985.  *A proposed standard set of
  problems to test finite element accuracy.*  Finite Elements in
  Analysis and Design, 1, 3–20.
* NAFEMS, *Standard Benchmark Tests for Linear Elastic Analysis*,
  R0015 (1990).
* NAFEMS, *Selected Benchmarks for Free Vibration*,
  R0016 (1990).