.. _benchmark: Benchmarking femorph-solver on your machine =========================================== ``femorph_solver.benchmark.run_benchmark()`` drives a standard sweep of modal solves and writes both machine-readable (JSON) and human-readable (HTML) outputs. The intended uses: * **Report back to maintainers.** When you file an issue about wall-time or memory, attach the JSON from a ``basic``-level run — it captures the full host signature + per-row measurements. * **Training data for the estimator.** The JSON schema is stable (append-only); the TA-2 time/memory estimator loads every historical file it finds to refit its model across hardware. * **Quick "is my install set up right?" check.** The ``fast`` flag runs one 16×16×2 plate and verifies the whole pipeline works in under 10 s — useful after installing or upgrading MKL / CHOLMOD. .. contents:: :local: :depth: 2 Effort levels ------------- Users pick one of three presets, each scoped to a time budget. The runner trims its size ladder as rows accumulate so a slow box sees fewer rows rather than a wall-clock time-over. .. list-table:: :widths: 12 14 14 14 46 :header-rows: 1 * - Level - Sizes - Mode counts - Budget - When to use * - ``basic`` - 3 (16²-64²) - 10 - ≤ 15 min - Issue triage; CI smoke after install. One-row ``--fast`` variant lands in ~5 s. * - ``full`` - 5 (16²-96²) - 10, 20 - ≤ 1 h - Per-host scorecard. Shows the backend ordering that ``auto`` would pick on your machine, plus how eigsh scales with mode count. * - ``exhaustive`` - 8 (16²-192²) - 10, 20, 50 - multi-hour - Overnight training-data pass for the TA-2 estimator. Includes OOC rows at sizes ≥ 500 k DOFs so the trade-off scatter is complete. Each level enumerates a full matrix of ``(size, n_modes, linear_solver, eigen_solver)``; unavailable backends (no MKL, no CHOLMOD, ...) are skipped cleanly instead of failing the sweep. Running from Python ------------------- .. code-block:: python from femorph_solver.benchmark import run_benchmark, BenchmarkLevel result = run_benchmark( BenchmarkLevel.BASIC, out_dir="./bench-out/", ) print(result.json_path) # → bench-out/femorph-benchmark-basic-.json print(result.html_path) # → bench-out/femorph-benchmark-basic-.html print(f"{len(result.rows)} rows in {result.total_wall_s:.1f} s") Running from the command line ----------------------------- .. code-block:: bash python -m femorph_solver.benchmark --level basic --out ./bench-out/ # --fast trims to one row + 60 s cap (smoke / CI). python -m femorph_solver.benchmark --level basic --out . --fast A ``femorph-solver benchmark`` console-script wrapper will land alongside TA-7 — same interface. Output format ------------- **JSON** (``femorph-benchmark--.json``) — primary machine-readable output. Top-level keys: .. code-block:: text schema_version "1.0" — bumped additively when fields change level "basic" | "full" | "exhaustive" description one-liner rendered at the top of HTML started_at ISO-8601 timestamp finished_at ISO-8601 timestamp total_wall_s aggregate sweep duration budget_s preset's wall budget budget_reached bool — did the runner bail early? host_report the ``femorph_solver.Report()`` string preset full preset config echoed back for reproducibility rows list of per-row measurements (see below) Each row has: .. code-block:: text spec dict of (nx, ny, nz, n_modes, linear_solver, eigen_solver, ooc, timeout_s) ok True / False — did the subprocess exit cleanly? wall_s total wall-time including assembly + BC reduce eig_s eigsh-only wall-time assemble_s _bc_reduce_and_release wall-time peak_rss_mb ru_maxrss in MB (monotonic within the subprocess) n_dof free DOF count frequencies first up-to-50 modal frequencies (Hz) error None on success; one-line error on failure **HTML** (``femorph-benchmark--.html``) — styled standalone report. Includes a summary box, a full row table with fail highlighting, and the host Report embedded verbatim for copy-paste into issues. Schema guarantees ----------------- The JSON schema is **append-only** across releases: - Every field documented in the table above will still be present and carry its current semantic in future versions. - New fields may be added; older loaders should ``.get()`` them with a sensible default. - ``schema_version`` bumps only when a field's semantic changes (never yet happened). This stability is what lets the TA-2 estimator load every historical ``femorph-benchmark-*.json`` that users contribute without care-and-feeding for version drift. Relation to the perf/ benches ----------------------------- :mod:`femorph_solver.benchmark` is the **user-facing** benchmark. For developer-facing tooling see :file:`perf/bench_pipeline.py` (size-sweep microbench with per-stage timing) and :file:`perf/bench_ooc_vs_incore.py` (live MAPDL OOC-vs-in-core head-to-head). Both of those emit their own JSON + markdown but aren't part of the stable schema — the benchmark module is.