.. _estimating: Estimating wall time + memory before you solve ============================================== ``Model.estimate_solve()`` returns predicted ``(wall_s, peak_rss_mb)`` for a modal or static solve **without** running it. Useful when you want to: - Check a job fits in your SLURM time / memory budget before submitting. - Pick between in-core and OOC up front. - Compare how a problem scales with ``n_dof`` without the measurement overhead. .. contents:: :local: :depth: 2 How it works ------------ The estimator fits a simple log-log power law per ``(host_signature, linear_solver)`` bucket — on 3D SPD meshes under sparse-Cholesky ordering, factor fill and solve cost both scale as :math:`n^{4/3}` (George 1973), so ``log(wall_s)`` is nearly linear in ``log(n_dof)`` for any fixed host + backend. Training data comes from the TA-6 benchmark module (:mod:`femorph_solver.benchmark`): every ``femorph-benchmark-*.json`` file in the current working directory is loaded, broken into training rows, and grouped by ``(host_signature, linear_solver)``. Buckets with ≥ 2 non-OOC rows get their own coefficient fit; everything else falls back to a shared cross-host fit, and if no training data is available at all the estimator uses a shape-of-universe prior tuned from the repo's own Intel Core i9-14900KF benchmark sweeps (with a ``p95 = 2 × p50`` confidence band to signal you're looking at an uncalibrated prediction). Usage ----- .. code-block:: python import numpy as np, pyvista as pv import femorph_solver # Build your model as usual... m = femorph_solver.Model.from_grid(grid) m.et(1, "SOLID185") m.mp("EX", 1, 2.0e11); m.mp("PRXY", 1, 0.3); m.mp("DENS", 1, 7850) # ... apply BCs est = m.estimate_solve(n_modes=10) print(est) # Estimate(wall_s=p50 3.47 / p95 4.16, # peak_rss_mb=p50 892 / p95 1070, # bucket='Intel(R) Core(TM) i9-14900KF|94|Linux|x86_64|mkl_direct', # n_training=8) if est.wall_s_p95 > 60: print("This is a long-running solve — consider OOC or a bigger host.") Training on your own host ------------------------- Run the TA-6 benchmark once to seed training data: .. code-block:: bash python -m femorph_solver.benchmark --level basic --out ./ Then subsequent ``estimate_solve`` calls read the ``femorph-benchmark-basic-*.json`` file and fit per-host coefficients for the backends the ``basic`` level exercised (``auto`` + ``arpack``). Run the ``full`` level to cover more backends: .. code-block:: bash python -m femorph_solver.benchmark --level full --out ./ Extending the feature vector ---------------------------- :class:`~femorph_solver.estimators.HostSpec` carries an ``extras`` dict — arbitrary key/value strings from the TA-6 benchmark's host report. Future retraining passes can add features (memory bandwidth benchmark, BLAS thread count, disk type for OOC) by extending that dict and bumping the estimator's feature-extraction step; historical JSON files stay loadable because the loader reads every field via ``.get()`` with defaults. Limitations ----------- - **Per-host fits need data.** On a first run with no training JSON, the estimator falls back to the repo's prior, which was tuned on a 14900KF + Intel MKL. Expect wide ``p95`` bands (2× ``p50``) until you've run the benchmark at least once on your hardware. - **Extrapolation beyond training range is unreliable.** If your training rows top out at 200 k DOFs and you ask for 1 M, the power-law prediction might still be in the ballpark — but the confidence band doesn't widen enough. Run the benchmark at a size close to your target problem for a tight estimate. - **Only ``n_dof`` is currently used as a feature.** Future revisions will add ``nnz`` / mesh aspect ratio / mode count as features — the schema is already append-only, so expanded training data flows through without a loader change. References ---------- - George, A. "Nested dissection of a regular finite element mesh." SIAM J. Numer. Anal. 10 (1973), pp. 345-363. Origin of the :math:`n^{4/3}` factor-fill bound for structured 3D grids — the power-law basis for the log-log fit.