Source code for femorph_solver.solvers.linear._mumps

"""MUMPS multifrontal sparse direct solver via ``python-mumps``.

MUMPS is the go-to direct solver when Pardiso is unavailable or when the
redistributable-license / out-of-core story matters:

* **Licence** — CeCILL-C (LGPL-like).  Unlike MKL Pardiso, MUMPS can be
  shipped with a commercial redistribution without the MKL EULA.
* **Out-of-core** — mature OOC mode via ``factor(ooc=True)`` / MUMPS's
  ``ICNTL(22)``; spills the factor to disk so RAM stops being the
  ceiling on single-node solves.  Pardiso has an OOC mode too but it is
  less battle-tested.
* **Distributed memory** — MUMPS is the on-ramp to MPI-distributed
  solves across nodes, which neither Pardiso nor SuperLU offer.

On a single shared-memory node at ptr.cdb scale (≤1 M DOFs, SPD) MUMPS
is usually 1.5–3× slower than Pardiso and uses ~10–30 % more factor
memory.  Keep it available, but ``auto`` still prefers Pardiso > CHOLMOD
> MUMPS on SPD workloads.

Implementation notes for the python-mumps binding (``pip install
python-mumps``, import ``mumps``):

* The ``Context`` constructor takes no ``sym=`` argument; symmetry is
  declared on :meth:`Context.set_matrix` as ``symmetric=True``, which
  internally maps to MUMPS's ``sym=2`` (general-symmetric LDLᵀ).  The
  wrapper does not expose MUMPS's ``sym=1`` "strictly SPD" mode because
  python-mumps 0.0.6 doesn't — sym=2 handles SPD input correctly, just
  without the small memory win of skipping pivoting.
* The Context docstring warns "only complex numbers supported", but the
  underlying code dispatches on the input dtype (``dmumps`` for
  float64, ``zmumps`` for complex128).  Float64 works fine.

References
----------
- Amestoy, P. R., Duff, I. S., L'Excellent, J.-Y., & Koster, J.
  "A Fully Asynchronous Multifrontal Solver Using Distributed
  Dynamic Scheduling."  SIAM J. Matrix Anal. Appl. 23 (1), 15-41
  (2001).  [the MUMPS multifrontal + distributed-scheduling
  algorithm]
- Amestoy, P. R., Guermouche, A., L'Excellent, J.-Y., & Pralet, S.
  "Hybrid scheduling for the parallel solution of linear systems."
  Parallel Computing 32 (2), 136-156 (2006).  [hybrid
  static/dynamic scheduler shipped in MUMPS 4.x+]
- Agullo, E., Guermouche, A., & L'Excellent, J.-Y.  "Reducing the
  I/O volume in sparse out-of-core multifrontal methods."  SIAM J.
  Scientific Computing 31 (6), 4774-4794 (2010).  [the OOC path
  exposed here via ``ICNTL(22)``]
"""

from __future__ import annotations

import importlib.util
from typing import ClassVar

import numpy as np
import scipy.sparse as sp

from ._base import LinearSolverBase

#: Cheap availability probe — does not load the native MUMPS library.
_HAVE: bool = importlib.util.find_spec("mumps") is not None


[docs] class MUMPSSolver(LinearSolverBase): """MUMPS sparse direct solver via python-mumps (multifrontal).""" name: ClassVar[str] = "mumps" kind: ClassVar[str] = "direct" spd_only: ClassVar[bool] = False install_hint: ClassVar[str] = ( "`pip install python-mumps` (requires libmumps — e.g. apt install " "libmumps-scotch-dev, brew install mumps, or conda install -c conda-forge mumps)" ) def __init__( self, A: sp.spmatrix | sp.sparray, *, assume_spd: bool = False, mixed_precision: bool | None = False, ooc: bool = False, ordering: str = "auto", ) -> None: """``ooc=True`` enables MUMPS out-of-core factorisation — the factor spills to ``$MUMPS_TMPDIR`` (or ``/tmp``) so peak RAM no longer includes the full L factor. Trades wallclock for memory; useful when the factor wouldn't otherwise fit. ``ordering`` is MUMPS's reordering heuristic: ``"auto"``, ``"amd"``, ``"amf"``, ``"scotch"``, ``"pord"``, ``"metis"``, or ``"qamd"`` (availability depends on how libmumps was built). """ self._ooc = ooc self._ordering = ordering super().__init__(A, assume_spd=assume_spd, mixed_precision=mixed_precision)
[docs] @staticmethod def available() -> bool: return _HAVE
def _factor(self, A): import mumps # python-mumps wants a coo sparse array; set_matrix(symmetric=True) # drops the strictly-lower triangle internally so we don't need # to pre-triu the matrix ourselves. When the assembler already # handed us upper-triangular storage (``femorph_triu``) we pass it # straight through with ``symmetric=False`` — otherwise MUMPS # would truncate it twice. if self._assume_spd: if getattr(A, "femorph_triu", False): A_coo = sp.coo_matrix(A).astype(np.float64, copy=False) symmetric_flag = False else: A_coo = sp.coo_matrix(A).astype(np.float64, copy=False) symmetric_flag = True else: A_coo = sp.coo_matrix(A).astype(np.float64, copy=False) symmetric_flag = False self._A = A_coo self._ctx = mumps.Context() self._ctx.set_matrix(self._A, symmetric=symmetric_flag) self._ctx.factor(ordering=self._ordering, ooc=self._ooc)
[docs] def solve(self, b: np.ndarray) -> np.ndarray: return self._ctx.solve(np.ascontiguousarray(b, dtype=np.float64))
def __del__(self) -> None: # Context holds libmumps state; release it deterministically so # tests that churn through many factors don't accumulate native # allocations. ``Context.__exit__`` runs the ``job=-2`` cleanup # but isn't invoked on plain GC. ctx = getattr(self, "_ctx", None) if ctx is not None and getattr(ctx, "mumps_instance", None) is not None: try: ctx.mumps_instance.job = -2 ctx.mumps_instance.call() except Exception: pass ctx.mumps_instance = None