Global assembly
===============

Per-element :math:`\mathbf{K}_e` / :math:`\mathbf{M}_e` /
:math:`\mathbf{f}_e` blocks (built by the kernel routines —
see :doc:`../elements/index`) are scattered into the global
sparse matrices that the linear / eigen solvers see.  This
chapter describes the assembly algorithm, the storage format,
and how femorph-solver pre-computes a single CSR pattern and
re-uses it across stiffness, mass, and damping passes.

Element-to-global DOF map
-------------------------

Each MAPDL node :math:`n` has a fixed bank of degrees of
freedom — three translations
:math:`(u_x, u_y, u_z)` for solid kernels, plus three
rotations :math:`(\theta_x, \theta_y, \theta_z)` for beam /
shell kernels — laid out contiguously in the global vector
:math:`\mathbf{U}` according to the layout returned by
:meth:`Model.dof_map() <femorph_solver.Model.dof_map>`.  Per-
element DOF indices follow from that mapping plus the
element's local node order:

.. math::

    \mathrm{idx}(e, i, k) =
    \mathrm{base}(\mathrm{node}_e[i]) + k,
    \qquad i \in [0, n_\mathrm{nodes,e}), \;
    k \in [0, n_\mathrm{dof/node}),

where :math:`\mathrm{node}_e[i]` is the :math:`i`-th element
node and :math:`\mathrm{base}(n)` is the first global-DOF
index for node :math:`n`.  ``Model._dof_layout()`` caches the
:math:`(\mathrm{base}, \text{labels})` mapping for every
element-touching node so the per-cell scatter is a vectorised
``displacement[dofs]`` gather.

Sparse CSR scatter
------------------

The assembled :math:`\mathbf{K}` and :math:`\mathbf{M}` are
sparse symmetric :math:`(N, N)` matrices, with :math:`N`
the global free-DOF count.  femorph-solver stores them in
SciPy's **Compressed Sparse Row (CSR)** format
(:class:`scipy.sparse.csr_matrix`) — each non-zero is a triple
:math:`(\mathrm{row}, \mathrm{col}, \mathrm{val})` packed into
three flat arrays plus a per-row offset:

.. math::

    \mathbf{K}_{ij} = \mathrm{data}[k]
    \quad \text{where}\quad
    \mathrm{indices}[k] = j,
    \quad
    \mathrm{indptr}[i] \le k < \mathrm{indptr}[i + 1].

CSR is the right choice for the LinearOperator API used by
ARPACK / Pardiso / CHOLMOD because its row-wise traversal maps
directly onto :math:`\mathbf{K}\, \mathbf{v}` matrix-vector
products and onto symbolic factorisation graph algorithms
(Saad 2003 §3.4; Davis 2006 §2.7).

The pattern-once-fill-many split
--------------------------------

Element kernels assemble a fresh dense :math:`(n_e, n_e)` per
element, but the **non-zero pattern** of :math:`\mathbf{K}` is
fixed by the mesh + element-type registry — it doesn't depend
on the material or numerical values.  femorph-solver exploits
this in two phases:

1. **Symbolic pass.** A single graph traversal computes the
   per-row column indices that any element in the model
   touches.  Each row's column array is then sorted and
   de-duplicated.  Result: ``indptr`` and ``indices`` are
   built once and re-used for every assembly.

2. **Numerical pass.** Per-element ``ke`` /  ``me`` /
   ``ke_bbar`` etc. are run, and the resulting dense block is
   scattered into the pre-allocated ``data`` array via a
   vectorised ``np.add.at`` / fused index gather.  No
   per-element ``coo_matrix`` allocation, no merging of
   duplicate (row, col) pairs at the end.

The split brings two practical advantages: (a) the
stiffness assembly time is dominated by the kernel BLAS calls
rather than Python overhead, and (b) the same allocated
``indptr`` / ``indices`` arrays back :math:`\mathbf{K}` and
:math:`\mathbf{M}` so the eigen solver's
:math:`(\mathbf{K} - \sigma \mathbf{M})` shifted matrix can
re-use both buffers without reallocation.

Symmetry
--------

For linear-elastic isotropic material every kernel produces a
symmetric :math:`\mathbf{K}_e` (the bilinear form
:math:`\mathbf{B}^{\!\top}\, \mathbf{C}\, \mathbf{B}` is
symmetric since :math:`\mathbf{C}` is) and a symmetric
positive-definite :math:`\mathbf{M}_e`.  The assembled
:math:`\mathbf{K}` and :math:`\mathbf{M}` inherit symmetry, so
femorph-solver:

* Stores **only the upper triangle** when calling Cholesky
  backends (CHOLMOD / Pardiso) — the lower-triangle pattern
  is never materialised.
* Uses :func:`scipy.sparse.linalg.eigsh` (real-symmetric
  Lanczos) for modal analysis, with the cyclic-symmetry
  branch using the real-2n augmentation
  (Grimes, Lewis, Simon 1994) to keep the eigenproblem
  symmetric even after the complex-Hermitian harmonic
  decomposition.

What femorph-solver does **not** assemble
-----------------------------------------

* **Damping** :math:`\mathbf{C}` — none of the current solver
  paths (static + modal + cyclic-modal) use damping.  The
  TA-roadmap harmonic / transient solvers will assemble a
  Rayleigh damping matrix
  :math:`\mathbf{C} = \alpha \mathbf{M} + \beta \mathbf{K}`
  on demand.
* **Pre-stress / geometric stiffness** :math:`\mathbf{K}_g`
  — buckling and prestressed-modal analyses need a separate
  assembly pass that uses the static-solve stress field.
  Out of scope for the linear-elastic corpus.

How femorph-solver runs an assembly
-----------------------------------

:meth:`Model.stiffness_matrix() <femorph_solver.Model.stiffness_matrix>`
and :meth:`Model.mass_matrix() <femorph_solver.Model.mass_matrix>`
are the public entry points; both call
:func:`femorph_solver._assembly.assemble_csr` internally.  The
returned matrix is a :class:`scipy.sparse.csr_matrix` with
shape :math:`(N, N)` indexed by ``Model.dof_map()`` rows.

References
----------

* Saad, Y. (2003) *Iterative Methods for Sparse Linear
  Systems*, 2nd ed., SIAM, §3.3 (sparse storage formats),
  §3.4 (CSR / CSC arithmetic).
* Davis, T. A. (2006) *Direct Methods for Sparse Linear
  Systems*, SIAM, §2.7 (CSR storage), §4 (symbolic
  factorisation graph algorithms).
* Cook, R. D., Malkus, D. S., Plesha, M. E., Witt, R. J.
  (2002) *Concepts and Applications of Finite Element
  Analysis*, 4th ed., Wiley, §2.10 (assembly), §3.4 (sparse
  banded storage).
* Bathe, K.-J. (2014) *Finite Element Procedures*, 2nd ed.,
  §3.4 (assembly), §8 (skyline / sparse storage variants).
* Zienkiewicz, O. C. and Taylor, R. L. (2013) *The Finite
  Element Method: Its Basis and Fundamentals*, 7th ed., §1.4
  (element-to-global scatter algorithm).