Global assembly#

Per-element \(\mathbf{K}_e\) / \(\mathbf{M}_e\) / \(\mathbf{f}_e\) blocks (built by the kernel routines — see Element kernels) are scattered into the global sparse matrices that the linear / eigen solvers see. This chapter describes the assembly algorithm, the storage format, and how femorph-solver pre-computes a single CSR pattern and re-uses it across stiffness, mass, and damping passes.

Element-to-global DOF map#

Each MAPDL node \(n\) has a fixed bank of degrees of freedom — three translations \((u_x, u_y, u_z)\) for solid kernels, plus three rotations \((\theta_x, \theta_y, \theta_z)\) for beam / shell kernels — laid out contiguously in the global vector \(\mathbf{U}\) according to the layout returned by Model.dof_map(). Per- element DOF indices follow from that mapping plus the element’s local node order:

\[\mathrm{idx}(e, i, k) = \mathrm{base}(\mathrm{node}_e[i]) + k, \qquad i \in [0, n_\mathrm{nodes,e}), \; k \in [0, n_\mathrm{dof/node}),\]

where \(\mathrm{node}_e[i]\) is the \(i\)-th element node and \(\mathrm{base}(n)\) is the first global-DOF index for node \(n\). Model._dof_layout() caches the \((\mathrm{base}, \text{labels})\) mapping for every element-touching node so the per-cell scatter is a vectorised displacement[dofs] gather.

Sparse CSR scatter#

The assembled \(\mathbf{K}\) and \(\mathbf{M}\) are sparse symmetric \((N, N)\) matrices, with \(N\) the global free-DOF count. femorph-solver stores them in SciPy’s Compressed Sparse Row (CSR) format (scipy.sparse.csr_matrix) — each non-zero is a triple \((\mathrm{row}, \mathrm{col}, \mathrm{val})\) packed into three flat arrays plus a per-row offset:

\[\mathbf{K}_{ij} = \mathrm{data}[k] \quad \text{where}\quad \mathrm{indices}[k] = j, \quad \mathrm{indptr}[i] \le k < \mathrm{indptr}[i + 1].\]

CSR is the right choice for the LinearOperator API used by ARPACK / Pardiso / CHOLMOD because its row-wise traversal maps directly onto \(\mathbf{K}\, \mathbf{v}\) matrix-vector products and onto symbolic factorisation graph algorithms (Saad 2003 §3.4; Davis 2006 §2.7).

The pattern-once-fill-many split#

Element kernels assemble a fresh dense \((n_e, n_e)\) per element, but the non-zero pattern of \(\mathbf{K}\) is fixed by the mesh + element-type registry — it doesn’t depend on the material or numerical values. femorph-solver exploits this in two phases:

Symbolic pass. A single graph traversal computes the per-row column indices that any element in the model touches. Each row’s column array is then sorted and de-duplicated. Result: indptr and indices are built once and re-used for every assembly.
Numerical pass. Per-element ke / me / ke_bbar etc. are run, and the resulting dense block is scattered into the pre-allocated data array via a vectorised np.add.at / fused index gather. No per-element coo_matrix allocation, no merging of duplicate (row, col) pairs at the end.

The split brings two practical advantages: (a) the stiffness assembly time is dominated by the kernel BLAS calls rather than Python overhead, and (b) the same allocated indptr / indices arrays back \(\mathbf{K}\) and \(\mathbf{M}\) so the eigen solver’s \((\mathbf{K} - \sigma \mathbf{M})\) shifted matrix can re-use both buffers without reallocation.

Symmetry#

For linear-elastic isotropic material every kernel produces a symmetric \(\mathbf{K}_e\) (the bilinear form \(\mathbf{B}^{\!\top}\, \mathbf{C}\, \mathbf{B}\) is symmetric since \(\mathbf{C}\) is) and a symmetric positive-definite \(\mathbf{M}_e\). The assembled \(\mathbf{K}\) and \(\mathbf{M}\) inherit symmetry, so femorph-solver:

Stores only the upper triangle when calling Cholesky backends (CHOLMOD / Pardiso) — the lower-triangle pattern is never materialised.
Uses scipy.sparse.linalg.eigsh() (real-symmetric Lanczos) for modal analysis, with the cyclic-symmetry branch using the real-2n augmentation (Grimes, Lewis, Simon 1994) to keep the eigenproblem symmetric even after the complex-Hermitian harmonic decomposition.

What femorph-solver does not assemble#

Damping \(\mathbf{C}\) — none of the current solver paths (static + modal + cyclic-modal) use damping. The TA-roadmap harmonic / transient solvers will assemble a Rayleigh damping matrix \(\mathbf{C} = \alpha \mathbf{M} + \beta \mathbf{K}\) on demand.
Pre-stress / geometric stiffness \(\mathbf{K}_g\) — buckling and prestressed-modal analyses need a separate assembly pass that uses the static-solve stress field. Out of scope for the linear-elastic corpus.

How femorph-solver runs an assembly#

Model.stiffness_matrix() and Model.mass_matrix() are the public entry points; both call femorph_solver._assembly.assemble_csr() internally. The returned matrix is a scipy.sparse.csr_matrix with shape \((N, N)\) indexed by Model.dof_map() rows.

References#

Saad, Y. (2003) Iterative Methods for Sparse Linear Systems, 2nd ed., SIAM, §3.3 (sparse storage formats), §3.4 (CSR / CSC arithmetic).
Davis, T. A. (2006) Direct Methods for Sparse Linear Systems, SIAM, §2.7 (CSR storage), §4 (symbolic factorisation graph algorithms).
Cook, R. D., Malkus, D. S., Plesha, M. E., Witt, R. J. (2002) Concepts and Applications of Finite Element Analysis, 4th ed., Wiley, §2.10 (assembly), §3.4 (sparse banded storage).
Bathe, K.-J. (2014) Finite Element Procedures, 2nd ed., §3.4 (assembly), §8 (skyline / sparse storage variants).
Zienkiewicz, O. C. and Taylor, R. L. (2013) The Finite Element Method: Its Basis and Fundamentals, 7th ed., §1.4 (element-to-global scatter algorithm).