Global assembly#

Per-element \(\mathbf{K}_e\) / \(\mathbf{M}_e\) / \(\mathbf{f}_e\) blocks (built by the kernel routines — see Element kernels) are scattered into the global sparse matrices that the linear / eigen solvers see. This chapter describes the assembly algorithm, the storage format, and how femorph-solver pre-computes a single CSR pattern and re-uses it across stiffness, mass, and damping passes.

Element-to-global DOF map#

Each MAPDL node \(n\) has a fixed bank of degrees of freedom — three translations \((u_x, u_y, u_z)\) for solid kernels, plus three rotations \((\theta_x, \theta_y, \theta_z)\) for beam / shell kernels — laid out contiguously in the global vector \(\mathbf{U}\) according to the layout returned by Model.dof_map(). Per- element DOF indices follow from that mapping plus the element’s local node order:

\[\mathrm{idx}(e, i, k) = \mathrm{base}(\mathrm{node}_e[i]) + k, \qquad i \in [0, n_\mathrm{nodes,e}), \; k \in [0, n_\mathrm{dof/node}),\]

where \(\mathrm{node}_e[i]\) is the \(i\)-th element node and \(\mathrm{base}(n)\) is the first global-DOF index for node \(n\). Model._dof_layout() caches the \((\mathrm{base}, \text{labels})\) mapping for every element-touching node so the per-cell scatter is a vectorised displacement[dofs] gather.

Sparse CSR scatter#

The assembled \(\mathbf{K}\) and \(\mathbf{M}\) are sparse symmetric \((N, N)\) matrices, with \(N\) the global free-DOF count. femorph-solver stores them in SciPy’s Compressed Sparse Row (CSR) format (scipy.sparse.csr_matrix) — each non-zero is a triple \((\mathrm{row}, \mathrm{col}, \mathrm{val})\) packed into three flat arrays plus a per-row offset:

\[\mathbf{K}_{ij} = \mathrm{data}[k] \quad \text{where}\quad \mathrm{indices}[k] = j, \quad \mathrm{indptr}[i] \le k < \mathrm{indptr}[i + 1].\]

CSR is the right choice for the LinearOperator API used by ARPACK / Pardiso / CHOLMOD because its row-wise traversal maps directly onto \(\mathbf{K}\, \mathbf{v}\) matrix-vector products and onto symbolic factorisation graph algorithms (Saad 2003 §3.4; Davis 2006 §2.7).

The pattern-once-fill-many split#

Element kernels assemble a fresh dense \((n_e, n_e)\) per element, but the non-zero pattern of \(\mathbf{K}\) is fixed by the mesh + element-type registry — it doesn’t depend on the material or numerical values. femorph-solver exploits this in two phases:

  1. Symbolic pass. A single graph traversal computes the per-row column indices that any element in the model touches. Each row’s column array is then sorted and de-duplicated. Result: indptr and indices are built once and re-used for every assembly.

  2. Numerical pass. Per-element ke / me / ke_bbar etc. are run, and the resulting dense block is scattered into the pre-allocated data array via a vectorised np.add.at / fused index gather. No per-element coo_matrix allocation, no merging of duplicate (row, col) pairs at the end.

The split brings two practical advantages: (a) the stiffness assembly time is dominated by the kernel BLAS calls rather than Python overhead, and (b) the same allocated indptr / indices arrays back \(\mathbf{K}\) and \(\mathbf{M}\) so the eigen solver’s \((\mathbf{K} - \sigma \mathbf{M})\) shifted matrix can re-use both buffers without reallocation.

Symmetry#

For linear-elastic isotropic material every kernel produces a symmetric \(\mathbf{K}_e\) (the bilinear form \(\mathbf{B}^{\!\top}\, \mathbf{C}\, \mathbf{B}\) is symmetric since \(\mathbf{C}\) is) and a symmetric positive-definite \(\mathbf{M}_e\). The assembled \(\mathbf{K}\) and \(\mathbf{M}\) inherit symmetry, so femorph-solver:

  • Stores only the upper triangle when calling Cholesky backends (CHOLMOD / Pardiso) — the lower-triangle pattern is never materialised.

  • Uses scipy.sparse.linalg.eigsh() (real-symmetric Lanczos) for modal analysis, with the cyclic-symmetry branch using the real-2n augmentation (Grimes, Lewis, Simon 1994) to keep the eigenproblem symmetric even after the complex-Hermitian harmonic decomposition.

What femorph-solver does not assemble#

  • Damping \(\mathbf{C}\) — none of the current solver paths (static + modal + cyclic-modal) use damping. The TA-roadmap harmonic / transient solvers will assemble a Rayleigh damping matrix \(\mathbf{C} = \alpha \mathbf{M} + \beta \mathbf{K}\) on demand.

  • Pre-stress / geometric stiffness \(\mathbf{K}_g\) — buckling and prestressed-modal analyses need a separate assembly pass that uses the static-solve stress field. Out of scope for the linear-elastic corpus.

How femorph-solver runs an assembly#

Model.stiffness_matrix() and Model.mass_matrix() are the public entry points; both call femorph_solver._assembly.assemble_csr() internally. The returned matrix is a scipy.sparse.csr_matrix with shape \((N, N)\) indexed by Model.dof_map() rows.

References#

  • Saad, Y. (2003) Iterative Methods for Sparse Linear Systems, 2nd ed., SIAM, §3.3 (sparse storage formats), §3.4 (CSR / CSC arithmetic).

  • Davis, T. A. (2006) Direct Methods for Sparse Linear Systems, SIAM, §2.7 (CSR storage), §4 (symbolic factorisation graph algorithms).

  • Cook, R. D., Malkus, D. S., Plesha, M. E., Witt, R. J. (2002) Concepts and Applications of Finite Element Analysis, 4th ed., Wiley, §2.10 (assembly), §3.4 (sparse banded storage).

  • Bathe, K.-J. (2014) Finite Element Procedures, 2nd ed., §3.4 (assembly), §8 (skyline / sparse storage variants).

  • Zienkiewicz, O. C. and Taylor, R. L. (2013) The Finite Element Method: Its Basis and Fundamentals, 7th ed., §1.4 (element-to-global scatter algorithm).