Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...

Articles Figures Tables About

Parallel Fock Matrix Formation with Replicated Data

3 Parallel Fock Matrix Formation with Replicated Data [Pg.135]

Processes request tasks (atom quartets) by calling the function get quartet, which has been implemented in both a dynamic and a static version. The dynamic work distribution uses a manager-worker model with a manager process dedicated to distributing tasks to the other processes, whereas the static version employs a round-robin distribution of tasks. When the number of processes is small, fhe sfafic scheme achieves the best parallel performance because the dynamic scheme, when run on p processes, uses only p - 1 processes for compulation. As the number of processes increases, however, the parallel performance for the dynamic task distribution surpasses that of the static scheme, whose efficiency is reduced by load imbalance. Wifh fhe entire Fock and density matrix available to every process, no communication is required during the computation of the Fock matrix other than the fetching of tasks in the dynamic scheme. After all ABCD tasks have been processed, a global summation is required to add the contributions to the Fock matrix from all processes and send the result to every process. [Pg.135]

For M 6 shells on atom A For N e shells on atom B For R e shells on atom C For S e shells on atom D Compute MN RS) [Pg.136]

Global summation of Fock matrix contributions from all processes FIGURE 8.2 [Pg.136]

Outline of a parallel algorithm for Fock matrix formation using replicated Fock and density matrices. A, B, C, and D represent atoms M, N, R, and S denote shells of basis functions. The full integral permutational symmetry is utilized. Each process computes the integrals and the associated Fock matrix elements for a subset of the atom quartets, and processes request work (in the form of atom quartets) by caUing the function get quartet. Communication is required only for the final summation of the contributions to F, or, when dynamic task distribution is used, in get quartet. [Pg.136]


Speedups for parallel Fock matrix formation for distributed and replicated data algorithms when running two compute threads per node. Speedups were obtained on a Linux clustei for the uracil dimer with the aug-cc-pVTZ basis set and were computed relative to single-node timings with one compute thread, using measured wall times. [Pg.145]




SEARCH



Data matrix

Data, format

Fock matrix

Matrix format

Matrix formation

Parallel data

Parallel format

© 2024 chempedia.info