Data distribution, parallel algorithms

The four-index transformation is a good test case for parallel algorithm development of electronic structure calculations, because it has O(N ) operations, a low computation to data transfer ratio and is a compact piece of code. Distributed-memory algorithms were presented for a number of standard QC methods by Whiteside and co-workers Li52 special emphasis on the integral transformation. Details of their implementation on a 32-processor Intel hypercube were provided. [Pg.253]

Outline of the algorithm for parallel matrix-vector multiplication Ab = c with block-distributed matrix A discussed in the text. The number of processes is designated p, and this.proc is the process ID. The employed data distribution is shown in Figure 6.11 bi and c represent blocks of length n/yp of the b and c vectors, respectively, and A -i is the block of A stored by... [Pg.112]

Resort to a parallel algorithm oriented to general purpose Multiple Instruction Multiple Data (MIMD) architectures with distributed main memory but with a shared disk. [Pg.169]

An important consideration in the parallelization of quantum chemistry algorithms for distributed memory computers is the data distribution. The simplest approach is to replicate all the data on all the nodes. Considering, for example, a parallel direct HF computation, this means that each node must store the Fock matrix, the density matrix, the eigenvectors and a variety of other matrices depending on the implementation. Thus, the storage requirement on each node becomes 0 n ), where n is the number of basis functions, and for the large basis sets that can be handled in a reasonable amount of time on a massively parallel computer, this storage requirement may become prohibitive. [Pg.1993]

Because the fluctuations of interest were—by definition—rare, it was usually necessary to continue the data acquisition process for several weeks in order to build up acceptably smooth distributions. For this reason, the analysis algorithm was designed to enable trajectories to several termination squares (not just one) to be sought in parallel an 8 x 8 matrix of 64 adjacent termination squares, each centered on a different (qj, <[y) was scanned. [Pg.491]

To date the most efficient parallel SCF algorithms have been based on replication within each processor of several 0 N ) matrices, limiting the maximum calculation size and forcing an unacceptably low ratio of processors to memory. This restriction led to increased activity in the development of distributed-data schemes, some of which we now consider. [Pg.255]

A. P. Rendell, M. F. Guest, and R. A. Kendall,/. Comput. Chem., 14, 1429 (1993). Distributed Data Parallel Coupled Cluster Algorithm Application to the 2-Hydroxypyridine/2-Pyridone Tautomerism. [Pg.307]

M. Schiitz and R. Lindh, An integral direct, distributed-data, parallel MP2 algorithm, Theor. Chim. Acta, 95 (1997), 13-34. [Pg.273]

From Eq. 6.10 it follows that the dimension n must grow at the same rate as p to maintain a constant efficiency as the number of processes increases. If n increases at the same rate as p, however, the memory requirement per process n /p + 2n) will increase with the number of processes. Thus, a fc-fold increase in p, with a concomitant increase in n to keep the efficiency constant, will lead to a fc-fold increase in the memory required per process, creating a potential memory bottleneck. Measured performance data for a parallel matrix-vector multiplication algorithm using a row-distributed matrix are presented in section 5.3.2. [Pg.109]

Speedups for parallel Fock matrix formation for distributed and replicated data algorithms when running two compute threads per node. Speedups were obtained on a Linux clustei for the uracil dimer with the aug-cc-pVTZ basis set and were computed relative to single-node timings with one compute thread, using measured wall times. [Pg.145]

For a detailed discussion of Hartree-Fock theory, see, for instance, Szabo and Ostlund. Many parallel self-consistent field implementations have been presented in the literature for a review of some of the early work in this field, see Harrison and Shepard. Several massively parallel, distributed data self-consistent field algorithms have been implemented. For example, Colvin et al. ... [Pg.145]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...