Communication overhead

A novel approach to protein conformation is the entropy-sampling Monte Carlo method (ESMC), which is described in detail in another contribution to this volume. The method provides a complete thermodynamic description of protein models, but it is computationally quite expensive. However, because of the underlying data-parallel structure of ESMC algorithms, computations could be done on massively parallel computers essentially without the communication overhead typical for the majority of other simulation techniques. This technique will undoubtedly be applied to numerous systems in the near future. [Pg.233]

The network performance characteristics for a parallel computer may greatly influence the performance that can be obtained with a parallel application. The latency and bandwidth are among the most important performance characteristics because their values determine the communication overhead for a parallel program. Let us consider how to determine these parameters and how to use them in performance modeling. To model the communication time required for a parallel program, one first needs a model for the time required to send a message between two processes. For most purposes, this time can... [Pg.71]

Speedup curves illustrating commonly encountered performance patterns, (a) ideal (b) superlinear speedup with performance degradation due to, e.g., communication overhead or load imbalance (c) logarithmic communication overhead (d) linear communication overhead (e) incompletely parallelized program (serial fraction of 0.025). See text for details. [Pg.79]

We have used expressions involving the latency, a, and inverse bandwidth, /3, to model the communication time. An alternative model, the Hockney model, is sometimes used for the communication time in a parallel algorithm. The Hockney model expresses the time required to send a message between two processes in terms of the parameters Too and ni, which represent the asymptotic bandwidth and the message length for which half of the asymptotic bandwidth is attained, respectively. Metrics other than the speedup and efficiency are used in parallel computing. One such metric is the Karp-Flatt metric, also referred to as the experimentally determined serial fraction. This metric is intended to be used in addition to the speedup and efficiency, and it is easily computed. The Karp-Flatt metric can provide information on parallel performance characteristics that caimot be obtained from the speedup and efficiency, for instance, whether degrading parallel performance is caused by incomplete parallelization or by other factors such as load imbalance and communication overhead. ... [Pg.90]

Let us try to modify the matrix-vector multiplication algorithm from section 6.4.1 to improve the scalability. The poor scalability was a result of the relatively large communication overhead incurred by using a row distribution for the matrix A. When A is distributed by rows, all elements of the b or c vector must visit (or be stored by) each process during the computation if b and c are replicated, no data exchange is required for b, but an all-to-all broadcast is required to replicate c at the end of the computation if both vectors are distributed, no communication is required for c but all elements of b must visit all processes during the execution. [Pg.109]

The communication overhead for this algorithm is the cost of performing a broadcast and a reduce operation that both involve nj fp elements and... [Pg.110]

Let us consider a performance model for algorithm (c). We first note that, for a manager-worker model in which one process is dedicated to distributing tasks to the others, the maximum efficiency that can be attained with p processes is bounded by [(p — l)/p] x 100%. Other factors that may contribute to lowering the efficiency are load imbalance on the worker processes and communication overhead. [Pg.128]

The execution time includes various synchronization and communication overheads, which means that in fact only part of the whole computational work can be performed in parallel. The overheads are usually machine dependent and are harder to estimate, so theoretical analysis frequently ignores them and tends to give overly optimistic estimates for the power of various parallel algorithms. [Pg.197]

The particle positions and velocities from cells, which are situated on the boundaries between processor domains, are copied to the neighboring processor (see Figure 26.15). Thus the number of particles located in the boundary cells defines the communication overhead. [Pg.744]

For this case, the communication overhead fstrips, which is proportional to the area of the interface between processor domains, is constant and equal to 1. Let us assume that the box is partitioned additionally into n -mesh of identical sub-boxes on x, y-plane. The communication cost tbox for n > 1 is proportional to the area of walls (only half of them) of a single sub-box and is equal to [2Lj P/n y l/n+ Ijn. For sufficiently long boxes with 1 the ratio of two overheads

greater than 1. This means that the communication overhead is lower, and consequently calculation communication ratio higher, for the first method... [Pg.744]

Find the list L, of K nearest neighbors j of each particle i in / ciust radius and sort out the list in ascending order according to the distance between i and j particles. Thus L,(k) = j and k is the position of the particle j in the list. This procedure can be performed in parallel along with computation of forces in FPM code. To reduce the communication overhead, we use the parallel clustering algorithm off-line after simulation. [Pg.751]

However, all the above research studies have focused on secret key distribution issues, while the specific and unique challenge in biometric authentication, ie, how to merge the payload with the biometric information while preserving the statistical uniqueness of the biometric information, has not been addressed. Furthermore, all the above-mentioned biometric-based key-exchange schemes need critical time synchronization because they need to record biometric information simultaneously at different positions of the same human body, which incurs considerable extra communication overhead in extremely resource-constrained wearable body sensor networks. In addition, one of the biometric features may not be unique and accidental faulted measures (eg, due to hardware or software failures) may malfunction the traditional biometric-based security system. [Pg.175]

Sensor information collected by the T1 nodes can be transmitted automatically or on request to T2 nodes. For current motion capture applications, sensor data is transmitted as frequendy as possible. A typical packet is 17 bytes containing a node ID, accelerometer, gyroscope, and compass data. While the l C network can provide bus speeds up to 400 KHz, packet sizes and communication overhead limit effective transmission rates. Lewis estimated the impact of increasing the number of T1 nodes on effective transmission rates, and his results are shown in Fig. 27.15 [10]. [Pg.640]

KB compressed). A large portion of this communication overhead is due to the message format imposed by our net-centric messaging protocol. In future work it will be important to significantly reduce the network overhead of sensors and analyzers since we expect them to continually publish data throughout the network. [Pg.136]

FIGURE 10 The effect of differing stencils on the communication overhead fc. For the stencils in panels (a)-(c), fc is proportional to where n is the number of grid points per processor. As the stencil size increases (d), fc decreases until, when the stencil covers the full domain of the problem (e), fc is proportional to Vn. This corresponds to a long-range force problem. [Pg.87]

For a fixed-size problem, the parallel efficiency drops dramatically with an increasing number of cores, because the communication overheads far supersede the actual computation—an observation valid for both the LJ liquid and the rhodopsin protein problems. The parallel efficiency drops to below 10%. The communication time contributes to more than 80% of the total run time. Typically, one would resort to a fixed-size scaling only when the problem size is such that it cannot fit on one core. It is the scaled-size problems that present an interesting aspect for large-scale computations. [Pg.299]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...