![]() |
||||||||||||||
|
||||||||||||||
|
IFS documentation front page
Chapter 1. Technical overview Chapter 2. FULL-POS post-processing and interpolation Chapter 3. Parallel implementation REFERENCES |
Next
Section Previous Section APPENDIX A Descriptions of data structuresExtensive use is made in the code of a set of scalar integers held in module YOMMP. These are used to assist the description of the data partitioning across processors and are based on the strategy of first partitioning in the north-south direction (NPRGPNS) and then further subpartitioning east-west, according to the value of NPRGPEW. The subdivision in each dimension is called a SET, leading to:
Similarly, for the wave-space partitioning:
The NPROC PEs are logically distributed in a 2-dimensional processor grid NPRGPNS * NPRGPEW. Due to the transposition method, most communication takes place along a column or a row in the logical grid. Each processor set of rows and columns can communicate independently of other sets. Each processor has two logical set coordinates: MYSETA and MYSETB and a logical processor id MYPROC. A number of data structures describe which other PEs belong to the A-set and B-set of a PE. This is used to control communication in the A and B direction. Global communication is defined by the arrays
The example in Fig. A.2 shows NPRGPNS = NPRTRW = 3 , NPRGPEW = NPRTRV = 1, for processor 2 (PE 2). Most variables are uniquely associated with each PE, but some like NPROCL(:) are linked to the latitudes on the globe. Grid-points and Fourier latitudes are disributed among PEs from north to south. The total number of (Gaussian) latitudes is called NDGLG. The array NPROCL((1: NDGLG) contains the (logical) number of the PE responsible for the Fourier transform calculations of each latitude, i.e. PE2 is responsible for latitudes 12 to 22. Since FFT calculations require all grid points on a latitude, complete latitudes are distributed among processors. The variable NDGLL is the number of Fourier latitudes on the PE ( = 11). This information is also available for all NPRTRW processors in the array NULTPP(1: NPRTRW). Thus, NDGLL = NULTPP(2). An array NPTRLS(1: NPRTRW) contains integer pointers to the first latitude (in global numbering) within the Fourier latitude range of a PE. For PE2 the value is 12. It is assumed that a latitude range from North to South is associated with one PE. The NULTPP(:) and NPTRLS(:) information from other PEs are used in the transposition routines to and from Fourier space. The right hand side of the diagram describes variables and data structures associated with grid point calculations. Due to the spectral transform method, there are only vertical dependencies for grid point calculations in IFS, with the exception of the semi-Lagrangian calculations, described in section 3.7. This makes it possible simply to split latitudes in any convenient way to achieve load balance, i.e. equal amounts of grid point calculation cost on each PE. If the logical variable LSPLIT = TRUE , each PR is assigned the same number of grid points (+ or - 1). This achieves a fairly good load balance on partitions of up to several hundred PEs. The choice LSPLIT = FALSE is convenient for debugging purposes and for when (new) code portions have not yet been generalized to cope with split latitudes. The array LSPLITLAT(1: NDGLG) records if a latitude is split between 2 of the NPRGPNS PEs. The number of PEs used for the east-west grid splitting (NPRGPEW) does not influence any of the variables in Fig. A.2 .
PE2 has both the first and last latitude split. The number of latitudes from which PE2 has grid columns is NDGENL ( = 12). The consecutive latitude range goes from MYFRSTACTLAT ( = 11) to MYLSTACTLAT ( = 22). This latitude range information for all NPRGPNS PEs is required for the transpositions to and from grid point space. It is defined in NFRSTLAT(1: NPRGPNS) and NLSTLAT(1: NPRGPNS) with respectively the first and last grid point latitude on each PE. As special cases, we have MYFRSTACTLAT = NFRSTLAT(2) and MYLSTACTLAT= NLSTLAT(2). The offset to the first grid point latitude of the PE is frequently used and stored in NFRSTLOFF ( = MYFRSTACTLAT-1). The array MYLATS(1: NDGENL) links the local grid point latitude numbering to the actual latitude number on the globe. In the example, MYLATS(1) = 11. The arrays NSTA(NDGLG+NPRGPNS-1, NPRGPEW) and NONL(NDGLG+NPRGPNS-1, NPRGPEW) describe how grid columns are distributed among processors. A latitude is split into a number of consecutive strips, the number depending on NPRGPEW and LSPLIT. Each latitude strip is uniquely determined by the starting point measured from Greenwich (NSTA) and the number of points on that latitude belonging to the processor (NONL). Split latitudes complicate the picture because a physical latitude (latitude 11 in the example) is split among several PEs. So latitude 11 in the example has to be represented twice in the NSTA and NONL data structures. This explains the over-dimensioning of the two arrays. The variables in Fig. A.3 are used to control the proper handling of NSTA and NONL array references on different processor sets, as indicated in Fig. A.4 .
When a high level of parallelism is required, it is necessary to distribute vertical levels among PEs for Fourier and Legendre transform calculations. NPRTRV is the number of PEs among which the vertical levels are distributed. In the diagram example, 19 levels are distributed among 3 PEs. The levels are always distributed as equally as possible in consecutive blocks. In Fig. A.5 , PE2 is responsible for the 6 levels from 8 to 13. The variables NFLEVL ( =6 ) and MYLEVS(1: NFLEVL) hold this information. The array NUMLL(1: NPRTRV) contains the number of levels each of the NPRTRV PEs are responsible for. As a special case, NFLEVL = NUMLL(2). The integer pointer array NPTRLL(1: NPRTRV) points to the actual first level on each processor, so NPTRLL(2)=8. To simplify code design, we define NPTRLL(NPRTRV+1) = NFLEVG + 1. A help array NBSETLEV (1: NFLEVG) records which PE-set is responsible for the calculation of each vertical level, i.e. NBSETLEV(10) = 2. These help arrays are required to control the vertical transposition, and in cases where global vertical gathering of data is needed. The surface pressure and other surface fields taking part in the spectral transform are typically assigned to the last PE in the NPRTRV set. The variable NPSP=1 on this set and 0 on all other sets. An additional help array NPSURF(1:NPRTRV) contains the NPSP value for each of the NPRTRV processor sets. It is necessary to have this information in order to perform vertical transpositions.
The shaded areas in Fig. A.6 show the so-called halo regions for PE2 for a simple configuration using 3 PEs. The halo region contains the grid columns required by PE2 from neighbouring PEs in order to calculate semi-Lagrangian departure points for all grid columns owned by this PE. Fig. A.7 shows the semi-Lagrangian core and halo regions in the most general configuration where the PEs (here PE11) have a 2-dimensional irregular shape with many neighbour PEs. For the data structures associated with halo communication, see the Subsections 3.7.2 SLCSET/Subsection 3.7.3 SLRSET description. The spectral wave numbers are distributed in a round-robin fashion among the NPRTRW PEs so as to create a good load balance. In Fig. A.8 , using a T21 resolution, the waves are distributed among 3 PEs. The contents of the variables reflect the view seen from PE2. NUMP ( = 7 ) is the number of spectral waves treated by PE2. The array MYMS(1: NPRTRW) contains the list of the actual wave numbers on PE2. The transposition routines use the array NUMPP(1: NPRTRW) containing the number of wave numbers on each of the NPRTRW PEs. The number of real spectral coefficients on the PE is NSPEC ( = 68 ) and, likewise, the number of complex spectral coefficients NSPEC2 ( = 168 ). The NSPEC values will usually vary among PEs. The maximum number of spectral coefficients over the NPRTRW PEs is called NSPEC2MX ( = 170 ) and is required for the dimensioning of arrays.
NPROCM(0: NSMAX) defines which PE set is responsible for the calculation of each spectral wave. The NSPEC2 spectral coefficients are stored as a 1-dimensional data structure. NASM0(0: NSMAX) has the starting position of the NUMP waves on this PE. Only NUMP values in the range 0: NSMAX are assigned positive meaningful values. NASM0(:) is widely used throughout the code when calculations for specific zonal wave numbers are required. The semi-implicit spectral calculations have only vertical dependencies so spectral coefficient columns can be distributed without constraints amoung the NPRTRV PEs (see Fig. A.9 ). To achieve good load balance, zonal waves are usually cut in the middle (see wave number 4 above). For some configurations, there are dependencies among the total waves (n) within a zonal wave number and for these cases, splitting in the middle of a wavenumber is not possible. This restricts the load-balanced parallelism to one half of the spectral truncation.
In almost all parts of IFS, it is sufficient to have a subset of the spectral coefficients, namely the subset this PE is responsible for (see Fig. A.8 ). However, a global view is required when initial data is read, when post processed spectral fields are gathered, and when spectral cost function contributions are accumulated. The global spectral data structure (see Fig. A.10 ) is designed so that local parts from each PE (in processor order) within an A-set are stored next to each other. To avoid memory waste, the data are stored in a one-dimensional structure. An index to the spectral zonal waves in the global data structure is defined by an array NALLMS(1:NSMAX+1). The array ND1M0G(0:NSMAX) points to the first spectral coefficient for each spectral wave within the one-dimensional structure. Finally, the array NPOSSP(1: NPRTRW) records where the spectral coefficients from each A-set start within the global one-dimensional structure. Next Section Previous Section |
|||||||||||||
|
|
|||||||||||||