Home page  
Home   Your Room   Login   Contact   Feedback   Site Map   Search:  
Discover this product  
About Us
Overview
Getting here
Committees
Products
Forecasts
Order Data
Order Software
Services
Computing
Archive
PrepIFS
Research
Modelling
Reanalysis
Seasonal
Publications
Newsletters
Manuals
Library
News&Events
Calendar
Employment
Open Tenders
   
Home > Newsevents > Training > Rcourse_notes > DATA_ASSIMILATION > OBS_AND_DIAG_TOOLS >  
   

Observations and diagnostic tools for data assimilation:
October 1998

By Heikki Järvinen





 
  Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>



2 . The observation screening

The ECMWF 3D/4D-Var data assimilation system makes use of an incremental minimization scheme to reduce the computational cost. The variational data assimilation starts with the first (high resolution) trajectory run. During this run the model counterparts for all the observations are calculated through the non-linear observation operators. As soon as these background departures are available for observations, the screening can be performed. Options for 3D- and 4D-screening are available. 3D-screening time window extends over the whole assimilation time window (currently six hours), whereas in 4D-screening the assimilation time window is partitioned into one hour time slots where the screening decisions are taken independently of the other time slots.

2.1 Screening of conventional observations

2.1 (a) Preliminary checks of observations

The observation screening begins with a preliminary check of the completeness of the reports. For instance, the observation and background errors should not be missing, as otherwise the background quality control cannot be performed. Also the reporting practice for synop and temp mass observations (surface pressure and geopotential height) is checked.

Next the observations are scanned through for blacklisting. The blacklist consist formally of two parts. First, the selection of variables for assimilation is done using the data selection part of the blacklist file. This controls which observation types, variables, vertical ranges etc. will be selected for the assimilation. Some more complicated decisions are also performed through the data selection file. For instance, an orographic rejection limit is applied in the case of the observation being too deep inside the model orography. This part of the blacklist also provides a handy tool for experimentation. Second, a monthly monitoring blacklist is applied for discarding the stations that have recently been reporting in an excessively noisy or biased manner as compared with the ECMWF background field.

2.1 (b) Background quality control

The background quality control is performed for all the variables that are intended to be used in the assimilation. The procedure is as follows. The variance of the background departure can be estimated as a sum of observation and background error variances , assuming that the observation and the background errors are uncorrelated. After normalizing with , the estimate of variance for the normalized departure is given by . In the background quality control, the square of normalized background departure is considered as suspect when it exceeds its expected variance more than by a predefined multiple. For the wind observations, the background quality control is performed simultaneously for both wind components. There is also a background quality control for the observed wind direction. For the scatt winds, a test for high wind speeds and cold SST (possible sea-ice) is applied. An example of the background quality control rejections is given in Fig. 1 . It shows that the background quality control effectively cuts off the tails of observation minus background departure distribution.


Figure 1 . An example of a histogram of background departures for airep temperature observations. Variational and background quality control rejections are denoted by filled and outlined columns, respectively.

2.1 (c) Vertical consistency of multi-level reports

The multi-level reports are checked for the vertical consistency and the duplicated levels are removed from the reports. The vertical consistency check of multi-level reports is applied in such a way that if four consecutive layers are found to be of suspicious quality, then these layers are rejected, and in the case of geopotential observations also all the layers above these four are rejected.

2.1 (d) Removal of duplicated reports

The removal of duplicated reports is performed by searching pairs of co-located reports of the same observation types and then checking the content of these reports. It may, for instance, happen that an airep report is duplicated having only a slightly different station identifier but the observed variables inside these reports are exactly the same ones, or partially duplicated. The pair-wise checking of duplicates results in a rejection of some or all of the content of one of the reports.

2.1 (e) Redundancy check

The redundancy check of the reports, together with the level selection of multi-level reports, is performed next for the active reports that are co-located and originate from the same station. For land synop and paob reports, the report closest to the centre of the screening time window with most active data is retained whereas the other reports from that station are considered as redundant and are therefore rejected from the assimilation. For ship synop and dribu observations the redundancy check is done in a slightly modified fashion. These observations are considered as potentially redundant if the moving platforms are within a circle with a radius of one degree latitude. Also in this case only the report closest to the centre of the screening time window with most active data is retained. All the data from the multi-level temp and pilot reports from same station are considered at the same time in the redundancy check. The principle is to retain the best quality data at the significant levels (i.e. the turning points of the sounding) and closest to the centre of the screening time window. One such datum will however only be retained in one of the reports. A wind observation, for instance, from a sounding station may therefore be retained either in a temp or in a pilot report, depending on which one happens to be of a better quality. A synop mass observation, if made at the same time and at the same station as the temp report, is redundant if there are any temp geopotential height observations that are no more than 50hPa above the synop mass observation.

2.1 (f) Thinning

Finally, a horizontal thinning is performed for the airep and TOVS reports. The horizontal thinning of reports means that a predefined minimum horizontal distance between the nearby reports from the same platform is enforced. For airep reports the free distance between reports is currently enforced to about 125 km. The thinning of the airep data is performed with respect to one airliner at a time. Reports from different airliners may however be very close to each other. In this removal of redundant reports the best quality data is retained as the preceding quality control is taken into account. In vertical, the thinning is performed for layers around standard pressure levels thus allowing more reports for ascending and descending flight paths. Thinning of TOVS reports is done at two stages. First a minimum distance of about 70 km is enforced, and thereafter a repeated scan is performed to achieve the final separation of roughly 250 km between reports from one platform. The thinning algorithm is the same as used for aireps but in case of TOVS reports a different preference order is applied: a sea sounding is preferred over a land one, a clear sounding is preferred over a cloudy one and finally, the closeness of observation time to centre of the screening time window is preferred. Fig. 2 gives an example of the over-all usage of TOVS reports. There is also an option for further thinning of SSM/I and satob observations within the IFS.


Figure 2 . The usage of TOVS reports in the assimilation on the North Eastern Atlantic. Filled rings mark reports contain one or more channels used in the assimilation, whereas the empty rings denote rejected reports. Most of the rejections are due to the horizontal thinning and much less due to the quality reasons. Note that both edges of the swath are rejected.

The effect of observation screening on synop surface pressure observations is summarized in Fig. 3 in the case of 3D-Var and 4D-Var, demonstrating the potential of 4D-Var in using observations from frequently reporting stations.


Figure 3 . The effect of the observation screening on synop surface pressure observations. Column height gives the number of observations available, while the shaded part displays those actually used in the assimilation. (a) 4D-screening for 4D-Var, and (b) 3D-screening for 3D/4D-Var

2.2 Screening of satellite radiances

The TOVS radiances (currently 120 km resolution) are preprocessed in a dedicated module which performs several functions to allow the assimilation of TOVS radiances in 4D-Var (the NESDIS retrievals are not used in 4D-Var but only monitored with the background profiles). This module is called advar and it is called for each TOVS observation with the model background temperature, specific humidity and ozone profiles and surface parameters interpolated to the location of the observations. For each analysis cycle there are typically 20,000 TOVS observations in total, for a dual polar orbiter system. In the screening run, advar is called twice.

2.2 (a) Input

The fast radiative transfer model for TOVS radiances requires an input profile from 1000 to 0.1 hPa. For the current 31 level model the background profiles are only available up to 10 hPa and so an extrapolation has to be performed up to 0.1 hPa for temperature using the NESDIS retrievals to 1 hPa and then a simple extrapolation based on model atmospheres above this level. Climatological mean profiles are assumed for water vapour and ozone. For the next version of the ECMWF forecast model with levels in the stratosphere this extrapolation is not necessary any more. Once the full profile from 1000 to 0.1 hPa is defined and checked radiative transfer model is called to compute the background radiances from the background profiles.

2.2 (b) Quality control

Several quality checks are applied to the measured and background radiances. The gross checks applied are:
(i)   Check that the background profile is within realistic limits (e.g. temperature in range 150 to 350 K, specific humidity positive and not supersaturated, ozone within climatological extremes).
(ii)   Check that the measured and background brightness temperatures are present for all required channels and within the range 150 to 350 K.

A series of more critical tests are then applied:
(i)   Gross background check (i.e. measured radiance departures from the background are less than 20 K).
(ii)   The background temperature, specific humidity and ozone profiles are checked to make sure they are close to or within the range encompassed by the diverse 32 (or 35 for ozone) profile dataset for which the radiative transfer model is valid.
(iii)   A fine background check where the square of the radiance departures are flagged if they are greater than .
(iv)   A check for cloud contamination for the HIRS channels is included by checking the radiance departure for HIRS channel 10 is inside the range -4 to +8 K.
(v)   Radiances at the two extreme edge positions of the swath are flagged at present and not used in 4D- Var.
(vi)   Checks are also made that the bias correction coefficients, satellite id, and scan position are all valid before proceeding.

2.2 (c) Retrieval

The main task for advar is to perform a 1D-Var retrieval of temperature, water vapour and ozone profiles. Each radiance profile is assigned to be clear, partly cloudy or cloudy by NESDIS and different TOVS channels and observation errors are used for each type. The background error covariances are also specified in a file and for temperature are close to the global mean background errors assumed in 4D-Var. For specific humidity the background errors assumed in 1D-Var follow the same formulation as in 4D-Var and the correlations are the same as in 4D-Var.

The minimisation of the cost function is performed using the method of Newtonian iteration and up to 5 iterations are allowed before the minimisation fails. If the cost function of the observed radiance in any of the channels exceeds a predefined threshold then the set of radiances is indicated as inconsistent. The output of 1D-Var includes background and retrieved temperature, water vapour and ozone profiles together with several retrieved surface parameters also included in the 1D-Var control vector.

A final check on the stability of the retrieved profile is provided in the code but not implemented as the profiles are not used in 4D-Var.

2.2 (d) SSM/I radiances

SSM/I radiances are also screened in a similar module which performs a similar set of functions to advar retrieving total column water vapour, surface wind speed and cloud liquid water path. At the time of writing the SSM/I radiances are used operationally only in a passive mode enabling a full scale performance monitoring.

2.2 (e) Scatterometer processing

A horizontal thinning is performed for the ERS scatterometer reports with respect to the particular measurement geometry of the instrument. The backscatter data are acquired within individual cells related to a 450 km wide grid with a mesh of 25 km in the across and along track directions. 19 measurement nodes are thus defined across the scatterometer's swath, while 19 rows are also considered in the along track direction to gather the data in squares of 19 by 19 points. The thinning is then achieved by keeping only every fourth point within these squares. The data are thus used at a resolution of 100 km instead of the original 25 km sampling distance.

Apart from the thinning, the other observation dependent decisions involved by the screening of the scatt data come essentially from the application of a sea-ice contamination test from the model sea surface temperature analysis, using a minimum threshold of 273 K, and a high wind rejection test with an upper wind speed limit set to 25 m/s for the higher of the scatt and background winds.

An extra quality control is done on the wind retrieval residual or so-called "normalized distance to the cone". This quantity is tested in global average over the six hours of the analysis cycle for each of the 19 measurement nodes across the swath. All the data are then rejected in bulk if an excessive value is found for any node (more than 1.3 times the expected average) whereas the number of data taken into account is judged significant (more than 500). While the first check performed locally aims at avoiding geophysical effects not explained by the transfer function (cmod4), for example rain or sea-state effects in the vicinity of deep lows, this global quality control on distance to the cone allows to detect technical anomalies not reported in real time by ESA and likely to affect the measurements in a correlated way and at larger scales. Such anomalies occur typically in the case of orbital manoeuvres.

2.3 A summary of the current use of observations

A summary of the current status of use of observations in the 4D-Var data assimilation is given in Table 1 below.

Table 1 . A summary of the current use of observations in the 4D-Var data assimilation at the ECMWF. stands for surface pressure, 2 m for relative humidity at 2 m level, and for brightness temperature, respectively.
Observation type
Variables used
Remarks
synop

and used only over sea, in the tropics also over low terrain (< 150 m). Orographic rejection limit 6hPa for , 100 hPa for and 800 m for ps
airep

Not used in full resolution. Used only below 50 hPa
satob

Selected areas and levels
dribu

Orographic rejection limit 800 m for
temp

Used at significant levels. only below 300 hPa. 10 m and used over land only in tropics over low terrain (< 150 m).
Orographic rejection limit 10 hPa for and , 100 hPa for , 6 hPa for and -4 hPa for
pilot


Used at significant levels. 10 m and used over land only in tropics over low terrain (< 150 m).
Orographic rejection limit 10 hPa for and
satem

Selected channels and areas. NESDIS retrievals are not used any more
paob


Used south of 19oS.
Orographic rejection limit 800 m for
scatt

Not used in full resolution. Used if SST warmer than 273 K or if both observed and background wind less than 25 m/s


2.4 Compression of the CMA-file

After the observation screening roughly 15% of all the observed data are active and the compressed observation array for the minimization run only contains those data. That large compression rate is mainly driven by the number of TOVS data as after the screening there are only 10-20% of the TOVS reports left, whereas for the conventional observations the figure is around 40%. As a part of the compression, the observations are resorted among the processors for the minimization job in order to achieve a more optimal load balancing of the parallel computer.

2.5 A massively parallel computing environment

The migration of operational codes at the ECMWF in 1996 to support a massively parallel computing environment set a requirement for reproducibility. The observation screening should result in exactly the same selection of observations when different number of processors are used for the computations. In the observation screening there are the two basic types of decisions to be made. Independent decisions, on one hand, are those where no information of any other observations or decisions is needed. In a parallel computing environment these decisions can be happily made at different processors fully in parallel. For dependent decisions, on the other hand, a global view of the observations is needed which implies that some communication between the processors is required. The observation array is however far too large to be copied for each individual processor. Therefore, the implementation of observation screening at the ECMWF is such that only a minimum necessary information of the reports is globally communicated in order to provide the global view to the observations needed for the dependent decisions.

The global view of the observations is provided in the form of a global "time-location" array for selected observation types. This array contains compact information of the reports that are still active at this stage. For instance, the observation time, location and station identifier as well as the owner processor of that report are included. The time-location array is composed at each processor locally and then collected for merging and redistributed for each processor. After the redistribution the array is sorted locally at the processors according to the unique sequence number. Every processor has thus exactly the same information to start with and the dependent decisions can be performed in a reproducible manner independently of the computer configuration.

The time-location array is just enough for all the dependent decisions, except for the redundancy checking of the multi-level temp and pilot reports. This is a special case in the sense that the information of each and every observed variable and from each level is needed. This actually means that the whole multi-level report has to be communicated. The other way out of this would be to force the observation clusters of the multi-level reports always into one processor without splitting them. In that case codes responsible for creation of the observation arrays for assimilation should ensure that geographical integrity of the observation arrays distributed for processors. This is, however, not possible in all the cases, and the observation screening has to be able to cope with this. Currently, it is coded in such a way that only a limited number of multi-level temp and pilot reports, based on the time-location array, are communicated between the appropriate processors as copies of these common stations.


Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>







 

Top of page 07.06.2002
 
   Page Details         © ECMWF
shim shim shim