|
|
Training
Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>
3 . The modelling of errors
To represent the fact that there is some uncertainty in
the background, the observations and in the analysis we will assume some
model of the errors between these vectors and their true counterparts. The
correct way to do this is to assume some probability density function,
or pdf, for each kind of error. There is a sophisticated and rigorous mathematical
theory of probabilities to which the reader may refer. For the more practical
minds we present a simplified (and mathematically loose) explanation of
pdfs in the paragraph below, using the example of background errors.
3.1 Using pdfs to represent uncertainty
Given a background field just before
doing an analysis, there is one and only one vector of errors that separates
it from the true state:
| |
|
|
If we were able to repeat each analysis experiment a large
number of times, under exactly the same conditions, but with different realizations
of errors generated by unknown causes, would be different each time. We can calculate statistics
such as averages, variances and histograms of frequencies of . In the limit of a very large number of realizations,
we expect the statistics to converge to values which depend only on the
physical processes responsible for the errors, not on any particular realization
of these errors. When we do another analysis under the same conditions,
we do not expect to know what will be the error , but at least we will know its
statistics. The best information about the distribution of is given by the limit of the histogram when the classes
are infinitely small, which is a scalar function of integral 1 called the
probability density function of . From this function one can derive all statistics, including the
average (or expectation) and the variances1. A popular model
of scalar pdf is the Gaussian function, which can be generalized to a multivariate
pdf.
3.2 Error variables
The errors in the background and in the observations2 are modelled as follows:
| |
• background errors:
, of average and covariances
. They are the estimation errors of the background
state, i.e. the difference between the background state vector and
its true value. They do not include discretization errors. |
| |
• observation errors:
, of average and covariances
. They contain errors in the observation process
(instrumental errors, because the reported value is not a perfect
image of reality), errors in the design of the operator , and representativeness errors i.e. discretization
errors which prevent from being
a perfect image of the true state3. |
| |
• analysis errors:
, of average . A measure
of these errors is given by the trace of the analysis
error covariance matrix , |
| |
.
|
|
|
They are the estimation errors
of the analysis state, which is what we want to minimize. |
The averages of errors are called biases and they
are the sign of a systematic problem in the assimilating system: a model
drift, or a bias in the observations, or a systematic error in the way they
are used.
It is important to understand the algebraic nature of the
statistics. Biases are vectors of the same kind as the model state or observation
vectors, so their interpretation is straightforward. Linear transforms that
are applied to model state or observation vectors (such as spectral transforms)
can be applied to bias vectors.
3.3 Using error covariances
Error covariances are more subtle and we will illustrate
this with the background errors (all remarks apply to observation errors
too). In a scalar system, the background error covariance is simply the
variance, i.e. the root-mean-square (or r.m.s., or quadratic)
average of departures from the mean:
| |
|
|
In a multidimensional system, the covariances are a square symmetric matrix.
If the model state vector has dimension , then the covariances are an matrix.
The diagonal of the matrix contain variances4, for each variable of the model; the off-diagonal terms
are cross-covariances between each pair of variables of the model. The matrix
is positive5. Unless some variances are zero, which happens only
in the rather special case where one believes some features are perfect
in the background, the error covariance matrix is positive definite. For
instance if the model state is tri-dimensional, and the background errors
(minus their average) are denoted , then
The off-diagonal terms can be transformed into error correlations (if the
corresponding variances are non zero):
| |
|
|
Finally, linear transformations of the model state vector
can only be applied to covariances as full matrix transforms. In particular,
it is not possible to directly transform the fields of variances or standard
deviations. If one defines a linear transformation by a matrix (i.e.
a matrix whose lines are the coordinates of the new basis vectors in terms
of the old ones, so that the new coordinates of the transform of are ), then the covariance matrix in terms
of the new variables is .
3.4 Estimating statistics in practice
The error statistics (biases and covariances) are functions
of the physical processes governing the meteorological situation and the
observing network. They also depend on our a priori knowledge of
the errors. Error variances in particular reflect our uncertainty in features
of the background or the observations. In general, the only way to estimate
statistics is to assume that they are stationary over a period of time and
uniform over a domain6 so that one can
take a number of error realizations and make empirical statistics. This
is in a sense a climatology of errors. Another empirical way to specify
error statistics is to take them to be a fraction of the climatological
statistics of the fields themselves.
When setting up an assimilation system in practice, such
approximations are unavoidable because it is very difficult to gather accurate
data to calibrate statistics: estimation errors cannot be observed directly.
Some useful information on the average values of the statistics can be gathered
from diagnostics of an existing data assimilation system using the observational
method (see its description below) and the NMC method (use of forecast
differences as surrogates to short-range forecast errors). More detailed,
flow-dependent forecast error covariances can be estimated directly from
a Kalman filter (described below), although this algorithm raises other
problems. Finally, meteorological common sense can be used to specify error
statistics, to the extent that they reflect our a priori knowledge of the
physical processes responsible for the errors7.
ref: Hollingsworth et
al. 1986;
Parrish and Derber 1992
Training Course Notes Front Page >>
Table of contents >>
Next Section >>
Previous Section >>
1 Mathematically speaking, a pdf may not have
an average or variances, but in the usual geophysical problems all pdfs
do, and we will assume this throughout this presentation.
2 One could model forecast errors and balance
properties in a similar way, although this is outside the scope of this
discussion. See the section on the Kalman filter.
3 An example is sharp temperature inversions
in the vertical. They can be fairly well observed using a radiosonde, but
it is impossible to represent them precisely with the current vertical resolution
of atmospheric models. On the other hand, temperature soundings obtained
from satellite cannot themselves observe sharp inversions.
4 The square roots of variances are called standard
deviations, or standard errors.
5 This does not mean that all the matrix elements
are positive; the definition of a positive definite matrix is given in
Appendix A.
The positiveness can be proven by remarking that the eigenvalues of the
matrix are the variances in the direction of the eigenvectors, and thus
are positive.
6 It is called an assumption of ergodicity.
7 It is obvious that e.g. forecast errors in
a tropical meteorological assimilation shall be increased in the vicinity
of reported tropical cyclones, for instance, or that observation operators
for satellite radiances have more errors in cloudy areas.
|