2.6 Variational quality control
The variational quality control, VarQC, has been described by Andersson and Järvinen (1999). It is a quality control mechanism which is incorporated within the variational analysis itself. A modification of the observation cost function to take into account the non-Gaussian nature of gross errors, has the effect of reducing the analysis weight given to data with large departures from the current iterand (or preliminary analysis). Data are not irrevocably rejected, but can regain influence on the analysis during later iterations if supported by surrounding data. VarQC is a type of buddy check, in that it rejects those data that have not been fitted by the preliminary analysis, often because it conflicts with surrounding data.
2.6.1 Description of the method
The method is based on Bayesian formalism. First, an a priori estimate of the probability of gross error
is assigned to each datum, based on study of historical data. Then, at each iteration of the variational scheme, an a posteriori estimate of the probability of gross error
is calculated (Ingleby and Lorenc, 1993), given the current value of the iterand (the preliminary analysis). VarQC modifies the gradient (of the observation cost function with respect to the observed quantity) by the factor
(the QC-weight),which means that data which are almost certainly wrong (
) are given near-zero weight in the analysis. Data with a
are considered `rejected' and are flagged accordingly, for the purpose of diagnostics and feedback statistics, etc.
The normal definition of a cost function is
where
is the probability density function. Instead of the normal assumption of Gaussian statistics, we assume that the error distribution can be modelled as a sum of two parts: one Gaussian, representing correct data and one flat distribution, representing data with gross errors. We write:
where subscript
refers to observation numer
.
and
are the Gaussian and the flat distributions, respectively:
The flat distribution is defined over an interval
which in Eq. (2.13) has been written as a multiple of the observation error standard deviation
. Substituting Eqs. (2.11) to (2.13) into Eq. (2.10), we obtain after rearranging the terms, an expression for the QC-modified cost function
and its gradient
, in terms of the normal cost function
where
2.6.2 Implementation
The a priori information i.e.
and
are set during the screening, in the routine DEPART, and stored in the NCMFGC1 and NCMFGC2-words of the ODB. Default values are set in DEFRUN, and can be modified by the namelist namjo. VarQC can be switched on/off for each observation type and variable individually using LVARQC, or it can be switched off all together by setting the global switch LVARQCG=.false. Since an as good as possible `preliminary analysis' is needed before VarQC starts, it is necessary to perform part of the minimization without VarQC, and then switch it on. This is controlled by NITERQC in yomcosjo, and is set to 40 by default. Printing of VarQC results is done by the routine PRTQC.
JOCOST computes
according to Eq. (2.15) and the QC-weight-the factor within brackets in Eq. (2.16).
2.6.3 Correlated data
The quality control of radiosonde height data (if used) is more complex because of the correlation of observation error (see JOPDF). This is one of the reason why we changed to using temperature data instead, from cy18r6. VarQC for correlated data is no longer supported.