PAPER I

Radi Ajjaji

Abstract

This paper discusses a problem observed since the very beginning in the 4d-var assimilation system at Météo-France. Precipitations over the Sahara desert are extremely overestimated in forecasts having as initial state a multi-incremental 4d-var analysis (Veersé and Thépaut, 1998). This problem is observed frequently during the summer period (June, July, and August). Several investigations had been done to diagnose and fix this problem which constitutes a serious source of weakness in ARPEGE assimilation system used at Météo-France since June 2000. As a first try to solve the problem, it has been decided to apply an incremental semi-external digital filter initialization at the end of the assimilation process to further filter model fields, especially humidity, at full resolution and consistently with the physics. This weakened slightly the problem but didn't solve it definitively.

The first ingredient suspected was the simplified physics introduced in the last inner loop, then the multi-truncation incremental approach. The different results showed that the overestimated rainfall is not caused by these two ingredients. A last experiment showed that the abnormal humidity increments associated to the large amounts of precipitations in the subsequent forecast over Sahara are caused by a strange feature of the tangent-linear dynamics, causing an abnormal nonlinear evolution of the increments during the minimization process.

1. Introduction

Four-dimensional variational assimilation (4d-var) has been operational at Météo-France since June 20th, 2000. The basic characteristics of this system are similar to those of the ECMWF 4d-var, with some specific components including the multi-incremental technique, a weak constraint based on digital filtering, the use of another set of specific simplified physical parameterizations based on Météo-France full physics package, and later an additional incremental semi-external filtering.

At the date of writing, minimization was based on three successive outer and inner loops. The first, second and third inner-loop minimizations are performed at spectral truncations T42, T63 and T95 respectively: This is cheaper than a pure incremental technique performed at resolution T95 (Veersé and Thépaut, 1998). The model resolution in the outer loops (and in forecasts) is T199, with a stretching factor of 3.5 and 31 vertical levels.

Noise is controlled inside the minimization process through a weak constraint based on a incremental digital filtering (Gauthier and Thépaut, 2000). A semi-external digital filtering is added to the system.

The last minimization (at truncation T95) is performed using diabatic tangent-linear and adjoint models containing simplified (regular) physical parameterizations : linear vertical diffusion, gravity wave drag, large-scale precipitations, radiation and deep convection (Janiskova et al., 1999).

This system configuration has been validated on several periods of parallel runs and the results were compared to the operational 3d-var system. However, on the Sahara area, the scores were bad, especially those concerning humidity field. This leads to a great overestimation of precipitation parameter over this zone.

Our investigations concerns the system described hereabove. The experiments concern a period of 10 days of variational assimilation from June 1st to June 10th, 2000. During this period, which corresponds to the beginning of operational 4d-var assimilation at Météo-France, frequent abnormal cases of precipitation overestimation over Sahara were observed.

In the second section, a brief description of the problem will be presented, followed by a description of the first solution tried to weaken it : the so-called semi-external digital-filter initialization. The third section will be dedicated to the investigations made to show the impact of the simplified physics introduced in the last inner loop. The fourth section will discuss the impact of a multi-truncation against a mono-truncation incremental approach. The problem will then be diagnosed in section 5 by studying some conceptual deficiencies in the derivation of TL/AD dynamics which may cause, among others, a bad behaviour of humidity increments.

2. Precipitation overestimation by ARPEGE forecast

Specific humidity, which is one of the 4d-var control variable, shows large positive analysis increments in the lowest model levels, over some subtropical land areas : Saudi Arabia, Sahara, Mexico, southern United States, ... (Fig. 1). Theses areas are characterized by high temperatures and dry conditions. Moistening could be as high as 10 g/Kg, and then it causes more and more humid subsequent forecast.

./R_Ajjaji_I_Fig1.gif

Figure 1 : 24-hours convective precipitation given by an ARPEGE forecast based on a 4d-var analysis.
Precipitations localized at [20°N, 5°E] are completely fictitious.

The main observational data used by humidity analysis are TEMP specific humidity, SYNOP 2 m relative humidity and TOVS radiances in channels HIRS-10, 11 and 12. Several other TOVS channels also have a slight influence on humidity. There may also be a weak influence on the humidity analysis from surface pressure data and radiosonde geopotential data through, at least, virtual temperature effects.

As a first solution suggested and applied to cure this wrong behaviour quickly and easily, a "semi-external" digital filtering (based on finalization) was applied. It is added to the system with almost no extra cost, as it uses model integrations which have to be performed anyway during 4d-var : observation screening, computation of the last model trajectory at the end of the minimization. As it is shown on Fig. 2, this solution has no great effect on our problem.

In an assimilation system like the operational one at Météo-France, multiple ingredients could be at the origin of a bad humidity analysis : simplified physics, multiple-truncation incremental approach, incremental approach itself, something wrong in the conception of TL/AD models, bad use of some observation types, use of incoherent background specific humidity structure functions. They were examined afterwards.

./R_Ajjaji_I_Fig2.gif

Figure 2 : 24-h convective precipitations amounts averaged over Sahara [15°N, 35°N]×[20°W, 40°E] for operational context [VAR], operational context with semi-external DFI [DDD], adiabatic context [ADB]. Forecasts based on 3d-var are taken as references.

3. Impact of simplified physics on humidity analysis over Sahara

A set of simplified and regular physical parameterizations as well as its tangent-linear and adjoint versions has been implemented in the ARPEGE data assimilation system. The main objectives are not only to be realistic enough, but mainly to be simple and regular for the efficiency of minimization in 4d-var. The package contains a simplified computation of radiative fluxes, vertical turbulent diffusion, orographic gravity waves, deep convection and stratiform precipitation fluxes (Janiskova et al., 1999).

In the operational version, this physics package is switched on only during the last inner loop, at resolution T95. At this truncation, the resolution is considered to be sufficiently high so that an inclusion of some physics becomes necessary. But this could have a bad impact on the humidity increments computed during the two first adiabatic inner loops. To investigate this aspect, two experiments were performed.

3.1 Applying simplified physics in all inner loops

In principle, this should produce more consistent increments, the impact of simplified physics affecting the increments since the beginning of the minimization, at truncation T42. But no significant effect on humidity increments can be observed, and the forecast scores keep unchanged. Such a study was already done by (Janiskova et al., 1998) where more investigations were carried out and extended to all other fields. Here the study focuses on humidity and temperature fields.

The shape of the cost-function decrease confirms the results obtained by (Thépaut et al., 1999). The introduction of simplified physics in all inner loops causes a less convergence ratio at the same number of iterations.

A single-observation experiment (one individual observation of specific humidity at 850 hPa is introduced somewhere over Hoggar mountains) shows a little difference, for humidity and temperature increments, between the experimental and operational configurations. Humidity increments look more noisy at resolutions T42 and T63, which reflects the impact of physics.

So, as a conclusion, we can say definitely that applying simplified physics only in the third inner loop is consistent and not responsible for humidity troubles.

3.2 No physics in the assimilation system : adiabatic inner loops

The aim of this experiment is to quantify the impact of simplified physics on the humidity increments. This is the only experiment that will confirm or not an eventual responsibility of simplified physics.

The scores and all the diagnostics presented on Figs. 3a-b show that the problem of big positive humidity increments keeps unchanged in the adiabatic assimilation experiment. One can also notice the quasi-neutral effect of the simplified physics on the forecast quality over this Sahara region. This is consistent with (Thépaut et al., 1999) where this neutral effect was also shown for all the tropical and subtropical regions.


Figure 3a : Specific humidity averaged in space on Sahara [15°N, 35°N]×[20°W, 40°E] in a 10-days adiabatic 4d-var suite	Figure 3b : Same as Fig. 3a, but for the operational context

4. Impact of minimization truncation on the humidity-increment field

Two assimilation configurations are studied : mono-truncation incremental and non incremental experiments are compared to the operational multiple-truncation incremental one. The simplified physics is still introduced in the last inner loop, in all cases.

4.1 Mono-truncation incremental case

The minimization truncation is the same (T95) during all the three inner loops; 25 iterations are allowed for each inner-loop minimization. This configuration is then rather similar to a simple incremental configuration with 75 iterations. But it is not exactly the same because simplified physics is not switched on since the beginning.

The results are shown on Figs. 4a-b. One can notice that a little improvement is obtained : the humidity-increments mean profile shows smaller values near the surface, i.e. along the layers between 1000 hPa and 700 hPa. But the increments still not realistic compared to those resulting from 3d-var assimilation, and the forecast scores are still very bad.


Figure 4a : Relative-humidity zonal mean profiles averaged over a 10 days 4d-var suite in a mono-truncation context. The corresponding 3d-var experiment is taken as reference.	Figure 4b : Same as Fig. 4a, but in the operational (multi-truncation) context

On the single-observation experiments (Fig. 5a) the isolines for specific-humidity increments, during the two first inner loops, seem sparser and with a greater magnitude compared to the operational case (differences are shown in Fig. 5b). The second and third inner loops show no significant difference compared to the first inner loop. This could be explained by the fact that the convergence is quickly reached during the first inner loop.

4.2 Impact of incrementality

The main practical problem to be solved for an operational implementation of 4d-var is the reduction of the great computational resources needed. One important approach to achieve this goal is the incremental formulation (Courtier et al., 1994). This formulation allows the minimization to be performed at lower resolution and with a simplified linear model whereas the atmospheric state remains transported by the nonlinear direct diabatic model.

The direct nonlinear model used at high resolution in the so-called outer loops is the T199 ARPEGE one. An outer loop consists mainly in calculating observation departures at high resolution in an environment of realistic physics and dynamics. This information is then used as a forcing ingredient during the inner loops to keep the integrations involved along the minimizations close to the atmospheric "reality".

./R_Ajjaji_I_Fig5a.gif

Figure 5a : Increments obtained in a single-observation experiment for specific and relative humidity during the three 4d-var minimization updates. The same truncation is used for all the inner loops.

./R_Ajjaji_I_Fig5b.gif

Figure 5b : Difference between the increments described in Fig. 5a and those obtained in an operational context.

The tangent-linear and adjoint models used during the minimizations are in fact the direct consequences of the incremental technique. They are introduced to approximate a real, more complex and heavier, calculation of the cost function and its gradient. The former becomes quadratic and thus the convergence of the algorithms used for minimization is guaranteed.

As a consequence, non-incremental experiments consist, in fact, in using the same resolution for inner and outer loops, and especially in minimizing the complete cost function of 4d-var, which requires the adjoint of the full nonlinear model. This is, of course, very difficult to achieve.

To simulate a non-incremental experiment in the context of our studies, we just forced the same horizontal resolution T95 both in outer and inner loops. The tangent-linear model and its adjoint are kept during minimizations. The forecasts are done at truncation T95.

The results show non significant changes. The phenomena remain non sensitive to this handling. But this doesn't mean that incremental technique is not responsible of the anomaly. It is perhaps indirectly involved in it, at least by the use of TL/AD approximations of the forecast model and observation operators. This will be discussed in the next section.

5. TL/AD formulation aberration

Thanks to François Bouttier (personal communication), sensitivity experiments, executed without specific humidity in the cost function, revealed the presence of a non-zero gradient with respect to this variable. So it turns out that there is something annoying in the TL/AD formulation of the dynamics. The specific-humidity increments are caused by the linearization of three terms in the dynamics:

./R_Ajjaji_I_Eq1.gif

Indices "v" and "d" stand for "vapour" and "dry" respectively. R is the air gas-constant, R_v the vapour one, R_d the dry air one. Cp, Cp _v, and Cp_d are the specific heats at constant pressure for air, water vapour and dry air respectively.

Experiments done by F. Bouttier also revealed that more than 90% of the humidity increments are caused by the adjoint of the third equation hereabove.

This fact happens, for example, in Sahara region because, apparently, there is a combination of high temperatures and large horizontal gradients of RT on the model surfaces. One must keep in mind that this is not due to an incorrect adjoint formulation : this is a feature of the dynamical equations. It seems that a solution of the problem could be the introduction of some extra-terms into the TL/AD models, to prevent large local derivatives. The problem is probably not limited to the humidity field.

As a first confirmation of this investigation, an experiment consisting in forcing R_v by R _d along the 4d-var minimization is performed. There should be a noticeable impact on increments since :

./R_Ajjaji_I_Eq2.gif

As it can be seen on the figures below, the problem disappears completely. But putting "R_v=R_d" is not the solution of the problem, it is just a mean to confirm the importance of local derivatives in the TL/AD part. More investigations are still be done to find a scientific reasonable cure.

From Figs. 6, 7 and 8, one can notice that, when compared to 3d-var, the experiment with R_v forced to R_d is the better one, indicating that the formulation of the TL/AD of dynamics still needs some modifications to work correctly.

This abnormal feature was present within the previous ARPEGE/IFS 3d-var assimilation. The conclusion at that time was that there was a problem with the specification of humidity background-errors in hot and dry conditions (Rabier et al. 1998). But this was significantly amplified by the change to 4d-var, from the very beginning, and some other problems appeared more recently, such as the April 24th, 2001, case (Figs. 9a-b). At 18 h UTC, the 4d-var assimilation introduced a strong drying over a large area, from Spain to Germany, not justified at all by observations. Six hours later, the reverse correction was introduced, going back to reasonable fields.

./R_Ajjaji_I_Fig6.gif

Figure 6 : Charts of convective precipitations (00 + 24 h) over Sahara, for the period between 01/06/2000 and 10/06/2000. Right column: operational suite. Left column: 3d-var reference. Middle column: results when forcing R_v to R_d in 4d-var.

./R_Ajjaji_I_Fig7.gif

Figure 7 : Zonal means of specific humidity (right), relative humidity (middle) and temperature (left). From top to bottom : operational 4d-var against 3d-var, adiabatic 4d-var against 3d-var, mono-truncation incremental 4d-var against 3d-var, experiment "R _v=R_d" against 3d-var.

./R_Ajjaji_I_Fig8.gif

Figure 8 : The same as Fig. 7, but for zonal means averaged on standard levels over a 10-days assimilation experiment


Figure 9a : 850 hPa specific humidity; first guess, 18 UTC	Figure 9b : Specific-humidity increments at 850 hPa.

On the April 24th, 2001, case, a detailed investigation of the impact of observations led to the following remarks :

No TEMP is available, within 300 km of the maximum increment, along the 6 hours assimilation window (missing observation at that time);

The anomaly is suppressed by adding an artificial TEMP observation (derived from the forecast) (Fig. 10)

Suppressing AIREPs and/or SYNOPs suppresses the anomaly.

2 AIREPs around Baléares are likely to be at the origin of most of the problem;

The problem on q is replaced by a problem on T, if suppressing the temperature observation in these 2 AIREPs.

The impact of simplified physics, multi-truncation, ... is neutral.

Applying the "R_v=R_d" trick suppresses the anomaly entirely.

Sensitivity experiments, executed with no specific humidity in the cost function, reveal the presence of a non-zero gradient with respect to the specific humidity variable (Fig. 11)

Some other tests were performed, such as : "R _v=R_d " trick, "R_v=R_d" trick but not in the observation operators, impact of horizontal diffusion, etc ... The evolution of the problem along all the steps of 4d-var was carefully examined. This led to the following diagnostic :

The multivariate assimilation tries to fit observations by adjusting q first, even with very strong increments, especially when there are very few observations of humidity to constraint it. This happens since the very first integrations of the TL and AD models, as shown on Fig. 11, at the lowest resolution, and increases regularly along the minimization steps.

./R_Ajjaji_I_Fig10.gif

Figure 10 : Difference between the operational and experimental cases for tangent-linear evolution of T42 analysis increments along the assimilation window (6 hours). The experimental case consists in a simulated TEMP observation introduced at the vicinity of the anomaly. The anomaly disappearance gives evidence about the importance of humidity observations.

./R_Ajjaji_I_Fig11.gif

Figure 11 : Cost-function gradient with respect to specific humidity, computed at the beginning of the assimilation window at the end of each iteration during the first (T42) inner loop. Specific humidity is not included in the cost function.

These multivariate aspects of 4d-var could be illustrated in a simple context as follows.

If we suppose that the nonlinear direct prognostic model equations read:

./R_Ajjaji_I_Eq3.gif

Then, the associated tangent-linear model in the vicinity of a given model state reads :

./R_Ajjaji_I_Eq4.gif

Index "b" indicates basic-state fields, in the vicinity of which tangent-linear calculations are done. The primed variables indicate the perturbations transported by the tangent-linear model. U, V, u, v indicate wind variables, T temperature, q specific humidity, P and p pressure.

If we consider only the important terms related to R _v and R_d, these equations could then read:

./R_Ajjaji_I_Eq5.gif

The adjoint equation for specific humidity could then read:

./R_Ajjaji_I_Eq6.gif

In this adjoint equation, it is clear that specific humidity ( q) will be modified to fit wind and temperature observations, unless R_v and R_d are kept identical. The suggested trick of setting artificially R_v to R_d in the TL and AD models was very efficient in reducing problems, as illustrated by the above figures. But that is not a long-term solution, since not scientifically based.

Setting "R_v=R_d" everywhere may be not necessary and even detrimental, since :

useless for observation operators,

dangerous in equations for T, because of the balance with ( Cp_v-Cp_d),

q is perhaps not the only control variable concerned.

6. Conclusion and discussion

The incremental approach proposed by Courtier et al. (1994) reduces the computational cost of the original 4d-var algorithm. An approximation of the solution is found iteratively by solving several quadratic cost functions, approximations of the original one obtained by linearization of the model and the observation operators at lower spectral truncations. The solutions at low resolution are used as a correction to update the initial state at high resolution. The constant term of the linearization represents the model equivalent at observation points and is always computed at full resolution. In this way nonlinearities are taken into account indirectly.

At low resolution, the tangent-linear version of the model is used to transport the errors (instantaneous increments) along the assimilation window. This linearized version of the model must take into account not only the dynamic, but the physics as well.

However, physical processes such as condensation, convection, vertical diffusion, gravity wave drag, etc ... , are highly nonlinear and often discontinuous with threshold effects, which may cause serious convergence problems in the minimization. As a consequence, in the incremental 4d-var framework, the errors (increments) are transported by a tangent-linear model including a linearization of a regular simplified physics package.

The operational 4d-var at Météo-France uses a multiple-truncation incremental approach consisting in doing three minimizations at different truncations (T42, T63 and T95, the forecast model being run at resolution T199 with stretching).

Some problems of unrealistic precipitation over Sahara, which were already present in the 3d-var ARPEGE version, are amplified with the 4d-var implementation. This paper tried to understand this problem in the framework of incremental 4d-var.

Several 4d-var assimilation suites were run over a period of 10 days (June 2000) corresponding to overestimated precipitations in the operations. An individual remarkable case (April 24th, 2001) in a temperate area (Spain) is also studied. The main results may be synthesized as follows:

When using a tangent-linear model with and without (adiabatic mode) linearized simplified physics, the problem remains the same. This does not mean, however, that there is nothing to do (perhaps to tune) in the simplified physics.

Applying or not semi-external digital filtering has the same neutral effect.

When performing a multiple-truncation incremental 4d-var in which the three resolutions are the same and equal to the higher one (T95), 24-h cumulated precipitations over Sahara area are reduced by a non negligible factor.

Sensitivity experiments, executed with no specific humidity in the cost function, revealed the presence of a non-zero gradient with respect to specific humidity.

As a first conclusion, we can suppose that the multi-incremental approach has a relatively bad impact over tropical areas. In the concept of this approach, a lot of combined factors are present and could explain this fact :

The hypothesis saying that large scales force the shorter ones could be not valid at the vicinity of the tropics.

There is a large difference between the resolution of the direct full model, used to compute the model equivalent observations, and the minimization resolutions.

Some physical processes (related to humidity) are not taken into account in the tangent linear model. Theses processes, if present, could compensate some bad features of the multivariate implicit aspect of 4d-var.

There exist situations where the 4d-var minimization finds it easier to fit height (or wind) data through a change of specific humidity than through a change of temperature, even if specific humidity change has to be substantial in magnitude to achieve this goal.

To avoid this bad 4d-var behaviour, several solutions may be suggested :

Enriching observation information by more and more humidity data. This will constraint the implicit adjustment done during minimizations to be more consistent. The humidity increments could then remain close to the reality.

Background structure-functions must be tuned so that more reliability still given to the guess information, at least over poor regions in term of observations.

Using more complete regular physics for TL/AD models. This will compensate certainly the bad features of multivariate implicit aspect.

Specific humidity increments could be constrained to remain realistic by imposing analytic or statistic relationships between errors for humidity and for the other variables.

Using artificial compensating extra terms to weaken the gradients calculated with respect to specific humidity, during the minimization process.

Taking into account second order terms neglected in the TL/AD formulation. Theses terms could be non negligible in a context when TL/AD are used on longer assimilation windows (Tanguay et al., 1996), and especially when the atmospheric situation is highly baroclinic.

References.

Courtier, P., J.-N. Thépaut, and A. Hollingsworth, 1994 : A strategy for operational implementation of 4D-Var, using an incremental approach. Quart. J. Roy. Meteor. Soc., 120, 1367-1387.

Gauthier, P. and J.-N. Thépaut, 2001 : Impact of the digital filter as a weak constraint in the preoperational 4D-Var assimilation system of Météo-France. Mon. Wea. Rev., 129, 2089-2102.

Janiskova, M., J.-N. Thépaut and J.-F. Geleyn, 1999 : Simplified and regular physical parameterisations for incremental four-dimensional variational assimilation. Mon. Wea. Rev., 127 , 26-45.

Jarvinen, H. , J.-N. Thépaut, P. Courtier, 1996 : Quasi-continuous variational data assimilation. Quart. J. Roy. Meteor. Soc. , 122, 515-534.

Tanguay, M., P. Bartello and P. Gauthier, 1996 : Four-dimensional data assimilation with a wide range of scales. Tellus , 47A, 974-997.

Thépaut, J.N., P. Caille, V. Cassé, J.-F. Geleyn, P. Hautie, M. Janiskova, P. Moll, J. Pailleux, F. Taillefer, F. Veersé, 1997 : 4D-Var developments at Météo-France. In Proceedings of ECMWF workshop on non-linear aspects of data assimilation . ECMWF, Reading, 9-11 September 1996, 469-491.

Rabier, F., E. Andersson, J. Haseler, P. Unden, P. Courtier, G. Kelly, D. Vasiljevic, C. Brankovic, C. Cardinali, C. Gaffard, A. Hollingsworth, C. Jakob, P. Janssen, E. Klinker, A. Lanzinger, M. Miller, A. Simmons, B. Strauss, J.-N. Thépaut, P. Viterbo, 1998: The ECMWF implementation of three-dimensional variational assimilation (3D-Var). III: Experimental results. Quart. J. Roy. Meteor. Soc., 124, 1831-1860.

Veersé, F. and J.-N. Thépaut, 1998 : Multiple-truncation incremental approach for four-dimensional variational data assimilation. Quart. J. Roy. Meteor. Soc., 124, 1889-1908.