The degree of freedom for signal assessment of measurement networks for joint chemical state and emission analysis

The Degree of Freedom for Signal (DFS) is generalized and applied to estimate the potential observability of observation networks for augmented model state and parameter estimations. The control of predictive geophysical model systems by measurements is dependent on a sufficient observational basis. Control parameters may include prognostic state variables, mostly the initial values, and insufficiently known model parameters, to which the simulation is sensitive. As for chemistrytransport models, emission rates are at least as important as initial values for model evolution control. Extending the optimisa5 tion parameter set must be met by observation networks, which allows for controlling the entire optimisation task. In this paper, we introduce a DFS based approach with respect to address both, emission rates and initial value observability. By applying a Kalman smoother, a quantitative assessment method on the efficiency of observation configurations is developed based on the singular value decomposition. For practical reasons an ensemble based version is derived for covariance modelling. The observability analysis tool can be generalized to additional model parameters. 10


Introduction
Air quality and climate change are influenced by the fluxes of green house gases, reactive gas emissions and aerosols.The temporal evolution of chemistry in the atmosphere is usually modelled by atmospheric chemistry transport models.poorly known initial values, sources and sinks are a serious problem for the quality of simulation, which can be addressed by data assimilation and inverse modelling.Parameter mis-specfications in a model can only be identified within data assimilation intervals of space-time methods, if the simulation is sufficiently sensitive and the error related observability of the measurement network is given.This poses the observability problem.Otherwise, the forecast degrades beyond the observation controlled period.
In practice data assimilation problems are typically solved in circumstances, where the number of observations are markedly lower than the model degree of freedom (Daley 1991).Consequently, when aiming to improve the quality of analysis by observation configurations, several aspects can be considered.These include (i) to optimize the observation network, subject to given constraints, (ii) to evaluate the value of individual or types of observations for the analyses, and (iii) to quantify the degree of which the analysis can be influenced by the observations, which is related to the sensitivity.
The observation network optimization problem (i) has been addressed traditionally by Observation System Simulation Experiments (OSSEs, e.g Daley 1991).The advanced concept of targeted observations has been popularized with the FASTEX campaign (e.g Szunyogh et al. 1999;Langland et al. 1999).Theoretical studies are presented, for example by Bishop et al. (1999); Berliner et al. (1998), or recently by Bellsky et al. (2014) for highly nonlinear dynamics, and Wu et al. (2016) for the optimal locations of observations for time-varying system within a finite-time interval.
The benefit assessment (ii) of individual observations or types of measurements or remote sensing data within a given network seeks to identify the value ranking of information sources, accounting for analysis achieved.This problem has been investigated by Cardinali et al. (2004), Cardinali (2009), and a sequence of related papers , or Liu and Kalnay (2008), the latter without use of an adjoint model.A related approach was described by Baker and Daley (2000), who exploited sensitivities to observations to identify their spatial extensions of impact.
Finally, the need to quantify the information content provided by the observations (iii) can be satisfied by suitable and calculable measures, such as entropy reduction or DFS.The concept of DFS has been applied to satellite retrieval problems, typically of lower dimensions as compared to data assimilation (see for example Eyre 1990;Rodgers 2000;Rabier et al. 2002;Fourrié et al. 2003;Martynenko et al. 2010;Fisher 2003 ).However, these studies focused on the classical data assimilation problem with initial values or prognostic state variables as the only parameters to be optimized.Yet for chemistry transport or greenhouse gas models, which highly depend on the emissions in the troposphere, the optimization of the initial state is no longer the only issue.Rather, the optimization of emissions play an equally important role as initial values.In order to get better analysis from combining the model with observations, efforts of joint optimization by adding the emission rates to concentrations have been made (Elbern et al. 2000(Elbern et al. , 2007;;Bocquet 2012;Bocquet and Sakov 2013;Miyazaki et al. 2012;Tang et al. 2011Tang et al. , 2013;;Winiarek et al. 2014).Yet the lack of ability to observe and estimate surface emission fluxes directly is a major roadblock, hampering the progress in predictive skills of climate and atmospheric chemistry models.Therefore, the capacity to distinguish between the degree of freedom for signal of both emission rates and concentrations is crucial to assess the value of a measurement network.These assessment results are dependent not only on observation network and its deployment with respect to emission sources, but also influenced by the assimilation window lengths and the meteorological transport conditions.
In this context a meanwhile classical task is greenhouse gas inversion, aiming at the estimation of carbon dioxide, methane, and nitrous oxide sources, from which a rich set of literature emerged (e.g Peter et al. 2005).The sensitivity of the model evolution with respect to the model errors and the observation network as its detector is a key quantity to be analysed.Several methodologies have been proposed to account for model errors in both variational and ensemble data assimilation (e.g.Bellsky et al. 2014;Gillijns and De Moor 2007;Li et al. 2009;Smith et al. 2013;Tremolet 2007;Daescu 2004;Zupanski 2007). Navon (1997) outlined the perceptibility and stability in optimal parameter estimation in meteorology and oceanography.Elbern et al. (2000Elbern et al. ( , 2007) ) took the strong constraint by a given diurnal profile shape of emission rates such that their amplitudes in addition to initial values are the only parameters to be optimized by 4D-variational inversion.A general framework to optimize a set of parameters controlling the 4D-var data assimilation system was introduced by Cioaca and Sandu (2014) and applied to shallow model state and other parameters in a related paper (Cioaca and Sandu 2014).
Singular value decomposition (SVD) is a well-known tool applied to identifying the priorities of observations by detecting the fastest growing uncertainties in meteorological models (e.g Lorenz 1965;Buizza and Palmer 1995;Khattatov et al. 1999;Johnson 2003;Liao et al. 2006;Daescu 2008;Abida and Bocquet 2009;Kang and Xu 2012;Singh et al. 2012Singh et al. , 2013;;Sandu et al. 2013).However, singular vector analysis and other methods for atmospheric chemistry with emissions are different since emissions play a similarly important role in forecast accuracy with initial values.Goris and Elbern (2013) used the singular vector decomposition to determine the sensitivity of the chemical composition to emissions and initial values for a variety of typical chemical scenarios and integration length.This methodology has been generalized for the 3-dimensional EURAD-IM (European Air pollution Dispersion-Inverse Model) and applied to a field campaign with airship borne measurements by Goris and Elbern (2015).
In this paper we introduce the practical implementation of this approach to identify the impacts of the observation networks for controlling transport diffusion models with the Kalman smoother as the appropriate data assimilation method.The focus is placed on presenting an approach to identify the spatially resolved potential and limits of measurement networks to optimize chemical states and emission rates jointly, by comparing their relative sensitivities.In section 2, we describe an atmospheric transport model extended with emission rates by establishing the dynamic model for emission rates in a novel way.In section 3, based on the Kalman smoother, we derive the theoretical approach to determine the degree of freedom for signal for both initial values and emission rates.In section 4, we develop the ensemble approach to evaluate the degree of freedom for signal of the model.In section 5, we present the approach to identify the sensitivity of observations by determining the directions of maximum perturbation growth to the initial perturbation.In section 6, we extend a 3D advection-diffusion equation with the dynamic model of the emission rate and give several elementary experiments to verify the approaches.In section 7, we conclude the main contributions of this paper and discuss possible extensions.

Atmospheric inverse modelling extended by emission rates
We usually describe the concentration change rate by the following prognostic atmospheric transport model where A is a nonlinear model operator, c(t) and e(t) are the state vector of chemical constituents and emission rates at time t, respectively .
The prior estimate of the state vector of concentrations c(t) is given and denoted by c b (t), termed as the background state.
The prior estimate of emission rates, usually taken from emission inventories, is denoted by e b (t).
Let A be the tangent linear operator of A, δc(t 0 ) = c(t 0 ) − c b (t 0 ) and δe(t) = e(t) − e b (t).The linear evolution of the perturbation of c(t) follows the tangent linear model as By the discretization of the tangent linear model in space, it is straightforward to obtain the linear solution of (2) discretized in space and continuous in time as where M (•, •) is the resolvent obtained from the spacial discretization of A. Without loss of generality, we assume δc(t) ∈ R n , δe(t) ∈ R n , where n is the dimension of the partial phase space of concentrations and emission rates.Obviously, M (•, •) ∈ R n×n .
In addition, let y(t) be the observation vector of c(t) and define where δy(t) ∈ R m(t) , m(t) is the dimension of the phase space of observation configurations at time t.H(t) is a nonlinear forward observation operator mapping the model space to the observation space.Then by linearizing the nonlinear operator H as H, we present the observation system as where the observation error ν(t) of the Gaussian distribution has zero mean and variance R(t) ∈ R m(t)×m(t) .
The Kalman smoother is a recursive estimator to provide the best linear unbiased estimates (BLUE) of the unknown variables with error estimates, using a sequence of observations (e.g Gelb 1974).In addition to 4D-Var approaches, Kalman smoothers not only can provide the best linear unbiased estimate by a series of observations over time for the state vector, but also update the forecasting error covariances of that estimate.In this paper our problems will be treated by the Kalman smoother from a theoretical viewpoint.
If the initial state of concentration is the only parameter to be optimized, it is feasible to apply the Kalman filter and smoother by the tangent linear model (2) with observations (5) within the time interval [t 0 , t N ].However, as mentioned before, in most cases the exact values of emission rates are poorly known.It has been shown by Elbern et al. (2007) that the diurnal profiles of emission rates are better known and hence can be considered as constraints, such that the amplitudes of the diurnal emission cycle can be taken as optimization parameters.Thus, we first reformulate the background evolution of emission rates from time where e b (•) is a n-dimensional vector, of which the i th element is denoted by e i b (•) and M e (t, s) is the scaling diagonal matrix defined as . . .
In this works we establish the dynamic model of emission rates subject to the constraint Typically, there is no direct observation for emissions, apart from the flux tower observations used for carbon dioxide, which are not considered here.Therefore, we can reformulate the observation mapping as where 0 n×n is a n × n matrix with zero elements.
It is now clear that both concentrations and emission rates are included in the state vector of the homogeneous model ( 10), It allows us to apply the Kalman smoother in a fixed time interval [t 0 , t N ] in order to optimize both parameters.Besides, in Appendix A, a more general case of the transport model extended by emission is shown.
3 The degree of freedom for signal of concentrations and emissions In this section we will introduce the theoretical approach to determine the DFS of concentrations and emissions, resting on the extended model in Section 2. This approach gives us access to determine the efficiency of observations to optimize each variable based on the Kalman smoother within a finite-time interval.
For convenience, we generalize the atmospheric transport model ( 10) by the following discrete-time linear system on the where x(•) ∈ R n is the state variable and y(t k ) ∈ R m(t k ) is the observation vector at time t k .The model error ε(t k ) and the observation error ν(t k ), k = 1, • • • , N of Gaussian distributions have zero means.The model error covariance matrix is denoted by Q(t k ), while the observation error covariance matrix is denoted by R(t k ).
We denote the BLUE of x(t i ) based on {y(t 0 ), It is known that the inverse of the analysis error covariance matrix at initial time, P −1 (t 0 |t N ) of a fixedinterval Kalman smoother is the optimal Hessian of the underlying cost function of 4D-Var (Li and Navon, 2001).Thus, we have It is clear that ( 14) comprises the information of the initial condition, model evolution, observation configurations and errors over the entire time interval At the same time, it is independent of any specific data and state vector, apart from the reference model evolution M (•, •) needed for the linearization, as well as the observation operator H(•).Actually, if we define we can rewrite (14) as where G R −1 G is the observability Gramian with respect to R −1 in control theory (Brockett, 1994).It represents the observation capacity of the observation networks with respect to the model.
Though ( 16) meets the demand to represent the estimate covariance by all available information before starting the data assimilation procedure, it cannot be applied directly to evaluate the potential improvement of the estimate by the Kalman smoother, due to the lack of clear statistical significance of the inverse of a covariance matrix.We aspire a matrix, which allows us for a direct and normalized comparison between sensitivities to initial values and emission rates.To this end, we consider matrix P with the following form: where I is the identity matrix and P The matrix P is a normalized matrix of the difference between the background forecast error covariance matrix P (t 0 |t −1 ) and the analysis error covariance matrix P (t 0 |t N ), as inferred the Kalman smoother.It shows how much the observation networks improve the estimation of model states and is the foundation matrix to study the DFS of models (Fisher 2003;Rodgers 2000;Singh et al. 2013).
Since P (t 0 |t N ) is unknown prior to the data assimilation procedure, we use ( 16) to rewrite P as It is worth noting that in ( 18) is always invertible even if the observation Gramian G G is not full-rank.Thus, P is well-defined for all models with invertible initial covariance and observation systems with invertible error covariances within assimilation window t 0 to t N .Then, we apply the singular value decomposition to simplify ( 18) where V and U are unitary matrices consisting of the left and right singular vectors,respectively, while S is the rectangular diagonal matrix consisting of the singular values.
Then, (18) can be simplified as where r is the rank of ( 18) and v i is the i th left singular vector in V related to the singular value s i , which is the i th element on the diagonal of S.
It is clear that the sum of the diagonal entries of P can be used to evaluate the total improvements of model states.Thus, the nuclear norm is appropriately taken as the metric, which is defined as where A is any matrix and tr(•) denotes the trace of the matrix.
From ( 21), we obtain This is well-known as the degree of freedom for signal (DFS) of the model (e.g Rodgers 2000).
It is obvious that P 1 < I 1 = n.Here n can be considered as the total relative improvement if the system is definitely observed.Thus, if we consider the ratio the percentage of the total improvement of the model is obtained, which is called the relative degree of freedom for signal.
In order to get a deeper insight into the capacity of the observation networks to improve the estimation of all model states, we consider the corresponding value in the diagonal of P as the contribution of the degree of freedom for signal.Denote the j th element on the diagonal of P by Pj , from ( 21),the contribution of the j th element of x(t 0 ) to the degree of freedom for signal can be expressed as where v ij is the j th element of v i .
Besides, we can see that Eqn. where Further, the degree of freedom for signal of j th element in c(t 0 ) and e(t 0 ) are given by where v c ij and v e ij are the j th elements of v c i and v e i respectively.Moreover, the degree of freedom for signal of concentration P c 1 and emission rates P e 1 are caculated by It is worth noticing that if and only if there is no prior correlation between the initial concentration and emission rates.In this case P ce (t 0 |t −1 ) = 0 n×n , the corresponding relative degrees of freedom for signal of concentration and emission rates are defined as From ( 24), it is obvious that pc ∈ [0, 1) and pe ∈ [0, 1) can be considered as the percentages of the relative improvements of concentration and emission rates, respectively.However, efficient observation networks probably lead to both of them are close to 1 such that It indicates the normalization of P is only with respect to the extended covariance matrix rather than specified to the state c and emission rates e.The relative degree of freedom for signal cannot serve our objective to distinguish the observability of concentration and emission rates.However, by observing the block form of P , we have Thus, in order to compare the improvements of the concentration and emission rates, we define relative ratio of the degree of freedom for signal for concentrations or emission rates as If the degree or relative degree of freedom for signal of the observation network and assimilation window is almost zero, an improvement cannot be expected.In contrast, { P c j } n j=1 and { P e j } n j=1 , which show the improvement of each parameter j of concentrations and emission rates respectively, can help us determining which parameters can be optimized by the existing observation configurations.Furthermore, comparing pc with pe , we can conclude that the estimate of the one with the larger relative ratio of freedom for signal can be improved more efficiently by the existing observation configurations than the other.
In other words, if pc > pe , the existing observation configurations are more efficient to the initial values of concentrations.
Conversely, if pc < pe , the observation configurations can improve the estimate of emission rates more.According to pc and pe , the "weights" between the concentrations and emission rates can be identified quantitatively.In a data assimilation context, where observations are in a weighted relation to the background, the BLUE favors those parameters with higher observation efficiency.
The special case that pe is very close to zero implies that observation network is nearly "blind" for emission rate optimization.
4 The ensemble approach to determine the DFS The ensemble Kalman smoother (EnKS), as a Monte Carlo implementation derived from the Kalman smoother, is suitable for problems with a large number of control variables and is a frequently applied tool in the field of data assimilation (Evensen, 2009).In this section we will introduce the ensemble-based case of the approach in Section 3.
For the discrete-time system (12), we denote the ensemble samples of x(t i |t j ) by xk (t where q is the number of ensemble members. Correspondingly, the ensemble means of x(t i |t j ) is given by where is the n×q ensemble matrix, 1 i×j is a i×j matrix of which each element is equal to 1.
We calculate the ensemble forecast and analysis covariances as where 1 q×q is the related perturbation matrix.We define the ensemble observation configurations in the entire assimilation window as Further, the ensemble mean and the forecasting error covariance matrix of the ensemble observation configurations are given by Similarly, we denote the ensemble covariance between the initial states and the forecasting observations by Furthermore, defining the ensemble observations as we assume ν(t It is shown by Evensen (2009) that the ensemble forecast and analysis covariances have the same form with the covariances in the standard Kalman filter.However, the ensemble size q is significantly less than the dimension of the model n in practical applications.It causes that the initial ensemble covariance P (t 0 |t −1 ) is not invertible.In this case, the pseudo inverse is a widely used alternative of the inverse of a matrix, due to its best fitness and uniqueness.We denote the pseudo inverse of a matrix A by A † .Then for the initial ensemble covariance we apply the singular value decomposition to where V 0 ∈ R n×n and U 0 ∈ R q×q consist of the left and right singular vectors respectively, and S 0 ∈ R n×q is a rectangular diagonal matrix with singular values {s 0i |s 0i 0} q i=1 on its diagonal.Thus, where ), r 0 is the rank of S 0 .Hence, we find a pseudo inverse where Ŝ † 0 is the pseudo inverse of Ŝ0 with the diagonal (1/s 01 , • • • , 1/s 0r0 , 0 1×(n−r0) ).Analog to (17), we define P as Likewise, corresponding to (13), we present the observation system in the entire time interval as where y = (y (t 0 ), ) and G as the observation configuration for x(t 0 ).Then, for the analysis error covariance matrix, we obtain Further, analog to (21), we obtain Let N i=1 m(t i ) = m be the number of observations available within the assimilation window.To proceed with (49), we apply again the singular value decomposition into the following matrix where , respectively.S ∈ R n×m consists of the singular values on its diagonal.
We denote the rank of (50) by r.Then, we rewrite P as where r is the rank of P and vi is the i th left singular vector in V related to the singular value si , which is the i th element on the diagonal of S.
We observe that ( 51) and ( 21) have a similar form.By virtue of the final results of ( 21) and ( 51) are equivalent.However, compared with P processes the absolute advantage that in the calculation of P f xy since we do not need the explicit form of G.It allows us to code it line by line such that our approach is computationally more efficient.
Analog to the standard case, we can similarly define the ensemble degree of freedom for signal (EnDFS) as P 1 and consider each element on the diagonal of P as the contribution to EnDFS of the corresponding model state.
Since P (t 0 |t −1 ) is typically not full rank, where I r0 is the diagonal matrix with the diagonal (1 1×r0 , 0 1×(n−r0) ).It is clear from (48) that P † 1 2 (t 0 |t −1 ) P (t 0 |t N ) P † 1 2 (t 0 |t −1 ) is still a nonnegative definite matrix.Thus, the ensemble relative degree of freedom for signal(EnRDFS) is defined by In order to distinguish the improvements for concentrations and emission rates, the ensemble relative ratios of DFS remain If we further consider the nonlinear dynamic model, we can renew the definition of the forecasting observation configurations as such that it can follow the nonlinear model, where G is again a combined model-observation nonlinear operator.
Correspondingly, the ensemble mean of ȳf k and P f xy can be calculated based on ( 56) with the nonlinear G. Thus, the above ensemble-based approach is available for nonlinear models.

Sensitivity of observation networks
The above discussion about DFS aims to evaluate a predefined measurement network on its potential to analyze initial values and emission rates simultaneously.In Appendix B, independent of any concrete data assimilation method, we use the singular vector approach to identify the sensitive directions of observation networks to initial values and emission rates and show the association between the efficiency and sensitivity of observation networks.
From Appendix B, we can see that the singular value s k shows the amplification of the impact of the initial state to the observation configurations during the entire time interval.The associated singular vector in the state space v k is the direction of k th growth of the perturbation of observations evolved from the initial perturbation.With the special choice W 0 = P −1 (t 0 |t −1 ) and W = R −1 we compare the sensitivity analysis with the analysis in Section 3. Since is also the k th direction which maximizes the relative improvement of estimates based on Kalman smoother.It indicates that the states contributing to DFS more are the same with the states more sensitive to the observation networks.Besides, the leading singular value s 1 is related to the operator norm of P as where δc, δe are the perturbations of the concentration, the emission rate of a species respectively.For vertical diffusion, K(z) is a differentiable function of height z.
For velocity v x = v y = 0.5 and the time step t = 0.5, the numerical solution is based on the symmetric operator splitting technique (Yanenko, 1971) with the following operator sequence where T x and T y are transport operators in horizontal directions x and y, D z is the diffusion operator in vertical direction z.
The parameters of emission and deposition rates are included in A. The Lax-Wendroff algorithm is chosen as the discretization method for horizontal advection with x = y = 1.The vertical diffusion is discretized with z = 1 by Crank-Nicolson scheme with the Thomas algorithm (Higham, 2002) as solver.The number of the grid points is N g = 1125.
With the same temporal and spacial discretization of the concentration, the background knowledge of the emission rate is given by e b (t n , i, j, l), where n = 1, • • • , N .We establish the discrete dynamic model of the emission rate according to ( 8) where For expository reasons we assume δd is a constant over time and the only one fixed observation configuration is timeinvariant.It indicates that the observation operator mapping the state space to the observation space is a 1 × 2N g time-invariant matrix.
In our simulations, we produce q = 500 (the ensemble number) samples for the initial concentration and emission rate respectively by pseudo independent random numbers and make the states correlated by the moving average technique.It has been tested that the computation cost of our approach is linearly increasing with the number of ensembles.In the following, we present three different tests, aiming to demonstrate the roles of variable winds, emissions, and vertical diffusion.

Advection test:
The following part demonstrates the application of the DFS analysis tool by basic examples, designed to show the expected elementary outcomes of the following situations, which exhibit the effects of assimilation window length in relation to emission location: these include (i) an assimilation window is too short to capture emission impacts at the observation site, (ii) an extended assimilation window with balanced signal of impacts of concentrations and emissions at the observation site, and (iii) a further increased assimilation window features a declining impact of initial values and growing emission impact.
The first elementary advection test (Fig. 1 to Fig. 7) identifies the sensitivities of parameters subject to different wind direction and data assimilation window (DAW) through the DFS.Focusing on the advection effects, we apply the model with a weak diffusion process (K(z) = 0.5e −z 2 ).
In Fig. 1 to 3 we assume southwesterly winds and the assigned data assimilation windows are 10 t, 35 t and 48 t respectively.The computation times are approximately 8.1s, 28.5s and 39.4s in our tests with the above three different assimilation windows, from which we can verify that the computation cost is nearly linearly increasing with the length of data assimilation window.The contributions to EnDFS of the initial states are shown in the left panels of Fig. 1 to 3. We can find that the horizontal fields at lowest layer (z = 0) where the estimates of the concentration are probably improved is enlarged with the extension of data assimilation windows.It is because that more and more grid points of the concentration are correlated with longer data assimilation windows.
The right panels of Fig. 1 to 3 show the EnDFS of the emission rate at each grid point with z = 0. From Fig. 1, we can observe that the EnDFS of the emission rate are smaller than the case of initial value in the influenced area.It indicates that the observations cannot detect the emission rate within 10 t data assimilation window.Thus, in this case initial values alone can be optimized.It is shown in the right panels of Fig. 2 and Fig. 3 that the emission rate plays a more and more important role on the impact of observations.In this two cases, we consider both the concentration and emission rate as optimized parameters.
The quantitative balance between the concentration and emission rate is provided in Table 1.As counter example, Fig. 5 to 7 also show the EnDFS of the concentration and emission rate under the same assumptions as Fig. 1 to 3 respectively, except that northeasterly wind is assumed.Clearly, with the northeasterly wind, whatever the duration of the assimilation window is, the emission is not detectable and improvable by that particular observation configuration.This hypothesis is demonstrated by our method.The quantitative balances are exposed in Table 1 for the related figures, where the insensitivity to emission rate optimisation remains equally low and induced by numerical noise. .

Emission signal test:
The purpose of emission signal test (Fig. 8 and Fig. 9) is to assess the impact of observation configurations to the emission rates evolved with different diurnal profiles.We have the same assumptions as Fig. 3 except the wind speed in Fig. 8 and Fig. 9 is accelerated such that the profiles of the emission rate is better detectable as to the observation.The only distinction between the situations in Fig. 8 and Fig. 9 is the pronounced diurnal cycle background profile of the emission rate during the assimilation window 48 t, schematically simulating a rush hour induced source.Since the profiles of emission rates are correlated with the emitted amount of that species during the data assimilation window, it is clearly shown in Table 1 that the distinct variation of the emission rate during the data assimilation window acts to level pc and pe , and thus helps to improve the estimates of source.
Diffusion test: The diffusion test (Fig. 10 to Fig. 12) aims to test the approach via comparing the EnDFS of the concentration and the emission rate at the layer z = 0 with a weak diffusion process and a strong diffusion process.We assume the observation configuration at each time step is located at (12, 10, 4) in the diffusion test , with K(z) = 0.5e −z 2 in Fig. 10 and K(z) = 0.5e −z 2 + 1 in Fig. 11.Besides, Fig. 10 and Fig. 12 preserve the same assumptions with Fig. 3.
It is obviously seen from Fig. 3 and Fig. 10 that the different observation locations strongly influence on the distribution of the concentration.Table 2 shows that with the same diffusion coefficient the degree ensemble degree of freedom for signal of the concentration in the lowest layer in Fig. 3 is definitely larger than the one for Fig. 10.Moreover, it can be seen from Table 1 that the observation configuration at the top layer is not efficient to the emission rate with such weak diffusion within 48 t data assimilation window .Comparing Fig. 10 with Fig. 11, we can find that the EnDFS of concentration and emission rate increase with the stronger diffusion process.The increasing impact of the observation configuration with the stronger diffusion is also verified by the EnDFS and ensemble relative ratios of DFS of the concentration and emission rate for Fig. 10 and 11 in Table 2.The balances between the concentration and emission rate for Fig. 10 and 11 are shown in Table 1.The significant difference of the 'weight' of emission rate in Table 1 implies that the observation configuration cannot detect the emission at the lowest layer with such 5 a weak diffusion in Fig. 10 and with the stronger in Fig. 11 both the concentration and emission rate should be considered as optimized parameters with the corresponding 'weights'.Table 1.Ensemble relative ratios of the initial value and emission rate at the lowest layer.Table 2.The ensemble degrees of freedom for signal of the initial value and emission rate at the lowest layer.This study demonstrates the quantification of the sensitivity of a given measurement network to impact initial trace gas state and emission rates for transport-diffusion models forced by emission.The indicators adopt the degrees of freedom for signal concept.Resting on a Kalman smoother, the contribution to the degree of freedom for signal is derived as the criterion to evaluate the potential improvement of the extended state vector as calculated by singular value decomposition.With a statistical interpretation, we can apply it to determine in advance, which parameters can be optimized by the data assimilation procedure.
The degree of freedom for signal and a number of metrics provide us with the quantitative solutions to measure to what extend the parameters can be optimized.Due to its normalization, it is uniformly available for any prior initial values of invertible background covariances.Further, the proposal of the ensemble based relative improvement covariance, based on EnKS, gives us a computationally feasible access to assess the degree of freedom for signal.
The sensitivity of observational networks was formulated by seeking the fastest directions of the perturbation ratio between initial states and observation configurations during the entire time interval.An elementary advection-diffusion example illustrated the significance of relative improvements covariances and their various metrics in different situations and compared them with the results of the sensitivity analysis.
In the future, it is planed to apply the efficiency analysis into the real atmospheric transport model to solve practical problems, also for nonlinear reactive chemistry-transport-diffusion models, as far as the validity of the tangent linear assumption holds, exactly as in atmospheric chemistry data assimilation problems.It is expected that we will get deeper insight to the sensitivity analysis for wider applications.For example, in order to evaluate the impact of observations in some certain locations, the local projection operator introduced by Buizza and Montani (1999) can be applied into approaches presented in Section 4 and Section 5.
(21) enables us to discriminate the DFS contributed to different optimization parameters, that is emission rates and initial values.Without loss of generality, we divide (21) into the following block matrix according to the dimension of c and e which implies the upper boundedness of P .It gives us an access to approximate and target the sensitive parameters or areas with the metric of the leading singular vectors weighted by the corresponding singular values.Moreover, due to the homogeneity of the atmospheric transport model state vector extended with emissions, the above sensitivity analysis can be easily applied by dividing singular vectors into the block form according to the dimensions of the initial state and emissions.The corresponding block parts of different singular vectors indicate the different sensitive directions of the initial state and emissions and allow for this relative quantification.Correspondingly, we can approximate and target the parameters sensitive to the existing observation networks for both initial values and emission rates.6 Example We consider a linear advection-diffusion model with Dirichlet horizontal boundary condition and Neumann boundary condition in the vertical direction on the domain [0, 14] × [0, 14] × [0, 4],

Fig. 4
Fig. 4 exhibits in its upper row panels the singular values underlying the results shown in Fig. 1 to 3. We approximate the sensitivities of the initial concentrations by the first five leading singular vectors weighted by the associated singular values in the nuclear norm and show the results in the three panels of Fig. 4, lower row.It is clearly visible that the sensitive area can be well targeted by only few singular vectors, although the sensitivity analysis cannot provide the quantitative solutions with a clear statistical significance as the the degree of freedom for signal of the model.The areas of influence to the measurement site in depencence of wind direction and assimilation window lengths is clearly visualized, corresponding to expectations.

Figure 1 .
Figure 1.Advection test with 10 t DAW and southwesterly wind.Isopleths of ensemble relative improvements of the concentration and emission rate are shown in the left and right figure panels respectively.The point located at (12, 10, 0) named as'Obs-cfg of conc' shows the invariant observation configuration.The point located at (2, 2, 0) named as 'Emss-source' is the source of the emission rate.

Figure 2 .
Figure 2. Advection test with 35 t DAW and southwesterly wind.Plotting conventions are as in Fig. 1.