Optimal Follow Up Designs for Fractional Partial Differential Equations with Application to a Convection-Advection Model.

As the mathematical properties of Fractional Partial Differential Equations are rapidly being developed, there is an increasing desire by researchers to employ these models in real world data oriented contexts. The main barrier to employing these models is the choice of the fractional order α . Recently, [6] show how to both estimate and make inferences about α from a Bayesian perspective. However, for experimental settings one needs to be able to design experiments that will provide optimal information about α . This work demonstrates how to use information based criteria, namely A, D and E optimality, to choose sampling locations in follow-up designs. Specifically, the simultaneous addition of one, two and three measurement locations is considered for a simple example. The simultaneous addition of four and five measurement locations is also considered across a variety of values of α . The results show that each of the criteria provide different optimal measurement locations as the number of additional measurement locations is increased. Indicating that the choice of optimality criteria should not be decided arbitrarily. ©2023 All rights reserved.


Introduction
Fractional Partial Differential Equations (FPDE) are growing in interest among researchers confronted with modeling flow through porous materials [3,12,16,17,18,23,24,28,14,6].The reason behind that is there are many phenomena in nature which are not described adequately by regualar derivatives and so there is a need for fractional calculus.Among them are crowded systems, such as protein diffusion within cells, [40], and diffusion through porous media.Here fractional differential equations sometimes provide a better description, see [17,11].Though fractional calculus has a long history, its application in many fields of science and engineering is relatively new.Recently, there has been many works related to fractional calculus and its application.For example, Sadek et al. [31,32,33,34,35,36] considered their applications to Riccati differential equations of fractional order, controllability and observability for fractional linear dynamical systems.In recent articles Ghanam and Boone [14] took a new approach for which they were interested in estimation and hypothesis testing applied to fractional fluid transport models.In particular, estimatation of various parameters associated with the model such as the fractional derivative parameter α as well as measurement error.Once empirical data is collected the parameters can be estimated using techniques discussed in [6].However, generating this empirical data may require experimentation on samples of the porous materials under study.
In this article we consider the mathematical model presented in [6] for which a parameter estimation for the fractional derivative order α is obtained and apply that to solve an optimization problem related to design of experiment.In this work, we consider how to place sensors for observation along the distance from the pressure source axis x using a follow-up study approach.Figure 1 shows and example of how an experimental set up may look.Pressure is applied to one end of the porous material and at specified distances x 1 , x 2 , ...x k sensors are placed to measure the pressure output at those locations at various times t from the pressure being applied, denoted p(x i , t).Notice that the choice of locations of the sensors may or may not be optimal for learning about the parameters that govern the process.The question posed here is: How can a follow-up study framework allow for better estimation of the parameters, specifically the fraction parameter α.The follow-up approach to design of experiments consist of two parts, first, a base design is conducted where sensors are placed and initial data is obtained and the information is used to help determine where the additional sensors are placed for the second stage of experimentation, the follow-up design.The advantage of follow-up studies is that researchers can place the additional sensors in locations that will allow for the most information about the parameters to be learned from the data.Experimental Design is a broad field in statistics which is covered well by [27] and [19].For this study, a base design with six locations chosen arbitrarily, and follow-up designs with 1, 2, 3, 4 and 5 possible new locations are considered.In Figure 1, the blue sensors reflect the base design and the green sensors reflect the addition of three sensors once data has been collected from the base sensor.These three green sensors are the follow-up design, in which the locations are chosen to minimize the volume of the Variance-Covariance matrix of the model parameters.In this optimization problem, we use three different techniques: A, D and E optimality and make a comparison across these methods.Each of these criteria use an information approach, specifically, Fisher's information.For more information about these methods we refer the reader to [27] and [19].
The outline of the paper is as follows: In Section 2, we give the various definitions of fractional derivatives and some properties related to their Laplace transforms.In Section 3, we state the mathematical model under consideration and define the problem under invistigation.In Sections 4 and 5, we provide simulated examples the design of experiment optimality criteria.In Sections 6 and 7, we provide the new results and discussion.

Preliminaries of Fractional Calculus
In this section, we state some definitions and results from fractional calculus, see [21,37,7,8].Definition 2.1.Riemann-Liouville Fractional Integral of order α for an absolutely integrable function f(t) is defined by when the right hand side exists, and Γ (α) is defined by Definition 2.2.Riemann-Liouville Fractional Derivative of order α > 0 for an absolutely integrable function f(t) is defined by Proposition 2.4.The Relationship between Caputo fractional derivative and Riemann-Liouville fractional derivative is given by (2.5) Definition 2.5.Hilfer Fractional Derivative of order α > 0 and type β for an absolutely integrable function f(t) with respect to t is defined by, Proposition 2.6.The Laplace transforms of fractional derivatives are given by: where where where 13(4) (2023), 130-143 It is clear from Proposition 2.6 that the differences in these Laplace transforms are in the initial data Lemma 2.7.Assume that f(t) is continuous on [0, T ], for some T > 0, then Definition 2.8.A one-parameter Mittag-Leffler Function is defined by , α > 0.

The Problem statement
This work focuses on a particular time-fractional advection-diffusion system given by (3.1) [25,2], which is relevant to applications in transport through porous media.Our task is to provide experimental design so that inferences on parameters for the fraction, α, and the experimental error variance σ 2 carry the optimal amount of information.
where ∂ α P ∂t α is the Caputo fractional derivative of order α defined in Eq. (2.4)

The Model
The model considered here comes from equation (1.3) where we take D = 1 and U = 1 and the initial condition is p(x, 0) = e −cx , (c > 0), and the boundary condition is p(x, t) → 0 as x → ∞.This yields the linear system, For this work we will assume c = 1 for simplicity, and in this case p(x, t) has a closed form solutionn given by: The fractional differential equation of interest is taken from [25] to model the pressure as a function of time t and location x. shows the influence of α across x when α = 0.5, 0.6, 0.7, 0.8 at time t = 1.0.Notice that as α increases the pressure is dampened or compressed towards zero.In Panel (b) the pressures is shown across t for values of α = 0.78, 0.80, 0.82, 0.84.Again, as α increases there is a dampening effect on the pressure.Figure 3 shows the solution across values of x and t when α = 0.82.In Panel (a) the solution is plotted across x for t = 0.1, 0.25, 0.4, 0.55.Here as x increases there is an exponential reduction in the pressure, however, as t increases the pressure also increases.Panel (b) plots the solution across t for x = 0.0, 0.5, 1.0, 1.5.Again, as t increases the pressure does as well, however, as x increases the pressure is reduced.

Statistical Model
A Bayesian approach is used to estimate the parameters in the model which requires that both a likelihood be specified as well as prior distributions on the model parameters [39].
where ϵ(x i , t j ) iid ∼ LogNormal(1, σ 2 ).Here the LogNormal likelihood is chosen to ensure that pressure is always a positive value.Let y(x i , t j ) be the observed value of Y(x i , t j ) where i = 1, ..., n x and j = 1, ..., n t .In this specification, the likelihood has two parameters, α and σ 2 .The prior distribution for α and σ 2 , π(α, σ 2 ) are specified as α ∼ Beta (α * , β * ) to reflect the prior knowledge that α is bound between 0 and 1.For σ 2 the prior distribution is specified as σ 2 ∼ χ 2 (df) to reflect the prior knowledge that σ 2 must be a positive value.For more on prior distribution and selection see [5].
For notation let x be all x i values and t be all the values of t j and y(x, t) be all the corresponding values of y(x i , t j ) observed and p(x, t) be all the solutions at the corresponding location and time values.The posterior distribution π α, σ 2 |p(x, t), y(x, t) can be found using Bayes' Theorem [4]: (3.5) In the case considered here there is no analytic solution to π α, σ 2 |p(x, t), y(x, t) and hence sampling method must be employed to draw samples from the posterior distribution, from which all inferences will be made.There are many choices for the algorithm to sample from the posterior distribution such as Acceptance Sampling, Metropolis-Hastings Sampling, Sampling Importance Resampling, etc.For more on sampling algorithms see [15], [13], [1].The posterior predictive distribution will be needed to determine the impact of adding a new measurement location by obtaining predictions of the outcome, Y new at a new proposed location X new .The posterior predictive distribution is given by: Notice that the predictive distribution is a probability distribution and hence any predictions generated from it are essentially random samples from the distribution.

Simulated Example
Suppose the system of interest is given by (3.3) where α = 0.82.Further suppose that data for y(x, t) has been observed, with noise, at all combinations of n x = 31 equally spaced levels of x from 0.01 to 10 and n t = 11 equally spaced times t from 0.5 to 1.5 and the noise is multiplicative following a LogNormal distribution with mean 1 and σ = 0.1.Figure 4 shows the unperturbed data surface in panel (a) and the perturbed data surface in panel (b).Notice that this set parameter specification produces a quite noisy surface.In real world situations we would expect to observe perturbed data similar to that in panel (b).The goal of this work is to determine if statistical techniques can be used to adequately estimate the parameters of the underlying surface when presented with noisy data.
The prior distributions for α and σ were specified as follows for both Model 1 and Model 2: Using the LogNormal likelihood, the prior above and Bayes Formula the posterior distribution is given as: The Metropolis-Hastings Markov chain Monte Carlo (MCMC) algorithm was employed to draw 1,000 posterior samples to be used for all inferences.Note that the sampler was tuned in order to find appropriate step-sizes and any samples from this tuning were discarded.Traceplots were examined to ensure convergence of the sampler to the region of high posterior density.

Design Optimality Criteria
As the goal of the study is determine new measurement locations along x which will provide better estimates for α and σ, one must define "better".One popular approach is to work with the information generated by the addition of the new measurements.Fisher's Information is often used to provide a theoretical framework for determining the information from a sample which is given by (under certain regularity conditions) [9], [39]: where θ is the parameter vector and f is the probability density associated with the likelihood.In many cases, such as standard linear models, one can find an explicit form for this analytically using Fisher's Information.For example, the model Z ∼ N(Wβ, Σ), the Fisher information matrix is given by I(β) = W ′ Σ −1 W. The variance-covariance of the least squares estimator for β in this general linear model is given  [27].In particular, notice that Cov(β) = I(β) −1 in this case.In general, if the estimator θ is efficient then it achieves the Cramér-Rao lower bound which is given by Cov( θ) = I(θ) −1 .If θ is not efficient then Cov( θ) ⩾ I(θ) −1 .Hence Cov( θ) −1 provides a reasonable approximation to the Fisher information matrix I(θ) [9].In our case, smaller variance of the joint posterior distribution of α and σ is preferable.Hence we wish to minimize the variance-covariance matrix of α and σ.This issue is how does one determine if one matrix is better than another?Typically one would look at the quadratic form associated with the matrix.If C 1 and C 2 are matrices where x ′ C 1 x < x ′ C 2 x for all x ∈ R n , then we would say that C 2 dominates C 1 and hence C 1 would be considered smaller.However, this approach is difficult to implement in practice.Instead other criterion have been developed such as considering the trace of the matrices.If trace(C 1 ) < trace(C 2 ) then C 1 is smaller than C 2 .In the variance-covariance context the trace is simply the sum of the marginal variances of each of the random variables.Another popular technique is to compare the determinants of the matrices.
Similarly, one could consider the largest eigenvalues of each of the matrices and rank on those.
Let Ξ be the set of all experimental designs (measurement locations) under consideration with ξ j be an individual design.Three popular criteria to evaluate experimental designs based on Fisher Information are A-optimality, D-optimality and E-optimality.[27,19,20]: for all ξ j ∈ Ξ.
Let Y new , ξ j denote the new outcome vector p(X new , t) where X new is defined by experimental design ξ j .In this case an analytic for Fisher's Information is not available for π α, σ|y(x i , t i ), p(x i , t i ), Y new,ξ j , ξ j , t .Here the empirical posterior variance-covariance matrix is a substituted as a reasonable approximation to Fisher's information matrix.Let (α 1 , σ 1 ), (α 2 , σ 2 ), ..., (α m , σ m ) be the m MCMC samples drawn from π α, σ|y(x i , t i ), p(x i , t i ), Y new,ξ j , ξ j , t .The variance-covariance matrix of the posterior samples from design ξ j is estimated by: Algorithm 1 gives a step-by-step presentation of method proposed here.Note that in Step 7 the MCMC samples are based on both the base design obervations and the predicted observations for the follow-up design under consideration.
Algorithm 1 Algorithm to determine A, D and E optimal designs.1: Set prior distribution parameters for α and σ. 2: Set base design sampling locations x and sampling times t.3: Conduct the experiment using the base design and taking measurements y(x, t).4: Obtain MCMC samples for α and σ 5: for ξ j ∈ Ξ do 6: Sample Y new,ξ j from π Y new (X new , t)|p(x, t), y(x, t), X new , ξ j , t using MCMC samples from step 4.

9:
Calculate and store A j = tr(C j ), D j = |C j | and E j = maxEigen C j .10: end for 11: Examine A j , D j and E j to determine the optimal design ξ j for each criteria.
Note that in Step 5 in Algorithm 1 there may be a huge number of design points.Suppose there are Λ possible measurement locations and λ B are chosen for the base design and λ F will be chosen for followup design points, then Λ − λ B λ F possible follow-up designs exist.For example, if there are 48 possible measurement locations and six are chosen in the base design and five additional measurement locations will be chosen, then there are 48 − 6 5 = 850, 558 possible follow-up designs.This is a large number of possible designs to calculate, hence a search algorithm such as Simulated Annealing can be employed.While this does not enumerate all the designs in Ξ it searches to find the one that meets the optimality criteria.This will necessitate that each criterion evaluated separately instead of simultaneously in step 5(d).For more on simulated annealing see [22].Both the MCMC sampler and the Simulated Annealing algorithms are programmed in MATLAB 2022b.To explore 1,000 candidate design combinations from the design space, computation took approximately 1,200 seconds using an Intel Core i5 Quad-Core processor at 3.8 GHz and 32GB of RAM.MCMC diagnostics such as trace-plots and ACF plots were examined throughout the process, which showed MCMC sampler converged to the region of high posterior density with high acceptance probability.

New Results for Design of Experiments
To consider the effect of σ and the number of additional measurement locations on the follow-up design location(s) for the A, D and E criteria when α = 0.82 is performed using a base design of 0.2, 1.0, 2.0, 5.0, 7.0, and 10.0.On all studies 1,000 MCMC samples are used for parameter estimation.For one additional location all 42 candidate locations are considered, for two additional locations all 946 candidate combinations are considered and for three additional locations simulated annealing is used with 1,000 steps.Table 6 shows the location results for the follow-up designs for A, D and E criteria when α = 0.82 and σ ∈ {0.01, 0.1}.Notice that for one additional sensor and σ = 0.1 all three criteria produce a follow up measurement location at 4.2.Similarly for σ = 0.01 all three criteria produce the same follow up measurement location at 6.2.When two additional measurement locations are allowed and σ = 0.1 both the D and E criteria produce follow-up locations at 0.4 and 4.2.However, the A criterion produces a different pair of locations at 1.2 and 2.6.This shows that the three criteria do not necessarily produce the same values which would suggest that the choice of criteria can be arbitrary.For the three additional location scenarios the criteria do not produce the same set of locations.However, the D and E criteria do produce one of the locations to be 2.6 in the σ = 0.01 case.Of further note is that for the A optimality criterion the choice of locations are the same across σ = 0.01 and σ − 0.1.

13(4) (2023), 130-143
Table 1: Selected follow-up designs for the A, D, and E criteria.Base design is 0.2, 1.0, 2.0, 5.0, 7.0, and 10.0.For this simulation α = 0.82, σ = 0.1 and σ = 0.01.To examine the effect of α on the optimal design location for four and five simultaneously added locations a simulation study was performed by varying α at values 0.3, 0.4, 0.5, 0.6 and 0.7 and keeping σ = 0.1.Using the approach for the one, two and three additional sensor study above, a simulated dataset was generated for each value of α with base design locations 0.2, 1.0, 2.0, 5.0, 7.0 and 10.0.Using each of these datasets the A, D and E optimal locations were found by employing the MCMC combined with simulated annealing approach with 1,000 search steps for the MCMC estimates and 1,000 simulated annealing steps.
Figure 5 shows the results of the study with the A optimal locations denoted by a blue circle (•), D optimal dented by a black triangle (△) and E-optimal denoted by a red plus sign (+).Notice that the A optimal points tend to be placed at higher and higher values as alpha is increased.This is less prominent in the E optimal locations which do seem to increase as well.The D optimal solutions exhibit a similar pattern of increasing the follow up design points, however, to a much smaller magnitude than both A or E optimality. Figure 6 shows the results of the study with the A optimal locations denoted by a blue circle (•), D optimal dented by a black triangle (△) and E-optimal denoted by a red plus sign (+).Notice that the A optimal design points are consistenly placed above 1.8 and spread across the remaining design points at random with high values near 9.0 and 10.0 common.This is in contrast to the E optimal locations which are much higher for α = 0.3 but spread out more evenly across the design space for the other values of α.Of particular interest is the D optimal solutions that are above 3.5 for α = 0.5 and 0.6 and rather randomly placed across the other values of α.The differences here between the four and five location follow up designs may be due to the base design choice which contains considerable information so additional points can be spread across a wider range of values.

Discussion
This study shows how to use MCMC and Simulated Annealing techniques to simultaneously place new experimental measurement (sensor) locations for FPDEs.By first considering a specific example where α = 0.82 with two different variances and one, two and three addidtional measurement locations.For the one additional measurement location, all three optimality criteria give the same new location, only depending on the model variance.As additinoal measurement locations are added, the three opotimality criteria provide different optimal locations.For a more general exploration α = 0.3, 0.4, 0.5, 0.6 and 0.7 is considered with four and five additional measurment locations.As the number of measurement locations incrases each of the optimality criteria provide different locations.
Of particular interest is when the number of additional measurement locations is increased the more spread out the locations become across the design space.This could be an artifact of the base design chosen as it may not contain enough information to guide placement.An additional study should be conducted to determine if replication of base design measurements provides a stronger pattern in the follow-up measurement location across the values of α.
Another area that should be explored is sequentially adding follow-up measurement locations instead of the simultaneous approach taken here.In a sequential approach, a new measurement location will be chosen based on each of the criteria and that location will be kept and additional locations will be chosen one by one.For example when α = 0.82 and σ = 0.1, Table 6 shows that all three methods chose 4.2 as the next optimal location if only one sensor is added (6.2 was chosen when σ = 0.01).How do the additional measurement locations change across various values for α for each of the different criteria?While the approach seems to be straightforward there are several scenarios that need to be addressed.If the sensors can be reused then one could gain additional measurements at already established sensor locations and hence there is a built in replication occurring, albeit unequal replication.If the sensor cannot be reused then at each step no additional information from other sensors can be obtained.While this may seem trivial, in many applications sensors may be single use and should be studied.
Note that the criteria used here focus on estimating the parameter α.However, in many applications estimation may not be the goal.Instead, a researcher may be interested in whether or not using a FPDE is warranted.In the case when α = 1, a standard PDE could be used.If the goal is to conduct a hypothesis test H 0 : α = 1 versus H A : α ∈ (0, 1), then the criteria should be changed accordingly.A Bayes factor approach to testing this has been presented in [6].A similar study for the simultaneous addition of sensors could be performed.
More complex models and models with larger parameter spaces should be included in future work where numeric solvers more than likely need to be employed to solve such complex fractional differential equations and systems.Many real world applications of fractional calculus and differential equations exist where Bayesian methods and estimation is important.These range from viscoelastic diffusion in complex fluids [30], anomalous diffusion [10], fractional order control problems [26], biological systems [29], and a lot more [38].Determining the accuracy of the numeric solver and its impact on inferences should be examined to ensure that the choice of numeric solver does not unduly influence any parameter inferences or predictive distributions.

Figure 1 :
Figure 1: Example experimental device with pressure input, sensors at various locations and resulting p(x, t) measurements.Blue sensors correspond to the base design and the red sensors are the additional sensor as suggested via the follow-up design.

Figure 2
shows the behavior of the solution across various values of α.Panel (a)