## Thursday, September 15, 2011

### The Separation of V and V

The material in this post started out as background for a magazine article, and turned in to the introduction to an appendix for my dissertation [1]. Since I started learning about verification and validation, I’ve always been curious about the reasons for the sharp lines we draw between the two activities. Turns out, you can trace the historical precedent for separating the task of verification and validation all the way back to Lewis Fry Richardson, and George Boole spent some space writing on what we’d call verification. Though neither he nor Richardson use the modern terms, the influence of their thought is evident in the distinct technical meanings we’ve chosen to give the two common-usage synonyms “verification” and “validation.”

The term verification is given slightly different definitions by different groups of practitioners [2]. In software engineering, the IEEE defines verification as

The process of evaluating the products of a software development phase to provide assurance that they meet the requirements defined for them by the previous phase.

while the definition now commonly accepted in the computational physics community is

The process of determining that a model implementation accurately represents the developer’s conceptual description of the model and the solution to the model. [3]

The numerical weather prediction community speaks of forecast verification, which someone using the above quoted AIAA definition would probably consider a form of validation, and a statistician might use the term validation in a way the computational physicist would probably consider verification [4]. Arguing over the definitions of words [5] which, in common use, are synonyms is contrary to progress under a pragmatic, “wrong, but useful” conception of the modeling endeavor [6]. Rather, we should be clear on our meaning in a specific context, and thus avoid talking past collegues in related disciplines. Throughout this work, the term verification is used consistent with currently accepted definitions in the aerospace, defense and computational physics communities [3789].

In the present diagnostic effort, the forward model code seeks an approximate solution to a discretized partial differential equation. This partial differential equation (PDE) is derived from Maxwell’s equations augmented by conservation equations derived from the general hyperbolic conservation laws through analytical simplification guided by problem-specific assumptions. The purpose of formal verification procedures such as method of manufactured solutions (MMS) are to demonstrate that the simulation code solves these chosen equations correctly. This is done by showing ordered convergence of the simulated solution in a discretization parameter (such as mesh size or time-step size).

Understanding the impact of truncation error on the quality of numerical solutions has been a significant concern over the entire developmental history of such methods. Although modern codes tend to use finite element, finite volume or pseudo-spectral methods as opposed to finite differences, George Boole’s concern for establishing the credibility of numerical results is generally applicable to all these methods. In his treatise, written in 1860, Boole stated

...we shall very often need to use the method of Finite Differences for the purpose of shortening numerical calculation, and here the mere knowledge that the series obtained are convergent will not suffice; we must also know the degree of approximation.

To render our results trustworthy and useful we must find the limits of the error produced by taking a given number of terms of the expansion instead of calculating the exact value of the function that gave rise thereto. [10]

In a related vein, Lewis Fry Richardson, under the heading Standards of Neglect in the 1927 paper which introduced the extrapolation method which now bears his name, stated in his characteristically colorful language

An error of form which wold be negligible in a haystack would be disastrous in a lens. Thus negligibility involves both mathematics and purpose. In this paper we discuss mathematics, leaving the purposes to be discussed when they are known. [11]

This appendix follows Richardson’s advice and confines discussion to the correctness of mathematics, leaving the purposes and sufficiency of the proposed methods for comparison with other diagnostic techniques in the context of intended applications.

While the development of methods for establishing the correctness and fitness of numerical approximations is certainly of historical interest, Roache describes why this effort in code verification is more urgently important than ever before (and will only increase in importance as simulation capabilities, and our reliance on them, grow).

In an age of spreading pseudoscience and anti-rationalism, it behooves those of us who believe in the good of science and engineering to be above reproach whenever possible. Public confidence is further eroded with every error we make. Although many of society’s problems can be solved with a simple change of values, major issues such as radioactive waste disposal and environmental modeling require technological solutions that necessarily involve computational physics. As Robert Laughlin noted in this magazine, “there is a serious danger of this power [of simulations] being misused, either by accident or through deliberate deception.” Our intellectual and moral traditions will be served well by conscientious attention to verification of codes, verification of calculations, and validation, including the attention given to building new codes or modifying existing codes with specific features that enable these activities. [6]

There is a moral imperative underlying correctness checking simply because we want to tell the truth, but this imperative moves closer "to first sight" because of the uses to which our results will be put.

The language Roache uses reflects a consensus in the computational physics community that has given the name verification to the activity of demonstrating the impact of truncation error (usually involving grid convergence studies), and the name validation to the activity of determining if a code has sufficient predictive capability for its intended use [3]. Boole’s idea of “trustworthy results” clearly underlies the efforts of various journals and professional societies [3798] to promote rigorous verification of computational results. Richardson’s separation of the questions of correct math and fitness for purpose are reflected in those policies as well. In addition, the extrapolation method developed by Richardson has been generalized to support uniform reporting of verification results [12].

Two types of verification have been distinguished [13]: Code verification and calculation verification. Code verification is done once for a particular code version, it demonstrates that a specific implementation solves the chosen governing equations correctly. This process can be performed on a series of grids of any size (as long as they are within the asymptotic range) with an arbitrarily chosen solution (no need for physical realism). Calculation verification, on the other hand, is an activity specific to a given scientific investigation, or decision support activity. The solution in this case will be on physically meaningful grids with physically meaningful initial condition (IC)s and boundary condition (BC)s (therefore no a priori-known solution). Rather than monitoring the convergence of an error metric, the convergence of solution functionals relevant to the scientific or engineering development question at hand are tracked to ensure they demonstrate convergence (and ideally, grid/time-step independence).

The approach taken in this work to achieving verification is based on heavy use of a computer algebra system (CAS) [14]. The pioneering work in using computer algebra for supporting the development of computational physics codes was performed by Wirth in 1980 [15]. This was quickly followed by other code generation efforts [16171819] demonstrations of the use of symbolic math programs to support stability analysis [20] and correctness verification for symbolically generated codes solving governing equations in general curvilinear body-fitted coordinate systems [21].

The CAS handles much of the tedious and error prone manipulation required to implement a numerical PDE solver. It also makes creating the forcing terms necessary for testing against manufactured solutions straight-forward for even very complex governing equations. The MMS is a powerful tool for correctness checking and debugging. The parameters of the manufactured solution allow the magnitude of the contribution of each term to the error to be controlled. In this way, if a code fails to converge for a solution with all parameters (1) (note that this is the recommended approach, hugely different parameter values which might obtain in a physically realistic solution can mask bugs). The parameter sizes can then be varied in a systematic way to locate the source of the non-convergence (as convincingly demonstrated by by Salari and Knupp with a blind test protocol [22]). This gives the code developer a diagnostic capability for the code itself. The error analysis can be viewed as a sort of group test [23]where the “dilution” of each term’s (member’s) contribution to the total (group) error (response) is governed by the relative sizes of the chosen parameters. Though we fit a parametric model (the error ansatz) to determine rate of convergence, the response really is expected to be a binary one as in the classic group test, the ordered convergence rate is maintained down to the round-off plateau or it is not. The dilution only governs how high the resolution must rise (and the error must fall) for this behavior to be confirmed. Terms with small parameters will require that convergence to very high levels is used to ensure that an ordered error is not lurking below.

### References

[1]   Stults, J., Nonintrusive Microwave Diagnostics of Collisional Plasmasin Hall Thrusters and Dielectric Barrier Discharges, Ph.D. thesis, Air Force Institute of Technology, September 2011.

[2]   Oberkampf, W. L. and Trucano, T. G., “Verification and Validation Benchmarks,” Tech. Rep. SAND2002-0529, Sandia National Laboratories, March 2002.

[3]   “AIAA Guide for the Verification and Validation of Computational Fluid Dynamics Simulations,” Tech. rep., American Institute of Aeronautics and Astronautics, Reston, VA, 1998.

[4]   Cook, S. R., Gelman, A., and Rubin, D. B., “Validation of Software for Bayesian Models Using Posterior Quantiles,” Journal of Computationaland Graphical Statistics, Vol. 15, No. 3, 2006, pp. 675–692.

[5]   Oreskes, N., Shrader-Frechette, K., and Belitz, K., “Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences,”Science, Vol. 263, No. 5147, 1994, pp. 641–646.

[6]   Roache, P. J., “Building PDE Codes to be Verifiable and Validatable,”Computing in Science and Engineering, Vol. 6, No. 5, September 2004.

[7]   “Standard for Verification, Validation in Computational Fluid Dynamics, Heat Transfer,” Tech. rep., American Society of Mechanical Engineers, 2009.

[8]   AIAA, “Editorial Policy Statement on Numerical and Experimental Uncertainty,” AIAA Journal, Vol. 32, No. 1, 1994, pp. 3.

[9]   Freitas, C., “Editorial Policy Statement on the Control of Numerical Accuracy,” ASME Journal of Fluids Engineering, Vol. 117, No. 1, 1995, pp. 9.

[10]   Boole, G., A Treatise on the Calculus of Finite Differences, Macmillian and co., London, 3rd ed., 1880.

[11]   Richardson, L. F. and Gaunt, J. A., “The Deferred Approach to the Limit. Part I. Single Lattice. Part II. Interpenetrating Lattices,”Philosophical Transactions of the Royal Society of London. Series A,Containing Papers of a Mathematical or Physical Character, Vol. 226, No. 636-646, 1927, pp. 299–361.

[12]   Roache, P. J., “Perspective: A method for uniform reporting of grid refinement studies,” Journal of Fluids Engineering, Vol. 116, No. 3, September 1994.

[13]   Roache, P. J., Verification and Validation in Computational Scienceand Engineering, Hermosa Publishers, Albuquerque, 1998.

[14]   Fateman, R. J., “A Review of Macsyma,” IEEE Trans. on Knowl. andData Eng., Vol. 1, March 1989, pp. 133–145.

[15]   Wirth, M. C., On the Automation of Computational Physics, Ph.D. thesis, University of California, Davis, 1980.

[16]   Cook, G. O., Development of a Magnetohydrodynamic Code forAxisymmetric, High-β Plasmas with Complex Magnetic Fields, Ph.D. thesis, Brigham Young University, December 1982.

[17]   Steinberg, S. and Roache, P. J., “Using MACSYMA to Write FORTRAN Subroutines,” Journal of Symbolic Computation, Vol. 2, No. 2, 1986, pp. 213–216.

[18]   Steinber, S. and Roache, P., “Using VAXIMA to Write FORTRAN Code,” Applications of Computer Algebra, edited by R. Pavelle, Kulwer Academic Publishers, August 1984, pp. 74–94.

[19]   Florence, M., Steinberg, S., and Roache, P., “Generating subroutine codes with MACSYMA,” Mathematical and Computer Modelling, Vol. 11, 1988, pp. 1107 – 1111.

[20]   Wirth, M. C., “Automatic generation of finite difference equations and fourier stability analyses,” SYMSAC ’81: Proceedings of the fourth ACMsymposium on Symbolic and algebraic computation, ACM, New York, NY, USA, 1981, pp. 73–78.

[21]   Steinberg, S. and Roache, P. J., “Symbolic manipulation and computational fluid dynamics,” Journal of Computational Physics, Vol. 57, No. 2, 1985, pp. 251 – 284.

[22]   Knupp, P. and Salari, K., “Code Verification by the Method of Manufactured Solutions,” Tech. Rep. SAND2000-1444, Sandia National Labs, June 2000.

[23]   Dorfman, R., “The Detection of Defective Members of Large Populations,” The Annals of Mathematical Statistics, Vol. 14, No. 4, 1943, pp. 436–440.

## Friday, September 2, 2011

### Airframer for Dayton Aerospace Cluster

In my previous post on the Dayton Aerospace Cluster, I mentioned that one of the big missing pieces is an air-framer. The DDN recently reported that a Fortune100 defense contractor that currently makes UAV components, but does not make full airframes yet is interested in locating a manufacturing / test facility in the Dayton region.

Here's some of the defense contractors on the Fortune 100:
See the full Fortune 500 list. The most interesting one on the list is GD, since two of the three joint ventures listed are with Israeli companies, and Dayton region representatives signed a trade deal in 2009 with representatives from Haifa.

## Thursday, September 1, 2011

### Fun with Filip

The Filipelli dataset [1] from the NIST Statistical Reference Datasets came up in discussions about ill-conditioned problems over on the Air Vent. Fitting the certified polynomial is a example problem 5.10 in Moler’s MATLAB book [3]. Part of the problem is to compare five different methods of fitting the 10th-order polynomial

1. MATLAB’s polyfit (which does QR on the Vandermonde matrix)
2. MATLAB’s backslash operator (which will do QR with pivoting)
3. Pseudoinverse
 $\beta ={X}^{†}y$ (1)

4. Normal equations
 $\beta ={\left({X}^{T}X\right)}^{-1}{X}^{T}y$ (2)

5. Centering by the mean and normalizing by the standard deviation

Carrick already demonstrated the centering approach in the comments. I was surprised how well the simple centering approach worked (it’s been quite a while since I’ve done this homework problem). Carrick mentioned that Gram-Schmidt might be another way to go, and I mentioned that QR using rotations or reflections is usually preferred for numerical stability reasons. One of the trade-offs is that Gram-Schmidt gives you the basis one vector at a time, but the other methods gives you Q all at once at the end. Depending on the problem one may be more useful than the other.

The LAPACK routine DGEQRG uses the Householder reflection method to calculate the QR decomposition [2]. This routine is wrapped in scipy.linalg.qr(), and it has an option to return only economic factors and perform pivoting. The Scipy version of polyfit() uses the singular value decomposition (from which you get the pseudoinverse).

So, the two main ways of getting robust numerical algorithms for solving least-squares problems are various flavors of $QR$ and the pseudoinverse. Naive application of some statistical software that doesn’t use these methods by default leads to problems (e.g. SAS and R). Since I don’t like doing other people’s homework I won’t got through Moler’s list, but I did want to demonstrate another method to tackle the problem using the dreaded normal equations (which are terribly conditioned in the case of the Vandermonde matrix).

The basic idea behind both numerical approaches is to build a “good” basis to describe $X$, so that when we write $X$ in terms of this basis, and perform operations using the basis, we don’t loose lots of precision. We can take a more symbolic approach to the problem, and define an orthogonal basis analytically. So instead of using the monomials as in the Vandermonde matrix, we can use an orthogonal set like the Legendre polynomials. It’s easiest to use this basis if you map the $x$ data to the interval $\left[-1,1\right]$.

If you read that thread from the R mailing list, you’ll recognize this approach as the one that was recommended (use the R poly function to define the model matrix with an orthogonal basis rather than the monomials which gives the Vandermonde matrix).

Here’s the Python to do the fit on the raw data just using the built-in polyfit() funciton.

# polyfit uses pseudoinverse
nx = 3000
p = sp.polyfit(x,y,10)
px = sp.linspace(x.min(),x.max(),nx)
# map x to [-1,1]
x_map = (x.min() + x.max() - 2.0*x)/(x.min() - x.max())
px_map = sp.linspace(-1,1,nx)
py = sp.polyval(p, px)
pNISTy = sp.polyval(beta_NIST, px)
NISTy = sp.polyval(beta_NIST, x)

Here’s the Python to do the fit using some other matrix factorizations.

# compare polyfit to QR, this version uses Householder reflections
Q,R = qr(X, mode=qr, econ=True)
b = sp.dot(sp.transpose(Q),y)
beta = solve(R,b)
beta = sp.flipud(beta) # polyval expects the coefficients in the other order
qry = sp.polyval(beta, px)

Finally, here is the solution using the Legendre basis to start, which avoids the necessity of generating the basis numerically using QR or the singular value decomposition (scipy.special.legendre).

# compare to pseudoinverse
Xdag = pinv(X)
beta_pinv = sp.flipud(sp.dot(Xdag, y))
pinvy = sp.polyval(beta_pinv, px)
Xmapdag = pinv(X_map)
pinvy_map = sp.polyval(sp.flipud(sp.dot(Xmapdag,y)), px_map)

X1 = sp.zeros((x_map.shape[0],11),dtype=float) # for fitting
X2 = sp.zeros((px_map.shape[0],11),dtype=float) # for plotting
X1[:,0] = 1.0
X2[:,0] = 1.0
portho = []
for i in xrange(11):
portho.append(legendre(i))
for i in xrange(11):
X1[:,i] = portho[i](x_map)
X2[:,i] = portho[i](px_map)
X1dag = pinv(X1)
beta1_pinv = sp.dot(X1dag, y)
pinvy_ortho = sp.dot(X2, beta1_pinv)
# do it the wrong way with X1
b_legendre = sp.dot(sp.transpose(X1),y)
beta_legendre = solve(sp.dot(sp.transpose(X1),X1), b_legendre)
py_legendre = sp.dot(X2, beta_legendre)
LU,piv = lu_factor(sp.dot(sp.transpose(X1),X1))
beta_legendre_lu = lu_solve((LU,piv),b_legendre)
py_legendre_lu = sp.dot(X2, beta_legendre_lu)

The error norm for all of the “good” methods is roughly the same order of magnitude. Even using a bad algorithm, inverting the normal equations, gives a good answer if you start out with a design matrix that is orthogonal to begin with. The QR method is much less analyst-work (and works even without mapping or normalizing the data). Since you just build the basis numerically it requires a bit more cpu-work (which is usually cheap), and it is already coded up and well tested in freely available libraries.

### References

[2]   DGEQRF, LAPACK subroutine, version 3.3.1, http://www.netlib.org/lapack/double/dgeqrf.f

[3]   Moler, C.B., Numerical computing with MATLAB, Chapter 5: Least Squares, Cambridge University Press, 2004.