Various Consequences: verification

Showing posts with label verification. Show all posts

Monday, August 18, 2014

Validation & Verification in Physics of Plasmas

Physics of Plasmas is making a collection of 20 papers on verification and validation available for free download for a limited time.

Theoretical models, both analytical and numerical, are playing an increasingly important role in predicting complex plasma behavior, and providing a scientific understanding of the underlying physical processes.

Since the ability of a theoretical model to predict plasma behavior is a key measure of the model’s accuracy and its ability to advance scientific understanding, it is Physics of Plasmas’ Editorial Policy to encourage the submission of manuscripts whose primary focus is the verification and/or validation of codes and analytical models aimed at predicting plasma behavior.

HiFiLES v0.1 Release

The folks at the Stanford Aerospace Computing Lab have recently released version 0.1 of HiFiLES. "HiFiLES is a high-order Flux Reconstruction solver for the Euler and Navier Stokes equations, capable of simulating high Reynolds number turbulent flows and transonic/supersonic regimes on unstructured grids."

From the release notes:

High-order numerical methods for flow simulations capture complex phenomena like vortices and separation regions using fewer degrees of freedom than their low-order counterparts. The High Fidelity (HiFi) provided by the schemes, combined with turbulence models for small scales and wall interactions, gives rise to a powerful Large Eddy Simulation (LES) software package. HiFiLES is an open-source, high-order, compressible flow solver for unstructured grids built from the ground up to take full advantage of parallel computing architectures. It is specially well-suited for Graphical Processing Unit (GPU) architectures. HiFiLES is written in C++. The code uses the MPI protocol to run on multiple processors, and CUDA to harness GPU performance.

The main reference for the code right now is this V&V paper.[1] The code uses an Energy Stable Flux Reconstruction (ESFR) scheme. Here are a couple papers on that approach.[2, 3].

References

[1] López-Morales, M. R., Bull, J., Crabill, J., Economon, T. D., Manosalvas, D., Romero, J., Sheshadri, A., Watkins II, J. E., Williams, D., Palacios, F., et al., “Verification and Validation of HiFiLES: a High-Order LES unstructured solver on multi-GPU platforms,” .

[2] Vincent, P. E., Castonguay, P., and Jameson, A., “A new class of high-order energy stable flux reconstruction schemes,” Journal of Scientific Computing, Vol. 47, No. 1, 2011, pp. 50–72.

[3] Castonguay, P., Vincent, P. E., and Jameson, A., “A new class of high-order energy stable flux reconstruction schemes for triangular elements,” Journal of Scientific Computing, Vol. 51, No. 1, 2012, pp. 224–256.

Thursday, January 9, 2014

Phil Roe: Colorful Fluid Dynamics

Echos of Tufte in one of his introductory statements: "It's full of noise, it's full of color, it's spectacular, it's intended to blow your mind away, it's intended to disarm criticism." And further on the dangers of "colorful fluid dynamics":

These days it is common to see a complicated flow field, predicted with all the right general features and displayed in glorious detail that looks like the real thing. Results viewed in this way take on an air of authority out of proportion to their accuracy.
--Doug McLean

This lecture is sponsored by MConneX.

Roe wraps up the lecture by referencing a NASA sponsored study, CFD Vision 2030, that addresses whether CFD will be able to reliably predict turbulent separated flows by 2030. The conclusion is that advances in hardware capability alone will not be enough, but that significant improvements in numerical algorithms are required.

Tuesday, October 22, 2013

11th World Congress on Computational Mechanics (WCCM XI)

The 11th World Congress on Computational Mechanics (WCCM XI) is coming up in July in Barcelona. Dates: abstracts due 29 Nov 2013, and full papers due 28 Feb 2014. The deadline for mini-symposium proposals has passed. There are some interesting looking ones (ECCM, ECFD),

Monday, September 30, 2013

CFD V&V Case Resources

This list is from a discussion on the CFD group on LinkedIn.

CFDOnline Validation and Test Cases: wiki with problem descriptions and some calculation results but not case files / grids (on the handful that I checked)
ERCOFTAC Classic Collection: "The database has undergone some re-structuring and expansion to include, amongst other things, more details of the test cases, computational results, and results and conclusions drawn from the ERCOFTAC Workshops on Refined Turbulence Modelling. At the moment, each case should contain at least a brief description, some data to download, and references to published work. Some cases contain significantly more information than this." Registration required.
LaRC Turbulence Modeling Resource: " The objective is to provide a resource for CFD developers to obtain accurate and up-to-date information on widely-used RANS turbulence models, and verify that models are implemented correctly. This latter capability is made possible through "verification" cases. This site provides simple test cases and grids, along with sample results (including grid convergence studies) from one or more previously-verified codes for some of the turbulence models. Furthermore, by listing various published variants of models, this site establishes naming conventions in order to help avoid confusion when comparing results from different codes."
CFD and Coffee: a list of resources focused on compressible RANS V&V.
cfd-benchmarks: benchmarks focused on room air distribution
Saturne Test Cases: a list of test cases for Code Saturne, which also links the Working Group 21 AGARD database among others
NPARC Alliance Verification and Validation Archive: grids available, but not much grid convergence info
Vassberg's NACA 0012 Grids: Hegedus Aero code-to-code calcs based on 2009 AIAA paper / grids for NACA 0012

I added the link to a Hegedus Aero page detailing his code-to-code and grid convergence results based on a previous AIAA paper and those same grids. The AIAA paper that page is based on is worth a read.

Sunday, August 4, 2013

A Defense of Computational Physics

This is a defense a long time in the making (h/t Dan Hughes). Patrick Roache has done short articles along a similar vein defending the Computational Physics enterprise against what amounts to an empty nihilism that seems to be popular among academics.

Here is the publisher's description,
Karl Popper is often considered the most influential philosopher of science of the first half (at least) of the 20th century. His assertion that true science theories are characterized by falsifiability has been used to discriminate between science and pseudo-science, and his assertion that science theories cannot be verified but only falsified have been used to categorically and pre-emptively reject claims of realistic Validation of computational physics models. Both of these assertions are challenged, as well as the applicability of the second assertion to modern computational physics models such as climate models, even if it were considered to be correct for scientific theories. Patrick J. Roache has been active in the broad area of computational physics for over four decades. He wrote the first textbooks in Computational Fluid Dynamics and in Verification and Validation in Computational Science and Engineering, and has been a pioneer in the V&V area since 1985. He is well qualified to confront the mis-application of Popper's philosophy to computational physics from the vantage of one actively engaged and thoroughly familiar with both the genuine problems and normative practice.
Here is a short excerpt from one of Roache's papers that gives a flavor of the argument he is addressing,

In a widely quoted paper that has been recently described as brilliant in an otherwise excellent Scientific American article (Horgan 1995), Oreskes et al (1994) think that we can find the real meaning of a technical term by inquiring about its common meaning. They make much of supposed intrinsic meaning in the words verify and validate and, as in a Greek morality play, agonize over truth. They come to the remarkable conclusion that it is impossible to verify or validate a numerical model of a natural system. Now most of their concern is with groundwater flow codes, and indeed, in geophysics problems, validation is very difficult. But they extend this to all physical sciences. They clearly have no intuitive concept of error tolerance, or of range of applicability, or of common sense. My impression is that they, like most lay readers, actually think Newton’s law of gravity was proven wrong by Einstein, rather than that Einstein defined the limits of applicability of Newton. But Oreskes et al (1994) go much further, quoting with approval (in their footnote 36) various modern philosophers who question not only whether we can prove any hypothesis true, but also “whether we can in fact prove a hypothesis false.” They are talking about physical laws—not just codes but any physical law. Specifically, we can neither validate nor invalidate Newton’s Law of Gravity. (What shall we do? No hazardous waste disposals, no bridges, no airplanes, no...) See also Konikow & Bredehoeft (1992) and a rebuttal discussion by Leijnse & Hassanizadeh (1994). Clearly, we are not interested in such worthless semantics and effete philosophizing, but in practical definitions, applied in the context of engineering and science accuracy.
Quantification of Uncertainty in Computational Fluid Dynamics, Annu. Rev. Fluid. Mech. 1997. 29:123–60

Thursday, January 3, 2013

National Strategy for Advancing Climate Modeling

NAP has a new report out on a national strategy for advancing climate modeling. I've just done a quick skim so far. Some tidbits below that touch on V&V and uncertainty.

Convergence for Falkner-Skan Solutions

About 6 months ago Dan Hughes sent me a link to an interesting paper on "chaotic" behavior in the trajectory of iterates in a numerical Falkner-Skan solution. It struck me that the novel results reported in that paper were an artifact of the numerical method, and had little to do with any "chaotic" physics that might be going on in boundary layers or other systems that might be well described by this equation. This is similar to the point I made in the Fun with Filip post: the choice of numerical method matters. Do not rush to judgment about problems until you have brought the most appropriate methods to bear.

There are some things about the paper that are not novel, and others that seem to be nonsense. It is well-known that there can be multiple solutions at given parameter values (non-uniqueness) for this equation, see White. There is the odd claim that "the ﬂow starts to create shock waves in the medium [above the critical wedge angle], which is a representation of chaotic behavior in the ﬂow ﬁeld." Weak solutions (solutions with discontinuities/shocks) and chaotic dynamics are two different things. They use the fact that the method they choose does not converge when two solutions are possible as evidence of chaotic dynamics. Perhaps the iterates really do exhibit chaos, but this is purely an artifact of the method (i.e. there is no physical time in this problem, only the pseudo-time of the iterative scheme). By using a different approach you will get different "dynamics", and with proper choice of method, can get convergence (spectral even!) to any of the multiple solutions depending on what initial condition you give your iterative scheme. They introduce a parameter, \(\eta_{\infty}\), for the finite value of the independent variable at "infinity" (i.e. the domain is truncated). There is nothing wrong with this (actually it's a commonly used approach for this problem), but it is not a good idea to solve for this parameter as well as the shear at the wall in your Newton iteration. A more careful approach of mapping the boundary point "to infinity" as the grid resolution is increased (following one of Boyd's suggested mappings) removes the need to solve for this parameter, and gives spectral convergence for this problem even in the presence of non-uniqueness and the not uncommon vexation of a boundary condition defined at infinity (all of external aerodynamics has this helpful feature).

VV&UQ for Historic Masonry Structures

What a neat application of verification, validation and uncertainty quantification (VV&UQ) methods! The paper is Uncertainty quantification in model verification and validation as applied to large scale historic masonry monuments.

Abstract: This publication focuses on the Verification and Validation (V&V) of numerical models for establishing confidence in model predictions, and demonstrates the complete process through a case study application completed on the Washington National Cathedral masonry vaults. The goal herein is to understand where modeling errors and uncertainty originate from, and obtain model predictions that are statistically consistent with their respective measurements. The approach presented in this manuscript is comprehensive, as it considers all major sources of errors and uncertainty that originate from numerical solutions of differential equations (numerical uncertainty), imprecise model input parameter values (parameter uncertainty), incomplete definitions of underlying physics due to assumptions and idealizations (bias error) and variability in measurements (experimental uncertainty). The experimental evidence necessary for reducing the uncertainty in model predictions is obtained through in situ vibration measurements conducted on the masonry vaults of Washington National Cathedral. By deploying the prescribed method, uncertainty in model predictions is reduced by approximately two thirds.

Highlights:

Developed a finite element model of Washington National Cathedral masonry vaults.
Carried out code and solution verification to address numerical uncertainties.
Conducted in situ vibration experiments to identify modal parameters of the vaults.
Calibrated and validated model to mitigate parameter uncertainty and systematic bias.
Demonstrated a two thirds reduction in the prediction uncertainty through V&V.

Keywords: Gothic Cathedral; Modal analysis; Finite element modeling; Model updating; Bayesian inference; Uncertainty quantification

I haven't read the full-text yet, but it looks like a coherent (Bayesian) and pragmatic approach to the problem.

Tuesday, October 11, 2011

Notre Dame V&V Workshop

The folks at Notre Dame are putting on a Verification and Validation Workshop. The preliminary agenda is up on the workshop site.

The purpose of the workshop is to bring together a diverse group of computational scientists working in fields in which reliability of predictive computational models is important. Via formal presentations, structured discussions, and informal conversations, we seek to heighten awareness of the importance of reliable computations, which are becoming ever more critical in our world.

The intended audience is computational scientists and decision makers in fields as diverse as earth/atmospheric sciences, computational biology, engineering science, applied mechanics, applied mathematics, astrophysics, and computational chemistry.

It looks very interesting. ~~Unfortunately, I don't think I'll be able to attend. Hopefully they will post posters and papers.~~ Update: I was able to attend. They will be posting the presenters' slides. I will put up some of my notes when they get the slides up.

Thursday, September 15, 2011

The Separation of V and V

The material in this post started out as background for a magazine article, and turned in to the introduction to an appendix for my dissertation [1]. Since I started learning about verification and validation, I’ve always been curious about the reasons for the sharp lines we draw between the two activities. Turns out, you can trace the historical precedent for separating the task of verification and validation all the way back to Lewis Fry Richardson, and George Boole spent some space writing on what we’d call verification. Though neither he nor Richardson use the modern terms, the influence of their thought is evident in the distinct technical meanings we’ve chosen to give the two common-usage synonyms “verification” and “validation.”

The term verification is given slightly different definitions by different groups of practitioners [2]. In software engineering, the IEEE defines verification as

The process of evaluating the products of a software development phase to provide assurance that they meet the requirements defined for them by the previous phase.

while the definition now commonly accepted in the computational physics community is

The process of determining that a model implementation accurately represents the developer’s conceptual description of the model and the solution to the model. [3]

The numerical weather prediction community speaks of forecast verification, which someone using the above quoted AIAA definition would probably consider a form of validation, and a statistician might use the term validation in a way the computational physicist would probably consider verification [4]. Arguing over the definitions of words [5] which, in common use, are synonyms is contrary to progress under a pragmatic, “wrong, but useful” conception of the modeling endeavor [6]. Rather, we should be clear on our meaning in a specific context, and thus avoid talking past collegues in related disciplines. Throughout this work, the term verification is used consistent with currently accepted definitions in the aerospace, defense and computational physics communities [3, 7, 8, 9].

In the present diagnostic effort, the forward model code seeks an approximate solution to a discretized partial differential equation. This partial differential equation (PDE) is derived from Maxwell’s equations augmented by conservation equations derived from the general hyperbolic conservation laws through analytical simplification guided by problem-specific assumptions. The purpose of formal verification procedures such as method of manufactured solutions (MMS) are to demonstrate that the simulation code solves these chosen equations correctly. This is done by showing ordered convergence of the simulated solution in a discretization parameter (such as mesh size or time-step size).

Understanding the impact of truncation error on the quality of numerical solutions has been a significant concern over the entire developmental history of such methods. Although modern codes tend to use finite element, finite volume or pseudo-spectral methods as opposed to finite differences, George Boole’s concern for establishing the credibility of numerical results is generally applicable to all these methods. In his treatise, written in 1860, Boole stated

...we shall very often need to use the method of Finite Differences for the purpose of shortening numerical calculation, and here the mere knowledge that the series obtained are convergent will not suffice; we must also know the degree of approximation.
To render our results trustworthy and useful we must find the limits of the error produced by taking a given number of terms of the expansion instead of calculating the exact value of the function that gave rise thereto. [10]

In a related vein, Lewis Fry Richardson, under the heading Standards of Neglect in the 1927 paper which introduced the extrapolation method which now bears his name, stated in his characteristically colorful language

An error of form which wold be negligible in a haystack would be disastrous in a lens. Thus negligibility involves both mathematics and purpose. In this paper we discuss mathematics, leaving the purposes to be discussed when they are known. [11]

This appendix follows Richardson’s advice and confines discussion to the correctness of mathematics, leaving the purposes and sufficiency of the proposed methods for comparison with other diagnostic techniques in the context of intended applications.

While the development of methods for establishing the correctness and fitness of numerical approximations is certainly of historical interest, Roache describes why this effort in code verification is more urgently important than ever before (and will only increase in importance as simulation capabilities, and our reliance on them, grow).

In an age of spreading pseudoscience and anti-rationalism, it behooves those of us who believe in the good of science and engineering to be above reproach whenever possible. Public confidence is further eroded with every error we make. Although many of society’s problems can be solved with a simple change of values, major issues such as radioactive waste disposal and environmental modeling require technological solutions that necessarily involve computational physics. As Robert Laughlin noted in this magazine, “there is a serious danger of this power [of simulations] being misused, either by accident or through deliberate deception.” Our intellectual and moral traditions will be served well by conscientious attention to verification of codes, verification of calculations, and validation, including the attention given to building new codes or modifying existing codes with specific features that enable these activities. [6]

There is a moral imperative underlying correctness checking simply because we want to tell the truth, but this imperative moves closer "to first sight" because of the uses to which our results will be put.

The language Roache uses reflects a consensus in the computational physics community that has given the name verification to the activity of demonstrating the impact of truncation error (usually involving grid convergence studies), and the name validation to the activity of determining if a code has sufficient predictive capability for its intended use [3]. Boole’s idea of “trustworthy results” clearly underlies the efforts of various journals and professional societies [3, 7, 9, 8] to promote rigorous verification of computational results. Richardson’s separation of the questions of correct math and fitness for purpose are reflected in those policies as well. In addition, the extrapolation method developed by Richardson has been generalized to support uniform reporting of verification results [12].

Two types of verification have been distinguished [13]: Code verification and calculation verification. Code verification is done once for a particular code version, it demonstrates that a specific implementation solves the chosen governing equations correctly. This process can be performed on a series of grids of any size (as long as they are within the asymptotic range) with an arbitrarily chosen solution (no need for physical realism). Calculation verification, on the other hand, is an activity specific to a given scientific investigation, or decision support activity. The solution in this case will be on physically meaningful grids with physically meaningful initial condition (IC)s and boundary condition (BC)s (therefore no a priori-known solution). Rather than monitoring the convergence of an error metric, the convergence of solution functionals relevant to the scientific or engineering development question at hand are tracked to ensure they demonstrate convergence (and ideally, grid/time-step independence).

The approach taken in this work to achieving verification is based on heavy use of a computer algebra system (CAS) [14]. The pioneering work in using computer algebra for supporting the development of computational physics codes was performed by Wirth in 1980 [15]. This was quickly followed by other code generation efforts [16, 17, 18, 19] demonstrations of the use of symbolic math programs to support stability analysis [20] and correctness verification for symbolically generated codes solving governing equations in general curvilinear body-fitted coordinate systems [21].

The CAS handles much of the tedious and error prone manipulation required to implement a numerical PDE solver. It also makes creating the forcing terms necessary for testing against manufactured solutions straight-forward for even very complex governing equations. The MMS is a powerful tool for correctness checking and debugging. The parameters of the manufactured solution allow the magnitude of the contribution of each term to the error to be controlled. In this way, if a code fails to converge for a solution with all parameters (1) (note that this is the recommended approach, hugely different parameter values which might obtain in a physically realistic solution can mask bugs). The parameter sizes can then be varied in a systematic way to locate the source of the non-convergence (as convincingly demonstrated by by Salari and Knupp with a blind test protocol [22]). This gives the code developer a diagnostic capability for the code itself. The error analysis can be viewed as a sort of group test [23]where the “dilution” of each term’s (member’s) contribution to the total (group) error (response) is governed by the relative sizes of the chosen parameters. Though we fit a parametric model (the error ansatz) to determine rate of convergence, the response really is expected to be a binary one as in the classic group test, the ordered convergence rate is maintained down to the round-off plateau or it is not. The dilution only governs how high the resolution must rise (and the error must fall) for this behavior to be confirmed. Terms with small parameters will require that convergence to very high levels is used to ensure that an ordered error is not lurking below.

References

[1] Stults, J., Nonintrusive Microwave Diagnostics of Collisional Plasmasin Hall Thrusters and Dielectric Barrier Discharges, Ph.D. thesis, Air Force Institute of Technology, September 2011.

[2] Oberkampf, W. L. and Trucano, T. G., “Verification and Validation Benchmarks,” Tech. Rep. SAND2002-0529, Sandia National Laboratories, March 2002.

[3] “AIAA Guide for the Verification and Validation of Computational Fluid Dynamics Simulations,” Tech. rep., American Institute of Aeronautics and Astronautics, Reston, VA, 1998.

[4] Cook, S. R., Gelman, A., and Rubin, D. B., “Validation of Software for Bayesian Models Using Posterior Quantiles,” Journal of Computationaland Graphical Statistics, Vol. 15, No. 3, 2006, pp. 675–692.

[5] Oreskes, N., Shrader-Frechette, K., and Belitz, K., “Verification, Validation, and Confirmation of Numerical Models in the Earth Sciences,”Science, Vol. 263, No. 5147, 1994, pp. 641–646.

[6] Roache, P. J., “Building PDE Codes to be Verifiable and Validatable,”Computing in Science and Engineering, Vol. 6, No. 5, September 2004.

[7] “Standard for Verification, Validation in Computational Fluid Dynamics, Heat Transfer,” Tech. rep., American Society of Mechanical Engineers, 2009.

[8] AIAA, “Editorial Policy Statement on Numerical and Experimental Uncertainty,” AIAA Journal, Vol. 32, No. 1, 1994, pp. 3.

[9] Freitas, C., “Editorial Policy Statement on the Control of Numerical Accuracy,” ASME Journal of Fluids Engineering, Vol. 117, No. 1, 1995, pp. 9.

[10] Boole, G., A Treatise on the Calculus of Finite Differences, Macmillian and co., London, 3rd ed., 1880.

[11] Richardson, L. F. and Gaunt, J. A., “The Deferred Approach to the Limit. Part I. Single Lattice. Part II. Interpenetrating Lattices,”Philosophical Transactions of the Royal Society of London. Series A,Containing Papers of a Mathematical or Physical Character, Vol. 226, No. 636-646, 1927, pp. 299–361.

[12] Roache, P. J., “Perspective: A method for uniform reporting of grid refinement studies,” Journal of Fluids Engineering, Vol. 116, No. 3, September 1994.

[13] Roache, P. J., Verification and Validation in Computational Scienceand Engineering, Hermosa Publishers, Albuquerque, 1998.

[14] Fateman, R. J., “A Review of Macsyma,” IEEE Trans. on Knowl. andData Eng., Vol. 1, March 1989, pp. 133–145.

[15] Wirth, M. C., On the Automation of Computational Physics, Ph.D. thesis, University of California, Davis, 1980.

[16] Cook, G. O., Development of a Magnetohydrodynamic Code forAxisymmetric, High-β Plasmas with Complex Magnetic Fields, Ph.D. thesis, Brigham Young University, December 1982.

[17] Steinberg, S. and Roache, P. J., “Using MACSYMA to Write FORTRAN Subroutines,” Journal of Symbolic Computation, Vol. 2, No. 2, 1986, pp. 213–216.

[18] Steinber, S. and Roache, P., “Using VAXIMA to Write FORTRAN Code,” Applications of Computer Algebra, edited by R. Pavelle, Kulwer Academic Publishers, August 1984, pp. 74–94.

[19] Florence, M., Steinberg, S., and Roache, P., “Generating subroutine codes with MACSYMA,” Mathematical and Computer Modelling, Vol. 11, 1988, pp. 1107 – 1111.

[20] Wirth, M. C., “Automatic generation of finite difference equations and fourier stability analyses,” SYMSAC ’81: Proceedings of the fourth ACMsymposium on Symbolic and algebraic computation, ACM, New York, NY, USA, 1981, pp. 73–78.

[21] Steinberg, S. and Roache, P. J., “Symbolic manipulation and computational fluid dynamics,” Journal of Computational Physics, Vol. 57, No. 2, 1985, pp. 251 – 284.

[22] Knupp, P. and Salari, K., “Code Verification by the Method of Manufactured Solutions,” Tech. Rep. SAND2000-1444, Sandia National Labs, June 2000.

[23] Dorfman, R., “The Detection of Defective Members of Large Populations,” The Annals of Mathematical Statistics, Vol. 14, No. 4, 1943, pp. 436–440.

Wednesday, January 19, 2011

Empiricism and Simulation

There are two orthogonal ideas that seem to get conflated in discussions about climate modeling. One is the idea that you’re not doing science if you can’t do a controlled experiment, but of course we have observational sciences like astronomy. The other is that all this new-fangled computer-based simulation is untrustworthy, usually because “it ain’t the way my grandaddy did science.” Both are rather silly ideas. We can still weigh the evidence for competing models based on observation, and we can still find protection from fooling ourselves even when those models are complex.

What does it mean to be an experimental as opposed to an observational science? Do sensitivity studies, and observational diagnostics using sophisticated simulations count as experiments? Easterbrook claims that because climate scientists do these two things with their models that climate science is an experimental science [1]. It seems like there is a motivation to claim the mantle of experimental, because it may carry more rhetorical credibility than the merely observational (the critic Easterbrook is addressing certainly thinks so). This is probably because the statements we can make about causality and the strength of the inferences we can draw are usually greater when we can run controlled experiments than when we are stuck with whatever natural experiments fortune provisions for us (and there are sound mathematical reasons for this, having to do with optimality in experimental design rather than any label we may place on the source of the data). This seeming motivation demonstrated by Easterbrook to embrace the label of empirical is in sharp contrast to the denigration of the empirical by Tobis in his three part series [2, 3, 4]. As I noted on his site, the narrative Tobis is trying to create with those posts has already been pre-messed with by Easterbrook, his readers just pointed out the obvious weaknesses too. One good thing about blogging is the critical and timely feedback.

The confusions of these two climate warriors are an interesting point of departure. I think they are both saying more than blah blah blah, so it’s worth trying to clarify this issue. The figure below is based on a technical report from Sandia [5], which is a good overview and description of the concepts and definitions for model verification and validation as it has developed in the computational physics community over the past decade or so. I think this emerging body of work on model V&V places the relative parts, experiment and simulation, in a sound framework for decision making and reasoning about what models mean.

Figure 1:

Verification and Validation Process (based largely on [5])

The process starts at the top of the flowchart with a “Reality of Interest”, from which a conceptual model is developed. At this point the path splits into two main branches. One based on “Physical Modeling” and the other based on “Mathematical Modeling”. Something I don’t think many people realize is that there is a significant tradition of modeling in science that isn’t based on equations. It is no coincidence that an aeronautical engineer might talk of testing ideas with a wind-tunnel model or a CFD model. Both models are simplifications of the reality of interest, which, for that engineer, is usually a full-scale vehicle in free flight.

Figure 2 is just a look at the V&V process through my Design of Experiments (DoE) colored glasses.

Figure 2:

Distorted Verification and Validation Process

My distorted view of the V&V process is shown to emphasize that there’s plenty of room for experimentalists to have fun (maybe even a job [3]) in this, admittedly model-centric, sandbox. However, the transferability of the basic experimental design skills between “Validation Experiments” and “Computational Experiments” says nothing about what category of science one is practicing. The method of developing models may very well be empirical (and I think Professor Easterbrook and I would agree it is, and maybe even should be), but that changes nothing about the source of the data which is used for “Model Validation.”

The computational experiments highlighted in Figure 2 are for correctness checking, but those aren’t the sorts of computational experiments Easterbrook claimed made climate science an experimental science. Where do sensitivity studies and model-based diagnostics fit on the flowchart? I think sensitivity studies fit well in the activity called “Pre-test Calculations”, which, one would hope, inform the design of experimental campaigns. Diagnostics are more complicated.

Heald and Wharton have a good explanation for the use of the term “diagnostic” in their book on microwave-based plasma diagnostics: “The term ‘diagnostics,’ of course, comes from the medical profession. The word was first borrowed by scientists engaged in testing nuclear explosions about 15 years ago [c. 1950] to describe measurements in which they deduced the progress of various physical processes from the observable external symptoms” [6]. With a diagnostic we are using the model to help us generate our “Experimental Data”, so that would happen within the activity of “Experimentation” on this flowchart. This use of models as diagnostic tools is applied to data obtained from either experiment (e.g. laboratory plasma diagnostics) or observations (e.g. astronomy, climate science), so it says nothing about whether a particular science is observational or experimental. Classifying scientific activities as experimental or observational is of passing interest, but I think far too much emphasis is placed on this question for the purpose of winning rhetorical “points.”

The more interesting issue from a V&V perspective is introducing a new connection in the flowchart that shows how a dependency between model and experimental data could exist (Figure 3). Most of the time the diagnostic model, and the model being validated are different. However, this case where they are the same is an interesting and practically relevant one that is not addressed in the current V&V literature that I know of (please share links if you “know of”).

Figure 3:

V&V process including model-based diagnostic

It should be noted that even though the same model may be used to make predictions and perform diagnostics, it will usually be run in a different way for those two uses. The significant changes between Figure 1 and Figure 3 are the addition of a “Experimental Diagnostic” box and the change to the mathematical cartoon in the “Validation Experiment” box. The change to the cartoon is to indicate that we can’t measure what we want directly (u), so we have to use a diagnostic model to estimate it based on the things we can measure (b). An example of when the model-based diagnostic is relatively independent of the model being validated might be using laser-based diagnostic for fluid flow. The equations describing propagation of the laser through the fluid are not the same as those describing the flow. An example of when the two codes might be connected would be if you were trying to use ultrasound to diagnose a flow. The diagnostic model and the predictive model could both be Navier-Stokes with turbulence closures. Establishing the validity of which is the aim of the investigation. I’d be interested in criticisms of how I explained this / charted this out.

Afterward

Attempt at Answering Model Questions

I’m not in the target population that professor Easterbrook is studying, but here’s my attempt at answering his questions about model validation[7].

“If I understand correctly–a model is ’valid’ (is that a formal term?) if the code is written to correctly represent the best theoretical science at the time...”
I think you are using an STS flavored definition for “valid.” The IEEE/AIAA/ASME/US-DoE/US-DoD definition differs. “Valid” means observables you get out of your simulations are “close enough” to observables in the wild (experimental results). The folks from DoE tend to argue for a broader definition of valid than the DoD folks. They’d like to include as “validation” activities of a scientist comparing simulation results and experimental results without reference to an intended use.
“– so then what do the results tell you? What are you modeling for–or what are the possible results or output of the model?”
Doing a simulation (running the implementation of a model) makes explicit the knowledge implicit in your modeling choices. The model is just the governing equations, you have to run a simulation to find solutions to those governing equations.
“If the model tells you something you weren’t expecting, does that mean it’s invalid? When would you get a result or output that conflicts with theory and then assess whether the theory needs to be reconsidered?”
This question doesn’t make sense to me. How could you get a model output that conflicted with theory? The model is based on theory. Maybe this question is about how simplifying assumptions could lead to spurious results? For example, if a simulation result shows failure to conserve mass/momentum/energy in a specific calculation possibly due to a modeling assumption (more likely due to a more mundane error), I don’t think anyone but a perpetual-motion machine nutter would seriously reconsider the conservation laws.
“Then is it the theory and not the model that is the best tool for understanding what will happen in the future? Is the best we can say about what will happen that we have a theory that adheres to what we know about the field and that makes sense based on that knowledge?”
This one doesn’t make sense to me either. You have a “theory,” but you can’t formulate a “model” of it and run a simulation, or just a pencil and paper calculation? I don’t think I’m understanding how you are using those words.
“What then is the protection or assurance that the theory is accurate? How can one ‘check’ predictions without simply waiting to see if they come true or not come true?”
There’s no magic; the protection from fooling ourselves is the same as it has always been, only the names of the problems change.

Attempt at Understanding Blah Blah Blah

“The trouble comes when empiricism is combined with a hypothesis that the climate is stationary, which is implicit in how many of their analyses work.” [8]
The irony of this statement is extraordinary in light of all the criticisms by the auditors and others of statistical methods in climate science. It would be a valid criticism, if it were supported.
“The empiricist view has never entirely faded from climatology, as, I think, we see from Curry. But it’s essentially useless in examining climate change. Under its precepts, the only thing that is predictable is stasis. Once things start changing, empirical science closes the books and goes home. At that point you need to bring some physics into your reasoning.” [2]
So we’ve gone from what could be reasonable criticism of unfounded assumptions of stationarity to empiricism being unable to explain or understand dynamics. I guess the guys working on embedding dimension stuff, or analogy based predictions would be interested to know that.
“See, empiricism lacks consilience. When the science moves in a particular direction, they have nothing to offer. They can only read their tea leaves. Empiricists live in a world which is all correlation, and no causation.” [3]
Lets try some definitions.
empiricism
knowledge through observation
consilience
unity of knowledge, non-contradiction

How can the observations contradict each other? Maybe a particular explanation for a set of observations is not consilient with another explanation for a different set of observations. This seems to be something that would get straightened out in short order though: it’s on this frontier that scientific work proceeds. I’m not sure how empiricism is “all correlation.” This is just a bald assertion with no support.
“While empiricism is an insufficient model for science, while not everything reduces to statistics, empiricism offers cover for a certain kind of pseudo-scientific denialism. [...] This is Watts Up technique asea; the measurements are uncertain; therefore they might as well not exist; therefore there is no cause for concern!” [4]
Tobis: Empiricism is an insufficient model for science. Feynman: The test of all knowledge is experiment. Tobis: Not everything reduces to statistics. Jaynes: Probability theory is the logic of science. To be fair, Feynman does go on to say that you need imagination to think up things to test in your experiments, but I’m not sure that isn’t included in empiricism. Maybe it isn’t included in the empiricism Tobis is talking about.
So that’s what all this is about? You’re upset at Watts making a fallacious argument about uncertainty? What does empiricism have to do with this? It would be simple enough to just point out that uncertainty doesn’t mean ignorance.

Not quite blah blah blah, but the argument is still hardly thought out and poorly supported.

References

[1] Easterbrook, S., “Climate Science is an Experimental Science,”http://www.easterbrook.ca/steve/?p=1322, February 2010.

[2] Tobis, M., “The Empiricist Fallacy,” http://initforthegold.blogspot.com/2010/11/empiricist-fallacy.html, November 2010.

[3] Tobis, M., “Empiricism as a Job,”http://initforthegold.blogspot.com/2010/11/empiricism-as-job.html, November 2010.

[4] Tobis, M., “Pseudo-Empiricism and Denialism,”http://initforthegold.blogspot.com/2010/11/pseudo-empiricism-and-denialism.html, November 2010.

[5] Thacker, B. H., Doebling, S. W., Hemez, F. M., Anderson, M. C., Pepin, J. E., and Rodriguez, E. A., “Concepts of Model Verification and Validation,” Tech. Rep. LA-14167-MS, Los Alamos National Laboratory, Oct 2004.

[6] Heald, M. and Wharton, C., Plasma Diagnostics with Microwaves, Wiley series in plasma physics, Wiley, New York, 1965.

[7] Easterbrook, S., “Validating Climate Models,”http://www.easterbrook.ca/steve/?p=2032, November 2010.

[8] Tobis, M., “Empiricism,”http://initforthegold.blogspot.com/2010/11/empiricism.html, November 2010.

Thanks to George Crews and Dan Hughes for their critical feedback on portions of this.

[Update: George left a comment with suggestions on changing the flowchart. Here's my take on his suggested changes.

A slightly modified version of George's chart. I think it makes more sense to have the "No" branch of the validation decision point back at "Abstraction", which parallels the "No" branch of the verification decision pointing at "Implementation". Also switched around "Experimental Data" and "Experimental Diagnostic." Notably absent is any loop for "Calibration"; this would properly be a separate loop with output feeding in to "Computer Model."

]

Saturday, January 8, 2011

Mathematical Science Foundations of Validation, Verification, and Uncertainty Quantification

There is an interesting study being conducted by the National Academies. Here's their "Statement of Task":

A committee of the NRC will examine practices for verification and validation (V&V) and uncertainty quantification (UQ) of large-scale computational simulations in several research communities.
Identify common concepts, terms, approaches, tools, and best practices of V&V and UQ.
Identify mathematical sciences research needed to establish a foundation for building a science of V&V and for improving practice of V&V and UQ.
Recommend educational changes needed in the mathematical sciences community and mathematical sciences education needed by other scientific communities to most effectively use V&V and UQ.

Here's a list of the folks on the committee. It should be interesting to see the study results (it's an 18 month long effort).

Various Consequences

Pages

Monday, August 18, 2014

Validation & Verification in Physics of Plasmas

Thursday, June 26, 2014

HiFiLES v0.1 Release

References

Thursday, January 9, 2014

Phil Roe: Colorful Fluid Dynamics

Tuesday, October 22, 2013

11th World Congress on Computational Mechanics (WCCM XI)

Monday, September 30, 2013

CFD V&V Case Resources

Sunday, August 4, 2013

A Defense of Computational Physics

Thursday, January 3, 2013

National Strategy for Advancing Climate Modeling

Monday, July 23, 2012

Convergence for Falkner-Skan Solutions

Sunday, July 22, 2012

VV&UQ for Historic Masonry Structures

Tuesday, October 11, 2011

Notre Dame V&V Workshop

Thursday, September 15, 2011

The Separation of V and V

References

Wednesday, January 19, 2011

Empiricism and Simulation

Afterward

Attempt at Answering Model Questions

Attempt at Understanding Blah Blah Blah

References

Saturday, January 8, 2011

Mathematical Science Foundations of Validation, Verification, and Uncertainty Quantification

Post Archive

Parts on Shapeways

Diode Gear

Topics