This post is another installment in the Lorenz63 series. In it I’ll try to address a common confusion among folks trying to understand what uncertainty in initial conditions means for our ability to predict the future state of a system (in particular as it pertains to climate change). The common pedagogical technique is to describe weather prediction as an initial value problem (IVP) and climate prediction as a boundary value problem (BVP) (for example see Serendipity). I don’t think that happens to be a very good teaching technique (it seems too hand-wavy to me [update: better reasoning given down in this comment and this comment]). I think there are probably better approaches which would give more insight into the problems. I’m a learn by doing / example kind of guy, so that’s what this post will focus on: using the Lorenz ’63 system as a useful toy to give insight into the problem.
In fact, predictions of both climate and weather often use the same models which approximately solve the same initial-boundary value problem described by partial differential equations (PDE)s for the conservation of mass, momentum and energy (along with many parametrizations for physical and chemical processes in the atmosphere as well as sub-grid scale models of unresolved flow features). The real distinction is that climate researchers and weather forecasters care about different functions or statistics of the solutions over different time-scales. A weather forecast depends on providing time-accurate predictive distributions of the state of the atmosphere in the near-future. A climate prediction is trying to provide a predictive distribution of a time-averaged atmospheric state which is (hopefully) independent of time far enough into the future (there’s an implicit ergodic hypothesis here, which, as reader Tom Vonk points out, still requires some theoretical developments to justify for the PDEs we’re actually interested in).
Another concept I’d like to introduce before I show example results from the Lorenz ’63 system is the entropy of a probability distribution. This can be viewed as a measure of informativeness of the distribution. [update: this paper presents the idea in context of climatic predictions] So a very informative distribution would have low entropy, but an uninformative distribution would have maximum entropy. For weather forecasting we would like low entropy in our predictive distributions, because that means we have significant information about what is going to happen. For climate, the distribution itself is what we want, and we are actually in some sense looking for the maximum entropy distribution that is consistent with our constraints. Measuring the “distance” between distributions with different constraints is what climate forecasting is all about.
Now for the toying with the Lorenz63 system. The “ensembles” I’m running are just varying the initial condition in the x-component (using the same approach shown here). I’m also using two slightly different bx parameters in the forcing function. Figure 1 shows the x-component of the resulting trajectories.
The analogy to the weather / climate difference is pretty well illustrated by these results. Up to t ~ 3 we could do some pretty decent “weather” prediction, the ensemble members stay close together. After that things diverge abruptly (exponential growth in initial differences), this illustrates the need to “spin-up” a model when you are interested in the climate distributions. After t ~ 10 we could probably start estimating the “climate” of the two different forcings (histograms for 10 < t < 12 shown in Figure 2).
These trajectories illustrate another interesting aspect of deterministic chaos, our uncertainty in the future state does not increase monotonically, it will grow and shrink in time (for instance compare the spread in the ensemble members at t = 4 to that at t = 8 shown in Figure 1). The entropy in the distribution of the trajectories as a function of time is shown in Figure 3. A Python function to calculate this for my ensemble (stored in a 2d array) is shown below.
# the ensemble runs accross the first dimension of x, the time
# runs accross the second, return entropy as a function of time
eps = 1e-16
ent = sp.zeros(x.shape, dtype=float)
bin_edges = sp.linspace(
min(x.ravel()), max(x.ravel()), int(sp.sqrt(x.shape)))
for i in xrange(x.shape):
# the histogram function returns a probability density with normed=True
p = sp.histogram(x[:,i], bins=bin_edges, normed=True)
# we would like a probability mass for each bin, so we need to
# multiply by the width of the bin:
dx = p - p
p = dx * p
# normalize (it’s generally very close, this is probably
# unnecessary), and take care of zero p bins so we don’t get
# NaNs in the log:
p = p / sum(p) + eps
ent[i] = -sum(p * sp.log(p))
The entropy gives us a nice measure of the informativeness of our ensemble. In the initial stages (t < 4) we’ve got small entropy (we could make “weather” predictions here). There’s a significant spike around t = 4, and then we see the magnitude of entropy drop off for a bit (or a nat, ha-ha) around t = 8, which matches the eye-balling of the ensemble spread we did earlier.
That’s it for toy model results, now for some conclusions and opinions.
There are two honest concerns with climate forecasting (if you know of more let me hear about them, if you don’t think my concerns are honest, let me hear that too). First, are the things we can predict with climate modeling useful for planing mitigation and adaption policies? So many of the alarming predictions of costs and catastrophes attributed to climate change in the press (and even in the IPCC’s reports) depend on particular regional (rather than global) climate changes. I think an open research question is how the time averaging (and large time-steps) involved in calculating the long trajectories for estimating equilibrium climate distributions (setting aside the theoretical underpinning of this ergodic assumption) affect the accuracy of the predicted spatial variations and regional changes (it is after-all a PDE rather than an ordinary differential equations (ODE)). This seems to be an area of research that is just beginning. Also, the papers I’ve been able to find so far don’t seem to report any grid convergence index results for the solutions (please link to papers that do have this in the comments if you know of any, thanks). This is an important part of what Roache calls calculation verification (as opposed to the code verification demonstrated here).
Second, what about the details of the ’averaging window’ (in both space and time)? How useful to policy are the results of long time-averages? Are the equilibrium distributions things we will ever actually reach? These two concerns about the usefulness of the climate modeling product for policy makers (and in a Republic like mine, the public), and the details about the averaging, seem to be the motivation for Pielke Sr.’s quibble over in this thread about the definition of climate and the implications of chaos. As Pielke points out, take your averaging volume small enough, and things start looking pretty chaotic.
My personal view is that deterministic chaos is neat in toy problems (and a fun challenge for applying the method of manufactured solutions), but in the real world our uncertainties about everything (and the stochastic nature of many of the forcings) swamp the infinitesimals. The thing that makes the dynamics of the climate-policy-science “system” interesting is the tension between giving useful insight to decision makers and ensuring that insight is not overly sensitive to our inescapable uncertainties. Right now, the state of that system is far from equilibrium.[Update: These survey results (courtesy of Roger Pielke Sr) are interesting. They seem to indicate that "climate scientists" view climate prediction as an IVP.
The question is, 16. How would you rate the ability of global climate models to: (very poor 1 2 3 4 5 6 very good) 16c. model temperature values for the next 10 years 16d. model temperature values for the next 50 years. The mean response for the longer term prediction is actually lower than for the short term prediction (3.7 vs. 4.2), though the difference isn't that big considering the standard deviation.
The comparing the "reproduce observations" questions (16a. and 16b.) with the "predict future values" (16c. and 16d.) goes to the validation question. Would you bet your life on your models predictions?]