Comments on Various Consequences: Fun with Filip

Finally got around to doing a little post on exper...

2012-07-04T20:35:16.972-04:00

Finally got around to doing a little post on experimental design with an eye to using a good basis.

The google book preview of the reference (Kennedy ...

2012-02-27T19:11:27.956-05:00

The google book preview of the reference (Kennedy and Gentle, 1980) mentioned in R help(poly) is missing a page smack in the middle, so here's a copy of those few pages for your fair use in research, teaching, scholarship, commentary, and, of course, Statistical Computing.

Found these interesting sounding papers while doin...

2012-01-01T10:39:39.609-05:00

Found these interesting sounding papers while doing some unrelated back-ground reading.

Chebyshev-Vandermonde Systems
Abstract. A Chebyshev-Vandermonde matrix
$ V = \left[ p_j \left( z_k \right) \right]^n_{j,k=0} \in C^{(n+1) \times (n+1)} $
is obtained by replacing the monomial entries of a Vandermonde matrix by Chebyshev polynomials $ p_j $ for an ellipse. The ellipse is also allowed to be a disk or an interval. We present a progressive scheme for allocating distinct nodes $ z_k $ on the boundary of the ellipse such that the Chebyshev-Vandermonde matrices obtained are reasonably well-conditioned. Fast progressive algorithms for the solution of the Chebyshev-Vandermonde systems are described. These algorithms are closely related to methods recently presented by Higham. We show that the node allocation is such that the solution computed by the progressive algorithms is fairly insensitive to perturbations in the right-hand side vector. Computed examples illustrate the numerical behavior of the schemes. Our analysis can also be used to bound the condition number of the polynomial interpolation operator defined by Newton's interpolation formula. This extends earlier results of Fischer and the first author.

And a couple others I haven't had a chance to grab the full-text on yet:
[1] Converting Interpolation Series into Chebyshev Series by Recurrence Formulas, Herbert E. Salzer, Mathematics of Computation, Vol. 30, No. 134 (Apr., 1976), pp. 295-302, (article consists of 8 pages), Published by: American Mathematical Society, Stable URL: http://www.jstor.org/stable/2005971

[2] D. Calvetti, L. Reichel, A Chebychev-Vandermonde solver, Linear Algebra and its Applications, Volume 172, 15 July 1992, Pages 219-229, ISSN 0024-3795, 10.1016/0024-3795(92)90027-8.
(http://www.sciencedirect.com/science/article/pii/0024379592900278)

I said, choice of basis should inform the experime...

2011-10-24T20:16:22.773-04:00

I said, choice of basis should inform the experimental design.

Well, there's nothing new under the sun: Optimal Designs for Large Degree Polynomial Regression

For optimal sequential design they recommend just picking a design that is optimal for "high enough" order, and accepting the less than optimal performance for the initial (low-order) screening. This is actually one of the big pragmatic problems I struggled with in trying to use Chebyshev roots points in a real design. Almost all real designs are sequential (test a bit, learn a bit, and then test some more), and equally spaced points lend themselves to this approach very easily.

Thanks for the hints on using the Latex markup. I...

2011-09-13T02:38:55.227-04:00

Thanks for the hints on using the Latex markup.

I started using it on some CO2 models. Take a look at the top 2 posts here:
http://theoilconundrum.blogspot.com/

No Legendre or Chebyshev in any of this, just some simple first-order PD models.

The second one is a Fokker-Planck model. The Latex works out pretty neatly!

Hi Ray, the data for this post is in a text file o...

2011-09-02T21:39:05.829-04:00

Hi Ray, the data for this post is in a text file on the NIST site.

Welcome to my link collection! Please feel free to add more ; - )

Since you seem to be interested in representing da...

2011-09-02T19:38:44.414-04:00

Since you seem to be interested in representing data in a minimal set of basis functions; have you considered using Muntz exponential polynomial series? They have two sets of variables: time (well decay) constants and coefficients. First the time constants would be calculated to minimize the least squares and then coefficients would be computed (actually at the same time the first time). Despite appearances there are orthogonal exponential functions available.
An advantage is that if we believe we have found system functions then different system stimulus should correspond to changing coefficients.
I have been thinking about coding it up for a couple of years; but no motivation.
If your interested I have several papers/links. Or send me data and I can try to get around to some coding.

Thanks Nick, I'll add that one to my little sc...

2011-09-02T07:47:52.785-04:00

Thanks Nick, I'll add that one to my little script. Scipy has lots of options for basis in scipy.special.

Carrick, I was thinking about point spacing some more, and it probably matters less for the overdetermined case than if you're doing interpolation (that is what I've been doing a lot recently so it was stuck in my head). I have enough trouble getting folks to try plain-vanilla experimental designs, I don't know how I'd pitch one based on Chebyshev roots spacing ; - )

Thanks for the links; I'll add them to my reading list; sounds interesting.

For Latex in comments, see this post on MathJax.

JS, It is an interesting problem. I tried using R ...

2011-09-02T06:05:40.185-04:00

JS,
It is an interesting problem. I tried using R with Chebychev polynomials, which worked well. I found the original condition number was 7.66e+21. After centering and normalising to a unit interval, that came down to 8.72e+6. Then after the Chebychev transformation, the condition number was 2743.

I used the stable recurrence relation to calculate the Chebychev polynomials. The coefficients came out as:
Coeficients B0:B10
[1,] -1.4674896142294e+03
[2,] -2.7721795919327e+03
[3,] -2.3163710816083e+03
[4,] -1.1279739409834e+03
[5,] -3.5447823370325e+02
[6,] -7.5124201739355e+01
[7,] -1.0875318035531e+01
[8,] -1.0622149858892e+00
[9,] -6.7019115459320e-02
[10,] -2.4678107827540e-03
[11,] -4.0296252508027e-05
which seems to be pretty much full double precision (default in R).

Here's the code:
# see http://noconsensus.wordpress.com/2011/08/19/slow-feedback/#comment-54341
# Nick Stokes - solving using Chebychev
condno = function(k){
w=eigen(t(k) %*%k)$val;
print(noquote(paste("Condition number =",round(max(w)/min(w),2))));
}
#download.file("http://www.itl.nist.gov/div898/strd/lls/data/LINKS/DATA/Filip.dat","filip.txt");
v=matrix(scan("filip.txt",skip=60,nlines=82),nrow=2); # Read NIST data set
y=v[1,]; x=v[2,]; N=10; i0=0:N;
condno(outer(x,i0,"^")); # Print condition no
c0 = -1/mean(x); c1 = 2/c0/diff(range(x))
x = c1*(1 + x*c0) ; # Transformed to unit unterval
condno(outer(x,i0,"^")); # Print condition no
g=cbind(1,x);
# Now create Chebs by recurrence
for(i in 2:N) g=cbind(g, 2*x*g[,i] - g[,i-1]);
condno(g); ## Print condition number
## Mult modified design matrix to get kernel
k=t(g)%*%g; ky=t(g)%*%y;
### Solve to complete regression
b=drop(solve(k,ky));
## Now make binomial (a) and Cheb coef matrices
binom=cheb=matrix(0,N+1,N+1); a2=0:1; binom[,1]=a1=1;
for(i in 1:N){
j=1:i; cheb[i,j]=a1;
binom[i+1,j+1]=binom[i,j]+binom[i,j+1];
a0=2*c(0,a1)-a2; a2=c(a1,0,0); a1=a0;
}
cheb[N+1,]=a1;
## Now convert Cheb coefs to original polynomial coefs
b1= b %*% cheb;
b2= (((c1^i0)*b1) %*% binom)* (c0^i0) ## These are the NIST numbers
rownames(b2)="Coeficients B0:B10";
print(noquote(format(t(b2),digits=14)));

ugh...preview isn't helping much here.... one...

2011-09-02T05:29:24.640-04:00

ugh...preview isn't helping much here.... one important omission:

"THe location x,y is relative to the position of a tower"

I missed a factor of pi in one place and a few oth...

2011-09-02T05:27:18.134-04:00

I missed a factor of pi in one place and a few other things like dropped words, but hopefully the thrust of my comment makes sense.

Regarding this: That brings up another good point; the choice of basis should inform the experimental design. Maybe that will be the next post.

That is absolutely important in good experimental design. I encountered the following problem as a post doc: We want to measure the gravitational acceleration anomalies (g(x,y,0) - g_model(x,y,0)) and use this to interpolate the value of g(0,0,z), where z > 0 is the height above the earth. (THe location x,y is the position of a tower where we're going to obtain measured values of g at various heights z_k, however, we have no choice where we measure. The tower is a large TV tower, with a central elevator, and a fixed number of platforms where you can safely get out of the elevator to measure g. These locations turn out to be where the tower is guyed, which also minimizes horizontal motion of the tower at those elevations.

Given an infinite amount of time, you'd just use a rectangular grid over x and y. What we did (of course) was transform to polar coordinates r,theta,z, then select the values r_k upon which to make m measurements at equal angles 2*pi/m.

That was theory, practice puts some of the points in the middle of catfish ponds or swamps, so they have to be moved to accommodate reality (and that in turn affects the accuracy of your results);. This experiment (in case you hadn't guessed) was to provide a high-resolution test of Newton's inverse square law up a tower.

An error analysis may be found here but it precedes the more precise experiment performed later. (I was a contributor to the latter but not former experiment).

The size of the outer ring was chosen incidentally to tie in the measurements to an existing data base of surface gravity measurements collected by the petroleum industry.

Yep, you're right about that (of course) It ...

2011-09-02T04:46:17.668-04:00

Yep, you're right about that (of course) It has been over 30 years since I had written that code (written as a grad student, and loosely following Numerical Recipes in C development) so I had to dig up my code to remind myself. Fortunately the code was well documented, and it was to derive your relationship.

I had remembered you needed a special choice of point to minimize Runge's phenomenon, but had forgotten what that choice was. This turns out to be:

x'_k = cos( (k+1/2 pi)/N)

here "N" is the truncation order of the series (we are expanding from polynomial order 0 to N-1) and where,

x = (xmax-xmin)/2 * x' + (xmax + xmin)/2

This maps our original boundaries [xmin,xmax] onto [-1,1] of course.

So it certainly involves a special interval for the approximation to be optimal (I had remembered that part). Note this gives a descending series in x_k (x_k < x_k-1)

Anyway, it's easy enough to show your contention (we'll stick to the boundary near k=0 the other follows mutatis mutandis).

x_k - x_0 = cos((k+1/2) pi/N)-1 ~= [(k+1/2)/N]^2/2

for k small, and voila a clustering 1/N^2 at the boundaries as per your comment!

(Is there a way to insert formulas on your site?).

In general, you can show,

Delta x_k = x'_k - x'_(k-1) = -2 sin(pi/2N) sin(k pi/N)

From this it is obvious that the spacing between points gets clustered at the boundaries.

Good point on Chebyshev; I think to really get the...

2011-09-01T13:29:08.058-04:00

Good point on Chebyshev; I think to really get the benefit from using that basis to avoid Runge phenomena you need the $ 1/n^2 $ clustering of the samples near the boundary.

That brings up another good point; the choice of basis should inform the experimental design. Maybe that will be the next post.

Nice work! I haven't tried it, but I have th...

2011-09-01T10:35:27.609-04:00

Nice work!

I haven't tried it, but I have the Tschebycheff expansion in my canned software routine (I didn't try it because it's C code, so it doesn't have things like the numerical gain of the method computed).

A typical use would be a special function (often implemented the old fashion way, e.g., direct integration of the differential equation defining it). If you had good luck with Legendre, I suspect Tschebycheff would work as well, or even be slightly superior to the Legendre polynomial series.

The main advantage I can think of is the Tschebycheff series is, if you want an optimal interpolating & efficient function, this is it. (As you know the question is "what happens to the fitted polynomial between interpolating points", whereas a plain old polynomial can oscillate wildly.

Function evaluation can also be computed efficiently using Clenshaw's upwards recurrence relation. So again, it all has to do with what you are trying to accomplish, with what is "optimal".