Suppose you are trying to estimate the effect that 6 factors have on a response, and you know that none of the factors influence the effect of the others, so that a simple model like this
(1) |
is the perfect choice. How should you get the data you need to estimate the ’s? You may be tempted to design a test to estimate each of these factors by changing one factor at a time (OFAT). There are no interaction terms (e.g. ) in equation 1. So there’s no need to perform any runs that change several of the ’s at once, right? Wrong.
Table 2 shows a 36 run OFAT design. There are three repeated cases for each treatment. Table 1 shows a 32 run D-optimal design. There are no repeated runs. You might expect that you would be better able to estimate error from the design in Table 2 because of replication, but you’d be wrong. In fact, as Figure 1 shows,
the average standard error in the coefficient estimates for the model in equation 1 are significantly lower for the D-optimal design most of the time even with fewer runs than the OFAT design.
Why does this happen? Each run in the D-optimal design contributes to the estimate of every term in the model. However, each run in the OFAT design can only contribute to the estimate of a single term in the model. The “error bars” for OFAT designs will almost always be significantly larger than D-optimal designs (other optimality criteria give largely the same improvement over OFAT in practice).
Table 1: D-Optimal Design
X1 | X2 | X3 | X4 | X5 | X6 | |
1 | -1 | -1 | -1 | -1 | -1 | -1 |
2 | 1 | -1 | -1 | -1 | -1 | -1 |
3 | -1 | 1 | -1 | -1 | -1 | -1 |
5 | -1 | -1 | 1 | -1 | -1 | -1 |
8 | 1 | 1 | 1 | -1 | -1 | -1 |
9 | -1 | -1 | -1 | 1 | -1 | -1 |
12 | 1 | 1 | -1 | 1 | -1 | -1 |
14 | 1 | -1 | 1 | 1 | -1 | -1 |
15 | -1 | 1 | 1 | 1 | -1 | -1 |
17 | -1 | -1 | -1 | -1 | 1 | -1 |
20 | 1 | 1 | -1 | -1 | 1 | -1 |
22 | 1 | -1 | 1 | -1 | 1 | -1 |
23 | -1 | 1 | 1 | -1 | 1 | -1 |
26 | 1 | -1 | -1 | 1 | 1 | -1 |
27 | -1 | 1 | -1 | 1 | 1 | -1 |
29 | -1 | -1 | 1 | 1 | 1 | -1 |
32 | 1 | 1 | 1 | 1 | 1 | -1 |
34 | 1 | -1 | -1 | -1 | -1 | 1 |
35 | -1 | 1 | -1 | -1 | -1 | 1 |
37 | -1 | -1 | 1 | -1 | -1 | 1 |
40 | 1 | 1 | 1 | -1 | -1 | 1 |
41 | -1 | -1 | -1 | 1 | -1 | 1 |
44 | 1 | 1 | -1 | 1 | -1 | 1 |
48 | 1 | 1 | 1 | 1 | -1 | 1 |
49 | -1 | -1 | -1 | -1 | 1 | 1 |
54 | 1 | -1 | 1 | -1 | 1 | 1 |
55 | -1 | 1 | 1 | -1 | 1 | 1 |
56 | 1 | 1 | 1 | -1 | 1 | 1 |
59 | -1 | 1 | -1 | 1 | 1 | 1 |
60 | 1 | 1 | -1 | 1 | 1 | 1 |
62 | 1 | -1 | 1 | 1 | 1 | 1 |
63 | -1 | 1 | 1 | 1 | 1 | 1 |
Table 2: OFAT Design
X1 | X2 | X3 | X4 | X5 | X6 | |
1 | 1 | 0 | 0 | 0 | 0 | 0 |
2 | 0 | 1 | 0 | 0 | 0 | 0 |
3 | 0 | 0 | 1 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 1 | 0 | 0 |
5 | 0 | 0 | 0 | 0 | 1 | 0 |
6 | 0 | 0 | 0 | 0 | 0 | 1 |
7 | -1 | -0 | -0 | -0 | -0 | -0 |
8 | -0 | -1 | -0 | -0 | -0 | -0 |
9 | -0 | -0 | -1 | -0 | -0 | -0 |
10 | -0 | -0 | -0 | -1 | -0 | -0 |
11 | -0 | -0 | -0 | -0 | -1 | -0 |
12 | -0 | -0 | -0 | -0 | -0 | -1 |
13 | 1 | 0 | 0 | 0 | 0 | 0 |
14 | 0 | 1 | 0 | 0 | 0 | 0 |
15 | 0 | 0 | 1 | 0 | 0 | 0 |
16 | 0 | 0 | 0 | 1 | 0 | 0 |
17 | 0 | 0 | 0 | 0 | 1 | 0 |
18 | 0 | 0 | 0 | 0 | 0 | 1 |
19 | -1 | -0 | -0 | -0 | -0 | -0 |
20 | -0 | -1 | -0 | -0 | -0 | -0 |
21 | -0 | -0 | -1 | -0 | -0 | -0 |
22 | -0 | -0 | -0 | -1 | -0 | -0 |
23 | -0 | -0 | -0 | -0 | -1 | -0 |
24 | -0 | -0 | -0 | -0 | -0 | -1 |
25 | 1 | 0 | 0 | 0 | 0 | 0 |
26 | 0 | 1 | 0 | 0 | 0 | 0 |
27 | 0 | 0 | 1 | 0 | 0 | 0 |
28 | 0 | 0 | 0 | 1 | 0 | 0 |
29 | 0 | 0 | 0 | 0 | 1 | 0 |
30 | 0 | 0 | 0 | 0 | 0 | 1 |
31 | -1 | -0 | -0 | -0 | -0 | -0 |
32 | -0 | -1 | -0 | -0 | -0 | -0 |
33 | -0 | -0 | -1 | -0 | -0 | -0 |
34 | -0 | -0 | -0 | -1 | -0 | -0 |
35 | -0 | -0 | -0 | -0 | -1 | -0 |
36 | -0 | -0 | -0 | -0 | -0 | -1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# why one-factor-at-a-time testing is a bad idea | |
# http://www.variousconsequences.com/2013/02/no-interactions-ofat-is-still-bad-idea.html | |
# | |
library(AlgDesign) | |
library(xtable) | |
# full factorial of 6 factors at 2 levels: | |
fullfac = gen.factorial(levels=c(2,2,2,2,2,2), center=TRUE) | |
# D-optimal fraction, 32 runs: | |
dx.1 = optFederov(~(X1+X2+X3+X4+X5+X6)^2, data=fullfac, nTrials=32, nullify=2, maxIteration=64, nRepeats=200) | |
# ofat design with replication, 36 runs: | |
dx.2 = data.frame(rbind(diag(6),-diag(6),diag(6),-diag(6),diag(6),-diag(6))) | |
ntrials = 1000 # number of times to fit the model to synthetic data | |
std.err.1 = rep(0,ntrials) # record average standard error for each design | |
std.err.2 = rep(0,ntrials) | |
for(i in seq(ntrials)){ | |
# add a response: | |
dat.1 = transform(dx.1$design, Y=X1+X2+X3+X4+X5+X6+rnorm(nrow(dx.1$design),mean=0,sd=1)) | |
dat.2 = transform(dx.2, Y=X1+X2+X3+X4+X5+X6+rnorm(nrow(dx.2),mean=0,sd=1)) | |
# fit the models | |
lm.1 = lm(Y~X1+X2+X3+X4+X5+X6, data=dat.1) | |
lm.2 = lm(Y~X1+X2+X3+X4+X5+X6, data=dat.2) | |
# average the standard errors for the model coefficients | |
std.err.1[i] = mean(summary(lm.1)$coefficients[,2]) | |
std.err.2[i] = mean(summary(lm.2)$coefficients[,2]) | |
} | |
# estimate probability densities from the trials: | |
std.err.dens.1 = density(std.err.1, bw="sj") | |
std.err.dens.2 = density(std.err.2, bw="sj") | |
#plots and tables: | |
png("ofat_doe_compare.png", width=6.4, height=0.5*6.4, units="in", res=360) | |
par(mar=c(4,4,0,0),oma=c(0,0,0,0)) | |
plot(c(std.err.dens.1$x,std.err.dens.2$x), c(std.err.dens.1$y,std.err.dens.2$y), type="n", xlab="average coefficient standard error", ylab="probability density") | |
lines(std.err.dens.1$x, std.err.dens.1$y, col="blue") | |
lines(std.err.dens.2$x, std.err.dens.2$y, col="red") | |
legend(x="topright",legend=c("D-optimal, 32 runs","OFAT, 36 runs"), col=c("blue","red"), lty=1) | |
dev.off() | |
print(xtable(dx.1$design,caption="D-Optimal Design",label="t:dx1",digits=0),type="latex",file="dx.1.tex",table.placement="htbp",caption.placement="top") | |
print(xtable(dx.2,caption="OFAT Design",label="t:dx2",digits=0),type="latex",file="dx.2.tex",table.placement="htbp",caption.placement="top") |
Hello VC, I am a bit confused by the topic of the post (and also I am not good with R). Basic question: in the table 1 shouldn't these be 0 and 1, not -1 and 1?
ReplyDeleteIt's common practice to center the factor levels so a two-level factor takes values -1 and 1.
DeleteWith the AlgDesign function gen.factorial used in the script above you can change this with the 'center' option (center=FALSE instead of center=TRUE).
My goal with the post certainly wasn't to confuse, so please ask more questions if you've got them. Anything in particular that is especially confusing?