Suppose you are trying to estimate the effect that 6 factors have on a response, and you know that none of the factors influence the effect of the others, so that a simple model like this

$$Y={b}_{1}{X}_{1}+{b}_{2}{X}_{2}+{b}_{3}{X}_{3}+{b}_{4}{X}_{4}+{b}_{5}{X}_{5}+{b}_{6}{X}_{6}$$ | (1) |

is the perfect choice. How should you get the data you need to estimate the ${b}_{i}$’s? You may be tempted to design a test to estimate each of these factors by changing one factor at a time (OFAT). There are no interaction terms (e.g. ${b}_{7}{X}_{1}{X}_{4}$) in equation 1. So there’s no need to perform any runs that change several of the $X$’s at once, right? Wrong.

Table 2 shows a 36 run OFAT design. There are three repeated cases for each treatment. Table 1 shows a 32 run D-optimal design. There are no repeated runs. You might expect that you would be better able to estimate error from the design in Table 2 because of replication, but you’d be wrong. In fact, as Figure 1 shows,

the average standard error in the coefficient estimates for the model in equation 1 are significantly lower for the D-optimal design most of the time even with fewer runs than the OFAT design.

Why does this happen? Each run in the D-optimal design contributes to the estimate of every term in the model. However, each run in the OFAT design can only contribute to the estimate of a single term in the model. The “error bars” for OFAT designs will almost always be significantly larger than D-optimal designs (other optimality criteria give largely the same improvement over OFAT in practice).

X1 | X2 | X3 | X4 | X5 | X6 | |

1 | -1 | -1 | -1 | -1 | -1 | -1 |

2 | 1 | -1 | -1 | -1 | -1 | -1 |

3 | -1 | 1 | -1 | -1 | -1 | -1 |

5 | -1 | -1 | 1 | -1 | -1 | -1 |

8 | 1 | 1 | 1 | -1 | -1 | -1 |

9 | -1 | -1 | -1 | 1 | -1 | -1 |

12 | 1 | 1 | -1 | 1 | -1 | -1 |

14 | 1 | -1 | 1 | 1 | -1 | -1 |

15 | -1 | 1 | 1 | 1 | -1 | -1 |

17 | -1 | -1 | -1 | -1 | 1 | -1 |

20 | 1 | 1 | -1 | -1 | 1 | -1 |

22 | 1 | -1 | 1 | -1 | 1 | -1 |

23 | -1 | 1 | 1 | -1 | 1 | -1 |

26 | 1 | -1 | -1 | 1 | 1 | -1 |

27 | -1 | 1 | -1 | 1 | 1 | -1 |

29 | -1 | -1 | 1 | 1 | 1 | -1 |

32 | 1 | 1 | 1 | 1 | 1 | -1 |

34 | 1 | -1 | -1 | -1 | -1 | 1 |

35 | -1 | 1 | -1 | -1 | -1 | 1 |

37 | -1 | -1 | 1 | -1 | -1 | 1 |

40 | 1 | 1 | 1 | -1 | -1 | 1 |

41 | -1 | -1 | -1 | 1 | -1 | 1 |

44 | 1 | 1 | -1 | 1 | -1 | 1 |

48 | 1 | 1 | 1 | 1 | -1 | 1 |

49 | -1 | -1 | -1 | -1 | 1 | 1 |

54 | 1 | -1 | 1 | -1 | 1 | 1 |

55 | -1 | 1 | 1 | -1 | 1 | 1 |

56 | 1 | 1 | 1 | -1 | 1 | 1 |

59 | -1 | 1 | -1 | 1 | 1 | 1 |

60 | 1 | 1 | -1 | 1 | 1 | 1 |

62 | 1 | -1 | 1 | 1 | 1 | 1 |

63 | -1 | 1 | 1 | 1 | 1 | 1 |

X1 | X2 | X3 | X4 | X5 | X6 | |

1 | 1 | 0 | 0 | 0 | 0 | 0 |

2 | 0 | 1 | 0 | 0 | 0 | 0 |

3 | 0 | 0 | 1 | 0 | 0 | 0 |

4 | 0 | 0 | 0 | 1 | 0 | 0 |

5 | 0 | 0 | 0 | 0 | 1 | 0 |

6 | 0 | 0 | 0 | 0 | 0 | 1 |

7 | -1 | -0 | -0 | -0 | -0 | -0 |

8 | -0 | -1 | -0 | -0 | -0 | -0 |

9 | -0 | -0 | -1 | -0 | -0 | -0 |

10 | -0 | -0 | -0 | -1 | -0 | -0 |

11 | -0 | -0 | -0 | -0 | -1 | -0 |

12 | -0 | -0 | -0 | -0 | -0 | -1 |

13 | 1 | 0 | 0 | 0 | 0 | 0 |

14 | 0 | 1 | 0 | 0 | 0 | 0 |

15 | 0 | 0 | 1 | 0 | 0 | 0 |

16 | 0 | 0 | 0 | 1 | 0 | 0 |

17 | 0 | 0 | 0 | 0 | 1 | 0 |

18 | 0 | 0 | 0 | 0 | 0 | 1 |

19 | -1 | -0 | -0 | -0 | -0 | -0 |

20 | -0 | -1 | -0 | -0 | -0 | -0 |

21 | -0 | -0 | -1 | -0 | -0 | -0 |

22 | -0 | -0 | -0 | -1 | -0 | -0 |

23 | -0 | -0 | -0 | -0 | -1 | -0 |

24 | -0 | -0 | -0 | -0 | -0 | -1 |

25 | 1 | 0 | 0 | 0 | 0 | 0 |

26 | 0 | 1 | 0 | 0 | 0 | 0 |

27 | 0 | 0 | 1 | 0 | 0 | 0 |

28 | 0 | 0 | 0 | 1 | 0 | 0 |

29 | 0 | 0 | 0 | 0 | 1 | 0 |

30 | 0 | 0 | 0 | 0 | 0 | 1 |

31 | -1 | -0 | -0 | -0 | -0 | -0 |

32 | -0 | -1 | -0 | -0 | -0 | -0 |

33 | -0 | -0 | -1 | -0 | -0 | -0 |

34 | -0 | -0 | -0 | -1 | -0 | -0 |

35 | -0 | -0 | -0 | -0 | -1 | -0 |

36 | -0 | -0 | -0 | -0 | -0 | -1 |

Hello VC, I am a bit confused by the topic of the post (and also I am not good with R). Basic question: in the table 1 shouldn't these be 0 and 1, not -1 and 1?

ReplyDeleteIt's common practice to center the factor levels so a two-level factor takes values -1 and 1.

DeleteWith the AlgDesign function gen.factorial used in the script above you can change this with the 'center' option (center=FALSE instead of center=TRUE).

My goal with the post certainly wasn't to confuse, so please ask more questions if you've got them. Anything in particular that is especially confusing?