## Stat.psu.edu

Randomized-block and randomized complete-block designs (Chapters 21 and 25)
Randomized block designs. In a randomized block (RB) design, experimental units are grouped into \blocks" that
are thought to be similar. The random assignment of units to treatments is done separately within each block. The
rationale for doing this is that, in the resulting dataset, the proportion of units receiving each treatment is identical
across blocks. If the blocking factor is related to the outcome, then blocking can subtantially increase the precision
of treatment comparisons over a completely-randomized (CR) design.

Example. A clinical trial will be conducted to compare the performance of a new experimental cholesterol-reducing
drug (Compound X) against the industry's current leader (Lipitor). Twenty subjects will be recruited to participate
in the trial at each of ten sites, for a total of N = 200 subjects.

In a CR design, the 200 subjects would be randomly divided into two groups of 100 subjects each. The rst group
would receive Compound X, the second group would receive Lipitor. The analysis would be a standard one-way
ANOVA with two groups. The fact that the subjects came from di erent sites is not used in the design and does
not need to be used in the analysis. The treatment e ect can be tested by a standard one-way ANOVA with two
groups, i.e. a pooled two-sample t-test, with error df=198.

In a RB design, the randomization would be performed separately within each site. In each site, we would randomly
assign 10 subjects to Compound X and 10 to Lipitor. The advantage of doing it this way is that the treatment
groups will be balanced in the sense that the proportions of subjects in Site 1, Site 2, . . . , Site 10 within the
Compound X group will be identical to the proportions of subjects in Site 1, Site 2, . . . , Site 10 within the Lipitor
group. If the randomization is done this way, then the blocking factor (site) should not be ignored in the analysis.

We will have to t a linear model that allows site e ects to be present. It turns out that the test for a treatment
e ect based on this model will be equivalent to (a) computing
Y1j = mean response for Compound X in site j, and
Y2j = mean response for Lipitor in site j,
for each site j = 1; : : : ; 10, and (b) comparing the Y1j's to the Y2j's by a paired t-test with error df=9.

Di erent kinds of blocking factors. In the hypothetical example, we are imagining the ten sites to be exchangeable,
something like a random sample from a hypothetical population of sites. The analysis of pairs (Y1j; Y2j) by a paired
t-test implicitly assumes sampling from an in nite population, and in practice researchers will often assume this
whether or not the sites were actually sampled from a master list. The individual sites are not really of interest;
we hope to generalize the treatment e ect to broader contexts beyond the ten sites represented in this study. The
paired t-test is implicitly treating the blocking factor (site) as a random factor.

Sometimes a blocking factor will take only a small number of levels that are substantively di erent from one another
and therefore not exchangeable. For example, prior to the study, some measure of the severity of the patient's
condition (e.g., (HDL-C)/(LDL-C), the ratio of bad cholesterol to good cholesterol) could be used to classify
patients as medium-risk, high-risk and very high-risk, and the randomization could be carried out separately within
these three groups. If the study were done that way, then the blocking factor (Risk, with 3 levels) would be regarded
as a xed factor. The data could then be analyzed as an F F factorial design (Risk
how to do. We would be interested not only in the main e ect of Drug, but also in the Risk
which would allow us to see if the treatment e ect varies across the risk groups.

For the remainder of this lecture and the next few lectures, we will be primarily thinking of situations where the
blocks are exchangeable and the blocking factor is a nuisance. This will correspond to the F R situations that we
discussed last time.

Randomized complete-block designs. In the classic RCB design, we want to measure the e ects of a treatment
factor, Factor A, with levels i = 1; : : : ; a. The experimental units are grouped together into blocks of a units each,
and within each block, we assign one unit per treatment in a random fashion. The blocking factor, Factor B, has
levels j = 1; : : : ; b. An RCB design with a = 3 treatments and b = 5 blocks is shown below.

The term complete means that the complete set of treatments appears in each block. When authors talk about RCB
designs, they are usually speaking of situations where each block has a experimental units, and where each of the
treatments i = 1; : : : ; a appears once per block. But there are also situations where each block has an experimental
units, and each treatment is replicated n times within each block. Our textbook calls that a generalized RCB design.

In most cases, there will be no replication (n = 1), and the data will have the same structure as in a two-way
factorial ANOVA with n = 1 observation per cell. Although n = 1 is most common, imagining situations with n > 1
helps to clarify some important theoretical issues in the analysis.

Points of confusion. Our textbook rst takes up RCB designs in Chapter 21. In that chapter, the treatment factor
(Factor A) and the blocking factor (Factor B) are both considered to be xed. In that case, the analysis is equivalent
to two-way factorial ANOVA with one observation per cell, which we covered in a previous lecture. In that setting,
we will usually apply an additive model, because if we estimated a full set of interaction terms, there would be no
df's left to estimate 2. Tukey's test for additivity is then recommended.

In Chapter 25, the textbook addresses the situation where Factors A and B are both considered to be random (R R)
and where A is xed and B is random (F R). The R R model is rarely applied to RCB experiments, because the
levels of Factor A are usually regarded as xed. More often, we would want to apply the F R model which regards
A as xed and B as random.

But, as we learned last time, there are two di erent versions of the F R model: unrestricted and restricted. Our
textbook uses the restricted model and clearly says so. Other authors do not always clarify which one they are
using, and that may lead to con icting statements and confusion. For example, some authors will say, \There is
no standard F-test for the block e ect (Factor B) in the RCB design." Other authors will present an F-test for the
block e ect, apparently contradicting their colleagues. When authors give con icting advice, it is usually because
they are making di erent assumptions. So we will try to be clear and clarify what happens under the restricted and
unrestricted versions.

Unrestricted version. The ANOVA table for the unrestricted version was shown last time.

Now consider what happens when n = 1. In that case, ijk becomes confounded with ( )ij. The model is
where ijk = ( )ij + ijk is a combined error. Line 4 drops out because there are no degrees of freedom to estimate
The MS for Line 3 is now an unbiased estimate of (2 + 2 ), the variance of ij. Without replication, we can't
separate these variance components. But we do not need to assume that there are no interactions (2 = 0) to
get a valid test for the e ect of treatments or the e ect of blocks. Line 3 is an appropriate error term for Line 1
and Line 2, even if interactions are present.

Restricted version. In the restricted model, we constrain the random interactions
Line 3 still provides an appropriate error term for Line 1, even if interactions are present. But now there is no
suitable error term for Line 2. In this situation, we cannot test the e ect of Factor B (blocks) unless we assume
that the model is additive (2 = 0.)Therefore, if an author says, \There is no exact test for the block e ect in an RCB design," you know that this
author is implicitly using the restricted model.

In practice, this distinction between restricted and unrestricted models doesn't matter much, because when we use
the F R model, we are usually thinking of the blocks as a nuisance and the test for a block e ect is not of great
interest. If the software prints out a test statistic and p-value for the block e ect, many authors will tell you to
ignore it.

How to do the analysis in R. We can do this analysis using the lm() function. We would t the model y ~ A + B
and use the error line (which is formally equivalent to the AB interaction) as the error term for A. You can also use
it as the error term for B if you are thinking of the model as unrestricted.

What are we assuming about the interactions? Some authors may say, \The RCB analysis assumes that the
block and treatment e ects are additive." Others might say, \The RCB analysis does not require additivity." Which
statement is is correct? Either one could be correct. It depends on whether the blocks are regarded as a xed
or random e ect. If you treat them as random, you do not have to assume that 2 = 0, even if there is noreplication (n = 1). If you treat blocks as xed, and if n = 1, then you do have to assume that 2 = 0. This is
not a contradiction. In the F F model, you are testing a hypothesis of no treatment e ect among these particular
blocks. In the F R model, you are testing a hypothesis of no treatment e ect in the entire population of blocks.

The hypotheses being tested are di erent, and the assumptions needed are di erent.

Compound symmetry and sphericity. Suppose that we collect the observations for each block into a vector,
Under the F R model with n = 1, the vectors Y1; Y2; : : : ; Yb can be regarded as an independent sample from a
multivariate normal distribution with mean
where = 2=(2 + 2 + 2). The standard F R model implies that the observations within each block are
intercorrelated, and that the correlation between every pair Yij and Yi0j (i 6= i0) is equal. Recall that (1) is called a
compound symmetry or exchangeable covariance structure. If the number of blocks is large relative to the number
a sample covariance matrix from Y1; Y2; : : : ; Yb and formally test whether
this assumption holds.

It turns out, however, that the F-test for no treatment e ects based on the F-statistic MSA=MSAB is valid under
conditions more general than (1). The F-test will be valid under any covariance structure having the property that
Yi0j) is equal for every pair i 6= i0. This is called sphericity or the Huynh-Feldt conditions. When there are
only a = 2 treatments, sphericity automatically holds, but it may be violated if a 3 and deserves to be tested.

Tests for sphericity are available and have been implemented in many software packages. These tests are not
very powerful unless b >> a. When b >> a, then we can apply more sophisticated procedures that estimate the
covariance structure from the data, and we can even construct sandwich-type SE's for treatment e ects that are
robust to misspeci cation of the covariance structure. This falls under the general body of methods known as
generalized estimating equations (GEE). If time permits, we will discuss some of these methods at the end of the
semester in the context of repeated measures.

The e ciency of blocking. Instead of formally testing for an e ect of blocks in an RCB design, many authors
suggest that it is more appropriate and interesting to examine whether blocking was an e ective strategy for reducing
variance. If we had done a completely randomized experiment | taking all ba experimental units and randomly
assigning them to the a treatment groups without consideration of the blocks | then we would have applied a
standard one-way ANOVA with a groups and b observations per group, and the error variance 2 would probably
have been larger. For reasons that I will not try to explain now, we can estimate the variance that we would have
An estimate of the relative e ciency of (a) the RCB design that was actually done to (b) the hypothetical CR
Many textbooks will suggest a modi ed version of RE that takes into account the di erences in degrees of freedom
for error estimation. This df correction makes very little di erence unless the degrees of freedom for SSAB is very
is the error degrees of freedom from the RCB design that was actually used, and
is the error degrees of freedom from the hypothetical CR design that was not used.

Values of RE greater than 1 indicate that the RCB design is more e cient. RE can also be interpreted as the ratio
of sample sizes (number of replicate measurements per treatment) needed to achieve equivalent precision in a CR
design. For example, suppose we have an RCB design with b = 10 blocks, which means that each treatment was
applied 10 times. If RE = 4:5, then a CR design would need to about 45 units per treatment to achieve the same
precision.

Contrasts in RCB analysis with a global error term. In a standard RCB design with n = 1, the omnibus test for
to an F-distribution with (a 1); (a 1)(b 1) degrees of freedom. If this result is not signi cant, we can apply
is less than 4, then we know that there are no signi cant di erences among any of the treatments. But if F is
signi cant, or if F > 4, then we should investigate the treatment e ects. Plotting the means Yi will help us to
understand what is happening.

It is often helpful to examine treatment contrasts of the form
where ai=1 ci = 0. The sum of squares for the contrast, SSL, represents a portion of the treatment sum of squares
SSA attributable to departures from H0 : L = 0. The formula for SSL is identical to the formula for a main-e ect
contrast in a two-way factorial design with n = 1:
The standard test for H0 : L = 0 would compare
to an F-distribution with 1 and (a 1)(b 1) degrees of freedom.

If we have a set of contrasts L1; L2; : : : ; La 1 that are mutually orthogonal, we can partition SSA into components
and express it in a single ANOVA table, like this:
Note that each contrast uses the same error term, MSAB, with (a 1)(b 1) degrees of freedom. Using this single
\global" error term for all Treatment contrasts is appropriate if the sphericity (Huynh-Feldt) condition is satis ed.

Interestingly, there is another way to test contrasts that does not invoke this assumption. The alternative way uses
the data to estimate a \local" error term for each contrast.

Contrasts in RCB analysis with local error terms. Here is another way to test H0 : L = 0 where L is a treatment
First, we reduce the observations in each block to a single contrast score,
The average of these contrast scores across blocks becomes the estimated contrast:
To get a standard error for the estimated contrast, we take the sample variance of the contrast scores and divide
to a t-distribution with b 1 degrees of freedom, or compare
to an F with 1, b 1 degrees of freedom.

If the contrast is a di erence between two treatments, L = i
i0 , this becomes a paired t-test. Therefore, we
can regard this method as a generalization of the paired t-test.

We can rewrite the F-statistic as
can be viewed as a portion of SSA B. In fact, if L1; L2; : : : ; La 1 are mutually orthogonal treatment contrasts, then
we can partition SSA B into pieces corresponding to the local error terms,
And we can express the ANOVA table like this:
The advantage of testing contrasts in this manner is that we do not need to rely on the sphericity assumption,
which is often violated in real data examples. The disadvantage is that, if the sphericity assumption does hold, these
tests are going to be less powerful than those using the global error term, because the error df has decreased from
(a 1)(b 1) to (b 1). Unless the number of blocks is very small, the loss of power will be slight.

Example. Kuehl (2000, Exercise 8.3) describes an experiment to measure the self-inductance of coils under di ering
temperatures of the measuring bridge. Five coils were selected, and the self-inductance of each coil was measured
Read in the data and t the additive model.

> Coil <- read.table("Coil.txt", header=T)
With just ve observations per condition, we might think that these di erences between the treatments would not
The e ect of \blocks" (coils) is also highly signi cant, if you choose to interpret this as a valid test. (It is valid
under an unrestricted model.) Let's look at the relative e ciency:
"Sum Sq" "Mean Sq" "F value" "Pr(>F)"
> MS.AB <- tmp$"Mean Sq"[3]
> MS.CR <- (SS.B + b*(a-1)*MS.AB)/(a*b-1)
> RE <- RE * (df.RCB+1)*(df.CR+3)/((df.RCB+3)*(df.CR+1))
The gain in e ciency is enormous. To perform a CR experiment with the same precision, we estimate that we
264 1; 300 coils to each of the four temperature conditions, taking approximately
1; 300 4 = 5; 200 measurements in all.

Now let's examine the treatment means.

> Ybari. <- tapply(induc, temp, mean)
They show a tendency to decrease as temperature rises. We can partition the e ect of temperature into linear,
quadratic and cubic e ects using orthogonal polynomial contrast weights:
Now compute the F-statistics using a global error term:
> # look at the SS's and make sure that they add up
> # F statistics with the global error term
The 95th percentile of F1;12 is 4.75, so the linear trend is highly signi cant (F >> 4) but the quadratic and cubic
e ects are not. Note that these tests assume sphericity. We can avoid that assumption by computing local error
> Ymat <- cbind( induc[temp==22], induc[temp==23],
> SSL1xB <- (b-1)*var(L1j)/sum(c1^2)
> SSL2xB <- (b-1)*var(L2j)/sum(c2^2)
> SSL3xB <- (b-1)*var(L3j)/sum(c3^2)
The wide variation in these local error SS's (the rst one is more than 40 times as large as the second one) suggests
that sphericity has been violated. Now let's repeat the contrast tests using the local error terms.

These test statistics are very noisy, because they are based on 5-1=4 error degrees of freedom. The 95th percentile
of F1;4 is 7.71, so the linear e ect is still signi cant, but the quadratic and cubic e ects are not.

Finally, let's look at some residual plots. Under this linear model, the residuals ought to be approximately normally
> # quick way to get standardized residuals

**Normal Q−Q Plot**
Source: http://stat.psu.edu/~jls/stat512/lectures/lec31.pdf

VIAGRA™ Tablets – Combined 25 mg, 50 mg, 100 mg version. 23.12.2007 The format of this leaflet was determined by the Ministry of Health and its content was checked SUMMARY OF PRODUCT CHARACTERISTICS NAME OF THE MEDICINAL PRODUCT VIAGRA™ 2. QUALITATIVE AND QUANTITATIVE COMPOSITION VIAGRA 25 mg: Each tablet contains sildenafil citrate equivalent to 25 mg sildenafil. VI

July 2012 - Quitline Week One Declare Your Independence: Freedom from Tobacco! This July, the State of Maryland Employee Workplace Wellness Initiative is focusing on quitting smoking and eating healthy. Information on upcoming wellness efforts directed at improving the health of Maryland’s state workforce is available at Commemorate this 4th of July by declaring your freedom from