The Performance of Step-Wise Group Screening Designs

In this paper we evaluate the performance of step-wise group screening designs in which group-factors contain an equal number of factors in the initial step.  A usual assumption in group screening designs is that the directions of possible effects are known a-priori. In practice, however, this assumption is unreasonable. We shall examine step-wise group screening designs without errors in observations when this assumption is relaxed. We shall consider cancellations of effects within group-factors. The performance of step-wise group-screening designs shall then be compared with the performance of multistage group screening designs.


Introduction
n scientific experimentations situations arise in which a large number of potentially important factors must be examined.In such situations there is often a need, due to limited resources, for an efficient method of factor screening.One such method, introduced by Watson (1961), is the two-stage group screening.This method has been generalized to more than two-stages by Li (1962) and Patel (1962).
The notion of step-wise group screening was introduced by Patel and Manene (1987) basing their method on the group testing procedure introduced by Sterret (1957).Odhiambo and Manene (1987) considered step-wise group screening designs with errors in observations.Manene (1997) extended step-wise group screening designs to what is called "two-type step-wise designs".Manene et al. (2002) further extended the two type step-wise group screening designs to multi-type step-wise group screening designs.

I
A basic assumption in group screening is that the direction of all suspected effects are known or can be corrected a-priori.Under this assumption, factor levels can be assigned in such a way that there is no cancellation of effects within a group-factor.In practice, however, this assumption is inadequate.Mauro and Smith (1982) considered the performance of two-stage group screening when the assumption of known effect direction is false.Odhiambo (1986) generalized this approach and assessed the performance of multistage group screening when there is a non-zero probability of cancellation of effects within a group-factor.
We shall use the approach, first suggested by Mauro and Smith (1982) and later used by Odhiambo (1986), to assess the performance of step-wise group screening designs.The performance of step-wise group screening designs with cancellation of effects shall then be compared with the performance of multi-stage group screening designs.

Assumptions
Suppose that f factors are to be screened for their effect on the response.For detecting the factors having major effect, it is usually adequate to assume a first-order linear model.
where y u is the u th response, β o is a constant common to every response, ( ) is the linear effect of the i th factor in the u th run, and ε u is the u th error term.
In addition to equation (2.1) assumption, we shall assume that : (i) All factors have independently the same probability p 1 of having a positive effect and the same probability p 2 of having a negative effect.Thus, the probability of a factor being active (defective) is The screening procedure is performed without experimental errors.This means that we take ε u = 0 in equation (2.1).These are basically the same assumptions made by Mauro and Smith (1982) and Odhiambo (1986).
Suppose that it is desired to classify f factors as active or inactive using the step-wise group screening procedure.The initial step of this procedure consists of dividing the f factors into g firstorder group factors each of size k (f = kg).The first order group factors are then tested for their effect and those found to be effective are set aside.In step two we start with any effective first-order groupfactor and test factors within it one by one till we find an active factor.We set aside factors that are found to be inactive, keeping the active factor separate.In step three, the remaining factors are regrouped in a group, which is then tested for its effect.The test procedure carried out in step one and in step two is repeated successively in the subsequent steps till the analysis terminates with a test on a non-effective group-factor or with a group-factor of size one.This test procedure is performed on all group-factors found to be effective in the initial step.
Since there are no experimental errors, it is possible to use designs with the smallest number of runs in the initial step (i.e. the number of runs required to test g group-factors is g + 1), where the one extra run is the control run.This control run may be used at every step of the step-wise design.

The Expected Number of Runs
Suppose that there are f factors divided into g group-factors in the initial step such that each group-factor contains exactly k-factors.Let X denotes the number of factors with positive effects and Y denotes the number of factors with negative effects contained in a group-factor of size k in the initial step.Let p 1 = Probability that a factor chosen at random has a positive effect.p 2 = Probability that a factor chosen at random has a negative effect.Then Otherwise which is the trinomial distribution with parameters k 1 , p 1 , and p 2 .This distribution will subsequently be denoted by ( ) 1 2 ; ; , , .
x y p p k Β At any step of the step-wise design, a group-factor is active if it contains at least one active factor.However, an active group-factor will have a significant effect (i.e. will be effective) if and only if the factor effects do not cancel completely within the group-factor.Let us define p* = P ( group at the initial step is active) and θ = P ( a group factor at the initial step is effective) Then ( ) ( ) and ( ) the expected number of runs required to classify as effective or non-effective factors within a group-factor that was found to be effective at the initial step if it contains x factors with positive effects and y factors with negative effects.If x ≠ 0 and y = 0, then which is actually the same as ( ) given by Patel and Manene (1987).Thus and that . , e. the number of active factors with positive effects equals the number of active factors with negative effects), then the effects cancel completely and such a group-factor is dropped from further analysis in the initial step.Thus

Proof
There are two cases to consider, namely (i) the first active factor detected has a positive effect; (ii) the first active factor detected has a negative effect.Considering case (i), the probability that the first factor tested has a positive effect is 1 / k and the probability that the (l + 1)-st factor tested is the first active factor and has a positive effect is ( ) ( ) tests to complete the test procedure if the first factor tested is active and has a positive effect, and the test procedure if the (l + 1) -st factor tested is the first active factor and has a positive effect.
Considering case (ii), the probability that the first factor tested has a negative effect is 2 / k and the probability that the (l + 1) -st factor tested is the first active factor and has a negative effect is ( ) ( ) On the average we shall require 1 + 1 + tests to complete the test procedure if the first factor tested is active and has a negative effect, and the test procedure if the (l + 1)-st factor tested is the first active factor and has a negative effect.It follows that Rewriting equation (3.12) and using results in equation (3.9) and equation (3.10) and expanding the resulting expression we obtain (3.11).
This completes the proof of the lemma.
Proof.We again consider the two cases highlighted in the proof of lemma 3.1 above.Considering case (i), the probability that the first factor tested has a positive effect is 1 / k and the probability that the ( ) factor tested is the first active factor and has a positive effect is On average we shall need 1 + 1 + ( ) ( ) tests to complete the test procedure if the first factor tested is active and has a positive effect, and + + + tests to complete the test procedure if the (l + 1)-st factor tested is the first active factor and has a positive effect.
For case (ii), the probability that the first factor tested has a negative effect is 3 / k and the probability that the (l + 1) -st factor tested is the first active factor and has a negative effect is On the average we shall need 1 + 1 + tests to complete the test procedure if the first factor tested is active and has a negative effect; and + + + tests to complete the rest procedure if the (l + 1 )-st factor tested is the first active factor and has a negative effect.Thus Using results in equation (3.9) and lemma 3.1 in equation (3.14), expanding the resulting expression and simplifying we obtain (3.13).
This completes the proof of lemma 3.2.
Using the same argument as that used in the proof of lemma 3.1 and lemma 3.2, we obtain 1,4 37 497 11 14 28 217 8 14 and 45 45 3 R denotes the number of tests required to analyse a group-factor of size k that is known to be effective. Then Denote by R s the number of tests required to analyse all the factors in the N group-factors found to be effective in the initial step.Then ( ) Let R be the total number of runs required to analyse the f factors under investigation, then where 1 is the number of runs in the initial step.
( ) Theorem 3.1 Let R denote the total number of runs required to classify as effective or non-effective all the f' factors under investigation in a step-wise group screening experiment.Then where p 1 is the a-priori probability of a factor being effective with positive effect and p 2 is the a-priori probability of a factor being effective with negative effect (p = p 1 + p 2 ), and k is the size of the groupfactor at the initial step.
Proof.From (3.22) we have Since p, the apriori probability of a factor being effective is usually small, the probability of having more than four effective factors in an effective group factor is negligible.We can thus assume that an effective group-factor will not have more than four effective factors.Using this assumption we have ;0; , , ; ; , , ( ) 1;2; , , 2;1; , , 1; 3; , 3;1; , , 2 420 13 45 184 84 1 15120 The proof of the theorem follows from using results in equation (3.24) in equation (3.23) and simplifying the resulting expression.

Expected Number of Active Factors Detected
Let A I denote the number of active factors within active group-factors declared non-effective in the initial step.Then the expected number of active factors declared inactive from among the k factors within a group-factor which was classified as effective in the initial step if it contains x factors with positive effects and y factors with negative effects.Obviously and 1 ,2 ,. . ., , ; We consider the order in which the active factors occur for given x and y and the corresponding number of undetected active factors in each case.As an illustration, consider the case when x = 1 and y = 2. Then the possible orders in which active factors appear are as indicated in Table I using '-' for negative effect and '+' for positive effect.
Table 1.Determining I (1,2) A denote the number of undetected active factors in a group-factor of size k that were found to be effective in the initial step.Then ; ; , , Theorem 4.1 Let A denote the total number of undetected active factors in a step-wise group screening design in which an active factor has either a positive effect or a negative effect with respective probabilities p 1 and p 2 (p 1 + p 2 = p).Then where f is the number of factors under investigation, k is the size of each group-factor at the initial step and θ * is the probability of cancellation within a group-factor of size k.
Proof.Equation (4. If we assume that at an initial step a group-factor contains no more than four active factors and using equations (4.4) and (4.5) and simplifying we obtain equation (4.11).This completes the proof of theorem (4.1) .
Let A be the total number of active factors that are detected in a step-wise group screening design when cancellation is allowed.Then, ( )( ) where fp is the expected total number of active factors among the f factors under investigation.When either 1 0 p = or * 2 0, p θ = will also be zero and ( )

Comparison of Relative Performance of Step-Wise Group Screening and Sstage Group Screening (s = 2,3)
To measure the efficiency of a step-wise group screening design, we need to obtain the efficiency of detecting active factors and the relative testing cost.Let us define ( ) as a percentage measure of the efficiency of a step-wise group screening strategy for detecting the active factors.As another measure of the efficiency of a step-wise group screening procedure we define the relative testing cost, ( ) as the ratio, expressed as a percentage, of the expected number of runs required by step-wise group screening design to the number of runs required to test the factors individually.
A large value of A φ or a smaller value of E R indicates better performance on the average, but both measures should be considered in assessing the performance of a group screening strategy to ensure lack of conflict.Only if one group screening design has both a large A φ and a smaller R E than another screening strategy can the first be said to be definitely better than the second.

E R and ( )
A E for the step-wise group screening design allowing for cancellations are given in theorem 3.1 and equation (3.17) respectively.Odhiambo (1986) gave the corresponding expressions for an s-stage group screening design as ( )

p p p = −
For all partitions, step-wise group screening has fewer runs than the corresponding two-stage group-screening design.
Step-wise group screening designs outperforms two-stage group screening designs for values of 0.08 p ≥ for all partitions of p's.That is for 0.8 p ≥ step-wise group screening designs have both fewer number of runs and are more efficient in detecting active factors than the two-stage group screening designs.
From Table 3 we again observe that for the three stage group screening designs, the minimum expected number of runs for a particular value of p are slightly fewer when by A s the number of undetected active factors within the N group-factors of size k found to be effective in the initial step.Further let A denote the total number of undetected active factors from among the f factors under investigation.Then -factor is effective and r θ * is the probability of cancellation in an r th order group-factor (r = 1,2, …,s -1, s ≥ 2 ).

Table 2 .
Tables the horizontal lines indicate the beginning of a different partition.It should be noted that Tables2 and 3are only for illustration and are not exhaustive.Relative performance of step-wise and two-stage group screening designs for f =100 and specified values of p 1 and p 2 .

Table 3 .
Relative performance of step-wise and three-stage group screening designs for f =100 and specified values of p 1 and p 2