Rank Procedures for Repeated Measures with Missing Values

This article presents a nonparametric approach to analyzing research designs that include repeated observations on the same set of individuals or units, such as longitudinal panel studies. The data collected from different individuals are generally assumed to be independent, while several observations from the same individual or unit may be dependent. The approach suggested in this study is nonparametric in the sense that only the distribution functions are used to define the treatment effects, and suggested procedures are extensions of well-established rank order methods. Missing observations are especially common in studies with repeated measures, and three strategies for addressing this problem are compared: complete case analysis, last observation carried forward, and complete set analysis. The authors demonstrate these methods with an analysis of age trends and sex differences in alcohol consumption in a sample of U.S. adolescents who completed questionnaires four times in the 7th through 10th grades.

n many sociological, psychological, or medical studies, the subjects are observed repeatedly under different conditions, called treatments in the terminology of experimental design. Such designs can be described by several more or less complicated models that come under the framework of mixed models. They include growth curves, longitudinal data, or repeated-measures designs. The experimental unitsoften referred to as subjects or individuals are typically people in sociological research, but they might instead be groups or organizations. The data collected from different experimental units are generally assumed to be independent while that from the same unit may be, to some degree, dependent. As a result of this dependence, the analysis and interpretation of such studies become considerably more difficult than for experimental designs in which all observations are independent. Moreover, the special problem of missing values is an additional difficulty arising in studies with repeated measures.
The theory for the analysis of such designs using parametric or semiparametric mixed models is well developed and is well described in some recent textbooks (Diggle, Liang, and Zeger 1994;Lindsey 1993). For the analysis of general nonnormal dataparticularly for discrete data or data that are measured on an ordinal scalethe theory is less developed, particularly when several factors are present in the experiment or study. Such data occur in many sociological or psychological experiments and in numerous areas in medicine, where ratings or gradings of any type are observed. The designs of these trials are more or less complicated, and appropriate procedures for the analysis are required.
Here, we suggest an approach that is nonparametric in the sense that only the distribution functions are used to define and estimate the treatment effects and to formulate the hypotheses. The suggested procedures are extensions of the Wilcoxon-Mann-Whitney and the Kruskal-Wallis test to multidimensional factorial research designs and to repeated-measures designs. These extensions are based on (1) the idea of Akritas and Arnold (1994) to formulate nonparametric hypotheses in factorial designs by means of the distribution functions and (2) on an extension of this idea to unbalanced repeated-measures designs in which the distribution functions may be discontinuous (Akritas and Brunner 1997;Brunner, Munzel, and Puri 1999). To motivate the procedures and to present the ideas, we consider the alcohol consumption study described in the following subsection.
Our empirical example is an analysis of the relationship of gender and age to alcohol consumption in a sample of American students, assessed once each year in the 7th through 10th grades. Prior research leads us to expect that we will find alcohol use to be associated with age and that this age trend will differ between boys and girls. National data sources, such as the Monitoring the Future Study, report that the rate of alcohol use increases sharply over the age span we are studying (Johnston, O'Malley, and Bachman 2000:95, 99-100). These data also indicate that in early adolescence (e.g., 8th grade), there is negligible gender difference in the prevalence of alcohol consumption. By the end of high school, however, males use alcohol at substantially higher rates than females (Johnston et al. 2000:95, 99-100), thus matching the gender difference that is typical of most forms of deviant or illegal behavior (Gottfredson and Hirschi 1990). The relatively early initiation of alcohol use by girls is consistent with Caspi, Moffitt, and Silva's (1993) portrayal of early pubertal maturation of girls prompting attention from older boys, who encourage them to join in activities such as substance use and delinquency.
The example is well suited to the goals of this study because there is a sharp decline in the rate of participation in the study over the four waves of data. Of the 321 students who completed the questionnaire in the 7th grade, 211 (66 percent) did so in the 8th grade, 184 (57 percent) in the 9th grade, and 169 (53 percent) in the 10th grade. Furthermore, there are likely to be systemic differences between the students who participated for different lengths of time.
The largest decline in participation came between the first and second years of the study, and most of that attrition was due to a change from accepting passive parental consent to requiring active consent. Students could participate in the first wave as long as parents did not disapprove, but a signed parental consent form was necessary to participate in later waves. Even after extensive efforts (including letters, phone calls, and inducements to students and teachers), researchers were unable to obtain completed forms from many families. Later attrition primarily resulted from families relocating away from the target schools and not providing a new address, which became more common as students entered high school in the 9th or 10th grades. Overall, it is likely that students who provided fewer waves of data came from families that lived less stable lives and were less cooperative with authorities. Thus, they may well be at higher risk for alcohol use.
These data come from the national evaluation of the G.R.E.A.T. gang prevention program (Esbensen et al. 2001). This evaluation included more than 3,000 students from 153 classrooms in 22 schools, located in six different cities. Although this sample is not representative in any systematic sense, it does capture much of the diversity of the United States. The research sites include many regions of the country, small to very large cities, very poor to solidly middleclass areas, and areas with very different racial and ethnic compositions. In the total sample, 51 percent of respondents were female and 49 percent male. The ethnic/racial composition of the sample was 46 percent white, 17 percent African American, 20 percent Latino, 4 percent Native American, and 4 percent Asian, with the remainder in other groups or of mixed ethnicity.
We report analyses based on a 12 percent random subsample from the larger study. The total sample is large enough that even trivial effects might reach statistical significance, so we preferred this smaller sample for illustrating the method we are presenting. Also, there is considerably less clustering by classroom and school in this smaller sample. Parametric analyses indicate only weak clustering effects in the total sample (Esbensen et al. 2001), and they should be of no consequence in the reduced sample.
The data come from self-administered questionnaires, completed under the supervision of the research team. Teachers were not present. The measure of alcohol use is a single item asking,``How many times in the last six months have you used alcohol?'' Respondents responded with a specific number rather than choosing from response categories. The few highest responses were recoded to a maximum of 100. Although respondents are reporting a count of events, it would be unwise to assume that responses will follow formal statistical distributions that typify event counts, such as the Poisson or negative binomial. Almost all respondents who indicated using alcohol more than five times chose round numbers, such as multiples of 10. Limitations of memory and of time avail-able for the task mean that these numbers are not literal counts but rather rough approximations. Thus, it is prudent to analyze these data as ordinal rather than metric. Therefore, we will use ranking methods to analyze this example.

STATISTICAL MODEL
Nonparametric models concerning rank methods for repeated measures were first investigated for the paired two-sample design in which the hypothesis was formulated through the marginal distributions of both samples. In the context of longitudinal data, the paired two-sample design corresponds to measurements on the same experimental unit at two different occasions. The marginal distributions are considered to investigate whether the observed variable has changed with time.
The first ideas to use the so-called marginal model to define a treatment effect and to formulate a hypothesis in a nonparametric mixed model date back to Mehra and Puri (1967) ;Hollander, Pledger, and Lin (1974); and Govindarajulu (1975). This approach was later extended and studied in more detail by Brunner and Neumann (1982), Thompson (1990Thompson ( , 1991, and Brunner and Denker (1994).
The idea underlying the nonparametric marginal model is to make use of solely the independence structure of the observations and to determine from the design which marginal distributions of the observed random vectors are identical. The questions of interest are then investigated using quantities that can be derived from the marginal distributions. Thus, one completely forgoes a parameterization and modeling of the distributions. The simplest way of modeling the independence structure is by associating independent replications with independent random variables while allowing random variables not corresponding to independent replications an arbitrary dependence structure.
To introduce the ideas and to keep the notation simple, we restrict ourselves to the case of two groups of subjects, which corresponds to the repeated-measures design underlying the example in the previous subsection. Here, two groups (boys: J @ 4, girls: J @ 5), each consisting of O J experimental units (subjects), are examined on U dif-ferent occasions (in the example, U @ 7), where the measurements 9 JL 9 JLU (alcohol consumption) are observed on the Lth subject in group J. These U repeated measures are written in vector form as 8 JL @ +9 JL 9 JLU , , L @ 4 O J . The O @ O . O 2 vectors 8 JL are assumed to be independent for +J L, 9 @ +J L ,, with marginals where ' 3 JT +Y, @ 1+9 JLT Y, denotes the left-continuous version, and ' n JT +Y, @ 1+9 JLT Y, denotes the right-continuous version of the distribution function of the random variable 9 JLT . The distribution function ' JT +Y,, defined in (1.1), is called the normalized version of the distribution function and is useful to describe models for random variables with continuous as well as with discontinuous distribution functions in a unified form (Ruymgaart 1980). Clearly, if the distribution function is continuous, then all three versions are identical; that is, ' JT @ ' 3 JT @ ' n JT . The components of each vector 8 JL can be arbitrarily dependent on one another. Since the observations of different subjects within one group are considered as replications of the experiment, it is reasonable to assume that the common distribution functions ( JL +X, of the vectors 8 JL are identical (i.e., they do not depend on the index L). Additional dependence structures or assumptions on the distributions will not be needed within this framework. This nonparametric model is very general and makes use exclusively of the distributions and the independence of the random vectors. Thus, the notion of group and time effects and of the interaction between them, as well as the corresponding hypotheses, must be defined by the distributions. This will be provided in the next section.

RELATIVE EFFECTS
To describe treatment effects in the general marginal model (1.1), we generalize Birnbaum's (1956) idea of describing a nonparametric treatment effect for two independent random variables : ' and : 2 ' 2 with continuous distribution functions by the probability Q @ 1+: : 2 ,, applying it to the case of several samples, to data with ties, as well as to the case of repeated measures.
Let 9 JL @ +9 JL 9 JLU , denote independent random vectors with marginal distribution functions ' n JT +Y, @ 1+9 JLT Y,, J @ 4 5, L @ 4 O J T @ 4 U. Then a treatment effect for group J at time T with respect to group S at time A (J S @ 4 5 and T A @ 4 U) is defined as the probability that a randomly selected observation from group S at time A is less or equal than a randomly selected observation from group J at time T (if S @ J, then the observations from different subjects L and L must be selected); that is, 1+9 SLA 9 JL 3 T , +S L, 9 @ +J L ,.
We note that 1+9 SLA 9 JL 3 T , @ 1+9 SA 9 J2T ,, +S L, 9 @ +J L ,, since the observations 9 SLA ' n SA +Y, and 9 JL 3 T ' n JT +Y, are independent and identically distributed replications of the experiment. Let / @ U +O . O 2 , denote the total number of observations. Then the weighted mean is called a relative treatment effect of ' n JT with respect to the total experiment. Some explanation is required as to why a weighted sum of all pairwise probabilities is used in this definition. First, it should be noted that it may be difficult to draw consistent conclusions only from a subset of pairwise comparisons since they are not transitive in general. This is easily seen from the following counterexample.
Let 9 J ' n J +Y,, J @ 4 5 6, where and assume that larger values correspond to a better outcome. Then it follows that 1+9 9 2 , @ (i.e., ' n is``better'' than ' n 2 ), 1+9 2 9 , @ (i.e., ' n 2 is``better'' than ' n ), and 1+9 9 , @ (i.e., ' n is``better'' than ' n ), which appears to be a paradox result (for details, see Gardner 1970). Such inconsistent results can be avoided if all distributions are compared with the same reference distribution ) n +Y, of a random variable 9 f . Obviously, ) n +Y, must cover a wide range of possible observations if different distributions ' n J +Y, should correspond to different relative effects Q J @ 1+9 f 9 J ,. If, for example, ) n +Y, is chosen as then Q @ 1+9 f 9 , @ Q 2 @ 1+9 f 9 2 , @ Q @ 1+9 f 9 , @ 4, although the pairwise relative effects for the three distributions are different from 2 . Therefore, a reasonable reference distribution should include all distributions of the experiment. One useful choice of a suitable reference distribution is the mean of all distributions in the experiment where the distribution functions ' n SA +Y, are weighted by the sample sizes O and O 2 in the two experimental groups of subjects. The relative treatment effect Q JT , defined in (2.1), can be considered a numerical quantification of the comparison between the independent random variables 9 JLT ' n JT +Y, and 9 f ) n +Y,. We note that the procedures considered by Brunner and Neumann (1982), Kepner and Robinson (1988), Thompson (1990Thompson ( , 1991, and Akritas and Arnold (1994) are based on the relative treatment effects Q JT , defined in (2.1).
The case of distribution functions with common points of discontinuity is taken into account by using the normalized version ' JT +Y, @ 2^' 3 JT +Y, . ' n JT +Y,`of the distribution function, which leads to the normalized version of the mean distribution function By using the normalized version )+Y,, the relative effect Q JT +J @ 4 5 T @ 4 U, is generalized to possibly discontinuous distribution functions; that is, If the distribution functions ' JT +Y, are continuous, then 1+9 SA 9 J2T , @ 1+9 SA 9 J2T , and 1+9 SA @ 9 J2T , @ 3, which shows that Q JT , defined in (2.1), is just a special case of Q JT , defined in (2.4). The procedures considered by Akritas and Brunner (1997) and by Brunner et al. (1999) are based on this generalization of the relative treatment effect Q JT .

HYPOTHESES
To conveniently formulate the hypotheses in this design with U repeated measures and two independent groups of subjects, we use vector and matrix notation. Let & @ +' ' U ' 2 ' 2U , denote the vector of the marginal distributions, and let P @ +Q Q U Q 2 Q 2U , denote the vector of the relative treatment effects. In the sequel, the situations of no group effect, no time effect, and no interaction between group and time are characterized by means of the distribution functions.
In the two-sample design with independent observations, the hypothesis of no treatment effect is commonly formulated as ) f = ' @ ' 2 or equivalently as ) f = ' ' 2 @ 3, where 3 denotes a function that is identical 3. Let & @ +' ' 2 , denote the vector of the distribution functions. Then this hypothesis can be written in vector notation as +4 4,& @ 3. This concept to formulate nonparametric hypotheses by means of the distribution functions has been extended to factorial designs by Akritas and Arnold (1994) and is applied to the nonparametric marginal model in (1.1). The fixed factor group with the two levels . and ' (gender) is denoted by ", and the fixed factor time with the four levels 4 7 (time points) is denoted by 5 and the interaction between these two factors by "5. The following hypotheses are considered: 1. No effect of the factor " (group):

No interaction between the factors " and
To write these hypotheses equivalently in matrix notation, let U @ +4 4, 5 U U denote the U-dimensional vector with all components equal to 4, let * U @ U U denote the U U matrix with all elements equal to 1, let ) U denote the U-dimensional identity matrix, and let P U @ ) U U * U denote the so-called U-dimensional centering matrix. Finally, let ! " denote the Kronecker product of the two matrices ! and ", which means that each element of the matrix ! is multiplied by the matrix ". Then the three hypotheses are written in matrix notation as Similar relations follow for the relative treatment effects under these nonparametric hypotheses since from #& @ , it follows that also #P @ (a formal derivation of this implication is given in Appendix 5.1). In particular, we have If the random variables 9 JLT are observed on a metric scale and if the expectations v JT @ &+9 JLT ,, J @ 4 5 T @ 4 U exist, then, under the nonparametric hypotheses, similar relations hold for the expectations v JT since #& @ , #> @ , where > @ +v v U v 2 v 2U , denotes the vector of the expectations (for a formal derivation, see also Appendix 5.1). In particular, the implications for the three hypotheses of no group effect, no time effect, and no interaction between group and time are ,. This means that under the nonparametric hypothesis ) f = #& @ , the corresponding linear contrasts #P of the relative treatment effects, as well as the corresponding linear contrasts #> of the expectations, are equal to 3. Therefore, the hypotheses in (2.5) can be used as reasonable characterizations to describe the situations of no group effect, no time effect, or no interaction in the nonparametric model.

STRATEGIES FOR ANALYZING DATA WITH MISSING VALUES
First we discuss some strategies for analyzing longitudinal data with missing values. Each of these can be applied in combination with the estimators and statistics described in the next two subsections. The simplest way is to leave out all subjects where at least one observation is missing. This strategy is referred to as complete case analysis (CCA). The advantage of CCA is that there are no longer missing values in the data set that is analyzed, and all formulas for complete observations can be used. The disadvantages are that a lot of information is thrown away (in the worst case, all available data) and that the subjects for which all observations are available must be considered a random sample of all subjects in the study. In the terminology of Rubin (1976), this means that the observations are missing completely at random. Because this assumption may be at least questionable in many studies, several strategies are in use to replace the unobserved values by some reasonable values. Such more or less sophisticated estimation procedures take into account that the fact whether a measurement could be observed or not may be related to the outcome of the study. Such imputation methods have been developed for parametric as well as semiparametric models during the past three decades. It seems to be difficult to apply these ideas to pure nonparametric models as considered in this study, and the development of appropriate procedures has to be seen in the future.
A rather simple but nevertheless popular method of imputing unobserved values is the so-called last observation carried forward (LOCF) method, where the missing values are replaced with the last observed measurement. The advantage of LOCF is also that there are no longer missing values in the data set, and all formulas for complete observations can be used. The obvious disadvantage is that missing data are replaced with``guessed'' data, which means that there is no real gain of information; instead, a bias may be caused by the virtual data. Moreover, the assumption of identically distributed random vectors 8 JL within group J is no longer true since observations carried forward are completely dependent on the last observed value.
A third method, the so-called complete set analysis (CSA), uses all data that have been observed. This leads to technically difficult statistical procedures. Moreover, the estimated covariance matrix may no longer meet the basic properties of a covariance matrix. In particular, it may happen that the variances and covariances do not fulfill the Cauchy-Schwarz inequality and/or that the covariance matrix is not positive semidefinite. This requires rather complicated correction procedures (see the remark in Section 3.3). The advantage is that all information that is available is used. It may be noted, however, that even CSA requires that the observations are missing completely at random.
Currently, the experimenter is left with only one practical recourse when faced with the problem of missing values in the framework of a nonparametric model: One must carry out different analyses and then compare and discuss the results obtained. A careful subgroup analysis may improve the interpretation of these results. Such subgroups can be obtained from a stratification of the data, where the strata are defined by those subjects who drop out after the first time point, after the second time point, and so on. The data are then analyzed for those earlier time points when the groups are complete. This technique will be briefly demonstrated in the next subsection.

ESTIMATORS FOR THE RELATIVE EFFECTS
Statistics for testing the nonparametric hypotheses considered in Section 2 are based on consistent estimators of the relative marginal effects Q JT . If there are no missing values in the data set, the notation introduced in Section 1.2 will be sufficient for the derivation of the estimators and statistics. This is also true if there are missing values and the CCA or the LOCF strategy is used to create a complete data set. If, however, the CSA strategy is used, we need further notation. To indicate whether an observation is missing or not, indicators u JLT are introduced; that is, u JLT @ 4 if 9 JLT is observed, 3 if 9 JLT is missing.
The indicators u JLT can be considered as random variables. If, however, the observations are missing completely at random (Rubin 1976), the indicators are independent of the observations. We can therefore carry out the analysis conditionally on the indicators and treat them as fixed. Furthermore, note that the new notation with the indicators u JLT is a generalization of the notation in Section 1.2. If there are no missing values in the original data set or in the data set created by the CCA or LOCF strategy, all indicators are equal to 1. Therefore, we can use the more general notation, regardless of whether there are missing values or which strategy for handling missing values is used.
Denote by u JuT @ S O J L' u JLT the number of available observations at time T in group J, and let / @ S U T' u JLT denote the number of all observations 9 JLT . Then the relative treatment effect Q JT , defined in (2.4), can be estimated from the ranks of the observations by where 3 JuT @ u 3 JuT S OJ L' u JLT 3 JLT , J @ 4 5, T @ 4 U, and 3 JLT denotes the rank of 9 JLT among all / observed values. If 9 JLT is missing, then 3 JLT @ 3, for convenience. Thus, 3 JuT is the mean in group J at time T of the ranks of the / observed 9 JLT . For a derivation of the estimators a Q JT , see Appendix 5.2. The estimator a Q JT is consistent for the relative effect Q JT if the minimal number of subjects at time T in group J for which observations are available tends to infinity (for details, see Appendix 5.3 and Proposition 3.1 of Brunner et al. 1999).
The vector of the estimators a Q JT for study group J is denoted by > P J @ + a Q J a Q JU , , and > P @ +> P 3 > P 3 2 , 3 is the vector of the estimators in both study groups. Let v 2 Ju @ + 3 Ju 3 JuU , denote the vector of the rank means 3 JuT for the U repeated measures within group J. Then, the rank representation of > P J is given by The estimators a Q JT of the relative marginal effects Q JT can be used to describe the results of the study. The quantity a Q JT estimates the probability that a randomly selected observation from the total data set (both groups, all subjects, and all time points) is smaller than a randomly selected observation at time point T within group J.
For the alcohol consumption study, this means that the lower the value of a Q JT , the lower the alcohol consumption at time point T in group J compared with the consumption at all time points in the total the relative marginal effects. The time curves of these estimators a Q JT for the two study groups . and ' are displayed in Figure 1.
One draws from Figure 1 that the relative marginal effects seem to increase during time in both study groups, while no uniform effect of the gender of the subjects is to be seen. Rather, there seems to be an interaction between gender and time (i.e., differing age trends for boys and girls). To evaluate this descriptive impression by an inferential analysis, the distributional aspects of the estimators > P J shall be considered below.

TEST STATISTICS
For testing the general nonparametric hypothesis ) f = #& @ , we first consider the large sample distribution of the contrast vector of the relative marginal effects O 2 #> P, where O @ O . O 2 denotes the total number of subjects. In the case that no observations are missing, Akritas and Brunner (1997) have shown that for large sample sizes, the distribution of O 2 #> P has a multivariate normal distribution with expectation and covariance matrix #6 O # under the hypothesis ) f = #& @ . The matrix 6 O is a covariance matrix that has a block-diagonal structure and is given by The unknown covariance matrices 6 J , J @ 4 5, can be consistently estimated by the sample covariance matrices of the ranks 3 JLT . The only assumption needed to show this result is that the ratios OO J / f 4, J @ 4 5 of the total number of subjects O and number O J of the subjects in the study group J remain bounded when O $ 4. In the case of completely at random missing values, a similar result has been derived by Brunner et al. (1999) under the assumption that Ou f / f 4, where u f denotes the minimal number of subjects at time T in group J for which observations are available. For details, we refer to Appendix 5.3.
Remark. The estimation of the covariance matrix > 6 J can yield a matrix that is not positive semidefinite, if the CSA strategy is applied in case of missing values. Numerous suggestions on the modification of the matrix ensuring positive semidefiniteness can be found in the literature (see, e.g., Rousseeuw and Molenberghs 1993). In any case, great caution is necessary in the analysis and interpretation when many observations are missing.
To test the global hypotheses of no group or time effect or of no interaction, we shall use the quadratic forms 2 O +#, and ' O +#, given below in (3.3) and (3.4), respectively, while hypotheses concerning a conjectured trend or paired comparisons are tested by the linear form 5 O +W,, given in 3.5.
For testing the global hypothesis ) f = #& @ , the Wald-type statistic (WTS) 2 O +#, is commonly used. This statistic is a quadratic form in the contrasts O 2 #> P, where a generalized inverse^# > 6 O # 3`3 of the estimated covariance matrix # > 6 O # 3 is used to generate the quadratic form. Note that it is not possible to replace the generalized inverse by a simple inverse in general because the matrix # might be singular. It follows from Theorem 5.1 in Appendix 5.3 that the statistic has, asymptotically, a central 2 h@?!E -distribution under the hypothesis ) f if 6 O is of full rank. However, extremely large sample sizes are needed to achieve an acceptable approximation by this distribution. Therefore, we will not use the WTS but the so-called ANOVAtype statistic. This statistic is described below.
The problems appearing with Wald-type statistics are mainly caused by estimating the covariance matrix 6 O , in particular, when all elements of 6 O must be estimated. The idea, therefore, is to leave out the estimated matrix > 6 O and consider the quadric form 2 " O +#, @ O > P -> P, where -@ # ^# # `3 #, and^## `3 denotes some generalized inverse of ## . Note thatis a projection matrix and that -& @ / #& @ because # ^# # `3 is a generalized inverse of #. Thus, it is reasonable to use the quadratic form 2 " O +#, as a statistic for testing the hypothesis ) f = #& @ . The large sample distribution of 2 " O +#, can be approximated by a scaled 2 -distribution. The same approximation technique is used for estimating the degrees of freedom for U-test with unequal variances. The results are summarized below.
Under ) f = #& @ , the first two moments of the ANOVA-type statistic (ATS), and of the '+ G 4,-distribution coincide asymptotically for where the unknown traces wu+-6 O , and wu+-6 O -6 O , can be estimated consistently by replacing 6 O with > 6 O , given in (5.3) in Appendix 5.3. For the derivation of these results see, for example, Brunner et al. (1999).
Remark. This approximation dates back to Box (1954) and turns out to be quite accurate for independent observations (see Brunner, Dette, and Munk 1997). For repeated measures, the estimator a G may be slightly biased for small sample sizes.
To derive test statistics that are especially sensitive to a conjectured patterned alternative, the method of Page (1963) and Hettmansperger and Norton (1987) was extended to repeated-measures designs by Akritas and Brunner (1996). The idea is to weight the estimated treatment effects by a set of constants X X 2U reproducing the conjectured pattern of the alternative, which has to be specified in advance. Let W @ +X X 2U , denote the vector of the weights X JT . Then, under ) f = #& @ , it follows from Theorem 5.1 in Appendix 5.3 that the linear rank statistic -O +W, @ O 2 W 3 #> P has, asymptotically, a normal distribution with mean 3 and vari- 3) in Appendix 5.3. Finally, it follows that the linear form has, asymptotically, a standard normal distribution under ) f = #& @ .
In the next section, we apply these statistics to the repeatedmeasures design underlying the alcohol consumption study. In the example of the alcohol study, we have two groups of subjects (boys and girls) who report their alcohol consumption at four time points (7th through 10th grades). There is a considerable amount of missing data and dropouts. Only a total of 60 male and 60 female participants reported data for all four time points. The number of observations available at each time point is listed in 1, while the relative marginal effects for the three strategies are displayed in Table 2. The analysis of the data is performed by the nonparametric methods described in the previous sections.
First, a potential interaction between gender and time is investigated where the hypothesis ) f +"5 , in (2.5) is tested by the ATS  Table 3). Apparently, no interaction can be detected, and the questions of a potential gender effect or time effect can be investigated by the ATS for the hypotheses ) f +", and ) f +5 , in (2.5). All results are listed in Table 3. Obviously, the decisions agree for all three strategies. No effect of gender can be detected, while there is a clear time effect. The conjecture was that alcohol consumption increases with time. To assess an increasing trend, the pattern W @ +4 5 6 7, is appropriate, and the statistic 5 O +W, given in (3.5) is computed. For all three strategies, a highly significant increasing trend is detected. The results are 5 O +W, @ 7<9 + Q-value 43 3 e ) for CCA, 5 O +W, @ 97: + Q-value 43 3 e ) for LOCF, and 5 O +W, @ 885 + Q-value 43 3 e ) for CSA. Some caution is necessary for the interpretation of the results since about 50 percent of the data are missing at the end of the study. To assess the question of a potential selection process causing the missingness of the data, which was already discussed in Section 1.1, a subgroup analysis is performed. This investigation, however, should not be considered a confirmatory analysis but rather an exploratory data inspection.
To investigate the question of whether those participants who drop out after the third time point have different time profiles than )LJXUH 5HODWLYH 0DUJLQDO (IIHFWV IRU WKH 0DOH OHIW DQG )HPDOH ULJKW 3DUWLFLSDQWV :LWK &RPSOHWH 'DWD VROLG OLQH DQG IRU 7KRVH 'URSSLQJ 2XW $IWHU WKH 7KLUG 7LPH 3RLQW GDVKHG OLQH those who completed the questionnaire at all four time points, the relative marginal effects are computed for the male and female participants separately for the first three time points. The results are displayed in Figure 2. Obviously, all time profiles are nearly identicalexcept that of the female participants who drop out after the third time point. This is a strong indication that the observations might not be missing completely at random. This conjecture should be confirmed by an appropriate analysis for the remaining part of the total study.

INTEGRAL REPRESENTATION OF RELATIVE EFFECTS
To show the general relations #& @ , #P @ and #& @ , #> @ as stated in Section 2.2, we use the integral representation of the relative effect. Let )+Y, @ / 3 S 2 J' S U T' O J ' JT +Y, denote the weighted average of all marginal distribution functions, where / @ U +O . O 2 , denotes the total number of observations. If the random variables 9 SA and 9 J2T are independent, then and by the representation of )+Y, in (2.2), it follows that Thus, the vector of the relative treatment effects P @ + Q Q U Q 2 Q 2U , can be formally written as P @ U )E &. To see the relation of the hypotheses in (2.5) to the relative marginal effects Q JT , we note that for any matrix # with 5 U columns, To prove a similar relation for the expectations > @ + v v U v 2 v 2U , , assume that the random variables 9 JLT are defined on a metric scale and that v JT @ &+9 JT 4. Then > can be formally written as > @ U YE &, and it follows that #> @ U YE+#&, and #& @ , #> @ .

ASYMPTOTIC RESULTS
The estimator a Q JT is asymptotically unbiased and consistent for the relative effect Q JT in the sense that &+ a Q JT Q JT , 2 $ 3 if plq JT u JuT $ 4 (see Proposition 3.1 of Brunner et al. 1999). The asymptotic normality of the contrast O 2 #> P under ) f = #& @ is stated in the following theorem.
7KHRUHP Let 8 JL @ +9 JL 9 JLU , , J @ 4 5, L @ 4 O J , be independent random vectors, and let O @ O . O 2 . Let ' JT +Y,, a ' JT +Y,, u JLT , )+Y, a )+Y, Q JT , and a Q JT be as defined earlier. Furthermore, let u f @ plq JT u JuT denote the minimal number of the observed data in group J at time T. If Ou f O f 4, then, under the hypothesis ) f = #& @ , the statistic O 2 #> P has, asymptotically, a multivariate normal distribution with expectation and covariance matrix #6 O # . The unknown covariance matrix 6 O is consistently estimated by where for J @ 4 5, the elements of > 6 J @ +a W J +T T ,, TT 3 'U are estimated by means of the ranks 3 JLT ; that is, where , J +T T , @ +u JuT 4,+u JuT 3 4, . J +T T , 4 and where J +T T , @ S OJ L' O J u JLT u JLT 3 . 3URRI See Brunner et al. (1999).