Title: Assessment and application of a nonparametric confidence set approach based on the mean test

Abstract: In genomewide linkage analyses one needs to adjust for the number of simultaneous hypotheses tested. Such an adjustment is very difficult due to the complex nature of dependencies amongst data arising from linkage studies. It is possible, though, to avoid dealing with multiple testing issues, by constructing confidence sets of markers tightly linked to a genetic trait through testing hypotheses that are reversal of the usual setup. Here, we examine the effect of several factors on the confidence set derived using the nonparametric mean statistic based on IBD sharing of affected sib pairs. Simulations are performed to assess the performance of the approach in terms of its true and false discovery rates, under varying conditions of the underlying single locus disease model, the amount of data, the heterozygosity of the marker, and the number of family members genotyped at the marker locus. We also provide rough guidelines for the choice of coverage probability of the confidence set that would lead to a false discovery rate of a desirable level. Our methods are then applied to the simulated data from the Genetic Analysis Workshop (GAW) 13, focusing on the high blood pressure trait. Our simulation results show that when the assumption of a single locus trait with accurate risk estimates is met, the confidence set is able to sufficiently localize the disease causing gene. However, a moderate increase of about 30% in the amount of data is necessary for the method to achieve the same power as when the marker is 100% polymorphic. Also, inaccurate estimation of the risks may compromise the power or inflate the false discovery rate of the method, particularly when genetics only plays a limited role in the disease causing mechanism. Incorrect specification of the number of disease causing genes also reduces the power of the method. Nevertheless, the results for the GAW data suggest that, despite potential reduction in power due to deviation from the ideal situation, our method can still achieve significantly higher power than standard nonparametric methods while maintaining comparable false positive rates, especially when the sample size is reasonably large.