Traditional sample size computation based on "power" does not apply directly to multiple comparisons, because the power of a test of homogeneity includes the probability of an incorrect decision. For example, the F-test may reject because the sample mean of treatment 2 is much larger than the sample mean of treatment 3, when in fact the population mean of treatment 2 is smaller than the population mean of treatment 3. Thus, the power of a test of homogeneity includes some probability of incorrect multiple comparison inference, which is undesirable.
Sample size computation implemented here computes the joint probability of "correct" and "useful" inference, where
correct inference = all separations are in the right direction
useful inference = all treatments sufficiently far apart are separated
See Appendix C of Multiple Comparisons: Theory and Methods for a discussion of this concept and details of the computation.
In a paper titled "On an Approach to Sample Size Determination for Confidence Intervals Proposed by Hsu" which appeared in the JSM97 proceedings of the Biopharmaceutical Section, Olivier Guilbaud of Astra gave a technique to easily and accurately approximate the desired sample size. His idea is as follows. If one let A be the event of {correct inference} and let B be the event of {useful inference}, then computing P{A and B} by