rGLM {rGLM} | R Documentation |
This function takes a dataset of haplotypes in which rows for individuals of uncertain phase have been augmented by "pseudo-individuals" who carry the possible multilocus genotypes consistent with the single-locus phenotypes. The EM algorithm is used to find penalized MLEs for trait associations with covariates in L1 regularized generalized linear models. Parts of code is borrowed from the hapassoc function in hapassoc package as well the arguments and values.
rGLM(form, haplos.list, baseline = "missing",family = binomial, freq = NULL, maxit = 50, tol = 0.001, start = NULL, lambda=0, trace = FALSE)
form |
model equation in usual R format |
haplos.list |
list of haplotype data from pre.hapassoc, which is a function borrowed from the hapassoc package |
baseline |
optional, haplotype to be used for baseline coding. Default is the most frequent haplotype according to the initial haplotype frequency estimates returned by pre.hapassoc. |
family |
binomial, poisson, gaussian or freq are supported, default=binomial |
freq |
initial estimates of haplotype frequencies, default values are calculated in pre.hapassoc using standard haplotype-counting (i.e. EM algorithm without adjustment for non-haplotype covariates) |
maxit |
maximum number of iterations of the EM algorithm; default=50 |
tol |
convergence tolerance in terms of either the maximum difference in parameter estimates between interations or the maximum relative difference in parameter estimates between iterations, which ever is larger. |
start |
starting values for parameter estimates in the risk model |
lambda |
tuning parameter lambda |
trace |
indicates whether or not a list of the genotype variables used to form haplotypes and a list of other non-genetic variables should be printed; default is TRUE. |
haplos.list is a list from pre.hapassoc function. Please check the detail in the pre.hapassoc help file.
it |
number of iterations of the EM algorithm |
beta |
estimated regression coefficients |
freq |
estimated haplotype frequencies |
fits |
fitted values of the trait |
wts |
final weights calculated in last iteration of the EM algorithm. These are estimates of the conditional probabilities of each multilocus genotype given the observed single-locus genotypes. |
response |
trait value |
converged |
TRUE/FALSE indicator of convergence. If the algorithm fails to converge, only the converged indicator is returned. |
model |
model equation |
loglik |
the log-likelihood evaluated at the maximum likelihood estimates of all parameters call the function call |
Burkett K, Graham J, McNeney B. 2006. hapassoc: Software for likelihood inference of trait associations with SNP haplotypes and other attributes. Journal of Statistical Software {16}:1-19.
Guo, W. and Lin, S. 2009. Generalized linear modeling with regularization for detecting common disease rare haplotype association. Genetics Epidemiology. DOI: 10.1002/gepi.20382.