STEM-hy: Species Tree Estimation using Maximum likelihood (with hybridization)

Version 1.0

Laura S. Kubatko and Travis Treseder
Department of Statistics
The Ohio State University
Columbus, OH 43210
lkubatko@stat.osu.edu

Copyright 2007-2012 by Laura Salter Kubatko. This software is provided "as is" without warranty of any kind. In no event shall the author be held responsible for any damage resulting from the use of this software. The program package, including source codes, executables, and documentation, is distributed free of charge.

The basic analyses incorporated in STEM-hy are described in the following publication:
Kubatko, L., B. C. Carstens, and L. L. Knowles. 2009. STEM: Species Tree Estimation
using Maximum likelihood for gene trees under coalescence, Bioinformatics, available here
(doi: 10.1093/bioinformatics/btp079).

The hybridization methods are described in the following publication:
Kubatko, LS. 2009. Identifying Hybridization Events in the Presence of Coalescence
via Model Selection, Systematic Biology 58(5): 478-488, available here.

About the Program

STEM-hy is a program for inferring maximum likelihood species trees from a collection of estimated gene trees under the coalescent model. The program has the following functionality:

Return the ML tree under the coalescent model with gene trees as input data
Compute the likelihood of a user-specified tree
Search for a set of high likelihood trees using simulated annealing
Carry out a bootstrap analysis to obtain bootstrap support for the internal nodes of the ML species tree or to obtain a bootstrap consensus tree
Evaluate hypotheses of hybridization in a model-selection framework

Downloads:

stem-hy.jar (Executable only)
STEM-hyv1.0.tar.gz (Supplemental files -- examples, etc.)
STEMv2.0.zip (source code and documentation)
STEMv1.1.zip (source code and documentation)
Documentation (version 2.0) only (pdf)

New - Google Users Group
Click here

Version History:

STEM-hy, Version 1.0, Released January 10, 2012.

STEM-hy Version 1.0 includes all functionality of STEM 2.0.
An option for carrying out a bootstrap analysis has been added. Alignments for individual genes must be provided in PHYLIP format. These genes will be bootstrapped a user-specified number of times, and gene trees for each bootstrap data set are estimated using the program SSA. The bootstrap gene trees in each sample are used to estimate the species tree for each bootstrap replicate.
Hybridization analyses as described in Kubatko (2009) can now be carried out. This analysis requires that the user specify a species trees as the well as identify putative hybrid taxa. The program will estimate the hybridization parameters and AIC for each hypothesis of hybridiation.

Version 2.0, Released January 21, 2011.

STEM 2.0 is completely re-written in Clojure and is distributed as a JAR file. It is run as a JAVA application.
A new option for supplying the set of input gene trees has been included. It is now possible to supply a list of file names that contain the gene trees.
The format of the settings file has been standardized to YAML format.
STEM 1.1 could not handle some patterns of missing data. STEM 2.0 has been more extensively tested with a variety of missing data patterns and we believe it now always handles missing data properly.
STEM 2.0 will compute the likelihood for a user-specified tree with user-specified branch lengths, as well as providing maximum likelihood branch lengths for a user-specified tree.

Version 1.1, Released November 26, 2008.

Improved input and output format.
Allows for different taxon samples in each gene, thereby enabling analysis when data are missing for some lineages for some genes.

Version 1.01 beta, Released February 2006.

Acknowledgements

Continued development of STEM/STEM-hy is based upon work supported by the National Science Foundation under Grants DMS 0104290, DMS 0702277, and DEB 0842219. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Contact Info

Questions or comments concerning the program can be e-mailed to me at lkubatko@stat.osu.edu. Please also e-mail me if you'd like to be advised when updated versions of the program become available.

Back to Laura's homepage