Skip to content

Accurate Species Tree EstimatoR: a family of optimation algorithms for species tree inference (including ASTRAL & CASTER)

License

Notifications You must be signed in to change notification settings

chaoszhang/ASTER

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Accurate Species Tree EstimatoR (ASTER❋)

A family of optimatization algorithms for species tree inference:

  1. ASTRAL-IV (from unrooted gene tree topologies with integrated CASTLES-II for branch lengths)
  2. ASTRAL-Pro3 (from unrooted gene family tree topologies with integrated CASTLES-Pro)
  3. Weighted ASTRAL (from unrooted gene trees with branch lengths and/or supports)
  4. CASTER-site (from whole genome alignments or aligned sequences)
  5. CASTER-pair (from whole genome alignments or aligned sequences)
  6. WASTER-site (from raw reads)
  7. SISTER (from optical-map-like distance data or shape data)
  8. MONSTER

Announcements

Integrated in Phylosuite (NEW)

Many ASTER tools have been integrated in PhyloSuite, an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies.

GUI for Windows users

Please check out our software with GUI. Simply download the zip file, extract the contents, enter exe folder, and click aster-gui.exe.

Bug Reports

Contact chaozhang@berkeley.edu, aster-users@googlegroups.com, or post on ASTER issues page.

Documentations

  • The rest of this README file
  • Program specific tutorials (see EXECUTION section)
  • Forums (feel free to ask questions or ask for help running ASTER):

INSTALLATION

For most users, installing ASTER is very easy! Download using one of two approaches:

  • You simply need to download the zip file for Windows/MacOS/Linux and extract the contents to a folder of your choice.
  • Alternatively, you can clone the github repository and checkout the branch named Windows/MacOS/Linux.

Binary files should be in the exe folder for Windows or bin folder otherwise. If you are lucky, these may just work as is and you may not need to build at all.

For Linux/Unix/WSL users

  1. In terminal, cd into the downloaded directory and run make.
  • If you see *** Installation complete! *** then you are done!
  • If you see Command 'g++' not found then before rerunning make,
    • Debian (Ubuntu) users try
      sudo apt update
      sudo apt install g++
      
    • CentOS (RedHat) users try
      sudo yum update
      sudo yum install gcc-c++
      
    • Unix (MacOS) users should be prompted for installing g++ and please click "install". If no prompt, try g++
  1. Binary files should be in the bin folder.

For Windows users

  • Executables for x86-64 are available in exe folder and it is very likely that they already work.
  • Windows Subsystem for Linux (WSL) is HIGHLY recommanded if you need to install on your own! Please follow instructions in "For Linux/Unix/WSL users" section.
  • To compile windows excutables:
    1. Download MinGW and install posix version for your architecture (eg. x86-64)
    2. Add path to bin folder of MinGW to system environment variable PATH
    3. Double click make.bat inside the downloaded directory

EXECUTION

Please click the link below:

  1. ASTRAL-IV
  2. ASTRAL-Pro3
  3. Weighted ASTRAL
  4. CASTER-site
  5. CASTER-pair
  6. WASTER-site

HELP ME CHOOSE A SUITABLE TOOL

Q: I want to reconstruct a phylogenetic tree without alignments.

A: I recommend WASTER-site, which can accurately infer family-level species tree with short reads with even 1.5X coverage. This tool can also be used to build an adequate order-level guide tree for CACTUS. (You will be amazed by how accurate this "guide tree" is!)

Q: I have a supermatrix of SNPs in fasta/phylip format and I want a "quick-and-dirty" run to get an adequate phylogenetic tree.

A: I recommend CASTER-site, which is usually 1-2 magnitudes faster than concatenation-based maximum likelihood methods yet more accurate in presence of incomplete lineage sorting with enough data.

Q: My dataset has a lot of muti-copy genes (e.g. plants) and I want to make an effort to utilize these precious signals.

A: I highly recommend ASTRAL-Pro3, which takes as input non-rooted non-labelled gene family trees. ASTRAL-Pro3 does not need to know the homology relationships of genes, but you still need to reconstruct gene family trees by yourself using RAxML/IQTree/Fasttree.

Q: I have aligned genomes (>10M sites) and the average nucleotide identity is >80% between closely related species (e.g. birds, mammals, or abundant taxon sampling).

A: I recommend CASTER-site (faster) and CASTER-pair (slower). Those methods are usually 1-2 magnitudes faster than concatenation-based maximum likelihood methods yet more accurate in presence of incomplete lineage sorting. Please run both and select the species tree that makes more sense.

Q: I have gene trees with branch lengths and Bootstrap/Baysian supports and I know that horizontal gene transfers and hybridizations are rare.

A: I recommend Weighted ASTRAL. It utilizes branch lengths and supports to improve accuracy.

Q: I have gene trees but they do not satisfy the requirements for wASTRAL.

A: You can still use ASTRAL-IV. By the way, ASTRAL-IV is also useful for finding the supertree.

ACKNOWLEGEMENT

ASTER code uses Regularized Incomplete Beta Function by Lewis Van Winkle under zlib License. Code is contributed by Chao Zhang supervised by Siavash Mirarab and Rasmus Nielsen.