Estimate a phylogenetic tree
We use the piq_build_tree app to build phylogenies.
For this simple case, we will build a single phylogeny using the GTR model on one alignment.
We have a utility script for this documentation that provides the local path to that data. We will then load that data and, as it contains quite a few sequences, we will use a subset of it. We use methods on the cogent3 object to do that.
In [1]:
Copied!
import cogent3
from piqtree import download_dataset
aln_path = download_dataset("example.phy.gz", dest_dir="data")
aln = cogent3.load_aligned_seqs(aln_path, moltype="dna", format_name="phylip")
aln
import cogent3
from piqtree import download_dataset
aln_path = download_dataset("example.phy.gz", dest_dir="data")
aln = cogent3.load_aligned_seqs(aln_path, moltype="dna", format_name="phylip")
aln
Out[1]:
| 0 | |
| Bird | CTACCACACCCCAGGACTCAGCAGTAATTAACCTTAAGCATAAGTGTAACTTGACTTAGC |
| LngfishAu | ..C.............AC.......G......A..G........C.A.G.......C... |
| LngfishSA | .A..............AA.............T....G.........A........C...T |
| LngfishAf | .A...............A.............AA..GGA................TCC... |
| Frog | AA.TTTGGT..TGT..T........G..A...A..G.A...G..C.A.G..C..T.C..T |
| Turtle | ..T......................G..A..AA...........C.A.G..........T |
| Sphenodon | ..C..............A.......G.....TA.............A............T |
| Lizard | ..T........A...CA........G..A...A...........C.A.G..........T |
| Crocodile | ..C............C.A........G....TA...G.......C.A.G.C....C...T |
| Human | ................AA.......G.........T.......AC.A.GT..A...A... |
17 x 1998 (truncated to 10 x 60) dna alignment
We now take a look at the help for the piq_build_tree app.
In [2]:
Copied!
from cogent3 import app_help
app_help("piq_build_tree")
from cogent3 import app_help
app_help("piq_build_tree")
Options for making the app
--------------------------
piq_build_tree_app = get_app(
'piq_build_tree',
model: piqtree.model._model.Model | str,
rand_seed=None,
bootstrap_reps=None,
num_threads=None,
other_options='',
)
Reconstruct a phylogenetic tree.
Given a sequence alignment, uses IQ-TREE to reconstruct a phylogenetic tree.
Parameters
----------
aln : Alignment
The sequence alignment.
model : Model | str
The substitution model with base frequencies and rate heterogeneity.
rand_seed : int | None, optional
The random seed - None means no seed is used, by default None.
bootstrap_replicates : int | None, optional
The number of bootstrap replicates to perform, by default None.
If 0 is provided, then no bootstrapping is performed.
At least 1000 is required to perform bootstrapping.
num_threads: int | None, optional
Number of threads for IQ-TREE to use, by default None (single-threaded).
If 0 is specified, IQ-TREE attempts to find the optimal number of threads.
other_options: str, optional
Additional command line options for IQ-TREE.
Returns
-------
PhyloNode
The IQ-TREE maximum likelihood tree from the given alignment.
Input type
----------
Alignment
Output type
-----------
PhyloNode
We build an app for estimating phylogenies with a GTR model
In [3]:
Copied!
from cogent3 import get_app
phylo_gtr = get_app("piq_build_tree", model="GTR", bootstrap_reps=1000)
phylo_gtr
from cogent3 import get_app
phylo_gtr = get_app("piq_build_tree", model="GTR", bootstrap_reps=1000)
phylo_gtr
Out[3]:
piq_build_tree(model='GTR', rand_seed=None, bootstrap_reps=1000, num_threads=None, other_options='')
Run the phylogeny estimation and display branches with support < 90%.¶
In [4]:
Copied!
tree = phylo_gtr(aln)
dnd = tree.get_figure(show_support=True, threshold=90)
dnd.show(width=600, height=600)
tree = phylo_gtr(aln)
dnd = tree.get_figure(show_support=True, threshold=90)
dnd.show(width=600, height=600)