phruta
behind the scenesphruta
has a plethora of dependencies that enables the
package to perform particular tasks in a (largely…) organized fashion.
This vignette has the objective of acknowledging the maintainers and
creators of the main R packages that phruta
uses in
different of it’s steps. First, both rgbif
(Chamberlain and
Boettiger, 2017) and taxize
(Chamberlain and Szöcs, 2013)
are both used for either retrieving taxonomic information or curating
it. Second, ape
(Paradis and Schliep, 2019) is generally
used for interacting with sequence- and tree-based files within
R
and relative to files in the working directory. Third,
both rentrez
(Winter, 2017) and reutils
(Schöfl, 2016) are used to retrieve sequences from GenBank. We use
functions from two packages to interact with GenBank given differences
in the structure of search strings or expectations in the resulting
objects. Fourth, we use DECIPHER
(Wright, 2016),
Biostrings
(Pagès et al. 2022), msa
(Bodenhofer et al. 2015), and odseq
(Jiménez, 2022) mostly
during sequence alignment-related steps involving both curation and
effective alignment. Fifth, we use ips
(Heibl, 2008) to
interact with external software such as RAxML
(Stamatakis,
2014). Finally, geiger
(Harmon et al. 2008; Pennell et
al. 2014) and Rogue
(Aberer et al. 2013; Smith, 2021, 2022)
are used to time-calibrate a phylogeny using a reference tree and
identify rogue taxa in newly-inferred trees, respectively.
Aberer, A. J., Krompass, D., & Stamatakis, A. (2013). Pruning rogue taxa improves phylogenetic accuracy: an efficient algorithm and webservice. Systematic biology, 62(1), 162-166.
Bodenhofer, U., Bonatesta, E., Horejš-Kainrath, C., & Hochreiter, S. (2015). msa: an R package for multiple sequence alignment. Bioinformatics, 31(24), 3997-3999.
Chamberlain, S. A., & Boettiger, C. (2017). R Python, and Ruby clients for GBIF species occurrence data (No. e3304v1). PeerJ Preprints.
Chamberlain, S. A., & Szöcs, E. (2013). taxize: taxonomic search and retrieval in R. F1000Research, 2. doi:10.5281/zenodo.5037327
Harmon, L. J., Weir, J. T., Brock, C. D., Glor, R. E., & Challenger, W. (2008). GEIGER: investigating evolutionary radiations. Bioinformatics, 24(1), 129-131.
Heibl, C. (2008). PHYLOCH: R language tree plotting tools and interfaces to d Jiménez, J. (2022). odseq: Outlier detection in multiple sequence alignments. R package version 1.24.0.
Pagès, H., Aboyoun, P., Gentleman, R., DebRoy, S. (2022). Biostrings: Efficient manipulation of biological strings. R package version 2.64.0
Paradis, E., & Schliep, K. (2019). ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics, 35(3), 526-528.
Pennell, M. W., Eastman, J. M., Slater, G. J., Brown, J. W., Uyeda, J. C., FitzJohn, R. G., … & Harmon, L. J. (2014). geiger v2. 0: an expanded suite of methods for fitting macroevolutionary models to phylogenetic trees. Bioinformatics, 30(15), 2216-2218.
Schöfl, G. (2016). reutils: Talk to the NCBI EUtils. R package version 0.2.3.
Smith, M. R. (2022). Using information theory to detect rogue taxa and improve consensus trees. Systematic Biology, 71(5), 1088-1094.
Smith, M.R. (2021): Rogue: Identify Rogue Taxa in Sets of Phylogenetic Trees, Zotero.
Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30(9), 1312-1313.
Winter, D. J. (2017). rentrez: An R package for the NCBI eUtils API (No. e3179v2). PeerJ Preprints.
Wright, E. S. (2016). Using DECIPHER v2. 0 to analyze big biological sequence data in R. R J., 8(1), 352.
The author thanks Heidi E. Steiner for proofreading the vignettes and
documentation in phruta
in addition to early versions of
this manuscript. Anna Krystalli, Rayna Harris, Frederick Boehm, and
Maëlle Salmon provided excellent comments during the peer review process
of phruta
in ROpenSci. Wonkyiun Yim provided early support
in salphycon
. Finally, the author thanks the author and
maintainers of all the packages used in phruta
and
salphycon
for their constant support to the field.