Title: | Access London Natural History Museum Host-Helminth Record Database |
---|---|
Description: | Access to large host-parasite data is often hampered by the availability of data and difficulty in obtaining it in a programmatic way to encourage analyses. 'helminthR' provides a programmatic interface to the London Natural History Museum's host-parasite database, one of the largest host-parasite databases existing currently <https://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>. The package allows the user to query by host species, parasite species, and geographic location. |
Authors: | Tad Dallas [aut, cre] |
Maintainer: | Tad Dallas <[email protected]> |
License: | GPL-3 |
Version: | 1.0.10 |
Built: | 2024-12-12 06:14:18 UTC |
Source: | https://github.com/ropensci/helminthR |
'helminthR': A programmatic interface to the London Natural History Museum's host-parasite database.
The package currently allows you to query by host species, parasite species, and geographic location. No information is provided on parasite prevalence or intensity.
Tad Dallas [email protected]
Gibson, D. I., Bray, R. A., & Harris, E. A. (Compilers) (2005). Host-Parasite Database of the Natural History Museum, London. <http://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>
Given a host-parasite edgelist, this function can validate species names,
provide further taxonomic information (thanks to taxize
),
and remove records only to genus level.
cleanData(edge, speciesOnly = FALSE, validateHosts = FALSE)
cleanData(edge, speciesOnly = FALSE, validateHosts = FALSE)
edge |
Host-parasite edgelist obtained from |
speciesOnly |
boolean flag to remove host and parasite species where data are only available at genus level (default = FALSE) |
validateHosts |
boolean flag to check host species names against Catalogue of Life information and output taxonomic information (default = FALSE) |
Use data(locations)
for a list of possible locations.
cleanEdge Host-parasite edgelist, but cleaned
Tad Dallas
Given a host genus, species, and/or location, returns a list of parasite
occurrences on that host or for that location.
Use data(locations)
for a list of possible locations.
findHost( genus = NULL, species = NULL, location = NULL, citation = FALSE, hostState = NULL, speciesOnly = FALSE, validateHosts = FALSE, parGroup = NULL, removeDuplicates = FALSE )
findHost( genus = NULL, species = NULL, location = NULL, citation = FALSE, hostState = NULL, speciesOnly = FALSE, validateHosts = FALSE, parGroup = NULL, removeDuplicates = FALSE )
genus |
Host genus |
species |
Host species |
location |
Geographic location. |
citation |
Boolean. Should the output include the citation link and the number of supporting citations? default is FALSE |
hostState |
number corresponding to one of six different host states. The default value is NULL and includes all host states |
speciesOnly |
boolean flag to remove host and parasite species where data are only available at genus level (default = FALSE) |
validateHosts |
boolean flag to check host species names against Catalogue of Life information and output taxonomic information (default = FALSE) |
parGroup |
name of parasite group to query (default queries all groups) |
removeDuplicates |
(boolean) should duplicate host-parasite combinations be removed? (default is FALSE) |
hostState
can take values 1-6 corresponding to if the recorded host
was found
(1) "In the wild"
(2) "Zoo captivity"
(3) "Domesticated"
(4) "Experimental"
(5) "Commercial source"
(6) "Accidental infestation"
A value of NULL should be entered if you would like to include all hostStates.
parGroup
can be specified as "Acanthocephalans", "Cestodes",
"Monogeans", "Nematodes", "Trematodes", or "Turbs" (Turbellarians etc.).
The default is to query all helminth parasite taxa.
Three (or five) column data.frame containing host species, parasite species (shortened name and full name), and citation link and number of citations (if 'citation'=TRUE), with each row corresponding to an occurrence of a parasite species on a host species.
Tad Dallas
Gibson, D. I., Bray, R. A., & Harris, E. A. (Compilers) (2005). Host-Parasite Database of the Natural History Museum, London. <http://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>
gorillaParasites <- helminthR::findHost("Gorilla", "gorilla") # An example of how to query multiple hosts when you have a # vector of host species names hosts <- c("Gorilla gorilla", "Peromyscus leucopus") plyr::ldply(hosts, function(x) {helminthR::findHost(unlist(strsplit(x, " "))[1], unlist(strsplit(x," "))[2])})
gorillaParasites <- helminthR::findHost("Gorilla", "gorilla") # An example of how to query multiple hosts when you have a # vector of host species names hosts <- c("Gorilla gorilla", "Peromyscus leucopus") plyr::ldply(hosts, function(x) {helminthR::findHost(unlist(strsplit(x, " "))[1], unlist(strsplit(x," "))[2])})
Given a location (available from data{locations}
) this function
returns all host-parasite associations in that location.
findLocation( location = NULL, group = NULL, citation = FALSE, hostState = NULL, speciesOnly = FALSE, validateHosts = FALSE, removeDuplicates = FALSE )
findLocation( location = NULL, group = NULL, citation = FALSE, hostState = NULL, speciesOnly = FALSE, validateHosts = FALSE, removeDuplicates = FALSE )
location |
Location of host-parasite interaction. |
group |
Parasite group - Cestodes, Acanthocephalans, Monogeneans, Nematodes, Trematodes, or Turbellarian etc. (Turb) |
citation |
Boolean. Should the output include the citation link and the number of supporting citations? default is FALSE |
hostState |
number corresponding to one of six different host states. The default value is NULL and includes all host states. |
speciesOnly |
boolean flag to remove host and parasite species where data are only available at genus level (default = FALSE) |
validateHosts |
boolean flag to check host species names against Catalogue of Life information and output taxonomic information (default = FALSE) |
removeDuplicates |
(boolean) should duplicate host-parasite combinations be removed? (default is FALSE) |
hostState
can take values 1-6 corresponding to if the recorded
host was found
(1) "In the wild"
(2) "Zoo captivity"
(3) "Domesticated"
(4) "Experimental"
(5) "Commercial source"
(6) "Accidental infestation"
Three (or five) column data.frame containing host species,
parasite species (shortened name and full name), and citation link and
number of citations (if citation = TRUE
), with each row corresponding
to an occurrence of a parasite species on a host species.
Tad Dallas
Gibson, D. I., Bray, R. A., & Harris, E. A. (Compilers) (2005). Host-Parasite Database of the Natural History Museum, London. <http://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>
FrenchHostPars <- helminthR::findLocation(location="France")
FrenchHostPars <- helminthR::findLocation(location="France")
Given a host genus and/or species, this function returns a matrix containing
host-parasite interaction data. Search available locations using
data(locations)
.
findParasite( genus = NULL, species = NULL, group = NULL, subgroup = NULL, location = NULL, citation = FALSE, hostState = NULL, speciesOnly = FALSE, validateHosts = FALSE, removeDuplicates = FALSE )
findParasite( genus = NULL, species = NULL, group = NULL, subgroup = NULL, location = NULL, citation = FALSE, hostState = NULL, speciesOnly = FALSE, validateHosts = FALSE, removeDuplicates = FALSE )
genus |
Parasite genus |
species |
Parasite species |
group |
Parasite group - Cestodes, Acanthocephalans, Monogeneans, Nematodes, Trematodes, or Turbellarian etc. (Turb) |
subgroup |
Parasite subgroup (family names largely) |
location |
Location of host-parasite interaction. |
citation |
Boolean. Should the output include the citation link and the number of supporting citations? default is FALSE |
hostState |
number corresponding to one of six different host states. The default value is NULL includes all host states |
speciesOnly |
boolean flag to remove host and parasite species where data are only available at genus level (default = FALSE) |
validateHosts |
boolean flag to check host species names against Catalogue of Life information and output taxonomic information (default = FALSE) |
removeDuplicates |
(boolean) should duplicate host-parasite combinations be removed? (default is FALSE) |
hostState
can take values 1-6 corresponding to if the recorded host
was found
(1) "In the wild"
(2) "Zoo captivity"
(3) "Domesticated"
(4) "Experimental"
(5) "Commercial source"
(6) "Accidental infestation"
Three (or five) column data.frame containing host species,
parasite species (shortened name and full name), and citation link and
number of citations (if citation = TRUE
), with each row corresponding
to an occurrence of a parasite species on a host species.
Tad Dallas
Gibson, D. I., Bray, R. A., & Harris, E. A. (Compilers) (2005). Host-Parasite Database of the Natural History Museum, London. <http://www.nhm.ac.uk/research-curation/scientific-resources/taxonomy-systematics/host-parasites/>
strongHosts <- helminthR::findParasite(genus = "Strongyloides") # An example of how to query multiple parasite species when # you have a vector of parasite species names parasites <- c("Ascaris aculeati", "Oxyuris flagellum") plyr::ldply(parasites, function(x){ helminthR::findParasite(unlist(strsplit(x, " "))[1], unlist(strsplit(x," "))[2]) } )
strongHosts <- helminthR::findParasite(genus = "Strongyloides") # An example of how to query multiple parasite species when # you have a vector of parasite species names parasites <- c("Ascaris aculeati", "Oxyuris flagellum") plyr::ldply(parasites, function(x){ helminthR::findParasite(unlist(strsplit(x, " "))[1], unlist(strsplit(x," "))[2]) } )
Lists geographic locations that can be input to findHost
or
findParasite
and the corresponding latitude and longitude coordinates
of the country's centroid. The georeferencing was performed dynamically using the
Google Maps API, but they have since restricted access. The data on locations is now
provided in this data file called locations
– data(locations)
– and is based on
an earlier usage of ggmap
. The geographic coordinates may not be accurate, and users
should check for accuracy (and feel free to file an issue or PR on Github with corrections).
data(locations)
data(locations)
Name of geographic location
Latitude of location centroid
Longitude of location centroid
Gibson, D. I., Bray, R. A., & Harris, E. A. (Compilers) (2005). Host-Parasite Database of the Natural History Museum, London.