| Title: | Interface to 'the CAVD DataSpace' |
|---|---|
| Description: | Provides a convenient API interface to access immunological data within 'the CAVD DataSpace'(<https://dataspace.cavd.org>), a data sharing and discovery tool that facilitates exploration of HIV immunological data from pre-clinical and clinical HIV vaccine studies. |
| Authors: | Ju Yeong Kim [aut], Sean Hughes [rev], Jason Taylor [aut, cre], Helen Miller [aut], CAVD DataSpace [cph] |
| Maintainer: | Jason Taylor <[email protected]> |
| License: | GPL-3 |
| Version: | 0.7.7 |
| Built: | 2025-10-26 05:13:08 UTC |
| Source: | https://github.com/ropensci/DataSpaceR |
DataSpaceR provides a convenient API for accessing datasets within the DataSpace database.
Uses the Rlabkey package to connect to DataSpace. Implements convenient methods for accessing datasets.
Ju Yeong Kim
Check that there is a netrc file with a valid entry for the CAVD DataSpace.
checkNetrc(netrcFile = getNetrcPath(), onStaging = FALSE, verbose = TRUE)checkNetrc(netrcFile = getNetrcPath(), onStaging = FALSE, verbose = TRUE)
netrcFile |
A character. File path to netrc file to check. |
onStaging |
A logical. Whether to check the staging server instead of the production server. |
verbose |
A logical. Whether to print the extra details for troubleshooting. |
The name of the netrc file
try(checkNetrc())try(checkNetrc())
Constructor for DataSpaceConnection
connectDS(login = NULL, password = NULL, verbose = FALSE, onStaging = FALSE)connectDS(login = NULL, password = NULL, verbose = FALSE, onStaging = FALSE)
login |
A character. Optional argument. If there is no netrc file a temporary one can be written by passing login and password of an active DataSpace account. |
password |
A character. Optional. The password for the selected login. |
verbose |
A logical. Whether to print the extra details for troubleshooting. |
onStaging |
A logical. Whether to connect to the staging server instead of the production server. |
Instantiates an DataSpaceConnection.
The constructor will try to take the values of the various labkey.*
parameters from the global environment. If they don't exist, it will use
default values. These are assigned to 'options', which are then used by the
DataSpaceConnection class.
an instance of DataSpaceConnection
## Not run: con <- connectDS() ## End(Not run) con <- try(connectDS()) if (inherits(con, "try-error")) { warning("Read README for more information on how to set up a .netrc file.") }## Not run: con <- connectDS() ## End(Not run) con <- try(connectDS()) if (inherits(con, "try-error")) { warning("Read README for more information on how to set up a .netrc file.") }
The DataSpaceConnection class
The DataSpaceConnection class
configA list. Stores configuration of the connection object such as URL, path and username.
availableStudiesA data.table. The table of available studies.
availableGroupsA data.table. The table of available groups.
availablePublicationsA data.table. The table of available publications.
mabGridSummaryA data.table. The filtered grid with updated
n_ columns and geometric_mean_curve_ic50.
mabGridA data.table. The filtered mAb grid.
virusMetadataA data.table. Metadata about all viruses in the DataSpace.
virusNameMappingTablesA list of data.table objects. This list contains 'virusMetadataAll', 'virusLabId', and 'virus_synonym' which are described in the vignette 'Virus_Name_Mapping_Tables'.
new()
Initialize a DataSpaceConnection object.
See connectDS.
DataSpaceConnection$new( login = NULL, password = NULL, verbose = FALSE, onStaging = FALSE )
loginA character. Optional argument. If there is no netrc file a temporary one can be written by passing login and password of an active DataSpace account.
passwordA character. Optional. The password for the selected login.
verboseA logical. Whether to print the extra details for troubleshooting.
onStagingA logical. Whether to connect to the staging server instead of the production server.
A new 'DataSpaceConnection' object.
print()
Print the DataSpaceConnection object.
DataSpaceConnection$print()
getStudy()
Create a DataSpaceStudy object.
DataSpaceConnection$getStudy(studyName)
studyNameA character. Name of the study to retrieve.
getGroup()
Create a DataSpaceStudy object.
DataSpaceConnection$getGroup(groupId)
groupIdAn integer. ID of the group to retrieve.
filterMabGrid()
Filter rows in the mAb grid by specifying the values to keep in the
columns found in the mabGrid field. It takes the column and the
values and filters the underlying tables.
DataSpaceConnection$filterMabGrid(using, value)
usingA character. Name of the column to filter.
valueA character vector. Values to keep in the mAb grid.
resetMabGrid()
Reset the mAb grid to the unfiltered state.
DataSpaceConnection$resetMabGrid()
getMab()
Create a DataSpaceMab object.
DataSpaceConnection$getMab()
downloadPublicationData()
Download publication data for a chosen publication.
DataSpaceConnection$downloadPublicationData( publicationId, outputDir = getwd(), unzip = TRUE, verbose = TRUE )
publicationIdA character/integer. ID for the publication to download data for.
outputDirA character. Path to directory to download publication data.
unzipA logical. If TRUE, unzip publication data to outputDir.
verboseA logical. Default TRUE.
refresh()
Refresh the connection object to update available studies and groups.
DataSpaceConnection$refresh()
clone()
The objects of this class are cloneable with this method.
DataSpaceConnection$clone(deep = FALSE)
deepWhether to make a deep clone.
## Not run: # Create a connection (Initiate a DataSpaceConnection object) con <- connectDS() con # Connect to cvd408 # https://dataspace.cavd.org/cds/CAVD/app.view#learn/learn/Study/cvd408?q=408 cvd408 <- con$getStudy("cvd408") # Connect to all studies cvd <- con$getStudy("cvd408") # Connect to the NYVAC durability comparison group # https://dataspace.cavd.org/cds/CAVD/app.view#group/groupsummary/220 nyvac <- con$getGroup(220) # Refresh the connection object to update available studies and groups con$refresh() ## End(Not run)## Not run: # Create a connection (Initiate a DataSpaceConnection object) con <- connectDS() con # Connect to cvd408 # https://dataspace.cavd.org/cds/CAVD/app.view#learn/learn/Study/cvd408?q=408 cvd408 <- con$getStudy("cvd408") # Connect to all studies cvd <- con$getStudy("cvd408") # Connect to the NYVAC durability comparison group # https://dataspace.cavd.org/cds/CAVD/app.view#group/groupsummary/220 nyvac <- con$getGroup(220) # Refresh the connection object to update available studies and groups con$refresh() ## End(Not run)
The DataSpaceMab class
The DataSpaceMab class
DataSpaceConnection$getMab()
configA list. Stores configuration of the connection object such as URL, path and username.
studyAndMabsA data.table. The table of available mAbs by study.
mabsA data.table. The table of available mAbs and their attributes.
nabMabA data.table. The table of mAbs and their neutralizing measurements against viruses.
studiesA data.table. The table of available studies.
assaysA data.table. The table of assay status by study.
variableDefinitionsA data.table. The table of variable definitions.
new()
Initialize DataSpaceMab object.
See DataSpaceConnection.
DataSpaceMab$new(mabMixture, filters, config)
mabMixtureA character vector.
filtersA list.
configA list.
print()
Print the DataSpaceMab object summary.
DataSpaceMab$print()
refresh()
Refresh the DataSpaceMab object to update datasets.
DataSpaceMab$refresh()
getLanlMetadata()
Applies LANL metadata to mabs table.
DataSpaceMab$getLanlMetadata()
clone()
The objects of this class are cloneable with this method.
DataSpaceMab$clone(deep = FALSE)
deepWhether to make a deep clone.
## Not run: # Create a connection (Initiate a DataSpaceConnection object) con <- connectDS() # Browse the mAb Grid con$mabGridSummary # Filter the grid by viruses con$filterMabGrid(using = "virus", value = c("242-14", "Q23.17", "6535.3", "BaL.26", "DJ263.8")) # Filter the grid by donor species (llama) con$filterMabGrid(using = "donor_species", value = "llama") # Check the updated grid con$mabGridSummary # Retrieve available viruses in the filtered grid con$mabGrid[, unique(virus)] # Retrieve available clades for 1H9 mAb mixture in the filtered grid con$mabGrid[mab_mixture == "1H9", unique(clade)] # Create a DataSpaceMab object that contains the filtered mAb data mab <- con$getMab() mab # Inspect the `nabMab` field mab$nabMab ## End(Not run)## Not run: # Create a connection (Initiate a DataSpaceConnection object) con <- connectDS() # Browse the mAb Grid con$mabGridSummary # Filter the grid by viruses con$filterMabGrid(using = "virus", value = c("242-14", "Q23.17", "6535.3", "BaL.26", "DJ263.8")) # Filter the grid by donor species (llama) con$filterMabGrid(using = "donor_species", value = "llama") # Check the updated grid con$mabGridSummary # Retrieve available viruses in the filtered grid con$mabGrid[, unique(virus)] # Retrieve available clades for 1H9 mAb mixture in the filtered grid con$mabGrid[mab_mixture == "1H9", unique(clade)] # Create a DataSpaceMab object that contains the filtered mAb data mab <- con$getMab() mab # Inspect the `nabMab` field mab$nabMab ## End(Not run)
The DataSpaceStudy class
The DataSpaceStudy class
DataSpaceConnection$getStudy()
DataSpaceConnection$getGroup()
studyA character. The study name.
configA list. Stores configuration of the connection object such as URL, path and username.
availableDatasetsA data.table. The table of datasets available in
the DataSpaceStudy object.
cacheA list. Stores the data to avoid downloading the same tables multiple times.
dataDirA character. Default directory for storing nonstandard
datasets. Set with setDataDir(dataDir).
treatmentArmA data.table. The table of treatment arm information for the connected study. Not available for all study connection.
groupA character. The group name.
studyInfoA list. Stores the information about the study.
new()
Initialize DataSpaceStudy class.
See DataSpaceConnection.
DataSpaceStudy$new(study = NULL, config = NULL, group = NULL, studyInfo = NULL)
studyA character. Name of the study to retrieve.
configA list. Stores configuration of the connection object such as URL, path and username.
groupAn integer. ID of the group to retrieve.
studyInfoA list. Stores the information about the study.
print()
Print DataSpaceStudy class.
DataSpaceStudy$print()
getDataset()
Get a dataset from the connection.
DataSpaceStudy$getDataset( datasetName, mergeExtra = FALSE, colFilter = NULL, reload = FALSE, outputDir = NULL, ... )
datasetNameA character. Name of the dataset to retrieve.
Accepts the value in either the "name" or "label" field from availableDatasets.
mergeExtraA logical. If set to TRUE, merge extra information. Ignored for non-integrated datasets.
colFilterA matrix. A filter as returned by Rlabkey's
makeFilter.
reloadA logical. If set to TRUE, download the dataset, whether a cached version exist or not.
outputDirA character. Optional, specifies directory to download
nonstandard datasets. If NULL, data will be downloaded to
dataDir, set with setDataDir(dataDir). If dataDir
is not set, and outputDir is NULL, a tmp directory will be
used.
...Extra arguments to be passed to
labkey.selectRows
clearCache()
Clear cache. Remove downloaded datasets.
DataSpaceStudy$clearCache()
getDatasetDescription()
Get variable information.
DataSpaceStudy$getDatasetDescription(datasetName, outputDir = NULL)
datasetNameA character. Name of the dataset to retrieve.
Accepts the value in either the "name" or "label" field from availableDatasets.
outputDirA character. Directory path.
setDataDir()
Set default directory to download non-integrated datasets. If no
dataDir is set, a tmp directory will be used.
DataSpaceStudy$setDataDir(dataDir)
dataDirA character. Directory path.
refresh()
Refresh the study object to update available datasets and treatment info.
DataSpaceStudy$refresh()
clone()
The objects of this class are cloneable with this method.
DataSpaceStudy$clone(deep = FALSE)
deepWhether to make a deep clone.
## Not run: # Create a connection (Initiate a DataSpaceConnection object) con <- connectDS() # Connect to cvd408 (Initiate a DataSpaceStudy object) # https://dataspace.cavd.org/cds/CAVD/app.view#learn/learn/Study/cvd408?q=408 cvd408 <- con$getStudy("cvd408") cvd408 # Retrieve Neutralizing antibody dataset (NAb) for cvd408 from DataSpace NAb <- cvd408$getDataset("NAb") # Get variable information of the NAb dataset cvd408$getDatasetDescription("NAb") # Take a look at cvd408's treatment arm information cvd408$treatmentArm # Clear cache of a study object cvd408$clearCache() # Connect to the NYVAC durability comparison group # https://dataspace.cavd.org/cds/CAVD/app.view#group/groupsummary/220 nyvac <- con$getGroup(220) # Connect to all studies cvd <- con$getStudy("") # Refresh the study object to update available datasets and treatment info cvd$refresh() ## End(Not run)## Not run: # Create a connection (Initiate a DataSpaceConnection object) con <- connectDS() # Connect to cvd408 (Initiate a DataSpaceStudy object) # https://dataspace.cavd.org/cds/CAVD/app.view#learn/learn/Study/cvd408?q=408 cvd408 <- con$getStudy("cvd408") cvd408 # Retrieve Neutralizing antibody dataset (NAb) for cvd408 from DataSpace NAb <- cvd408$getDataset("NAb") # Get variable information of the NAb dataset cvd408$getDatasetDescription("NAb") # Take a look at cvd408's treatment arm information cvd408$treatmentArm # Clear cache of a study object cvd408$clearCache() # Connect to the NYVAC durability comparison group # https://dataspace.cavd.org/cds/CAVD/app.view#group/groupsummary/220 nyvac <- con$getGroup(220) # Connect to all studies cvd <- con$getStudy("") # Refresh the study object to update available datasets and treatment info cvd$refresh() ## End(Not run)
Get a default netrc file path
getNetrcPath()getNetrcPath()
A character vector containing the default netrc file path
getNetrcPath()getNetrcPath()
Write a netrc file that is valid for accessing DataSpace.
writeNetrc( login, password, netrcFile = NULL, onStaging = FALSE, overwrite = FALSE )writeNetrc( login, password, netrcFile = NULL, onStaging = FALSE, overwrite = FALSE )
login |
A character. Email address used for logging in on DataSpace. |
password |
A character. Password associated with the login. |
netrcFile |
A character. Credentials will be written into that file. If left NULL, netrc will be written into a temporary file. |
onStaging |
A logical. Whether to connect to the staging server instead of the production server. |
overwrite |
A logical. Whether to overwrite the existing netrc file. |
The database is accessed with the user's credentials.
A netrc file storing login and password information is required.
See here
for instruction on how to register and set DataSpace credential.
By default curl will look for the file in your home directory.
A character vector containing the netrc file path
# First, create an account in the DataSpace App and read the terms of use # Next, create a netrc file using writeNetrc() writeNetrc( login = "[email protected]", password = "yourSecretPassword" ) # Specify `netrcFile = getNetrcPath()` to write netrc in the default path# First, create an account in the DataSpace App and read the terms of use # Next, create a netrc file using writeNetrc() writeNetrc( login = "[email protected]", password = "yourSecretPassword" ) # Specify `netrcFile = getNetrcPath()` to write netrc in the default path