Package 'internetarchive'

Title: An API Client for the Internet Archive
Description: Search the Internet Archive (<https://archive.org>), retrieve metadata, and download files.
Authors: Lincoln Mullen [aut, cre]
Maintainer: Ahmet Akkoc <[email protected]>
License: MIT + file LICENSE
Version: 0.1.6
Built: 2024-12-02 10:24:03 UTC
Source: https://github.com/ropensci/internetarchive

Help Index


Open an Internet Archive item in the browser

Description

Open an Internet Archive item in the browser

Usage

ia_browse(item_id, type = c("details", "stream"))

Arguments

item_id

The item identifier. If multiple item identifiers are passed in, only the first will be opened.

type

Which page to open: details is the metadata page, stream is the viewing page for items which are associated with a PDF.

Value

Returns the item ID(s) passed to the function.

Examples

# Distinguished Converts to Rome in America
ia_browse("distinguishedcon00scanuoft")

Download files for Internet Archive items.

Description

Download files for Internet Archive items.

Usage

ia_download(files, dir = ".", extended_name = TRUE, overwrite = FALSE,
  silence = FALSE)

Arguments

files

A data frame of files returned by ia_files. You should filter this data frame to download only the files that you actually want.

dir

The directory in which to save the downloaded files.

extended_name

If this argument is FALSE, then the downloaded file will have a filename in the following format: itemidentifier.extension, e.g., thedamnationofth00133gut.txt. If there are multiple files of the same file type for an item, then the file names will not be unique. If this argument is TRUE, them the downloaded file will have a filename in the following format: itemidentifier-original-filename.extension, e.g., thedamnationofth00133gut-133.txt.

overwrite

If TRUE, this function will download all files and overwrite them on disk if they have already been downloaded. If FALSE, then if a file already exists on disk it will not be downloaded again but other downloads will proceed normally.

silence

If false, print the item IDs as they are downloaded.

Value

A data frame including the file names of the downloaded files.

Examples

## Not run: 
if (require(dplyr)) {
  dir <- tempdir()
  ia_get_items("thedamnationofth00133gut") %>%
    ia_files() %>%
    filter(type == "txt") %>% # download only the files we want
    ia_download(dir = dir, extended_name = FALSE)
}

## End(Not run)

Access the list of files associated with an Internet Archive item

Description

Access the list of files associated with an Internet Archive item

Usage

ia_files(items)

Arguments

items

A list describing an Internet Archive items returned from the API.

Value

A list containing the files as a list of character vectors.

Examples

## Not run: 
ats_query <- c("publisher" = "american tract society")
ids       <- ia_search(ats_query, num_results = 3)
items     <- ia_get_items(ids)
files     <- ia_files(items)
files

## End(Not run)

Get the metadata for Internet Archive items

Description

Get the metadata for Internet Archive items

Usage

ia_get_items(item_id, silence = FALSE)

Arguments

item_id

A character vector containing the ID for an Internet Archive item. This argument is vectorized, so you can retrieve multiple items at once.

silence

If false, print the item IDs as they are retrieved.

Value

A list containing the metadata returned by the API. List names correspond to the item IDs.

Examples

## Not run: 
ia_get_items("thedamnationofth00133gut")

ats_query <- c("publisher" = "american tract society")
ids       <- ia_search(ats_query, num_results = 2)
ia_get_items(ids)

## End(Not run)

Access the item IDs from an Internet Archive items

Description

Access the item IDs from an Internet Archive items

Usage

ia_item_id(item)

Arguments

item

A list describing an Internet Archive items returned from the API. This argument is vectorized.

Value

A character vector containing the item IDs.

Examples

ats_query <- c("publisher" = "american tract society")
ids       <- ia_search(ats_query, num_results = 3)
items     <- ia_get_items(ids)
ia_item_id(items)

List accepted metadata fields

Description

List accepted metadata fields

Usage

ia_list_fields()

Value

A list of the accepted metadata fields

Examples

ia_list_fields()

Access the item metadata from an Internet Archive item

Description

Access the item metadata from an Internet Archive item

Usage

ia_metadata(items)

Arguments

items

A list object describing an Internet Archive items returned from the API.

Value

A data frame containing the metadata, with columns id for the item identifier, field for the name of the metadata field, and value for the metadata values.

Examples

ats_query <- c("publisher" = "american tract society")
ids       <- ia_search(ats_query, num_results = 3)
items     <- ia_get_items(ids)
metadata  <- ia_metadata(items)
metadata

Client for the Internet Archive API

Description

This client permits you to search (ia_search), retrieve item metadata (ia_metadata) and associated files (ia_files), and download files (ia_files) in a pipeable interface.