Title: | Download and Aggregate Data from Public Hire Bicycle Systems |
---|---|
Description: | Download and aggregate data from all public hire bicycle systems which provide open data, currently including 'Santander' Cycles in London, U.K.; from the U.S.A., 'Ford GoBike' in San Francisco CA, 'citibike' in New York City NY, 'Divvy' in Chicago IL, 'Capital Bikeshare' in Washington DC, 'Hubway' in Boston MA, 'Metro' in Los Angeles LA, 'Indego' in Philadelphia PA, and 'Nice Ride' in Minnesota; 'Bixi' from Montreal, Canada; and 'mibici' from Guadalajara, Mexico. |
Authors: | Mark Padgham [aut, cre] , Richard Ellison [aut], Tom Buckley [aut], Ryszard Szymański [ctb], Bea Hernández [rev] (Bea reviewed the package for ropensci, see https://github.com/ropensci/onboarding/issues/116), Elaine McVey [rev] (Elaine reviewed the package for ropensci, see https://github.com/ropensci/onboarding/issues/116), SQLite Consortium [ctb] (Authors of included SQLite code) |
Maintainer: | Mark Padgham <[email protected]> |
License: | GPL-3 |
Version: | 0.2.5.046 |
Built: | 2024-12-01 04:57:11 UTC |
Source: | https://github.com/ropensci/bikedata |
List of cities currently included in bikedata
bike_cities()
bike_cities()
A data.frame
of cities, abbreviations, and names of bike
systems currently able to be accessed.
bike_cities ()
bike_cities ()
Extract daily trip counts for all stations
bike_daily_trips( bikedb, city, station, member, birth_year, gender, standardise = FALSE )
bike_daily_trips( bikedb, city, station, member, birth_year, gender, standardise = FALSE )
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
city |
City for which trips are to be counted - mandatory if database contains data for more than one city |
station |
Optional argument specifying bike station for which trips are to be counted |
member |
If given, extract only trips by registered members
( |
birth_year |
If given, extract only trips by registered members whose declared birth years equal or lie within the specified value or values. |
gender |
If given, extract only records for trips by registered
users declaring the specified genders ( |
standardise |
If TRUE, daily trip counts are standardised to the relative numbers of bike stations in operation for each day, so daily trip counts are increased during (generally early) periods with relatively fewer stations, and decreased during (generally later) periods with more stations. |
A data.frame
containing daily dates and total numbers of trips
## Not run: bike_write_test_data () # by default in tempdir () # dl_bikedata (city = "la", data_dir = data_dir) # or some real data! store_bikedata (data_dir = tempdir (), bikedb = "testdb") # create database indexes for quicker access: index_bikedata_db (bikedb = "testdb") bike_daily_trips (bikedb = "testdb", city = "ny") bike_daily_trips (bikedb = "testdb", city = "ny", member = TRUE) bike_daily_trips (bikedb = "testdb", city = "ny", gender = "f") bike_daily_trips (bikedb = "testdb", city = "ny", station = "173", gender = 1) bike_rm_test_data () bike_rm_db ("testdb") # don't forget to remove real data! # file.remove (list.files (".", pattern = ".zip")) ## End(Not run)
## Not run: bike_write_test_data () # by default in tempdir () # dl_bikedata (city = "la", data_dir = data_dir) # or some real data! store_bikedata (data_dir = tempdir (), bikedb = "testdb") # create database indexes for quicker access: index_bikedata_db (bikedb = "testdb") bike_daily_trips (bikedb = "testdb", city = "ny") bike_daily_trips (bikedb = "testdb", city = "ny", member = TRUE) bike_daily_trips (bikedb = "testdb", city = "ny", gender = "f") bike_daily_trips (bikedb = "testdb", city = "ny", station = "173", gender = 1) bike_rm_test_data () bike_rm_db ("testdb") # don't forget to remove real data! # file.remove (list.files (".", pattern = ".zip")) ## End(Not run)
Extract date-time limits from trip database
bike_datelimits(bikedb, city)
bike_datelimits(bikedb, city)
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
city |
If given, date limits are calculated only for trips in that city. |
A vector of 2 elements giving the date-time of the first and last trips
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # dl_bikedata (city = 'la', data_dir = data_dir) # or some real data! # Remove one London file that triggers an API call which may fail tests: file.remove (file.path (tempdir(), "01aJourneyDataExtract10Jan16-23Jan16.csv")) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) bike_datelimits ('testdb') # overall limits for all cities bike_datelimits ('testdb', city = 'NYC') bike_datelimits ('testdb', city = 'los angeles') bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files ('.', pattern = '.zip')) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # dl_bikedata (city = 'la', data_dir = data_dir) # or some real data! # Remove one London file that triggers an API call which may fail tests: file.remove (file.path (tempdir(), "01aJourneyDataExtract10Jan16-23Jan16.csv")) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) bike_datelimits ('testdb') # overall limits for all cities bike_datelimits ('testdb', city = 'NYC') bike_datelimits ('testdb', city = 'los angeles') bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files ('.', pattern = '.zip')) ## End(Not run)
Count number of entries in sqlite3 database tables
bike_db_totals(bikedb, trips = TRUE, city)
bike_db_totals(bikedb, trips = TRUE, city)
bikedb |
A string containing the path to the SQLite3 database. |
trips |
If true, numbers of trips are counted; otherwise numbers of stations |
city |
Optional city for which numbers of trips are to be counted |
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) bikedb <- file.path (data_dir, 'testdb') # latest_lo_stns is set to FALSE just to avoid download on CRAN; this should # normally remain at default value of TRUE: store_bikedata (data_dir = data_dir, bikedb = bikedb, latest_lo_stns = FALSE) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) bike_db_totals (bikedb = bikedb, trips = TRUE) # total trips bike_db_totals (bikedb = bikedb, trips = TRUE, city = 'ch') bike_db_totals (bikedb = bikedb, trips = TRUE, city = 'ny') bike_db_totals (bikedb = bikedb, trips = FALSE) # total stations bike_db_totals (bikedb = bikedb, trips = FALSE, city = 'ch') bike_db_totals (bikedb = bikedb, trips = FALSE, city = 'ny') # numbers of stations can also be extracted with nrow (bike_stations (bikedb = bikedb)) nrow (bike_stations (bikedb = bikedb, city = 'ch')) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files ('.', pattern = '.zip')) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) bikedb <- file.path (data_dir, 'testdb') # latest_lo_stns is set to FALSE just to avoid download on CRAN; this should # normally remain at default value of TRUE: store_bikedata (data_dir = data_dir, bikedb = bikedb, latest_lo_stns = FALSE) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) bike_db_totals (bikedb = bikedb, trips = TRUE) # total trips bike_db_totals (bikedb = bikedb, trips = TRUE, city = 'ch') bike_db_totals (bikedb = bikedb, trips = TRUE, city = 'ny') bike_db_totals (bikedb = bikedb, trips = FALSE) # total stations bike_db_totals (bikedb = bikedb, trips = FALSE, city = 'ch') bike_db_totals (bikedb = bikedb, trips = FALSE, city = 'ny') # numbers of stations can also be extracted with nrow (bike_stations (bikedb = bikedb)) nrow (bike_stations (bikedb = bikedb, city = 'ch')) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files ('.', pattern = '.zip')) ## End(Not run)
Static summary of which systems provide demographic data
bike_demographic_data()
bike_demographic_data()
A data.frame
detailing the kinds of demographic data provided
by the different systems
bike_demographic_data () # Examples of filtering data by demographic parameters: ## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) sum (bike_tripmat (bikedb = bikedb, city = "bo")) # 200 trips sum (bike_tripmat (bikedb = bikedb, city = "bo", birth_year = 1990)) # 9 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = "f")) # 22 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = 2)) # 22 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = 1)) # = m; 68 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = 0)) # = n; 9 # Sum of gender-filtered trips is less than total because \code{gender = 0} # extracts all registered users with unspecified genders, while without # gender filtering extracts all trips for registered and non-registered # users. # The following generates an error because Washinton DC's DivvyBike system # does not provide demographic data sum (bike_tripmat (bikedb = bikedb, city = "dc", birth_year = 1990)) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) ## End(Not run)
bike_demographic_data () # Examples of filtering data by demographic parameters: ## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) sum (bike_tripmat (bikedb = bikedb, city = "bo")) # 200 trips sum (bike_tripmat (bikedb = bikedb, city = "bo", birth_year = 1990)) # 9 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = "f")) # 22 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = 2)) # 22 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = 1)) # = m; 68 sum (bike_tripmat (bikedb = bikedb, city = "bo", gender = 0)) # = n; 9 # Sum of gender-filtered trips is less than total because \code{gender = 0} # extracts all registered users with unspecified genders, while without # gender filtering extracts all trips for registered and non-registered # users. # The following generates an error because Washinton DC's DivvyBike system # does not provide demographic data sum (bike_tripmat (bikedb = bikedb, city = "dc", birth_year = 1990)) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) ## End(Not run)
Extract station-to-station distance matrix
bike_distmat(bikedb, city, expand = 0.5, long = FALSE, quiet = TRUE)
bike_distmat(bikedb, city, expand = 0.5, long = FALSE, quiet = TRUE)
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
city |
City for which tripmat is to be aggregated |
expand |
Distances are calculated by routing through the OpenStreetMap street network surrounding the bike stations, with the street network expanded by this amount to ensure all stations can be connected. |
long |
If FALSE, a square distance matrix of (num-stations, num_stations) is returned; if TRUE, a long-format matrix of (stn-from, stn-to, distance) is returned. |
quiet |
If FALSE, progress is displayed on screen |
If long = FALSE
, a square matrix of numbers of trips between
each station, otherwise a long-form tibble with three columns of of
(start_station_id, end_station_id, distance)
Distance matrices returned from bike_distamat
use all stations
listed for a given system, while trip matrices extracted with
bike_tripmat will often have fewer stations because operational
station numbers commonly vary over time. The two matrices may be reconciled
with the match_trips2dists
function, enabling then to be directly
compared.
Check whether files in database are the latest published files
bike_latest_files(bikedb)
bike_latest_files(bikedb)
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
A named vector of binary values: TRUE is files in bikedb
are
the latest versions; otherwise FALSE, in which case store_bikedata
could be run to update the database.
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = 'la', data_dir = data_dir) # Remove one London file that triggers an API call which may fail tests: file.remove (file.path (tempdir(), "01aJourneyDataExtract10Jan16-23Jan16.csv")) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) # bike_latest_files (bikedb) # All false because test data are not current, but would pass with real data bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = '.zip')) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = 'la', data_dir = data_dir) # Remove one London file that triggers an API call which may fail tests: file.remove (file.path (tempdir(), "01aJourneyDataExtract10Jan16-23Jan16.csv")) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) # bike_latest_files (bikedb) # All false because test data are not current, but would pass with real data bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = '.zip')) ## End(Not run)
Match rows and columns of distance and trip matrices
bike_match_matrices(mat1, mat2)
bike_match_matrices(mat1, mat2)
mat1 |
A wide- or long-form trip or distance matrix returned from
|
mat2 |
The corresponding distance or trip matrix. |
A list of the same matrices with matching start and end stations, and
in the same order passed to the routine (that is, mat1
then
mat2
). Each kind of matrix will be identified and named accordingly as
either "trip" or "dist". Matrices are returned in same format (long or wide)
as submitted.
Distance matrices returned from bike_distamat
use all stations
listed for a given system, while trip matrices extracted with
bike_tripmat will often have fewer stations because operational
station numbers commonly vary over time. This function reconciles the two
matrices through matching all row and column names (or just station IDs for
long-form matrices), enabling then to be directly compared.
If no directory is specified the bikedb
argument passed to
store_bikedata
, the database is created in tempdir()
. This
function provides a convenient way to remove the database in such cases by
simply passing the name.
bike_rm_db(bikedb)
bike_rm_db(bikedb)
bikedb |
The SQLite3 database containing the bikedata. |
TRUE if bikedb
successfully removed; otherwise FALSE
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
The function bike_write_test_data()
writes several small
zip-compressed files to disk. The default location is tempdir()
, in
which case these files will be automatically removed on termination of
current R session. If, however, any other value for data_dir
is passed
to bike_write_test_data()
, then the resultant files ought be deleted
by calling this function.
bike_rm_test_data(data_dir = tempdir())
bike_rm_test_data(data_dir = tempdir())
data_dir |
Directory in which data were extracted. |
Number of files successfully removed, which should equal six.
## Not run: bike_write_test_data () list.files (tempdir ()) bike_rm_test_data () bike_write_test_data (data_dir = getwd ()) list.files () bike_rm_test_data (data_dir = getwd ()) ## End(Not run)
## Not run: bike_write_test_data () list.files (tempdir ()) bike_rm_test_data () bike_write_test_data (data_dir = getwd ()) list.files () bike_rm_test_data (data_dir = getwd ()) ## End(Not run)
Extract station matrix from SQLite3 database
bike_stations(bikedb, city)
bike_stations(bikedb, city)
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
city |
Optional city (or vector of cities) for which stations are to be extracted |
Matrix containing data for each station
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = 'la', data_dir = data_dir) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) stations <- bike_stations (bikedb) head (stations) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = '.zip')) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = 'la', data_dir = data_dir) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) stations <- bike_stations (bikedb) head (stations) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = '.zip')) ## End(Not run)
Get names of files read into database
bike_stored_files(bikedb, city)
bike_stored_files(bikedb, city)
bikedb |
A string containing the path to the SQLite3 database. |
city |
Optional city for which filenames are to be obtained |
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) files <- bike_stored_files (bikedb = bikedb) # returns a tibble with names of all stored files bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files ('.', pattern = '.zip')) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) bikedb <- file.path (data_dir, 'testdb') store_bikedata (data_dir = data_dir, bikedb = bikedb) files <- bike_stored_files (bikedb = bikedb) # returns a tibble with names of all stored files bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files ('.', pattern = '.zip')) ## End(Not run)
Extract summary statistics of database
bike_summary_stats(bikedb)
bike_summary_stats(bikedb)
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
A data.frame
containing numbers of trips and stations along
with times and dates of first and last trips for each city in database and a
final column indicating whether the files match the latest published
versions.
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # dl_bikedata (city = "la", data_dir = data_dir) # or some real data! # Remove one London file that triggers an API call which may fail tests: file.remove (file.path (tempdir(), "01aJourneyDataExtract10Jan16-23Jan16.csv")) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) bike_summary_stats ("testdb") bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (".", pattern = ".zip")) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # dl_bikedata (city = "la", data_dir = data_dir) # or some real data! # Remove one London file that triggers an API call which may fail tests: file.remove (file.path (tempdir(), "01aJourneyDataExtract10Jan16-23Jan16.csv")) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) bike_summary_stats ("testdb") bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (".", pattern = ".zip")) ## End(Not run)
A data set containing for each of the six cities a data.frame
object
of 200 trips.
bike_test_data
bike_test_data
A list of one data frame for each of the five cities of (bo, dc, la, lo, ny), plus two more for chicago stations and trips (ch_st, ch_tr). Each of these (except "ch_st") contains 200 representative trips.
These data are only used to convert to .zip
-compressed files
using bike_write_test_data()
. These .zip
files can be
subsequently read into an SQLite3 database using store_bikedata
.
Extract station-to-station trip matrix or data.frame from SQLite3 database
bike_tripmat( bikedb, city, start_date, end_date, start_time, end_time, weekday, member, birth_year, gender, standardise = FALSE, long = FALSE, quiet = FALSE )
bike_tripmat( bikedb, city, start_date, end_date, start_time, end_time, weekday, member, birth_year, gender, standardise = FALSE, long = FALSE, quiet = FALSE )
bikedb |
A string containing the path to the SQLite3 database.
If no directory specified, it is presumed to be in |
city |
City for which tripmat is to be aggregated |
start_date |
If given (as year, month, day) , extract only those records from and including this date |
end_date |
If given (as year, month, day), extract only those records to and including this date |
start_time |
If given, extract only those records starting from and including this time of each day |
end_time |
If given, extract only those records ending at and including this time of each day |
weekday |
If given, extract only those records including the nominated weekdays. This can be a vector of numeric, starting with Sunday=1, or unambiguous characters, so "sa" and "tu" for Saturday and Tuesday. |
member |
If given, extract only trips by registered members
( |
birth_year |
If given, extract only trips by registered members whose declared birth years equal or lie within the specified value or values. |
gender |
If given, extract only records for trips by registered
users declaring the specified genders ( |
standardise |
If TRUE, numbers of trips are standardised to the operating durations of each stations, so trip numbers are increased for stations that have only operated a short time, and vice versa. |
long |
If FALSE, a square tripmat of (num-stations, num_stations) is returned; if TRUE, a long-format matrix of (stn-from, stn-to, ntrips) is returned. |
quiet |
If FALSE, progress is displayed on screen |
If long = FALSE
, a square matrix of numbers of trips between
each station, otherwise a long-form tibble with three columns of of
(start_station_id, end_station_id, numtrips
).
The city
parameter should be given for databases containing data
from multiple cities, otherwise most of the resultant trip matrix is likely
to be empty. Both dates and times may be given either in numeric or
character format, with arbitrary (or no) delimiters between fields. Single
numeric times are interpreted as hours, with 24 interpreted as day's end at
23:59:59.
If standardise = TRUE
, the trip matrix will have the same number
of trips, but they will be re-distributed as described, with more recent
stations having more trips than older stations. Trip number are also
non-integer in this case, whereas they are always integer-valued for
standardise = FALSE
.
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) tm <- bike_tripmat (bikedb = bikedb, city = "ny") # full trip matrix tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_date = 20161201, end_date = 20161201) tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_time = 1) tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_time = "01:00") tm <- bike_tripmat (bikedb = bikedb, city = "ny", end_time = "01:00") tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_date = 20161201, start_time = 1) tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_date = 20161201, end_date = 20161201, start_time = 1, end_time = 2) tm <- bike_tripmat (bikedb = bikedb, city = "ny", weekday = 5) tm <- bike_tripmat (bikedb = bikedb, city = "ny", weekday = c("f", "sa", "th")) tm <- bike_tripmat (bikedb = bikedb, city = "ny", weekday = c("f", "th", "sa")) tm <- bike_tripmat (bikedb = bikedb, city = "ny", member = 1) tm <- bike_tripmat (bikedb = bikedb, city = "ny", birth_year = 1976) tm <- bike_tripmat (bikedb = bikedb, city = "ny", birth_year = 1976:1990) tm <- bike_tripmat (bikedb = bikedb, city = "ny", gender = "f") tm <- bike_tripmat (bikedb = bikedb, city = "ny", gender = "m", birth_year = 1976:1990) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) tm <- bike_tripmat (bikedb = bikedb, city = "ny") # full trip matrix tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_date = 20161201, end_date = 20161201) tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_time = 1) tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_time = "01:00") tm <- bike_tripmat (bikedb = bikedb, city = "ny", end_time = "01:00") tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_date = 20161201, start_time = 1) tm <- bike_tripmat (bikedb = bikedb, city = "ny", start_date = 20161201, end_date = 20161201, start_time = 1, end_time = 2) tm <- bike_tripmat (bikedb = bikedb, city = "ny", weekday = 5) tm <- bike_tripmat (bikedb = bikedb, city = "ny", weekday = c("f", "sa", "th")) tm <- bike_tripmat (bikedb = bikedb, city = "ny", weekday = c("f", "th", "sa")) tm <- bike_tripmat (bikedb = bikedb, city = "ny", member = 1) tm <- bike_tripmat (bikedb = bikedb, city = "ny", birth_year = 1976) tm <- bike_tripmat (bikedb = bikedb, city = "ny", birth_year = 1976:1990) tm <- bike_tripmat (bikedb = bikedb, city = "ny", gender = "f") tm <- bike_tripmat (bikedb = bikedb, city = "ny", gender = "m", birth_year = 1976:1990) bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
Writes very small test files to disk that can be used to test the package.
The entire package works by reading zip-compressed data files provided by the
various hire bicycle systems. This function generates some equivalent data
that can be read into an SQLite
database by the
store_bikedata()
function, so that all other package functionality can
then be tested from the resultant database. This function is also used in the
examples of all other functions.
bike_write_test_data(data_dir = tempdir())
bike_write_test_data(data_dir = tempdir())
data_dir |
Directory in which data are to be extracted. Defaults to
|
## Not run: bike_write_test_data () list.files (tempdir ()) bike_rm_test_data () bike_write_test_data (data_dir = '.') list.files () bike_rm_test_data (data_dir = '.') ## End(Not run)
## Not run: bike_write_test_data () list.files (tempdir ()) bike_rm_test_data () bike_write_test_data (data_dir = '.') list.files () bike_rm_test_data (data_dir = '.') ## End(Not run)
Download data from all public bicycle hire systems which provide open data, currently including
Santander Cycles London, U.K.
citibike New York City NY, U.S.A.
Divvy Chicago IL, U.S.A.
Capital BikeShare Washingon DC, U.S.A.
Hubway Boston MA, U.S.A.
Metro Los Angeles CA, U.S.A.
dl_bikedata
Download data for particular cities and dates
store_bikedata
Store data in SQLite3
database
bike_test_data
Description of test data included with package
bike_write_test_data
Write test data to disk in form precisely
reflecting data provided by all systems
bike_rm_test_data
Remove data written to disk with
bike_write_test_data
bike_daily_trips
Aggregate daily time series of total trips
bike_stations
Extract table detailing locations and names of
bicycle docking stations
bike_tripmat
Extract aggregate counts of trips between all pairs
of stations within a given city
bike_summary_stats
Overall quantitative summary of database
contents. All of the following functions provide individual aspects of this
summary.
bike_db_totals
Count total numbers of trips or stations, either
for entire database or a specified city.
bike_datelimits
Return dates of first and last trips, either for
entire database or a specified city.
bike_demographic_data
Simple table indicating which cities
include demographic parameters with their data
bike_latest_files
Check whether files contained in database are
latest published versions
Mark Padgham
Download data for subsequent storage via store_bikedata.
dl_bikedata(city, data_dir = tempdir(), dates = NULL, quiet = FALSE) download_bikedata(city, data_dir = tempdir(), dates = NULL, quiet = FALSE)
dl_bikedata(city, data_dir = tempdir(), dates = NULL, quiet = FALSE) download_bikedata(city, data_dir = tempdir(), dates = NULL, quiet = FALSE)
city |
City for which to download bike data, or name of corresponding bike system (see Details below). |
data_dir |
Directory to which to download the files |
dates |
Character vector of dates to download data with dates formated as YYYYMM. |
quiet |
If FALSE, progress is displayed on screen |
This function produces (generally) zip-compressed data in R's temporary directory. City names are not case sensitive, and must only be long enough to unambiguously designate the desired city. Names of corresponding bike systems can also be given. Currently possible cities (with minimal designations in parentheses) and names of bike hire systems are:
Boston (bo) | Hubway |
Chicago (ch) | Divvy Bikes |
Washington, D.C. (dc) | Capital Bike Share |
Los Angeles (la) | Metro Bike Share |
London (lo) | Santander Cycles |
Minnesota (mn) | NiceRide |
New York City (ny) | Citibike |
Philadelphia (ph) | Indego |
San Francisco Bay Area (sf) | Ford GoBike |
Ensure you have a fast internet connection and at least 100 Mb space
Only files that don't already exist in data_dir
will be
downloaded, and this function may thus be used to update a directory of files
by downloading more recent files. If a particular file request fails,
downloading will continue regardless. To ensure all files are downloaded,
this function may need to be run several times until a message appears
declaring that 'All data files already exist'
## Not run: dl_bikedata (city = 'New York City USA', dates = 201601:201613) ## End(Not run)
## Not run: dl_bikedata (city = 'New York City USA', dates = 201601:201613) ## End(Not run)
Add indexes to database created with store_bikedata
index_bikedata_db(bikedb)
index_bikedata_db(bikedb)
bikedb |
The SQLite3 database containing the bikedata. |
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) trips <- bike_tripmat (bikedb = bikedb, city = "LA") # trip matrix stations <- bike_stations (bikedb = bikedb) # station data bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) trips <- bike_tripmat (bikedb = bikedb, city = "LA") # trip matrix stations <- bike_stations (bikedb = bikedb) # station data bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
A data.frame
of station id values, names, and geographic coordinates
for 786 stations for London, U.K. These stations are generally (and by
default) downloaded automatically to ensure they are always up to date, but
such downloading can be disabled in the store_bikedata()
function by
setting latest_lo_stns = FALSE
.
lo_stns
lo_stns
A data.frame
of the four columns described above.
Store previously downloaded data (via the dl_bikedata function) in a database for subsequent extraction and analysis.
store_bikedata( bikedb, city, data_dir, dates = NULL, latest_lo_stns = TRUE, quiet = FALSE )
store_bikedata( bikedb, city, data_dir, dates = NULL, latest_lo_stns = TRUE, quiet = FALSE )
bikedb |
A string containing the path to the SQLite3 database to
use. If it doesn't already exist, it will be created, otherwise data
will be appended to existing database. If no directory specified,
it is presumed to be in |
city |
One or more cities for which to download and store bike data, or names of corresponding bike systems (see Details below). |
data_dir |
A character vector giving the directory containing the
data files downloaded with |
dates |
If specified and no |
latest_lo_stns |
If |
quiet |
If FALSE, progress is displayed on screen |
Number of trips added to database
City names are not case sensitive, and must only be long enough to unambiguously designate the desired city. Names of corresponding bike systems can also be given. Currently possible cities (with minimal designations in parentheses) and names of bike hire systems are:
Boston (bo) | Hubway |
Chicago (ch) | Divvy Bikes |
Washington, D.C. (dc) | Capital Bike Share |
Los Angeles (la) | Metro Bike Share |
London (lo) | Santander Cycles |
Minnesota (mn) | NiceRide |
New York City (ny) | Citibike |
Philadelphia (ph) | Indego |
San Francisco Bay Area (sf) | Ford GoBike |
Data for different cities may all be stored in the same database, with city identifiers automatically established from the names of downloaded data files. This function can take quite a long time to execute, and may generate an SQLite3 database file several gigabytes in size.
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) trips <- bike_tripmat (bikedb = bikedb, city = "LA") # trip matrix stations <- bike_stations (bikedb = bikedb) # station data bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)
## Not run: data_dir <- tempdir () bike_write_test_data (data_dir = data_dir) # or download some real data! # dl_bikedata (city = "la", data_dir = data_dir) bikedb <- file.path (data_dir, "testdb") store_bikedata (data_dir = data_dir, bikedb = bikedb) # create database indexes for quicker access: index_bikedata_db (bikedb = bikedb) trips <- bike_tripmat (bikedb = bikedb, city = "LA") # trip matrix stations <- bike_stations (bikedb = bikedb) # station data bike_rm_test_data (data_dir = data_dir) bike_rm_db (bikedb) # don't forget to remove real data! # file.remove (list.files (data_dir, pattern = ".zip")) ## End(Not run)