| Title: | Work with Open Road Traffic Casualty Data from Great Britain |
|---|---|
| Description: | Work with and download road traffic casualty data from Great Britain. Enables access to the UK's official road safety statistics, 'STATS19'. Enables users to specify a download directory for the data, which can be set permanently by adding `STATS19_DOWNLOAD_DIRECTORY=/path/to/a/dir` to your `.Renviron` file, which can be opened with `usethis::edit_r_environ()`. The data is provided as a series of `.csv` files. This package downloads, reads-in and formats the data, making it suitable for analysis. See the stats19 vignette for details. Data available from 1979 to 2024. See the official data series at <https://www.data.gov.uk/dataset/cb7ae6f0-4be6-4935-9277-47e5ce24a11f/road-accidents-safety-data>. The package is described in a paper in the Journal of Open Source Software (Lovelace et al. 2019) <doi:10.21105/joss.01181>. See Gilardi et al. (2022) <doi:10.1111/rssa.12823>, Vidal-Tortosa et al. (2021) <doi:10.1016/j.jth.2021.101291>, Tait et al. (2023) <doi:10.1016/j.aap.2022.106895>, and León et al. (2025) <doi:10.18637/jss.v114.i09> for examples of how the data can be used for methodological and empirical research. |
| Authors: | Robin Lovelace [aut, cre] (ORCID: <https://orcid.org/0000-0001-5679-6536>), Malcolm Morgan [aut] (ORCID: <https://orcid.org/0000-0002-9488-9183>), Layik Hama [aut] (ORCID: <https://orcid.org/0000-0003-1912-4890>), Mark Padgham [aut] (ORCID: <https://orcid.org/0000-0003-2172-5265>), David Ranzolin [rev], Adam Sparks [rev, ctb] (ORCID: <https://orcid.org/0000-0002-0061-8359>), Ivo Wengraf [ctb], RAC Foundation [fnd], Blaise Kelly [aut] (ORCID: <https://orcid.org/0000-0003-2623-1598>) |
| Maintainer: | Robin Lovelace <[email protected]> |
| License: | GPL-3 |
| Version: | 4.0.0 |
| Built: | 2026-03-24 17:15:15 UTC |
| Source: | https://github.com/ropensci/stats19 |
Sample of stats19 data (2022 collisions)
A data frame
These were generated using the script in the
data-raw directory (misc.Rmd file).
nrow(accidents_sample_raw) accidents_sample_rawnrow(accidents_sample_raw) accidents_sample_raw
Sample of stats19 data (2022 casualties)
A data frame
These were generated using the script in the
data-raw directory (misc.Rmd file).
nrow(casualties_sample_raw) casualties_sample_rawnrow(casualties_sample_raw) casualties_sample_raw
Local helper to be reused.
check_input_file(filename = NULL, type = NULL, data_dir = NULL, year = NULL)check_input_file(filename = NULL, type = NULL, data_dir = NULL, year = NULL)
filename |
Character string of the filename of the .csv to read. |
type |
One of 'collision', 'casualty', 'Vehicle'. |
data_dir |
Where sets of downloaded data would be found. |
year |
Single year for which data are to be read. |
This function cleans the make of the vehicle.
clean_make(make, extract_make = TRUE)clean_make(make, extract_make = TRUE)
make |
A character vector of vehicle makes. Can be raw generic make/model strings if extract_make is TRUE. |
extract_make |
Logical, whether to extract the make from the input string using extract_make_stats19 first. Default is TRUE. |
clean_make(c("VW", "Mercedez")) clean_make(c("FORD FIESTA", "LAND ROVER DISCOVERY"), extract_make = TRUE)clean_make(c("VW", "Mercedez")) clean_make(c("FORD FIESTA", "LAND ROVER DISCOVERY"), extract_make = TRUE)
This function returns a combined cleaned make and model string.
It uses clean_make and clean_model to standardize both parts.
clean_make_model(generic_make_model)clean_make_model(generic_make_model)
generic_make_model |
A character vector of generic make/model strings |
clean_make_model(c("FORD FIESTA", "BMW 3 SERIES"))clean_make_model(c("FORD FIESTA", "BMW 3 SERIES"))
This function cleans the model of the vehicle.
It extracts the make using extract_make_stats19 and removes it from the string,
returning the remaining text as the model in title case.
clean_model(model)clean_model(model)
model |
A character vector of generic make/model strings |
clean_model(c("FORD FIESTA", "BMW 3 SERIES"))clean_model(c("FORD FIESTA", "BMW 3 SERIES"))
Download STATS19 data for a year
dl_stats19( year = NULL, type = NULL, data_dir = get_data_directory(), file_name = NULL, ask = FALSE, silent = FALSE, timeout = 600 )dl_stats19( year = NULL, type = NULL, data_dir = get_data_directory(), file_name = NULL, ask = FALSE, silent = FALSE, timeout = 600 )
year |
Single year for which data are to be read |
type |
One of 'collision', 'casualty', 'Vehicle'; defaults to 'collision'. |
data_dir |
Where sets of downloaded data would be found. |
file_name |
Character string of a specific STATS19 CSV filename to
download/read. If |
ask |
Should you be asked whether or not to download the files? |
silent |
Boolean. If |
timeout |
Timeout in seconds for the download if current option is less than this value. Defaults to 600 (10 minutes). |
if (curl::has_internet()) { # type by default is collisions table dl_stats19(year = 2022) }if (curl::has_internet()) { # type by default is collisions table dl_stats19(year = 2022) }
This function extracts the make from a generic make/model string, handling multi-word makes.
extract_make_stats19(generic_make_model)extract_make_stats19(generic_make_model)
generic_make_model |
A character vector of generic make/model strings |
extract_make_stats19(c("FORD FIESTA", "LAND ROVER DISCOVERY"))extract_make_stats19(c("FORD FIESTA", "LAND ROVER DISCOVERY"))
URL decoded file names. Currently there are 52 file names released by the DfT (Department for Transport) and the details include how these were obtained and would be kept up to date.
A named list
These were generated using the script in the
data-raw directory (misc.Rmd file).
head(file_names)head(file_names)
Find file names within stats19::file_names.
find_file_name(years = NULL, type = NULL)find_file_name(years = NULL, type = NULL)
years |
Year for which data are to be found |
type |
One of 'collisions', 'casualty' or 'vehicles' ignores case. |
find_file_name(2016)find_file_name(2016)
Format STATS19 casualties
format_casualties(x)format_casualties(x)
x |
Data frame created with |
This function formats raw STATS19 data
if(curl::has_internet()) { dl_stats19(year = 2022, type = "casualty") x = read_casualties(year = 2022) casualties = format_casualties(x) }if(curl::has_internet()) { dl_stats19(year = 2022, type = "casualty") x = read_casualties(year = 2022) casualties = format_casualties(x) }
Format STATS19 'collisions' data
format_collisions(x)format_collisions(x)
x |
Data frame created with |
This is a helper function to format raw STATS19 data
if(curl::has_internet()) { dl_stats19(year = 2022, type = "collision") }if(curl::has_internet()) { dl_stats19(year = 2022, type = "collision") }
This function takes messy column names and returns clean ones that work well with
R by default. Names that are all lower case with no R-unfriendly characters
such as spaces and - are returned.
format_column_names(column_names)format_column_names(column_names)
column_names |
Column names to be cleaned |
Column names cleaned.
if(curl::has_internet()) { crashes_raw = read_collisions(year = 2022) column_names = names(crashes_raw) column_names format_column_names(column_names = column_names) }if(curl::has_internet()) { crashes_raw = read_collisions(year = 2022) column_names = names(crashes_raw) column_names format_column_names(column_names = column_names) }
This function is a wrapper around the spatstat.geom::ppp() function and
it is used to transform STATS19 data into a ppp format.
format_ppp(data, window = NULL, ...)format_ppp(data, window = NULL, ...)
data |
A STATS19 dataframe to be converted into ppp format. |
window |
A windows of observation, an object of class |
... |
Additional parameters that should be passed to
|
A ppp object.
format_sf for an analogous function used to convert
data into sf format and spatstat.geom::ppp() for the original function.
if (requireNamespace("spatstat.geom", quietly = TRUE)) { x_ppp = format_ppp(accidents_sample) x_ppp }if (requireNamespace("spatstat.geom", quietly = TRUE)) { x_ppp = format_ppp(accidents_sample) x_ppp }
Format convert STATS19 data into spatial (sf) object
format_sf(x, lonlat = FALSE)format_sf(x, lonlat = FALSE)
x |
Data frame created with |
lonlat |
Should the results be returned in longitude/latitude?
By default |
x_sf = format_sf(accidents_sample) sf:::plot.sf(x_sf)x_sf = format_sf(accidents_sample) sf:::plot.sf(x_sf)
Format STATS19 vehicles data
format_vehicles(x)format_vehicles(x)
x |
Data frame created with |
This function formats raw STATS19 data
if(curl::has_internet()) { dl_stats19(year = 2022, type = "vehicle", ask = FALSE) x = read_vehicles(year = 2022, format = FALSE) vehicles = format_vehicles(x) }if(curl::has_internet()) { dl_stats19(year = 2022, type = "vehicle", ask = FALSE) x = read_vehicles(year = 2022, format = FALSE) vehicles = format_vehicles(x) }
Get data download dir
get_data_directory()get_data_directory()
Download vehicle data from the DVSA MOT API using VRM.
get_MOT(vrm, apikey)get_MOT(vrm, apikey)
vrm |
A list of VRMs as character strings. |
apikey |
Your API key as a character string. |
This function takes a a character vector of vehicle registrations (VRMs) and returns vehicle data from MOT records. It returns a data frame of those VRMs which were successfully used with the DVSA MOT API.
Information on the DVSA MOT API is available here: https://dvsa.github.io/mot-history-api-documentation/
The DVSA MOT API requires a registration. The function therefore requires the API key provided by the DVSA. Be aware that the API has usage limits. The function will therefore limit lists with more than 150,000 VRMs.
vrm = c("1RAC","P1RAC") apikey = Sys.getenv("MOTKEY") if(nchar(apikey) > 0) { get_MOT(vrm = vrm, apikey = apikey) }vrm = c("1RAC","P1RAC") apikey = Sys.getenv("MOTKEY") if(nchar(apikey) > 0) { get_MOT(vrm = vrm, apikey = apikey) }
Download, read and format STATS19 data in one function.
get_stats19( year = NULL, type = "collision", data_dir = get_data_directory(), file_name = NULL, format = TRUE, ask = FALSE, silent = FALSE, output_format = "tibble", engine = "readr", where = NULL, ... )get_stats19( year = NULL, type = "collision", data_dir = get_data_directory(), file_name = NULL, format = TRUE, ask = FALSE, silent = FALSE, output_format = "tibble", engine = "readr", where = NULL, ... )
year |
Single year for which data are to be read |
type |
One of 'collision', 'casualty', 'Vehicle'; defaults to 'collision'. |
data_dir |
Where sets of downloaded data would be found. |
file_name |
Character string of a specific STATS19 CSV filename to
download/read. If |
format |
Switch to return raw read from file, default is |
ask |
Should you be asked whether or not to download the files? |
silent |
Boolean. If |
output_format |
A string that specifies the desired output format. The
default value is |
engine |
CSV reader backend. Defaults to |
where |
Optional SQL predicate appended to the |
... |
Other arguments be passed to |
This function gets STATS19 data. Behind the scenes it uses
dl_stats19() and read_* functions, returning a
tibble (default), data.frame, sf or ppp object, depending on the
output_format parameter.
By default, stats19 downloads files to a temporary directory.
You can change this behavior to save the files in a permanent directory.
This is done by setting the STATS19_DOWNLOAD_DIRECTORY environment variable.
A convenient way to do this is by adding STATS19_DOWNLOAD_DIRECTORY=/path/to/a/dir
to your .Renviron file, which can be opened with usethis::edit_r_environ().
The function returns data for a specific year (e.g. year = 2022)
Note: for years before 2016 the function may return data from more years than are requested due to the nature of the files hosted at data.gov.uk.
As this function uses dl_stats19 function, it can download many MB of data,
so ensure you have a sufficient disk space.
If output_format = "data.frame" or output_format = "sf" or output_format = "ppp" then the output data is transformed into a data.frame, sf or ppp
object using the as.data.frame() or format_sf() or format_ppp()
functions, as shown in the examples.
if(curl::has_internet()) { col = get_stats19(year = 2022, type = "collision") cas = get_stats19(year = 2022, type = "casualty") veh = get_stats19(year = 2022, type = "vehicle") class(col) # data.frame output x = get_stats19(2022, silent = TRUE, output_format = "data.frame") class(x) # # Get 5-years worth of data (commented-out due to large response size): # col_5 = get_stats19(year = 5, type = "collision") # cas_5 = get_stats19(year = 5, type = "casualty") # veh_5 = get_stats19(year = 5, type = "vehicle") # Run tests only if endpoint is alive: if(nrow(x) > 0) { # use duckdb engine col_duck = get_stats19(year = 2022, type = "collision", engine = "duckdb") # use duckdb with where clause col_where = get_stats19(year = 2022, type = "collision", engine = "duckdb", where = "speed_limit = 30") # sf output x_sf = get_stats19(2022, silent = TRUE, output_format = "sf") # sf output with lonlat coordinates x_sf = get_stats19(2022, silent = TRUE, output_format = "sf", lonlat = TRUE) sf::st_crs(x_sf) if (requireNamespace("spatstat.geom", quietly = TRUE)) { # ppp output x_ppp = get_stats19(2022, silent = TRUE, output_format = "ppp") # We can use the window parameter of format_ppp function to filter only the # events occurred in a specific area. For example we can create a new bbox # of 5km around the city center of Leeds leeds_window = spatstat.geom::owin( xrange = c(425046.1, 435046.1), yrange = c(428577.2, 438577.2) ) leeds_ppp = get_stats19(2022, silent = TRUE, output_format = "ppp", window = leeds_window) spatstat.geom::plot.ppp(leeds_ppp, use.marks = FALSE, clipwin = leeds_window) } } }if(curl::has_internet()) { col = get_stats19(year = 2022, type = "collision") cas = get_stats19(year = 2022, type = "casualty") veh = get_stats19(year = 2022, type = "vehicle") class(col) # data.frame output x = get_stats19(2022, silent = TRUE, output_format = "data.frame") class(x) # # Get 5-years worth of data (commented-out due to large response size): # col_5 = get_stats19(year = 5, type = "collision") # cas_5 = get_stats19(year = 5, type = "casualty") # veh_5 = get_stats19(year = 5, type = "vehicle") # Run tests only if endpoint is alive: if(nrow(x) > 0) { # use duckdb engine col_duck = get_stats19(year = 2022, type = "collision", engine = "duckdb") # use duckdb with where clause col_where = get_stats19(year = 2022, type = "collision", engine = "duckdb", where = "speed_limit = 30") # sf output x_sf = get_stats19(2022, silent = TRUE, output_format = "sf") # sf output with lonlat coordinates x_sf = get_stats19(2022, silent = TRUE, output_format = "sf", lonlat = TRUE) sf::st_crs(x_sf) if (requireNamespace("spatstat.geom", quietly = TRUE)) { # ppp output x_ppp = get_stats19(2022, silent = TRUE, output_format = "ppp") # We can use the window parameter of format_ppp function to filter only the # events occurred in a specific area. For example we can create a new bbox # of 5km around the city center of Leeds leeds_window = spatstat.geom::owin( xrange = c(425046.1, 435046.1), yrange = c(428577.2, 438577.2) ) leeds_ppp = get_stats19(2022, silent = TRUE, output_format = "ppp", window = leeds_window) spatstat.geom::plot.ppp(leeds_ppp, use.marks = FALSE, clipwin = leeds_window) } } }
See the DfT's documentation on adjustment factors Annex: Update to severity adjustments methodology.
get_stats19_adjustments( data_dir = get_data_directory(), u = paste0("https://data.dft.gov.uk/road-accidents-safety-data/", "dft-road-casualty-statistics-casualty-adjustment-lookup_", "2004-latest-published-year.csv") )get_stats19_adjustments( data_dir = get_data_directory(), u = paste0("https://data.dft.gov.uk/road-accidents-safety-data/", "dft-road-casualty-statistics-casualty-adjustment-lookup_", "2004-latest-published-year.csv") )
data_dir |
Where sets of downloaded data would be found. |
u |
The URL of the zip file with adjustments to download |
See Estimating and adjusting for changes in the method of severity reporting for road accidents and casualty data: final report for details.
## Not run: if(curl::has_internet()) { adjustment = get_stats19_adjustments() } ## End(Not run)## Not run: if(curl::has_internet()) { adjustment = get_stats19_adjustments() } ## End(Not run)
Download DVLA-based vehicle data from the TfL API using VRM.
get_ULEZ(vrm)get_ULEZ(vrm)
vrm |
A list of VRMs as character strings. |
This function takes a character vector of vehicle registrations (VRMs) and returns DVLA-based vehicle data from TfL's API, included ULEZ eligibility. It returns a data frame of those VRMs which were successfully used with the TfL API. Vehicles are either compliant, non-compliant or exempt. ULEZ-exempt vehicles will not have all vehicle details returned - they will simply be marked "exempt".
Be aware that the API has usage limits. The function will therefore limit API calls to below 50 per minute - this is the maximum rate before an API key is required.
if(curl::has_internet()) { vrm = c("1RAC","P1RAC") get_ULEZ(vrm = vrm) }if(curl::has_internet()) { vrm = c("1RAC","P1RAC") get_ULEZ(vrm = vrm) }
Convert file names to urls
get_url( file_name = "", domain = "https://data.dft.gov.uk", directory = "road-accidents-safety-data" )get_url( file_name = "", domain = "https://data.dft.gov.uk", directory = "road-accidents-safety-data" )
file_name |
Optional file name to add to the url returned (empty by default) |
domain |
The domain from where the data will be downloaded |
directory |
The subdirectory of the url |
# get_url(find_file_name(1985))# get_url(find_file_name(1985))
Locate a file on disk
locate_files( data_dir = get_data_directory(), type = NULL, years = NULL, quiet = FALSE )locate_files( data_dir = get_data_directory(), type = NULL, years = NULL, quiet = FALSE )
data_dir |
Where sets of downloaded data would be found. |
type |
One of 'collision', 'casualty', 'Vehicle'; defaults to 'collision'. |
years |
Single year or vector of years for which data are to be read. |
quiet |
Print out messages (files found) |
Pin down a file on disk from parameters.
locate_one_file( filename = NULL, data_dir = get_data_directory(), year = NULL, type = NULL )locate_one_file( filename = NULL, data_dir = get_data_directory(), year = NULL, type = NULL )
filename |
Character string of the filename of the .csv to read. |
data_dir |
Where sets of downloaded data would be found. |
year |
Single year for which data are to be read. |
type |
One of 'collision', 'casualty', 'Vehicle'; defaults to 'collision'. |
locate_one_file()locate_one_file()
Downloads and processes the UK Department for Transport TAG Data Book table RAS4001, and joins estimated collision costs to STATS19 data.
Three matching modes are available:
"severity" — cost varies only by collision severity
"severity_road" — cost varies by severity and road type
"severity_road_bua" — cost varies by severity & road type, where road
type is determined using ONS Built-Up Area (BUA) polygons (2022)
BUA polygons are downloaded automatically from: https://open-geography-portalx-ons.hub.arcgis.com/api/download/v1/items/ad30b234308f4b02b4bb9b0f4766f7bb/geoPackage?layers=0
Optionally, total costs may be summarised by severity and/or road type.
match_tag( crashes, shapes_url = paste0("https://open-geography-portalx-ons.hub.arcgis.com/api/download/v1/", "items/ad30b234308f4b02b4bb9b0f4766f7bb/geoPackage?layers=0"), costs_url = paste0("https://assets.publishing.service.gov.uk/media/", "68d421cc275fc9339a248c8e/ras4001.ods"), match_with = "severity", include_motorway_bua = FALSE, summarise = FALSE )match_tag( crashes, shapes_url = paste0("https://open-geography-portalx-ons.hub.arcgis.com/api/download/v1/", "items/ad30b234308f4b02b4bb9b0f4766f7bb/geoPackage?layers=0"), costs_url = paste0("https://assets.publishing.service.gov.uk/media/", "68d421cc275fc9339a248c8e/ras4001.ods"), match_with = "severity", include_motorway_bua = FALSE, summarise = FALSE )
crashes |
A STATS19 collision data frame. May be |
shapes_url |
URL to download the ONS Built-Up Areas geopackage. Defaults to the official ONS 2022 dataset. |
costs_url |
URL to download the TAG Data Book RAS4001 table (ODS format). Defaults to the Department for Transport asset link. |
match_with |
Character string specifying the matching mode. One of:
The default uses |
include_motorway_bua |
Logical; if |
summarise |
Logical; if |
The function:
Downloads and parses RAS4001 cost tables
Computes road type using STATS19 fields or optional BUA polygons
Joins appropriate cost estimates
Optionally aggregates totals
When crashes is an sf object and summarise = TRUE, geometry is
automatically dropped.
A data frame (or sf object if input is sf) with added columns for
estimated collision costs.
If summarise = TRUE, a summary table of total costs (in millions) is returned.
## Not run: # Simple severity-based matching match_tag(stats19_df, match_with = "severity") # Severity + road type match_tag(stats19_df, match_with = "severity_road") # Using ONS Built-Up Areas, with motorway override match_tag( stats19_df, match_with = "severity_road_bua", include_motorway_bua = TRUE ) # Summarised totals match_tag(stats19_df, match_with = "severity", summarise = TRUE) ## End(Not run)## Not run: # Simple severity-based matching match_tag(stats19_df, match_with = "severity") # Severity + road type match_tag(stats19_df, match_with = "severity_road") # Using ONS Built-Up Areas, with motorway override match_tag( stats19_df, match_with = "severity_road_bua", include_motorway_bua = TRUE ) # Summarised totals match_tag(stats19_df, match_with = "severity", summarise = TRUE) ## End(Not run)
Generate a phrase for data download purposes
phrase()phrase()
This dataset represents the 43 police forces in England and Wales. These are described on the Wikipedia page. on UK police forces.
An sf data frame
The geographic boundary data were taken from the UK government's official geographic data portal. See http://geoportal.statistics.gov.uk/
These were generated using the script in the
data-raw directory (misc.Rmd file) in the package's GitHub repo:
github.com/ITSLeeds/stats19.
nrow(police_boundaries) police_boundaries[police_boundaries$pfa16nm == "West Yorkshire", ] sf:::plot.sf(police_boundaries)nrow(police_boundaries) police_boundaries[police_boundaries$pfa16nm == "West Yorkshire", ] sf:::plot.sf(police_boundaries)
Read in STATS19 road safety data from .csv files downloaded.
read_casualties( year = NULL, filename = "", data_dir = get_data_directory(), format = TRUE )read_casualties( year = NULL, filename = "", data_dir = get_data_directory(), format = TRUE )
year |
Single year for which data are to be read |
filename |
Character string of the filename of the .csv to read, if this is given, type and years determine whether there is a target to read, otherwise disk scan would be needed. |
data_dir |
Where sets of downloaded data would be found. |
format |
Switch to return raw read from file, default is |
Read in STATS19 road safety data from .csv files downloaded.
read_collisions( year = NULL, filename = "", data_dir = get_data_directory(), format = TRUE, silent = FALSE )read_collisions( year = NULL, filename = "", data_dir = get_data_directory(), format = TRUE, silent = FALSE )
year |
Single year for which data are to be read |
filename |
Character string of the filename of the .csv to read, if this is given, type and years determine whether there is a target to read, otherwise disk scan would be needed. |
data_dir |
Where sets of downloaded data would be found. |
format |
Switch to return raw read from file, default is |
silent |
Boolean. If |
This is a wrapper function to access and load stats 19 data in a user-friendly way. The function returns a data frame, in which each record is a reported incident in the STATS19 data.
if(curl::has_internet()) { dl_stats19(year = 2024, type = "collision") ac = read_collisions(year = 2024) }if(curl::has_internet()) { dl_stats19(year = 2024, type = "collision") ac = read_collisions(year = 2024) }
Read in stats19 road safety data from .csv files downloaded.
read_vehicles( year = NULL, filename = "", data_dir = get_data_directory(), format = TRUE )read_vehicles( year = NULL, filename = "", data_dir = get_data_directory(), format = TRUE )
year |
Single year for which data are to be read |
filename |
Character string of the filename of the .csv to read, if this is given, type and years determine whether there is a target to read, otherwise disk scan would be needed. |
data_dir |
Where sets of downloaded data would be found. |
format |
Switch to return raw read from file, default is |
Interactively select from options
select_file(fnames)select_file(fnames)
fnames |
Character vector of filenames to select from. |
Set data download dir
set_data_directory(data_path)set_data_directory(data_path)
data_path |
valid existing path to save downloaded files in. |
stats19_schema and stats19_variables contain
metadata on stats19 data.
stats19_schema is a look-up table matching
codes provided in the raw stats19 dataset with
character strings.
The schema data can be (re-)generated using the script in the
data-raw directory.
Sample of stats19 data (2022 vehicles)
A data frame
These were generated using the script in the
data-raw directory (misc.Rmd file).
nrow(vehicles_sample_raw) vehicles_sample_rawnrow(vehicles_sample_raw) vehicles_sample_raw