---
title: "Finding data"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Finding data}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
``` r
library(rnaturalearth)
library(sf)
#> Linking to GEOS 3.12.1, GDAL 3.8.5, PROJ 9.4.0; sf_use_s2() is TRUE
```
## Available data
There are a lot of data that can be downloaded from [Natural Earth](https://www.naturalearthdata.com/) with `ne_download()`. These data are divided into two main categories: _physical_ and _cultural_ vector data. The `df_layers_physical` and `df_layers_cultural` data frames included in the `rnaturalearth` packages show what layer of data can be downloaded.
### Physical vector data
``` r
data(df_layers_physical)
knitr::kable(
df_layers_physical,
caption = "physical vector data available via ne_download()"
)
```
Table: physical vector data available via ne_download()
|layer | scale10| scale50| scale110|
|:----------------------------------|-------:|-------:|--------:|
|antarctic_ice_shelves_lines | 1| 1| 0|
|antarctic_ice_shelves_polys | 1| 1| 0|
|coastline | 1| 1| 1|
|geographic_lines | 1| 1| 1|
|geography_marine_polys | 1| 1| 1|
|geography_regions_elevation_points | 1| 1| 1|
|geography_regions_points | 1| 1| 1|
|geography_regions_polys | 1| 1| 1|
|glaciated_areas | 1| 1| 1|
|lakes | 1| 1| 1|
|lakes_europe | 1| 0| 0|
|lakes_historic | 1| 1| 0|
|lakes_north_america | 1| 0| 0|
|lakes_pluvial | 1| 0| 0|
|land | 1| 1| 1|
|land_ocean_label_points | 1| 0| 0|
|land_ocean_seams | 1| 0| 0|
|land_scale_rank | 1| 0| 0|
|minor_islands | 1| 0| 0|
|minor_islands_coastline | 1| 0| 0|
|minor_islands_label_points | 1| 0| 0|
|ocean | 1| 1| 1|
|ocean_scale_rank | 1| 0| 0|
|playas | 1| 1| 0|
|reefs | 1| 0| 0|
|rivers_europe | 1| 0| 0|
|rivers_lake_centerlines | 1| 1| 1|
|rivers_lake_centerlines_scale_rank | 1| 1| 0|
|rivers_north_america | 1| 0| 0|
Based on the previous table, we know that we can download the `ocean` vector at small scale (110). Note that scales are defined as one of `110`, `50`, `10` or `small`, `medium`, `large`.
``` r
plot(
ne_download(type = "ocean", category = "physical", scale = "small")[
"geometry"
],
col = "lightblue"
)
#> Reading 'ne_110m_ocean.zip' from naturalearth...
```

### Cultural vector data
``` r
data(df_layers_cultural)
knitr::kable(
df_layers_cultural,
caption = "cultural vector data available via ne_download()"
)
```
Table: cultural vector data available via ne_download()
|layer | scale10| scale50| scale110|
|:-----------------------------------------------|-------:|-------:|--------:|
|admin_0_antarctic_claim_limit_lines | 1| 0| 0|
|admin_0_antarctic_claims | 1| 0| 0|
|admin_0_boundary_lines_disputed_areas | 1| 1| 0|
|admin_0_boundary_lines_land | 1| 1| 1|
|admin_0_boundary_lines_map_units | 1| 0| 0|
|admin_0_boundary_lines_maritime_indicator | 1| 1| 0|
|admin_0_boundary_map_units | 0| 1| 0|
|admin_0_breakaway_disputed_areas | 0| 1| 0|
|admin_0_countries | 1| 1| 1|
|admin_0_countries_lakes | 1| 1| 1|
|admin_0_disputed_areas | 1| 0| 0|
|admin_0_disputed_areas_scale_rank_minor_islands | 1| 0| 0|
|admin_0_label_points | 1| 0| 0|
|admin_0_map_subunits | 1| 1| 0|
|admin_0_map_units | 1| 1| 1|
|admin_0_pacific_groupings | 1| 1| 1|
|admin_0_scale_rank | 1| 1| 1|
|admin_0_scale_rank_minor_islands | 1| 0| 0|
|admin_0_seams | 1| 0| 0|
|admin_0_sovereignty | 1| 1| 1|
|admin_0_tiny_countries | 0| 1| 1|
|admin_0_tiny_countries_scale_rank | 0| 1| 0|
|admin_1_label_points | 1| 0| 0|
|admin_1_seams | 1| 0| 0|
|admin_1_states_provinces | 1| 1| 1|
|admin_1_states_provinces_lakes | 1| 1| 1|
|admin_1_states_provinces_lines | 1| 1| 1|
|admin_1_states_provinces_scale_rank | 1| 1| 1|
|airports | 1| 1| 0|
|parks_and_protected_lands_area | 1| 0| 0|
|parks_and_protected_lands_line | 1| 0| 0|
|parks_and_protected_lands_point | 1| 0| 0|
|parks_and_protected_lands_scale_rank | 1| 0| 0|
|populated_places | 1| 1| 1|
|populated_places_simple | 1| 1| 1|
|ports | 1| 1| 0|
|railroads | 1| 0| 0|
|railroads_north_america | 1| 0| 0|
|roads | 1| 0| 0|
|roads_north_america | 1| 0| 0|
|time_zones | 1| 0| 0|
|urban_areas | 1| 1| 0|
|urban_areas_landscan | 1| 0| 0|
``` r
plot(
ne_download(
type = "airports",
category = "cultural",
scale = 10L
)["geometry"],
pch = 21L,
bg = "grey"
)
#> Reading 'ne_10m_airports.zip' from naturalearth...
```

## Searching for countries and continents
In this article, we explore how we can search for data available to download within `rnaturalearth`. Let's begin by loading country data using the `read_sf()` function from the `sf` package. In the following code snippet, we read the Natural Earth dataset, which contains information about the sovereignty of countries.
``` r
df <- read_sf(
"/vsizip/vsicurl/https://naciscdn.org/naturalearth/10m/cultural/ne_10m_admin_0_sovereignty.zip"
)
head(df)
#> Simple feature collection with 6 features and 168 fields
#> Geometry type: MULTIPOLYGON
#> Dimension: XY
#> Bounding box: xmin: -109.4537 ymin: -55.9185 xmax: 140.9776 ymax: 7.35578
#> Geodetic CRS: WGS 84
#> # A tibble: 6 × 169
#> featurecla scalerank LABELRANK SOVEREIGNT SOV_A3 ADM0_DIF LEVEL TYPE TLC ADMIN ADM0_A3
#>
#> 1 Admin-0 sover… 5 2 Indonesia IDN 0 2 Sove… 1 Indo… IDN
#> 2 Admin-0 sover… 5 3 Malaysia MYS 0 2 Sove… 1 Mala… MYS
#> 3 Admin-0 sover… 0 2 Chile CHL 0 2 Sove… 1 Chile CHL
#> 4 Admin-0 sover… 0 3 Bolivia BOL 0 2 Sove… 1 Boli… BOL
#> 5 Admin-0 sover… 0 2 Peru PER 0 2 Sove… 1 Peru PER
#> 6 Admin-0 sover… 0 2 Argentina ARG 0 2 Sove… 1 Arge… ARG
#> # ℹ 158 more variables: GEOU_DIF , GEOUNIT , GU_A3 , SU_DIF ,
#> # SUBUNIT , SU_A3 , BRK_DIFF , NAME , NAME_LONG , BRK_A3 ,
#> # BRK_NAME , BRK_GROUP , ABBREV , POSTAL , FORMAL_EN ,
#> # FORMAL_FR , NAME_CIAWF , NOTE_ADM0 , NOTE_BRK , NAME_SORT ,
#> # NAME_ALT , MAPCOLOR7 , MAPCOLOR8 , MAPCOLOR9 , MAPCOLOR13 ,
#> # POP_EST , POP_RANK , POP_YEAR , GDP_MD , GDP_YEAR ,
#> # ECONOMY , INCOME_GRP , FIPS_10 , ISO_A2 , ISO_A2_EH ,
#> # ISO_A3 , ISO_A3_EH , ISO_N3 , ISO_N3_EH , UN_A3 , WB_A2 ,
#> # WB_A3 , WOE_ID , WOE_ID_EH , WOE_NOTE , ADM0_ISO ,
#> # ADM0_DIFF , ADM0_TLC , ADM0_A3_US , ADM0_A3_FR , ADM0_A3_RU ,
#> # ADM0_A3_ES , ADM0_A3_CN , ADM0_A3_TW , ADM0_A3_IN , ADM0_A3_NP ,
#> # ADM0_A3_PK , ADM0_A3_DE , ADM0_A3_GB , ADM0_A3_BR , ADM0_A3_IL ,
#> # ADM0_A3_PS , ADM0_A3_SA , ADM0_A3_EG , ADM0_A3_MA , ADM0_A3_PT ,
#> # ADM0_A3_AR , ADM0_A3_JP , ADM0_A3_KO , ADM0_A3_VN , ADM0_A3_TR ,
#> # ADM0_A3_ID , ADM0_A3_PL , ADM0_A3_GR , ADM0_A3_IT , ADM0_A3_NL ,
#> # ADM0_A3_SE , ADM0_A3_BD , ADM0_A3_UA , ADM0_A3_UN , ADM0_A3_WB ,
#> # CONTINENT , REGION_UN , SUBREGION , REGION_WB , NAME_LEN ,
#> # LONG_LEN , ABBREV_LEN , TINY , HOMEPART , MIN_ZOOM ,
#> # MIN_LABEL , MAX_LABEL , LABEL_X , LABEL_Y , NE_ID ,
#> # WIKIDATAID , NAME_AR , NAME_BN , NAME_DE , …
```
### Finding countries
One way to search for countries is to search within the `ADMIN` vector. Let's start by plotting some of the first countries.
``` r
lapply(
df$ADMIN[1L:6L],
\(x) plot(ne_countries(country = x)["geometry"], main = x)
)
```





Suppose that we want to search the polygons for the US, how should we spell it?
``` r
ne_countries(country = "USA")
ne_countries(country = "United States")
ne_countries(country = "United States Of America")
ne_countries(country = "United States of America")
```
One possibility consists to search within the `ADMIN` vector using a regular expression to find all occurrences of the word _states_.
``` r
grep("states", df$ADMIN, ignore.case = TRUE, value = TRUE)
#> [1] "United States of America" "Federated States of Micronesia"
```
We can now get the data.
``` r
plot(ne_countries(country = "United States of America")["geometry"])
```

### Continents
Finally, let's create plots for each continent using the `ne_countries` function with the continent parameter.
``` r
unique(df$CONTINENT)
#> [1] "Asia" "South America" "Europe"
#> [4] "Africa" "North America" "Oceania"
#> [7] "Antarctica" "Seven seas (open ocean)"
```
``` r
lapply(
unique(df$CONTINENT),
\(x)
plot(
ne_countries(
continent = x,
scale = "medium"
)["geometry"],
main = x
)
)
```






