--- title: "Choosing a *clifro* Datatype" author: "Blake Seers" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Choosing a *clifro* Datatype} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- # Introduction The `cf_datatype` function is all that is required to select `clifro` datatypes. This function can be called without any arguments that takes the user through interactive menus, otherwise the datatypes may be chosen programmatically if the menu options are known in advance. Whether the intention is to choose one datatype or many, this vignette details the various methods in choosing them. # Using the menus interactively to choose a datatype Those familiar with the cliflo [datatype selection menu](https://cliflo.niwa.co.nz/pls/niwp/wgenf.choose_datatype?cat=cat1) will recall the myriad datatypes and options available in the National Climate Database. Selection of a datatype requires navigation through trees of menus, check boxes and combo boxes. The `cf_datatype` function mimics this (tedious) behaviour by default, i.e. when no arguments are passed to the function and the datatypes, menus and options are all identical to (actually scraped from) the datatype selection menu. ## A minimal example Let's say the datatype we are interested in is 9am surface wind in knots. ```{r, echo=FALSE} library(clifro) library(pander) surfaceWind.dt = new("cfDatatype" , dt_name = "Wind" , dt_type = "Surface wind" , dt_sel_option_names = list("9amWind") , dt_sel_combo_name = "knots" , dt_param = structure("ls_sfw,1,2,3,4,5", .Names = "dt1") , dt_sel_option_params = list(structure(c("132", "knots"), .Names = c("prm4", "prm5"))) , dt_selected_options = list(c(4, 5)) , dt_option_length = 5 ) menu.opts = function(title, options){ cat(paste(title, "", paste(seq_along(options), options, sep = ": ", collapse = "\n"), sep = "\n")) } ``` ```{r, eval=FALSE} surfaceWind.dt = cf_datatype() # If you prefer pointing and clicking - turn the graphics option on: surfaceWind.dt = cf_datatype(graphics = TRUE) ``` ### Daily and Hourly Observations ```{r, echo=FALSE} menu.opts("Daily and Hourly Observations", c("Combined Observations", "Wind", "Precipitation", "Temperature and Humidity", "Sunshine and Radiation", "Weather", "Pressure", "Clouds", "Evaporation / soil moisture")) ``` The first menu that appears when the above line of code is run in R is the 'Daily and Hourly Observations'. We are interested in 'Wind', therefore we would type in the number of our selection (or select it using the mouse if `graphics = TRUE`), in this case option **2**. ### Submenu for the given datatype ```{r, echo=FALSE} menu.opts("Wind", c("Surface wind", "Max Gust")) ``` The next menu prompts us for the type of wind we are interested in, in this case we are interested in surface wind which is option **1**. ### Options for the given datatype ```{r, echo=FALSE} menu.opts("Surface wind options", c("WindRun", "HlyWind", "3HlyWind", "9amWind") ) ``` The next menu is the options for the chosen datatype, for which we may choose more than one. If more than one option for a given datatype is sought, options must be chosen one at a time. This is made possible by a menu prompting whether or not we would like to select another datatype every time an option is chosen. ```{r, echo=FALSE} menu.opts("Choose another option?", c("yes", "no")) ``` We are interested only in the surface wind at 9am in this example therefore we don't choose another option after we choose option **4**. ### Final options ```{r, echo=FALSE} menu.opts("Units", c("m/s", "km/hr", "knots")) ``` This final options menu is typically associated with the units of the datatype (although not always) and is sometimes not necessary, depending on the datatype. For this example we do have a final menu and it prompts us for the units that we are interested in where we choose option **3**. The surface wind datatype and the associated options are now saved in R as an object called `surfaceWind.dt`. ```{r} surfaceWind.dt ``` # Choosing a datatype without the menus The bold numbers in the minimal example above are emphasised specifically to show the menu order and selections needed to choose the strength of the 9am surface wind in knots datatype, i.e. **2** $\rightarrow$ **1** $\rightarrow$ **4** $\rightarrow$ **3**. In general, if we know the selections needed for each of the four menus then we can choose any datatype without using the menus making datatype selection a lot faster and a much less tedious. ## A minimal example To repeat our minimal example without the use of the menus we would just pass them as arguments to the `cf_datatype` function. These arguments are the selections of each of the four menus (in order) separated by a comma. ```{r, eval = FALSE} surfaceWind.dt = cf_datatype(2, 1, 4, 3) surfaceWind.dt ``` ```{r, echo = FALSE} surfaceWind.dt ``` ## Selecting more than one option for a given datatype Recall that we may choose more than one option at the third menu, equivalent to the check boxes on the cliflo [database query form](https://cliflo.niwa.co.nz/pls/niwp/wgenf.genform1). Using the menu to choose more than one option is an iterative process however we can just update our third function argument to deal with this in a more time-efficient manner. ```{r, echo = FALSE} surfaceWind.dt = new("cfDatatype" , dt_name = "Wind" , dt_type = "Surface wind" , dt_sel_option_names = list(c("HlyWind", "9amWind")) , dt_sel_combo_name = "knots" , dt_param = structure("ls_sfw,1,2,3,4,5", .Names = "dt1") , dt_sel_option_params = list(structure(c("134", "132", "knots"), .Names = c("prm2", "prm4", "prm5"))) , dt_selected_options = list(c(2, 4, 5)) , dt_option_length = 5 ) rainfall.dt = new("cfDatatype" , dt_name = "Precipitation" , dt_type = "Rain (fixed periods)" , dt_sel_option_names = list(c("Daily ", "Hourly")) , dt_sel_combo_name = NA_character_ , dt_param = structure("ls_ra,1,2,3,4", .Names = "dt1") , dt_sel_option_params = list(structure(c("181", "182"), .Names = c("prm1", "prm2"))) , dt_selected_options = list(c(1, 2)) , dt_option_length = 4 ) lightning.dt = new("cfDatatype" , dt_name = "Weather" , dt_type = "Lightning" , dt_sel_option_names = list("Ltng") , dt_sel_combo_name = NA_character_ , dt_param = structure("ls_light,1", .Names = "dt1") , dt_sel_option_params = list(structure("271", .Names = "prm1")) , dt_selected_options = list(1) , dt_option_length = 1 ) temperatureExtremes.dt = new("cfDatatype" , dt_name = "Temperature and Humidity" , dt_type = "Max_min_temp" , dt_sel_option_names = list(c("DlyGrass", "HlyGrass")) , dt_sel_combo_name = NA_character_ , dt_param = structure("ls_mxmn,1,2,3,4,5,6", .Names = "dt1") , dt_sel_option_params = list(structure(c("202", "204"), .Names = c("prm5", "prm6"))) , dt_selected_options = list(c(5, 6)) , dt_option_length = 6 ) ``` ```{r, eval = FALSE} surfaceWind.dt = cf_datatype(2, 1, c(2, 4), 3) surfaceWind.dt ``` ```{r, echo = FALSE} surfaceWind.dt ``` `surfaceWind.dt` now contains the surface wind datatype (in knots) with both 9am wind and hourly wind. Notice how all the other function arguments remain the same. # Selecting multiple datatypes Most applications involving the environmental data contained within the National Climate Database will require selection of more than one option for more than one datatype. This is where the true advantages in using the `clifro` package become apparent. ## An extended example Let us consider an application where we are now interested in hourly and 9am surface wind along with hourly and daily rainfall, hourly counts of lightning flashes and daily and hourly grass temperature extremes. There are a few ways to choose all of these datatypes. Firstly, you could go through the menu options one by one, selecting the corresponding datatypes and options and saving the resulting datatypes as different R objects. A less laborious alternative is to create each of these datatypes without the menus. This does of course assume we know the selections at each branch of the [datatype selection menus](https://cliflo.niwa.co.nz/pls/niwp/wgenf.choose_datatype?cat=cat1). ```{r, eval=FALSE} # Hourly and 9am surface wind (knots) surfaceWind.dt = cf_datatype(2, 1, c(2, 4), 3) surfaceWind.dt ``` ```{r, echo = FALSE} surfaceWind.dt ``` ```{r, eval = FALSE} # Hourly and daily rainfall rainfall.dt = cf_datatype(3, 1, c(1, 2)) rainfall.dt ``` ```{r, echo = FALSE} rainfall.dt ``` ```{r, eval = FALSE} # Hourly counts of lightning flashes lightning.dt = cf_datatype(6, 1, 1) lightning.dt ``` ```{r, echo = FALSE} lightning.dt ``` ```{r, eval = FALSE} # Daily and hourly grass temperature extremes temperatureExtremes.dt = cf_datatype(4, 2, c(5, 6)) temperatureExtremes.dt # Note: only the surface wind datatype requires combo options ``` ```{r, echo = FALSE} temperatureExtremes.dt ``` This results in 4 separate objects in R containing the datatypes and their corresponding options. If we were wanting to submit a query using all of these datatypes at once, having four separate datatypes is less than optimal. The following table shows the options for each of the menus that we are interested in. ```{r, echo = FALSE, results = "asis"} d = data.frame(Menu = c("First selection", "Second selection", "Third selection(s)", "combo box options"), `Surface wind` = c(2, 1, "2 & 4", 3), Rainfall = c(3, 1, "1 & 2", NA), Lightning = c(6, 1, 1, NA), Temperature = c(4, 2, "5 & 6", NA), check.names = FALSE) pandoc.table(d, style = "simple") ``` We can read across the columns to see the selections that are needed to return an R object containing the datatypes we are interested in. We can then just pass these values into the `cf_datatype` function to return a single R object containing all of our datatypes and options. ```{r, echo = FALSE} query1.dt = new("cfDatatype" , dt_name = c("Wind", "Precipitation", "Weather", "Temperature and Humidity" ) , dt_type = c("Surface wind", "Rain (fixed periods)", "Lightning", "Max_min_temp" ) , dt_sel_option_names = list(c("HlyWind", "9amWind"), c("Daily ", "Hourly"), "Ltng", c("DlyGrass", "HlyGrass")) , dt_sel_combo_name = c("knots", NA, NA, NA) , dt_param = structure(c("ls_sfw,1,2,3,4,5", "ls_ra,6,7,8,9", "ls_light,10", "ls_mxmn,11,12,13,14,15,16"), .Names = c("dt1", "dt2", "dt3", "dt4")) , dt_sel_option_params = list(structure(c("134", "132", "knots"), .Names = c("prm2", "prm4", "prm5")), structure(c("181", "182"), .Names = c("prm6", "prm7" )), structure("271", .Names = "prm10"), structure(c("202", "204" ), .Names = c("prm15", "prm16"))) , dt_selected_options = list(c(2, 4, 5), c(1, 2), 1, c(5, 6)) , dt_option_length = c(5, 4, 1, 6) ) ``` ```{r, tidy = FALSE, eval = FALSE} query1.dt = cf_datatype(c(2, 3, 6, 4), c(1, 1, 1, 2), list(c(2, 4), c(1, 2), 1, c(5, 6)), c(3, NA, NA, NA)) query1.dt ``` ```{r, echo = FALSE} query1.dt ``` We can also easily combine separate `cfDatatype` objects in R using the addition symbol `+`, to produce an identical result. This may be useful when you want to conduct multiple queries which include a subset of these datatypes. ```{r} query1.dt = surfaceWind.dt + rainfall.dt + lightning.dt + temperatureExtremes.dt query1.dt ``` ## Extras ```{r, eval=FALSE} # To add another datatype using the menu: query1.dt + cf_datatype() # Is equivalent to: query1.dt + cf_datatype(NA, NA, NA, NA) # Therefore is equivalent to adding a column of NA's to the above table: query1.dt = cf_datatype(c(2, 3, 6, 4, NA), c(1, 1, 1, 2, NA), list(c(2, 4), c(1, 2), 1, c(5, 6), NA), c(3, NA, NA, NA, NA)) # Half an unknown wind datatype i.e. we know first selection = 2 but nothing # further: rain.dt = cf_datatype(2) # Or cf_datatype(2, NA, NA, NA) ```