Title: | Archetypes for Targets |
---|---|
Description: | Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>. |
Authors: | William Michael Landau [aut, cre] , Rudolf Siegel [ctb] , Samantha Oliver [rev] , Tristan Mahr [rev] , Eli Lilly and Company [cph, fnd] |
Maintainer: | William Michael Landau <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.11.0.9002 |
Built: | 2024-12-20 21:19:00 UTC |
Source: | https://github.com/ropensci/tarchetypes |
A pipeline toolkit for R, the targets
package brings together
function-oriented programming and Make-like declarative pipelines for
Statistics and data science. The tarchetypes
package provides
convenient helper functions to create specialized targets, making
pipelines in targets easier and cleaner to write and understand.
tar_age()
creates a target that reruns
itself when it gets old enough.
In other words, the target reruns periodically at regular
intervals of time.
tar_age( name, command, age, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_age( name, command, age, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command |
R code to run the target and return a value. |
age |
A |
pattern |
Code to define a dynamic branching branching for a target.
In To demonstrate dynamic branching patterns, suppose we have
a pipeline with numeric vector targets |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Logical, whether to rerun the target if the user-specified
storage format changed. The storage format is user-specified through
|
repository |
Logical, whether to rerun the target if the user-specified
storage repository changed. The storage repository is user-specified
through |
iteration |
Logical, whether to rerun the target if the user-specified
iteration method changed. The iteration method is user-specified through
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
A |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_age()
uses the cue from tar_cue_age()
, which
uses the time stamps from targets::tar_meta()$time
.
See the help file of targets::tar_timestamp()
for an explanation of how this time stamp is calculated.
A target object. See the "Target objects" section for background.
Time stamps are not recorded for whole dynamic targets,
so tar_age()
is not a good fit for dynamic branching.
To invalidate dynamic branches at regular intervals,
it is recommended to use targets::tar_older()
in combination
with targets::tar_invalidate()
right before calling tar_make()
.
For example,
tar_invalidate(any_of(tar_older(Sys.time - as.difftime(1, units = "weeks"))))
# nolint
invalidates all targets more than a week old. Then, the next tar_make()
will rerun those targets.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other cues:
tar_cue_age()
,
tar_cue_force()
,
tar_cue_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( tarchetypes::tar_age( data, data.frame(x = seq_len(26)), age = as.difftime(0.5, units = "secs") ) ) }) targets::tar_make() Sys.sleep(0.6) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( tarchetypes::tar_age( data, data.frame(x = seq_len(26)), age = as.difftime(0.5, units = "secs") ) ) }) targets::tar_make() Sys.sleep(0.6) targets::tar_make() }) }
An assignment-based domain-specific language for pipeline construction.
tar_assign(targets)
tar_assign(targets)
targets |
An expression with special syntax to define a
collection of targets in a pipeline.
Example:
|
A list of tar_target()
objects.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. write.csv(airquality, "data.csv", row.names = FALSE) targets::tar_script({ library(tarchetypes) tar_option_set(packages = c("readr", "dplyr", "ggplot2")) tar_assign({ file <- tar_target("data.csv", format = "file") data <- read_csv(file, col_types = cols()) |> filter(!is.na(Ozone)) |> tar_target() model = lm(Ozone ~ Temp, data) |> coefficients() |> tar_target() plot <- { ggplot(data) + geom_point(aes(x = Temp, y = Ozone)) + geom_abline(intercept = model[1], slope = model[2]) + theme_gray(24) } |> tar_target() }) }) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. write.csv(airquality, "data.csv", row.names = FALSE) targets::tar_script({ library(tarchetypes) tar_option_set(packages = c("readr", "dplyr", "ggplot2")) tar_assign({ file <- tar_target("data.csv", format = "file") data <- read_csv(file, col_types = cols()) |> filter(!is.na(Ozone)) |> tar_target() model = lm(Ozone ~ Temp, data) |> coefficients() |> tar_target() plot <- { ggplot(data) + geom_point(aes(x = Temp, y = Ozone)) + geom_abline(intercept = model[1], slope = model[2]) + theme_gray(24) } |> tar_target() }) }) targets::tar_make() }) }
Create a target that responds to a change in an arbitrary value. If the value changes, the target reruns.
tar_change( name, command, change, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_change( name, command, change, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
change |
R code for the upstream change-inducing target. |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_change()
creates a pair of targets, one upstream
and one downstream. The upstream target always runs and returns
an auxiliary value. This auxiliary value gets referenced in the
downstream target, which causes the downstream target to rerun
if the auxiliary value changes. The behavior is cancelled if
cue
is tar_cue(depend = FALSE)
or tar_cue(mode = "never")
.
Because the upstream target always runs,
tar_outdated()
and tar_visnetwork()
will always
show both targets as outdated. However, tar_make()
will still
skip the downstream one if the upstream target
did not detect a change.
A list of two target objects, one upstream and one downstream. The upstream one triggers the change, and the downstream one responds to it. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other targets with custom invalidation rules:
tar_download()
,
tar_force()
,
tar_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_change(x, command = tempfile(), change = tempfile()) ) }) targets::tar_make() targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_change(x, command = tempfile(), change = tempfile()) ) }) targets::tar_make() targets::tar_make() }) }
Aggregate the results of upstream targets into a new target.
tar_combine()
expects unevaluated expressions for the name
,
and command
arguments, whereas tar_combine_raw()
uses a character string for name
and an evaluated expression object
for command
. See the examples for details.
tar_combine( name, ..., command = vctrs::vec_c(!!!.x), use_names = TRUE, pattern = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_combine_raw( name, ..., command = expression(vctrs::vec_c(!!!.x)), use_names = TRUE, pattern = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_combine( name, ..., command = vctrs::vec_c(!!!.x), use_names = TRUE, pattern = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_combine_raw( name, ..., command = expression(vctrs::vec_c(!!!.x)), use_names = TRUE, pattern = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the new target.
|
... |
One or more target objects or list of target objects.
Lists can be arbitrarily nested, as in |
command |
R command to aggregate the targets. Must contain
|
use_names |
Logical, whether to insert the names of the targets into the command when splicing. |
pattern |
Code to define a dynamic branching branching for a target.
In To demonstrate dynamic branching patterns, suppose we have
a pipeline with numeric vector targets |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A new target object to combine the return values from the upstream targets. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other static branching:
tar_map()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) target1 <- tar_target(x, head(mtcars)) target2 <- tar_target(y, tail(mtcars)) target3 <- tar_combine( name = new_target_name, target1, target2, command = dplyr::bind_rows(!!!.x) ) target4 <- tar_combine( name = new_target_name2, target1, target2, command = dplyr::bind_rows(!!!.x) ) list(target1, target2, target3, target4) }) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) target1 <- tar_target(x, head(mtcars)) target2 <- tar_target(y, tail(mtcars)) target3 <- tar_combine( name = new_target_name, target1, target2, command = dplyr::bind_rows(!!!.x) ) target4 <- tar_combine( name = new_target_name2, target1, target2, command = dplyr::bind_rows(!!!.x) ) list(target1, target2, target3, target4) }) targets::tar_make() }) }
tar_cue_age()
creates a cue object to
rerun a target if the most recent output data becomes old enough.
The age of the target is determined by targets::tar_timestamp()
,
and the way the time stamp is calculated is explained
in the Details section of the help file of that function.
tar_cue_age()
expects an unevaluated symbol for the name
argument, whereas tar_cue_age_raw()
expects a character string
for name
.
tar_cue_age( name, age, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE ) tar_cue_age_raw( name, age, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE )
tar_cue_age( name, age, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE ) tar_cue_age_raw( name, age, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE )
name |
Name of the target.
|
age |
A |
command |
Logical, whether to rerun the target if command changed since last time. |
depend |
Logical, whether to rerun the target if the value of one of the dependencies changed. |
format |
Logical, whether to rerun the target if the user-specified
storage format changed. The storage format is user-specified through
|
repository |
Logical, whether to rerun the target if the user-specified
storage repository changed. The storage repository is user-specified
through |
iteration |
Logical, whether to rerun the target if the user-specified
iteration method changed. The iteration method is user-specified through
|
file |
Logical, whether to rerun the target if the file(s) with the return value changed or at least one is missing. |
tar_cue_age()
uses the time stamps from tar_meta()$time
.
If no time stamp is recorded, the cue defaults to the ordinary
invalidation rules (i.e. mode = "thorough"
in targets::tar_cue()
).
A cue object. See the "Cue objects" section for background.
Time stamps are not recorded for whole dynamic targets,
so tar_age()
is not a good fit for dynamic branching.
To invalidate dynamic branches at regular intervals,
it is recommended to use targets::tar_older()
in combination
with targets::tar_invalidate()
right before calling tar_make()
.
For example,
tar_invalidate(any_of(tar_older(Sys.time - as.difftime(1, units = "weeks"))))
# nolint
invalidates all targets more than a week old. Then, the next tar_make()
will rerun those targets.
A cue object is an object generated by targets::tar_cue()
,
tarchetypes::tar_cue_force()
, or similar. It is
a collection of decision rules that decide when a target
is invalidated/outdated (e.g. when tar_make()
or similar
reruns the target). You can supply these cue objects to the
tar_target()
function or similar. For example,
tar_target(x, run_stuff(), cue = tar_cue(mode = "always"))
is a target that always calls run_stuff()
during tar_make()
and always shows as invalidated/outdated in tar_outdated()
,
tar_visnetwork()
, and similar functions.
A cue object is an object generated by targets::tar_cue()
,
tarchetypes::tar_cue_force()
, or similar. It is
a collection of decision rules that decide when a target
is invalidated/outdated (e.g. when tar_make()
or similar
reruns the target). You can supply these cue objects to the
tar_target()
function or similar. For example,
tar_target(x, run_stuff(), cue = tar_cue(mode = "always"))
is a target that always calls run_stuff()
during tar_make()
and always shows as invalidated/outdated in tar_outdated()
,
tar_visnetwork()
, and similar functions.
Other cues:
tar_age()
,
tar_cue_force()
,
tar_cue_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(26)), cue = tarchetypes::tar_cue_age( name = data, age = as.difftime(0.5, units = "secs") ) ) ) }) targets::tar_make() Sys.sleep(0.6) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(26)), cue = tarchetypes::tar_cue_age( name = data, age = as.difftime(0.5, units = "secs") ) ) ) }) targets::tar_make() Sys.sleep(0.6) targets::tar_make() }) }
tar_cue_force()
creates a cue object to
force a target to run if an arbitrary condition evaluates to TRUE
.
Supply the returned cue object to the cue
argument of
targets::tar_target()
or similar.
tar_cue_force( condition, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE )
tar_cue_force( condition, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE )
condition |
Logical vector evaluated locally when the target is
defined. If any element of |
command |
Logical, whether to rerun the target if command changed since last time. |
depend |
Logical, whether to rerun the target if the value of one of the dependencies changed. |
format |
Logical, whether to rerun the target if the user-specified
storage format changed. The storage format is user-specified through
|
repository |
Logical, whether to rerun the target if the user-specified
storage repository changed. The storage repository is user-specified
through |
iteration |
Logical, whether to rerun the target if the user-specified
iteration method changed. The iteration method is user-specified through
|
file |
Logical, whether to rerun the target if the file(s) with the return value changed or at least one is missing. |
tar_cue_force()
and tar_force()
operate differently.
The former defines a cue object based on an eagerly evaluated
condition, and tar_force()
puts the condition in a special
upstream target that always runs. Unlike tar_cue_force()
,
the condition in tar_force()
can depend on upstream targets,
but the drawback is that targets defined with tar_force()
will always show up as outdated in functions like tar_outdated()
and tar_visnetwork()
even though tar_make()
may still
skip the main target if the condition is not met.
A cue object. See the "Cue objects" section for background.
A cue object is an object generated by targets::tar_cue()
,
tarchetypes::tar_cue_force()
, or similar. It is
a collection of decision rules that decide when a target
is invalidated/outdated (e.g. when tar_make()
or similar
reruns the target). You can supply these cue objects to the
tar_target()
function or similar. For example,
tar_target(x, run_stuff(), cue = tar_cue(mode = "always"))
is a target that always calls run_stuff()
during tar_make()
and always shows as invalidated/outdated in tar_outdated()
,
tar_visnetwork()
, and similar functions.
Other cues:
tar_age()
,
tar_cue_age()
,
tar_cue_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(26)), cue = tarchetypes::tar_cue_force(1 > 0) ) ) }) targets::tar_make() targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(26)), cue = tarchetypes::tar_cue_force(1 > 0) ) ) }) targets::tar_make() targets::tar_make() }) }
tar_cue_skip()
creates a cue object to
skip a target if an arbitrary condition evaluates to TRUE
.
The target still builds if it was never built before.
Supply the returned cue object to the cue
argument of
targets::tar_target()
or similar.
tar_cue_skip( condition, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE )
tar_cue_skip( condition, command = TRUE, depend = TRUE, format = TRUE, repository = TRUE, iteration = TRUE, file = TRUE )
condition |
Logical vector evaluated locally when the target is
defined. If any element of |
command |
Logical, whether to rerun the target if command changed since last time. |
depend |
Logical, whether to rerun the target if the value of one of the dependencies changed. |
format |
Logical, whether to rerun the target if the user-specified
storage format changed. The storage format is user-specified through
|
repository |
Logical, whether to rerun the target if the user-specified
storage repository changed. The storage repository is user-specified
through |
iteration |
Logical, whether to rerun the target if the user-specified
iteration method changed. The iteration method is user-specified through
|
file |
Logical, whether to rerun the target if the file(s) with the return value changed or at least one is missing. |
A cue object. See the "Cue objects" section for background.
A cue object is an object generated by targets::tar_cue()
,
tarchetypes::tar_cue_force()
, or similar. It is
a collection of decision rules that decide when a target
is invalidated/outdated (e.g. when tar_make()
or similar
reruns the target). You can supply these cue objects to the
tar_target()
function or similar. For example,
tar_target(x, run_stuff(), cue = tar_cue(mode = "always"))
is a target that always calls run_stuff()
during tar_make()
and always shows as invalidated/outdated in tar_outdated()
,
tar_visnetwork()
, and similar functions.
Other cues:
tar_age()
,
tar_cue_age()
,
tar_cue_force()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(26)), cue = tarchetypes::tar_cue_skip(1 > 0) ) ) }) targets::tar_make() targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(25)), # Change the command. cue = tarchetypes::tar_cue_skip(1 > 0) ) ) }) targets::tar_make() targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(26)), cue = tarchetypes::tar_cue_skip(1 > 0) ) ) }) targets::tar_make() targets::tar_script({ library(tarchetypes) list( targets::tar_target( data, data.frame(x = seq_len(25)), # Change the command. cue = tarchetypes::tar_cue_skip(1 > 0) ) ) }) targets::tar_make() targets::tar_make() }) }
Create a target that downloads file from one or more URLs and automatically reruns when the remote data changes (according to the ETags or last-modified time stamps).
tar_download( name, urls, paths, method = NULL, quiet = TRUE, mode = "w", cacheOK = TRUE, extra = NULL, headers = NULL, iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_download( name, urls, paths, method = NULL, quiet = TRUE, mode = "w", cacheOK = TRUE, extra = NULL, headers = NULL, iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
urls |
Character vector of URLs to track and download. Must be known and declared before the pipeline runs. |
paths |
Character vector of local file paths to download each of the URLs. Must be known and declared before the pipeline runs. |
method |
Method to be used for downloading files. Current
download methods are The method can also be set through the option
|
quiet |
If |
mode |
character. The mode with which to write the file. Useful
values are |
cacheOK |
logical. Is a server-side cached value acceptable? |
extra |
character vector of additional command-line arguments for
the |
headers |
named character vector of additional HTTP headers to
use in HTTP[S] requests. It is ignored for non-HTTP[S] URLs. The
|
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_download()
creates a pair of targets, one upstream
and one downstream. The upstream target uses format = "url"
(see targets::tar_target()
) to track files at one or more URLs,
and automatically invalidate the target if the ETags
or last-modified time stamps change. The downstream target
depends on the upstream one, downloads the files,
and tracks them using format = "file"
.
A list of two target objects, one upstream and one downstream. The upstream one watches a URL for changes, and the downstream one downloads it. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other targets with custom invalidation rules:
tar_change()
,
tar_force()
,
tar_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_download( x, urls = c("https://httpbin.org/etag/test", "https://r-project.org"), paths = c("downloaded_file_1", "downloaded_file_2") ) ) }) targets::tar_make() targets::tar_read(x) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_download( x, urls = c("https://httpbin.org/etag/test", "https://r-project.org"), paths = c("downloaded_file_1", "downloaded_file_2") ) ) }) targets::tar_make() targets::tar_read(x) }) }
Loop over a grid of values, create an expression object from each one, and then evaluate that expression. Helps with general metaprogramming.
tar_eval()
expects an unevaluated expression for
the expr
object, whereas tar_eval_raw()
expects an
evaluated expression object.
tar_eval(expr, values, envir = parent.frame()) tar_eval_raw(expr, values, envir = parent.frame())
tar_eval(expr, values, envir = parent.frame()) tar_eval_raw(expr, values, envir = parent.frame())
expr |
Starting expression. Values are iteratively substituted
in place of symbols in
|
values |
List of values to substitute into |
envir |
Environment in which to evaluate the new expressions. |
A list of return values from the generated expression objects. Often, these values are target objects. See the "Target objects" section for background on target objects specifically.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Metaprogramming utilities:
tar_sub()
# tar_map() is incompatible with tar_render() because the latter # operates on preexisting tar_target() objects. By contrast, # tar_eval() and tar_sub() iterate over the literal code # farther upstream. values <- list( name = lapply(c("name1", "name2"), as.symbol), file = list("file1.Rmd", "file2.Rmd") ) tar_sub(list(name, file), values = values) tar_sub(tar_render(name, file), values = values) path <- tempfile() file.create(path) str(tar_eval(tar_render(name, path), values = values)) str(tar_eval_raw(quote(tar_render(name, path)), values = values)) # So in your _targets.R file, you can define a pipeline like as below. # Just make sure to set a unique name for each target # (which tar_map() does automatically). values <- list( name = lapply(c("name1", "name2"), as.symbol), file = c(path, path) ) list( tar_eval(tar_render(name, file), values = values) )
# tar_map() is incompatible with tar_render() because the latter # operates on preexisting tar_target() objects. By contrast, # tar_eval() and tar_sub() iterate over the literal code # farther upstream. values <- list( name = lapply(c("name1", "name2"), as.symbol), file = list("file1.Rmd", "file2.Rmd") ) tar_sub(list(name, file), values = values) tar_sub(tar_render(name, file), values = values) path <- tempfile() file.create(path) str(tar_eval(tar_render(name, path), values = values)) str(tar_eval_raw(quote(tar_render(name, path)), values = values)) # So in your _targets.R file, you can define a pipeline like as below. # Just make sure to set a unique name for each target # (which tar_map() does automatically). values <- list( name = lapply(c("name1", "name2"), as.symbol), file = c(path, path) ) list( tar_eval(tar_render(name, file), values = values) )
Create a pair of targets: one to
track a file with format = "file"
, and another
to read the file.
tar_file_read( name, command, read, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), format_file = c("file", "file_fast"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_file_read( name, command, read, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), format_file = c("file", "file_fast"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code that runs in the |
read |
R code to read the file. Must include |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
format_file |
Storage format of the file target, either
|
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A list of two new target objects to track a file and read the contents. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ tar_file_read(data, get_path(), read_csv(file = !!.x, col_types = cols())) }) targets::tar_manifest() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ tar_file_read(data, get_path(), read_csv(file = !!.x, col_types = cols())) }) targets::tar_manifest() }) }
Dynamic branching over output or input files.
tar_files()
expects a unevaluated symbol for the name
argument
and an unevaluated expression for command
, whereas
tar_files_raw()
expects a character string for the name
argument
and an evaluated expression object for command
. See the examples
for a demo.
tar_files( name, command, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = c("file", "file_fast", "url", "aws_file"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_files_raw( name, command, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = c("file", "url", "aws_file", "file_fast"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_files( name, command, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = c("file", "file_fast", "url", "aws_file"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_files_raw( name, command, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = c("file", "url", "aws_file", "file_fast"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command |
R command for the target.
|
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Character of length 1.
Must be |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_files()
creates a pair of targets, one upstream
and one downstream. The upstream target does some work
and returns some file paths, and the downstream
target is a pattern that applies format = "file"
or format = "url"
.
(URLs are input-only, they must already exist beforehand.)
This is the correct way to dynamically
iterate over file/url targets. It makes sure any downstream patterns
only rerun some of their branches if the files/urls change.
For more information, visit
https://github.com/ropensci/targets/issues/136 and
https://github.com/ropensci/drake/issues/1302.
A list of two targets, one upstream and one downstream.
The upstream one does some work and returns some file paths,
and the downstream target is a pattern that applies format = "file"
or format = "url"
.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Dynamic branching over files:
tar_files_input()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Do not use temp files in real projects # or else your targets will always rerun. paths <- unlist(replicate(2, tempfile())) file.create(paths) list( tar_files(name = x, command = paths), tar_files_raw(name = "y", command = quote(paths)) ) }) targets::tar_make() targets::tar_read(x) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Do not use temp files in real projects # or else your targets will always rerun. paths <- unlist(replicate(2, tempfile())) file.create(paths) list( tar_files(name = x, command = paths), tar_files_raw(name = "y", command = quote(paths)) ) }) targets::tar_make() targets::tar_read(x) }) }
Dynamic branching over input files or URLs.
tar_files_input()
expects a unevaluated symbol for the name
argument,
whereas
tar_files_input_raw()
expects a character string for name
.
See the examples
for a demo.
tar_files_input( name, files, batches = length(files), format = c("file", "file_fast", "url", "aws_file"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_files_input_raw( name, files, batches = length(files), format = c("file", "file_fast", "url", "aws_file"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_files_input( name, files, batches = length(files), format = c("file", "file_fast", "url", "aws_file"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_files_input_raw( name, files, batches = length(files), format = c("file", "file_fast", "url", "aws_file"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
files |
Nonempty character vector of known existing input files to track for changes. |
batches |
Positive integer of length 1, number of batches to partition the files. The default is one file per batch (maximum number of batches) which is simplest to handle but could cause a lot of overhead and consume a lot of computing resources. Consider reducing the number of batches below the number of files for heavy workloads. |
format |
Character, either |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character, iteration method. Must be a method
supported by the |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_files_input()
is like tar_files()
but more convenient when the files in question already
exist and are known in advance. Whereas tar_files()
always appears outdated (e.g. with tar_outdated()
)
because it always needs to check which files it needs to
branch over, tar_files_input()
will appear up to date
if the files have not changed since last tar_make()
.
In addition, tar_files_input()
automatically groups
input files into batches to reduce overhead and
increase the efficiency of parallel processing.
tar_files_input()
creates a pair of targets, one upstream
and one downstream. The upstream target does some work
and returns some file paths, and the downstream
target is a pattern that applies format = "file"
,
format = "file_fast"
, or format = "url"
.
This is the correct way to dynamically
iterate over file/url targets. It makes sure any downstream patterns
only rerun some of their branches if the files/urls change.
For more information, visit
https://github.com/ropensci/targets/issues/136 and
https://github.com/ropensci/drake/issues/1302.
A list of two targets, one upstream and one downstream.
The upstream one does some work and returns some file paths,
and the downstream target is a pattern that applies format = "file"
or format = "url"
.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Dynamic branching over files:
tar_files()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Do not use temp files in real projects # or else your targets will always rerun. paths <- unlist(replicate(4, tempfile())) file.create(paths) list( tar_files_input( name = x, files = paths, batches = 2 ), tar_files_input_raw( name = "y", files = paths, batches = 2 ) ) }) targets::tar_make() targets::tar_read(x) targets::tar_read(x, branches = 1) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Do not use temp files in real projects # or else your targets will always rerun. paths <- unlist(replicate(4, tempfile())) file.create(paths) list( tar_files_input( name = x, files = paths, batches = 2 ), tar_files_input_raw( name = "y", files = paths, batches = 2 ) ) }) targets::tar_make() targets::tar_read(x) targets::tar_read(x, branches = 1) }) }
Create a target that always runs if a user-defined condition rule is met.
tar_force( name, command, force, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_force( name, command, force, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
force |
R code for the condition that forces a build.
If it evaluates to |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_force()
creates a target that always runs
when a custom condition is met. The implementation builds
on top of tar_change()
. Thus, a pair of targets is created:
an upstream auxiliary target to indicate the custom condition
and a downstream target that responds to it and does your work.
tar_force()
does not actually use tar_cue_force()
, and the
mechanism is totally different.
Because the upstream target always runs,
tar_outdated()
and tar_visnetwork()
will always
show both targets as outdated. However, tar_make()
will still
skip the downstream one if the upstream custom condition is not met.
A list of 2 targets objects: one to indicate whether the custom condition is met, and another to respond to it and do your actual work. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other targets with custom invalidation rules:
tar_change()
,
tar_download()
,
tar_skip()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_force(x, tempfile(), force = 1 > 0) ) }) targets::tar_make() targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_force(x, tempfile(), force = 1 > 0) ) }) targets::tar_make() targets::tar_make() }) }
Nanoparquet storage format for data frames.
Uses nanoparquet::read_parquet()
and nanoparquet::write_parquet()
to read and write data frames returned by targets in a pipeline.
Note: attributes such as dplyr
row groupings and posterior
draws info are dropped during the writing process.
tar_format_nanoparquet(compression = "snappy", class = "tbl")
tar_format_nanoparquet(compression = "snappy", class = "tbl")
compression |
Character string, compression type for saving the
data. See the |
class |
Character vector with the data frame subclasses to assign.
See the |
A targets::tar_format()
storage format specification string
that can be directly supplied to the format
argument of
targets::tar_target()
or targets::tar_option_set()
.
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(targets) libary(tarchetypes) list( tar_target( name = data, command = data.frame(x = 1), format = tar_format_nanoparquet() ) ) }) tar_make() tar_read(data) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(targets) libary(tarchetypes) list( tar_target( name = data, command = data.frame(x = 1), format = tar_format_nanoparquet() ) ) }) tar_make() tar_read(data) }) }
Target factories for targets with
specialized storage formats. For example,
tar_qs(name = data, command = get_data())
is shorthand for
tar_target(name = data, command = get_data(), format = "qs")
.
Most of the formats are shorthand for built-in formats in targets
.
The only exception currently is the nanoparquet
format:
tar_nanoparquet(data, get_data())
is shorthand for
tar_target(data get_data(), format = tar_format_nanoparquet())
,
where tar_format_nanoparquet()
resides in tarchetypes
.
tar_format_feather()
is superseded in favor of tar_arrow_feather()
,
and all the tar_aws_*()
functions are superseded because of the
introduction of the aws
argument into targets::tar_target()
.
tar_url( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_file( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_file_fast( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_rds( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_qs( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_keras( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_torch( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_arrow_feather( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_parquet( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_fst( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_fst_dt( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_fst_tbl( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_nanoparquet( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), compression = "snappy", class = "tbl" )
tar_url( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_file( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_file_fast( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_rds( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_qs( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_keras( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_torch( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_arrow_feather( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_parquet( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_fst( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_fst_dt( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_fst_tbl( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_nanoparquet( name, command, pattern = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), compression = "snappy", class = "tbl" )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
pattern |
Code to define a dynamic branching branching for a target.
In To demonstrate dynamic branching patterns, suppose we have
a pipeline with numeric vector targets |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
compression |
Character string, compression type for saving the
data. See the |
class |
Character vector with the data frame subclasses to assign.
See the |
These functions are shorthand for targets with specialized
storage formats. For example, tar_qs(name, fun())
is equivalent to
tar_target(name, fun(), format = "qs")
.
For details on specialized storage formats, open the help file of the
targets::tar_target()
function and read about the format
argument.
A tar_target()
object with the eponymous storage format.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(targets) library(tarchetypes) list( tar_rds(name = x, command = 1), tar_nanoparquet(name = y, command = data.frame(x = x)) ) }) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(targets) library(tarchetypes) list( tar_rds(name = x, command = 1), tar_nanoparquet(name = y, command = data.frame(x = x)) ) }) targets::tar_make() }) }
Create a target that outputs a grouped data frame
with dplyr::group_by()
and targets::tar_group()
. Downstream
dynamic branching targets will iterate over the groups of rows.
tar_group_by( name, command, ..., tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_group_by( name, command, ..., tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
... |
Symbols, variables in the output data frame to group by. |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A target object to generate a grouped data frame to allows downstream dynamic targets to branch over the groups of rows. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Grouped data frame targets:
tar_group_count()
,
tar_group_select()
,
tar_group_size()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_by(data, produce_data(), var1, var2), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_by(data, produce_data(), var1, var2), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
Create a target that outputs a grouped data frame
for downstream dynamic branching. Set the maximum
number of groups using count
. The number of rows per group
varies but is approximately uniform.
tar_group_count( name, command, count, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_group_count( name, command, count, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
count |
Positive integer, maximum number of row groups |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A target object to generate a grouped data frame to allows downstream dynamic targets to branch over the groups of rows. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Grouped data frame targets:
tar_group_by()
,
tar_group_select()
,
tar_group_size()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_count(data, produce_data(), count = 2), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_count(data, produce_data(), count = 2), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
tidyselect
semantics.Create a target that outputs a grouped data frame
with dplyr::group_by()
and targets::tar_group()
.
Unlike tar_group_by()
, tar_group_select()
expects you to select grouping variables using tidyselect
semantics.
Downstream dynamic branching targets will iterate over the groups of rows.
tar_group_select( name, command, by = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_group_select( name, command, by = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
by |
Tidyselect semantics to specify variables to group over. Alternatively, you can supply a character vector. |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A target object to generate a grouped data frame to allows downstream dynamic targets to branch over the groups of rows. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Grouped data frame targets:
tar_group_by()
,
tar_group_count()
,
tar_group_size()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_select(data, produce_data(), starts_with("var")), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_select(data, produce_data(), starts_with("var")), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
Create a target that outputs a grouped data frame
for downstream dynamic branching. Row groups have
the number of rows you supply to size
(plus the remainder
in a group of its own, if applicable.) The total number of groups
varies.
tar_group_size( name, command, size, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_group_size( name, command, size, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
size |
Positive integer, maximum number of rows in each group. |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A target object to generate a grouped data frame to allows downstream dynamic targets to branch over the groups of rows. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Grouped data frame targets:
tar_group_by()
,
tar_group_count()
,
tar_group_select()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_size(data, produce_data(), size = 7), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ produce_data <- function() { expand.grid(var1 = c("a", "b"), var2 = c("c", "d"), rep = c(1, 2, 3)) } list( tarchetypes::tar_group_size(data, produce_data(), size = 7), tar_target(group, data, pattern = map(data)) ) }) targets::tar_make() # Read the first row group: targets::tar_read(group, branches = 1) # Read the second row group: targets::tar_read(group, branches = 2) }) }
Prepend R code to the commands of multiple targets.
tar_hook_before()
expects unevaluated expressions for the hook
and
names
arguments, whereas tar_hook_before_raw()
expects
evaluated expression objects.
tar_hook_before( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() ) tar_hook_before_raw( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() )
tar_hook_before( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() ) tar_hook_before_raw( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() )
targets |
A list of target objects. The input target list can be arbitrarily nested, but it must consist entirely of target objects. In addition, the return value is a simple list where each element is a target object. All hook functions remove the nested structure of the input target list. |
hook |
R code to insert.
|
names |
Name of targets in the target list
to apply the hook. Supplied using The regular hook functions expects unevaluated expressions for the |
set_deps |
Logical of length 1, whether to refresh the dependencies
of each modified target by scanning the newly generated
target commands for dependencies. If |
envir |
Optional environment to construct the quosure for the |
A flattened list of target objects with the hooks applied. Even if the input target list had a nested structure, the return value is a simple list where each element is a target object. All hook functions remove the nested structure of the input target list.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other hooks:
tar_hook_inner()
,
tar_hook_outer()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_before( targets = targets, hook = print("Running hook."), names = starts_with("x") ) }) targets::tar_manifest(fields = command) }) # With tar_hook_before_raw(): targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_before_raw( targets = targets, hook = quote(print("Running hook.")), names = quote(starts_with("x")) ) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_before( targets = targets, hook = print("Running hook."), names = starts_with("x") ) }) targets::tar_manifest(fields = command) }) # With tar_hook_before_raw(): targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_before_raw( targets = targets, hook = quote(print("Running hook.")), names = quote(starts_with("x")) ) }) }
In the command of each target, wrap each mention of each dependency target in an arbitrary R expression.
tar_hook_inner()
expects unevaluated expressions for the hook
and
names
arguments, whereas tar_hook_inner_raw()
expects
evaluated expression objects.
tar_hook_inner( targets, hook, names = NULL, names_wrap = NULL, set_deps = TRUE, envir = parent.frame() ) tar_hook_inner_raw( targets, hook, names = NULL, names_wrap = NULL, set_deps = TRUE, envir = parent.frame() )
tar_hook_inner( targets, hook, names = NULL, names_wrap = NULL, set_deps = TRUE, envir = parent.frame() ) tar_hook_inner_raw( targets, hook, names = NULL, names_wrap = NULL, set_deps = TRUE, envir = parent.frame() )
targets |
A list of target objects. The input target list can be arbitrarily nested, but it must consist entirely of target objects. In addition, the return value is a simple list where each element is a target object. All hook functions remove the nested structure of the input target list. |
hook |
R code to wrap each target's command.
The hook must contain the special placeholder symbol
|
names |
Name of targets in the target list
to apply the hook. Supplied using The regular hook functions expects unevaluated expressions for the |
names_wrap |
Names of targets to wrap with the hook
where they appear as dependencies in the commands of other targets.
Use |
set_deps |
Logical of length 1, whether to refresh the dependencies
of each modified target by scanning the newly generated
target commands for dependencies. If |
envir |
Optional environment to construct the quosure for the |
The expression you supply to hook
must contain the special placeholder symbol .x
so tar_hook_inner()
knows where to insert the original command
of the target.
A flattened list of target objects with the hooks applied. Even if the input target list had a nested structure, the return value is a simple list where each element is a target object. All hook functions remove the nested structure of the input target list.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other hooks:
tar_hook_before()
,
tar_hook_outer()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2, x1)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_inner( targets = targets, hook = fun(.x), names = starts_with("x") ) }) targets::tar_manifest(fields = command) # With tar_hook_inner_raw(): targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2, x1)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_inner_raw( targets = targets, hook = quote(fun(.x)), names = quote(starts_with("x")) ) }) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2, x1)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_inner( targets = targets, hook = fun(.x), names = starts_with("x") ) }) targets::tar_manifest(fields = command) # With tar_hook_inner_raw(): targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2, x1)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_inner_raw( targets = targets, hook = quote(fun(.x)), names = quote(starts_with("x")) ) }) }) }
Wrap the command of each target in an arbitrary R expression.
tar_hook_outer()
expects unevaluated expressions for the hook
and
names
arguments, whereas tar_hook_outer_raw()
expects
evaluated expression objects.
tar_hook_outer( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() ) tar_hook_outer_raw( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() )
tar_hook_outer( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() ) tar_hook_outer_raw( targets, hook, names = NULL, set_deps = TRUE, envir = parent.frame() )
targets |
A list of target objects. The input target list can be arbitrarily nested, but it must consist entirely of target objects. In addition, the return value is a simple list where each element is a target object. All hook functions remove the nested structure of the input target list. |
hook |
R code to wrap each target's command.
The hook must contain the special placeholder symbol
|
names |
Name of targets in the target list
to apply the hook. Supplied using The regular hook functions expects unevaluated expressions for the |
set_deps |
Logical of length 1, whether to refresh the dependencies
of each modified target by scanning the newly generated
target commands for dependencies. If |
envir |
Optional environment to construct the quosure for the |
The expression you supply to hook
must contain the special placeholder symbol .x
so tar_hook_outer()
knows where to insert the original command
of the target.
A flattened list of target objects with the hooks applied. Even if the input target list had a nested structure, the return value is a simple list where each element is a target object. All hook functions remove the nested structure of the input target list.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other hooks:
tar_hook_before()
,
tar_hook_inner()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_outer( targets = targets, hook = postprocess(.x, arg = "value"), names = starts_with("x") ) }) targets::tar_manifest(fields = command) # Using tar_hook_outer_raw(): targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_outer_raw( targets = targets, hook = quote(postprocess(.x, arg = "value")), names = quote(starts_with("x")) ) }) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_outer( targets = targets, hook = postprocess(.x, arg = "value"), names = starts_with("x") ) }) targets::tar_manifest(fields = command) # Using tar_hook_outer_raw(): targets::tar_script({ targets <- list( # Nested target lists work with hooks. list( targets::tar_target(x1, task1()), targets::tar_target(x2, task2(x1)) ), targets::tar_target(x3, task3(x2)), targets::tar_target(y1, task4(x3)) ) tarchetypes::tar_hook_outer_raw( targets = targets, hook = quote(postprocess(.x, arg = "value")), names = quote(starts_with("x")) ) }) }) }
knitr
document.Shorthand to include knitr
document in a
targets
pipeline.
tar_knit()
expects an unevaluated symbol for the name
argument,
and it supports named ...
arguments for knitr::knit()
arguments.
tar_knit_raw()
expects a character string for name
and
supports an evaluated expression object
knit_arguments
for knitr::knit()
arguments.
tar_knit( name, path, output_file = NULL, working_directory = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, ... ) tar_knit_raw( name, path, output_file = NULL, working_directory = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, knit_arguments = quote(list()) )
tar_knit( name, path, output_file = NULL, working_directory = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, ... ) tar_knit_raw( name, path, output_file = NULL, working_directory = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, knit_arguments = quote(list()) )
name |
Name of the target.
|
path |
Character string, file path to the |
output_file |
Character string, file path to the rendered output file. |
working_directory |
Optional character string,
path to the working directory
to temporarily set when running the report.
The default is |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
quiet |
Boolean; suppress the progress bar and messages? |
... |
Named arguments to |
knit_arguments |
Optional language object with a list
of named arguments to |
tar_knit()
is an alternative to tar_target()
for
knitr
reports that depend on other targets. The knitr
source
should mention dependency targets with tar_load()
and tar_read()
in the active code chunks (which also allows you to knit the report
outside the pipeline if the _targets/
data store already exists).
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
Then, tar_knit()
defines a special kind of target. It
1. Finds all the tar_load()
/tar_read()
dependencies in the report
and inserts them into the target's command.
This enforces the proper dependency relationships.
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
2. Sets format = "file"
(see tar_target()
) so targets
watches the files at the returned paths and reruns the report
if those files change.
3. Configures the target's command to return both the output
report files and the input source file. All these file paths
are relative paths so the project stays portable.
4. Forces the report to run in the user's current working directory
instead of the working directory of the report.
5. Sets convenient default options such as deployment = "main"
in the target and quiet = TRUE
in knitr::knit()
.
A tar_target()
object with format = "file"
.
When this target runs, it returns a character vector
of file paths. The first file paths are the output files
(returned by knitr::knit()
) and the knitr
source file is last. But unlike knitr::knit()
,
all returned paths are relative paths to ensure portability
(so that the project can be moved from one file system to another
without invalidating the target).
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Literate programming targets:
tar_quarto()
,
tar_quarto_rep()
,
tar_render()
,
tar_render_rep()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Ordinarily, you should create the report outside # tar_script() and avoid temporary files. lines <- c( "---", "title: report", "output_format: html_document", "---", "", "```{r}", "targets::tar_read(data)", "```" ) path <- tempfile() writeLines(lines, path) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_knit(name = report, path = path), tar_knit_raw(name = "report2", path = path) ) }) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Ordinarily, you should create the report outside # tar_script() and avoid temporary files. lines <- c( "---", "title: report", "output_format: html_document", "---", "", "```{r}", "targets::tar_read(data)", "```" ) path <- tempfile() writeLines(lines, path) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_knit(name = report, path = path), tar_knit_raw(name = "report2", path = path) ) }) targets::tar_make() }) }
List the target dependencies of one or more
literate programming reports (R Markdown or knitr
).
tar_knitr_deps(path)
tar_knitr_deps(path)
path |
Character vector, path to one or more R Markdown or
|
Character vector of the names of targets
that are dependencies of the knitr
report.
Other Literate programming utilities:
tar_knitr_deps_expr()
,
tar_quarto_files()
lines <- c( "---", "title: report", "output_format: html_document", "---", "", "```{r}", "targets::tar_load(data1)", "targets::tar_read(data2)", "```" ) report <- tempfile() writeLines(lines, report) tar_knitr_deps(report)
lines <- c( "---", "title: report", "output_format: html_document", "---", "", "```{r}", "targets::tar_load(data1)", "targets::tar_read(data2)", "```" ) report <- tempfile() writeLines(lines, report) tar_knitr_deps(report)
Construct an expression whose global variable dependencies
are the target dependencies of one or more literate programming reports
(R Markdown or knitr
). This helps third-party developers create their
own third-party target factories for literate programming targets
(similar to tar_knit()
and tar_render()
).
tar_knitr_deps_expr(path)
tar_knitr_deps_expr(path)
path |
Character vector, path to one or more R Markdown or
|
Expression object to name the dependency targets
of the knitr
report, which will be detected in the
static code analysis of targets
.
Other Literate programming utilities:
tar_knitr_deps()
,
tar_quarto_files()
lines <- c( "---", "title: report", "output_format: html_document", "---", "", "```{r}", "targets::tar_load(data1)", "targets::tar_read(data2)", "```" ) report <- tempfile() writeLines(lines, report) tar_knitr_deps_expr(report)
lines <- c( "---", "title: report", "output_format: html_document", "---", "", "```{r}", "targets::tar_load(data1)", "targets::tar_read(data2)", "```" ) report <- tempfile() writeLines(lines, report) tar_knitr_deps_expr(report)
Define multiple new targets based on existing target objects.
tar_map( values, ..., names = tidyselect::everything(), descriptions = tidyselect::everything(), unlist = FALSE, delimiter = "_" )
tar_map( values, ..., names = tidyselect::everything(), descriptions = tidyselect::everything(), unlist = FALSE, delimiter = "_" )
values |
Named list or data frame with values to iterate over.
The names are the names of symbols in the commands and pattern
statements, and the elements are values that get substituted
in place of those symbols. |
... |
One or more target objects or list of target objects.
Lists can be arbitrarily nested, as in |
names |
Subset of |
descriptions |
Names of a column in |
unlist |
Logical, whether to flatten the returned list of targets.
If |
delimiter |
Character of length 1, string to insert between other strings when creating names of targets. |
tar_map()
creates collections of new
targets by iterating over a list of arguments
and substituting symbols into commands and pattern statements.
A list of new target objects. If unlist
is FALSE
,
the list is nested and sub-lists are named and grouped by the original
input targets. If unlist = TRUE
, the return value is a flat list of
targets named by the new target names.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other static branching:
tar_combine()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_map( list(a = c(12, 34), b = c(45, 78)), targets::tar_target(x, a + b), targets::tar_target(y, x + a, pattern = map(x)) ) ) }) targets::tar_manifest() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_map( list(a = c(12, 34), b = c(45, 78)), targets::tar_target(x, a + b), targets::tar_target(y, x + a, pattern = map(x)) ) ) }) targets::tar_manifest() }) }
Define targets for batched replication within static branches for data frames.
tar_map_rep()
expects an unevaluated symbol for the name
argument
and an unevaluated expression for command
,
whereas tar_map_rep_raw()
expects a character string for name
and an evaluated expression object for command
.
tar_map_rep( name, command, values = NULL, names = NULL, descriptions = tidyselect::everything(), columns = tidyselect::everything(), batches = 1, reps = 1, rep_workers = 1, combine = TRUE, delimiter = "_", unlist = FALSE, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_map_rep_raw( name, command, values = NULL, names = NULL, descriptions = quote(tidyselect::everything()), columns = quote(tidyselect::everything()), batches = 1, reps = 1, rep_workers = 1, combine = TRUE, delimiter = "_", unlist = FALSE, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_map_rep( name, command, values = NULL, names = NULL, descriptions = tidyselect::everything(), columns = tidyselect::everything(), batches = 1, reps = 1, rep_workers = 1, combine = TRUE, delimiter = "_", unlist = FALSE, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_map_rep_raw( name, command, values = NULL, names = NULL, descriptions = quote(tidyselect::everything()), columns = quote(tidyselect::everything()), batches = 1, reps = 1, rep_workers = 1, combine = TRUE, delimiter = "_", unlist = FALSE, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command |
R code for a single replicate. Must return
a data frame when run.
|
values |
Named list or data frame with values to iterate over.
The names are the names of symbols in the commands and pattern
statements, and the elements are values that get substituted
in place of those symbols. |
names |
Subset of |
descriptions |
Names of a column in |
columns |
A tidyselect expression to select which columns of |
batches |
Number of batches. This is also the number of dynamic
branches created during |
reps |
Number of replications in each batch. The total number
of replications is |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
combine |
Logical of length 1, whether to statically combine all the results into a single target downstream. |
delimiter |
Character of length 1, string to insert between other strings when creating names of targets. |
unlist |
Logical, whether to flatten the returned list of targets.
If |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
A list of new target objects. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Other branching:
tar_map2()
,
tar_map2_count()
,
tar_map2_size()
,
tar_rep()
,
tar_rep2()
,
tar_rep_map()
,
tar_rep_map_raw()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Just a sketch of a Bayesian sensitivity analysis of hyperparameters: assess_hyperparameters <- function(sigma1, sigma2) { # data <- simulate_random_data() # user-defined function # run_model(data, sigma1, sigma2) # user-defined function # Mock output from the model: posterior_samples <- stats::rnorm(1000, 0, sigma1 + sigma2) tibble::tibble( posterior_median = median(posterior_samples), posterior_quantile_0.025 = quantile(posterior_samples, 0.025), posterior_quantile_0.975 = quantile(posterior_samples, 0.975) ) } hyperparameters <- tibble::tibble( scenario = c("tight", "medium", "diffuse"), sigma1 = c(10, 50, 50), sigma2 = c(10, 5, 10) ) list( tar_map_rep( name = sensitivity_analysis, command = assess_hyperparameters(sigma1, sigma2), values = hyperparameters, names = tidyselect::any_of("scenario"), batches = 2, reps = 3 ), tar_map_rep_raw( name = "sensitivity_analysis2", command = quote(assess_hyperparameters(sigma1, sigma2)), values = hyperparameters, names = tidyselect::any_of("scenario"), batches = 2, reps = 3 ) ) }) targets::tar_make() targets::tar_read(sensitivity_analysis) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) # Just a sketch of a Bayesian sensitivity analysis of hyperparameters: assess_hyperparameters <- function(sigma1, sigma2) { # data <- simulate_random_data() # user-defined function # run_model(data, sigma1, sigma2) # user-defined function # Mock output from the model: posterior_samples <- stats::rnorm(1000, 0, sigma1 + sigma2) tibble::tibble( posterior_median = median(posterior_samples), posterior_quantile_0.025 = quantile(posterior_samples, 0.025), posterior_quantile_0.975 = quantile(posterior_samples, 0.975) ) } hyperparameters <- tibble::tibble( scenario = c("tight", "medium", "diffuse"), sigma1 = c(10, 50, 50), sigma2 = c(10, 5, 10) ) list( tar_map_rep( name = sensitivity_analysis, command = assess_hyperparameters(sigma1, sigma2), values = hyperparameters, names = tidyselect::any_of("scenario"), batches = 2, reps = 3 ), tar_map_rep_raw( name = "sensitivity_analysis2", command = quote(assess_hyperparameters(sigma1, sigma2)), values = hyperparameters, names = tidyselect::any_of("scenario"), batches = 2, reps = 3 ) ) }) targets::tar_make() targets::tar_read(sensitivity_analysis) }) }
Define targets for batched dynamic-within-static branching for data frames, where the user sets the (maximum) number of batches.
tar_map2_count()
expects unevaluated language for arguments
name
, command1
, command2
, columns1
, and columns2
.
tar_map2_count_raw()
expects a character string for name
and an evaluated expression object for each of
command1
, command2
, columns1
, and columns2
.
tar_map2_count( name, command1, command2, values = NULL, names = NULL, descriptions = tidyselect::everything(), batches = 1L, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = tidyselect::everything(), columns2 = tidyselect::everything(), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_map2_count_raw( name, command1, command2, values = NULL, names = NULL, descriptions = quote(tidyselect::everything()), batches = 1L, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = quote(tidyselect::everything()), columns2 = quote(tidyselect::everything()), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_map2_count( name, command1, command2, values = NULL, names = NULL, descriptions = tidyselect::everything(), batches = 1L, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = tidyselect::everything(), columns2 = tidyselect::everything(), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_map2_count_raw( name, command1, command2, values = NULL, names = NULL, descriptions = quote(tidyselect::everything()), batches = 1L, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = quote(tidyselect::everything()), columns2 = quote(tidyselect::everything()), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command1 |
R code to create named arguments to In regular |
command2 |
R code to map over the data frame of arguments
produced by In regular |
values |
Named list or data frame with values to iterate over.
The names are the names of symbols in the commands and pattern
statements, and the elements are values that get substituted
in place of those symbols. |
names |
Subset of |
descriptions |
Names of a column in |
batches |
Positive integer of length 1,
maximum number of batches (dynamic branches within static branches)
of the downstream ( |
combine |
Logical of length 1, whether to statically combine all the results into a single target downstream. |
suffix1 |
Character of length 1,
suffix to apply to the |
suffix2 |
Character of length 1,
suffix to apply to the |
columns1 |
A tidyselect expression to select which columns of In regular |
columns2 |
A tidyselect expression to select which columns of
In regular |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
delimiter |
Character of length 1, string to insert between other strings when creating names of targets. |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Static branching creates one pair of targets
for each row in values
. In each pair,
there is an upstream non-dynamic target that runs command1
and a downstream dynamic target that runs command2
.
command1
produces a data frame of arguments to
command2
, and command2
dynamically maps over
these arguments in batches.
A list of new target objects. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Other branching:
tar_map2()
,
tar_map2_size()
,
tar_map_rep()
,
tar_rep()
,
tar_rep2()
,
tar_rep_map()
,
tar_rep_map_raw()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ tarchetypes::tar_map2_count( x, command1 = tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ), command2 = tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), batches = 3 ) }) targets::tar_make() targets::tar_read(x) # With tar_map2_count_raw(): targets::tar_script({ tarchetypes::tar_map2_count_raw( name = "x", command1 = quote( tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ) ), command2 = quote( tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), batches = 3 ) }) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ tarchetypes::tar_map2_count( x, command1 = tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ), command2 = tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), batches = 3 ) }) targets::tar_make() targets::tar_read(x) # With tar_map2_count_raw(): targets::tar_script({ tarchetypes::tar_map2_count_raw( name = "x", command1 = quote( tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ) ), command2 = quote( tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), batches = 3 ) }) }) }
Define targets for batched dynamic-within-static branching for data frames, where the user sets the (maximum) size of each batch.
tar_map2_size()
expects unevaluated language for arguments
name
, command1
, command2
, columns1
, and columns2
.
tar_map2_size_raw()
expects a character string for name
and an evaluated expression object for each of
command1
, command2
, columns1
, and columns2
.
tar_map2_size( name, command1, command2, values = NULL, names = NULL, descriptions = tidyselect::everything(), size = Inf, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = tidyselect::everything(), columns2 = tidyselect::everything(), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_map2_size_raw( name, command1, command2, values = NULL, names = NULL, descriptions = quote(tidyselect::everything()), size = Inf, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = quote(tidyselect::everything()), columns2 = quote(tidyselect::everything()), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_map2_size( name, command1, command2, values = NULL, names = NULL, descriptions = tidyselect::everything(), size = Inf, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = tidyselect::everything(), columns2 = tidyselect::everything(), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_map2_size_raw( name, command1, command2, values = NULL, names = NULL, descriptions = quote(tidyselect::everything()), size = Inf, combine = TRUE, suffix1 = "1", suffix2 = "2", columns1 = quote(tidyselect::everything()), columns2 = quote(tidyselect::everything()), rep_workers = 1, delimiter = "_", tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command1 |
R code to create named arguments to In regular |
command2 |
R code to map over the data frame of arguments
produced by In regular |
values |
Named list or data frame with values to iterate over.
The names are the names of symbols in the commands and pattern
statements, and the elements are values that get substituted
in place of those symbols. |
names |
Subset of |
descriptions |
Names of a column in |
size |
Positive integer of length 1,
maximum number of rows in each batch for
the downstream ( |
combine |
Logical of length 1, whether to statically combine all the results into a single target downstream. |
suffix1 |
Character of length 1,
suffix to apply to the |
suffix2 |
Character of length 1,
suffix to apply to the |
columns1 |
A tidyselect expression to select which columns of In regular |
columns2 |
A tidyselect expression to select which columns of
In regular |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
delimiter |
Character of length 1, string to insert between other strings when creating names of targets. |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
Static branching creates one pair of targets
for each row in values
. In each pair,
there is an upstream non-dynamic target that runs command1
and a downstream dynamic target that runs command2
.
command1
produces a data frame of arguments to
command2
, and command2
dynamically maps over
these arguments in batches.
A list of new target objects. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Other branching:
tar_map2()
,
tar_map2_count()
,
tar_map_rep()
,
tar_rep()
,
tar_rep2()
,
tar_rep_map()
,
tar_rep_map_raw()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ tarchetypes::tar_map2_size( x, command1 = tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ), command2 = tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), size = 2 ) }) targets::tar_make() targets::tar_read(x) # With tar_map2_size_raw(): targets::tar_script({ tarchetypes::tar_map2_size_raw( name = "x", command1 = quote( tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ) ), command2 = quote( tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), size = 2 ) }) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ tarchetypes::tar_map2_size( x, command1 = tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ), command2 = tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), size = 2 ) }) targets::tar_make() targets::tar_read(x) # With tar_map2_size_raw(): targets::tar_script({ tarchetypes::tar_map2_size_raw( name = "x", command1 = quote( tibble::tibble( arg1 = arg1, arg2 = seq_len(6) ) ), command2 = quote( tibble::tibble( result = paste(arg1, arg2), random = sample.int(1e9, size = 1), length_input = length(arg1) ) ), values = tibble::tibble(arg1 = letters[seq_len(2)]), size = 2 ) }) }) }
drake
-plan-like pipeline DSLSimplify target specification in pipelines.
tar_plan(...)
tar_plan(...)
... |
Named and unnamed targets. All named targets must follow
the |
Allows targets with just targets and commands
to be written in the pipeline as target = command
instead of
tar_target(target, command)
. Also supports ordinary
target objects if they are unnamed.
tar_plan(x = 1, y = 2, tar_target(z, 3), tar_render(r, "r.Rmd"))
is equivalent to
list(tar_target(x, 1), tar_target(y, 2), tar_target(z, 3), tar_render(r, "r.Rmd"))
. # nolint
A list of tar_target()
objects.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) tar_plan( tarchetypes::tar_fst_tbl(data, data.frame(x = seq_len(26))), means = colMeans(data) # No need for tar_target() for simple cases. ) }) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) tar_plan( tarchetypes::tar_fst_tbl(data, data.frame(x = seq_len(26))), means = colMeans(data) # No need for tar_target() for simple cases. ) }) targets::tar_make() }) }
Shorthand to include a Quarto project in a
targets
pipeline.
tar_quarto()
expects an unevaluated symbol for the name
argument and an unevaluated expression for the exectue_params
argument.
tar_quarto_raw()
expects a character string for the name
argument and an evaluated expression object
for the exectue_params
argument.
tar_quarto( name, path = ".", output_file = NULL, working_directory = NULL, extra_files = character(0), execute = TRUE, execute_params = list(), cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, profile = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = NULL, library = NULL, error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_quarto_raw( name, path = ".", output_file = NULL, working_directory = NULL, extra_files = character(0), execute = TRUE, execute_params = NULL, cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, profile = NULL, packages = NULL, library = NULL, error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_quarto( name, path = ".", output_file = NULL, working_directory = NULL, extra_files = character(0), execute = TRUE, execute_params = list(), cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, profile = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = NULL, library = NULL, error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_quarto_raw( name, path = ".", output_file = NULL, working_directory = NULL, extra_files = character(0), execute = TRUE, execute_params = NULL, cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, profile = NULL, packages = NULL, library = NULL, error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
path |
Character string, path to the Quarto source file if rendering a single file, or the path to the root of the project if rendering a whole Quarto project. |
output_file |
The name of the output file. If using |
working_directory |
Optional character string,
path to the working directory
to temporarily set when running the report.
The default is |
extra_files |
Character vector of extra files and
directories to track for changes. The target will be invalidated
(rerun on the next |
execute |
Whether to execute embedded code chunks. |
execute_params |
Named collection of parameters
for parameterized Quarto documents. These parameters override the custom
custom elements of the
|
cache |
Cache execution output (uses knitr cache and jupyter-cache respectively for Rmd and Jupyter input files). |
cache_refresh |
Force refresh of execution cache. |
debug |
Leave intermediate files in place after render. |
quiet |
Suppress warning and other messages. |
quarto_args |
Character vector of other |
pandoc_args |
Additional command line arguments to pass on to Pandoc. |
profile |
Quarto project profile(s) to use. Either
a character vector of profile names or |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_quarto()
is an alternative to tar_target()
for
Quarto projects and standalone Quarto source documents
that depend on upstream targets. The Quarto
R source documents (*.qmd
and *.Rmd
files)
should mention dependency targets with tar_load()
and tar_read()
in the active R code chunks (which also allows you to render the project
outside the pipeline if the _targets/
data store already exists).
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
Then, tar_quarto()
defines a special kind of target. It
1. Finds all the tar_load()
/tar_read()
dependencies in the
R source reports and inserts them into the target's command.
This enforces the proper dependency relationships.
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
2. Sets format = "file"
(see tar_target()
) so targets
watches the files at the returned paths and reruns the report
if those files change.
3. Configures the target's command to return both the output
rendered files and the input dependency files (such as
Quarto source documents). All these file paths
are relative paths so the project stays portable.
4. Forces the report to run in the user's current working directory
instead of the working directory of the report.
5. Sets convenient default options such as deployment = "main"
in the target and quiet = TRUE
in quarto::quarto_render()
.
A target object with format = "file"
.
When this target runs, it returns a character vector
of file paths: the rendered documents, the Quarto source files,
and other input and output files.
The output files are determined by the YAML front-matter of
standalone Quarto documents and _quarto.yml
in Quarto projects,
and you can see these files with tar_quarto_files()
(powered by quarto::quarto_inspect()
).
All returned paths are relative paths to ensure portability
(so that the project can be moved from one file system to another
without invalidating the target).
See the "Target objects" section for background.
If you encounter difficult errors, please read
https://github.com/quarto-dev/quarto-r/issues/16.
In addition, please try to reproduce the error using
quarto::quarto_render("your_report.qmd", execute_dir = getwd())
without using targets
at all. Isolating errors this way
makes them much easier to solve.
Literate programming files are messy and variable,
so functions like tar_render()
have limitations:
* Child documents are not tracked for changes.
* Upstream target dependencies are not detected if tar_read()
and/or tar_load()
are called from a user-defined function.
In addition, single target names must be mentioned and they must
be symbols. tar_load("x")
and tar_load(contains("x"))
may not
detect target x
.
* Special/optional input/output files may not be detected in all cases.
* tar_render()
and friends are for local files only. They do not
integrate with the cloud storage capabilities of targets
.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Literate programming targets:
tar_knit()
,
tar_quarto_rep()
,
tar_render()
,
tar_render_rep()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Unparameterized Quarto document: lines <- c( "---", "title: report.qmd source file", "output_format: html", "---", "Assume these lines are in report.qmd.", "```{r}", "targets::tar_read(data)", "```" ) writeLines(lines, "report.qmd") # Include the report in a pipeline as follows. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_quarto(name = report, path = "report.qmd") ) }, ask = FALSE) # Then, run the pipeline as usual. # Parameterized Quarto: lines <- c( "---", "title: 'report.qmd source file with parameters'", "output_format: html_document", "params:", " your_param: \"default value\"", "---", "Assume these lines are in report.qmd.", "```{r}", "print(params$your_param)", "```" ) writeLines(lines, "report.qmd") # Include the report in the pipeline as follows. unlink("_targets.R") # In tar_dir(), not the user's file space. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_quarto( name = report, path = "report.qmd", execute_params = list(your_param = data) ), tar_quarto_raw( name = "report2", path = "report.qmd", execute_params = quote(list(your_param = data)) ) ) }, ask = FALSE) }) # Then, run the pipeline as usual. }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Unparameterized Quarto document: lines <- c( "---", "title: report.qmd source file", "output_format: html", "---", "Assume these lines are in report.qmd.", "```{r}", "targets::tar_read(data)", "```" ) writeLines(lines, "report.qmd") # Include the report in a pipeline as follows. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_quarto(name = report, path = "report.qmd") ) }, ask = FALSE) # Then, run the pipeline as usual. # Parameterized Quarto: lines <- c( "---", "title: 'report.qmd source file with parameters'", "output_format: html_document", "params:", " your_param: \"default value\"", "---", "Assume these lines are in report.qmd.", "```{r}", "print(params$your_param)", "```" ) writeLines(lines, "report.qmd") # Include the report in the pipeline as follows. unlink("_targets.R") # In tar_dir(), not the user's file space. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_quarto( name = report, path = "report.qmd", execute_params = list(your_param = data) ), tar_quarto_raw( name = "report2", path = "report.qmd", execute_params = quote(list(your_param = data)) ) ) }, ask = FALSE) }) # Then, run the pipeline as usual. }
Detect the important files in a Quarto project.
tar_quarto_files(path = ".", profile = NULL, quiet = TRUE)
tar_quarto_files(path = ".", profile = NULL, quiet = TRUE)
path |
Character of length 1, either the file path to a Quarto source document or the directory path to a Quarto project. Defaults to the Quarto project in the current working directory. |
profile |
Character of length 1, Quarto profile. If |
quiet |
Suppress warning and other messages. |
This function is just a thin wrapper that interprets the output
of quarto::quarto_inspect()
and returns what tarchetypes
needs to
know about the current Quarto project or document.
A named list of important file paths in a Quarto project or document:
sources
: source files which may reference upstream target
dependencies in code chunks using tar_load()
/tar_read()
.
output
: output files that will be generated during
quarto::quarto_render()
.
input
: pre-existing files required to render the project or document,
such as _quarto.yml
and quarto extensions.
Other Literate programming utilities:
tar_knitr_deps()
,
tar_knitr_deps_expr()
lines <- c( "---", "title: source file", "---", "Assume these lines are in report.qmd.", "```{r}", "1 + 1", "```" ) path <- tempfile(fileext = ".qmd") writeLines(lines, path) # If Quarto is installed, run: # tar_quarto_files(path)
lines <- c( "---", "title: source file", "---", "Assume these lines are in report.qmd.", "```{r}", "1 + 1", "```" ) path <- tempfile(fileext = ".qmd") writeLines(lines, path) # If Quarto is installed, run: # tar_quarto_files(path)
Targets to render a parameterized Quarto document with multiple sets of parameters.
tar_quarto_rep()
expects an unevaluated symbol for the name
argument and an unevaluated expression for the exectue_params
argument.
tar_quarto_rep_raw()
expects a character string for the name
argument and an evaluated expression object
for the exectue_params
argument.
tar_quarto_rep( name, path, working_directory = NULL, execute_params = data.frame(), batches = NULL, extra_files = character(0), execute = TRUE, cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_quarto_rep_raw( name, path, working_directory = NULL, execute_params = expression(NULL), batches = NULL, extra_files = character(0), execute = TRUE, cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, rep_workers = 1, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_quarto_rep( name, path, working_directory = NULL, execute_params = data.frame(), batches = NULL, extra_files = character(0), execute = TRUE, cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_quarto_rep_raw( name, path, working_directory = NULL, execute_params = expression(NULL), batches = NULL, extra_files = character(0), execute = TRUE, cache = NULL, cache_refresh = FALSE, debug = FALSE, quiet = TRUE, quarto_args = NULL, pandoc_args = NULL, rep_workers = 1, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
path |
Character string, path to the Quarto source file if rendering a single file, or the path to the root of the project if rendering a whole Quarto project. |
working_directory |
Optional character string,
path to the working directory
to temporarily set when running the report.
The default is |
execute_params |
Code to generate
a data frame or You may also include an
|
batches |
Number of batches. This is also the number of dynamic
branches created during |
extra_files |
Character vector of extra files and
directories to track for changes. The target will be invalidated
(rerun on the next |
execute |
Whether to execute embedded code chunks. |
cache |
Cache execution output (uses knitr cache and jupyter-cache respectively for Rmd and Jupyter input files). |
cache_refresh |
Force refresh of execution cache. |
debug |
Leave intermediate files in place after render. |
quiet |
Suppress warning and other messages. |
quarto_args |
Character vector of other |
pandoc_args |
Additional command line arguments to pass on to Pandoc. |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
tidy_eval |
Logical of length 1, whether to use tidy evaluation
to resolve |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_quarto_rep()
is an alternative to tar_target()
for
a parameterized Quarto document that depends on other targets.
Parameters must be given as a data frame with one row per
rendered report and one column per parameter. An optional
output_file
column may be included to set the output file path
of each rendered report. (See the execute_params
argument for details.)
The Quarto source should mention other dependency targets
tar_load()
and tar_read()
in the active code chunks
(which also allows you to render the report
outside the pipeline if the _targets/
data store already exists
and appropriate defaults are specified for the parameters).
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
Then, tar_quarto()
defines a special kind of target. It
1. Finds all the tar_load()
/tar_read()
dependencies in the report
and inserts them into the target's command.
This enforces the proper dependency relationships.
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
2. Sets format = "file"
(see tar_target()
) so targets
watches the files at the returned paths and reruns the report
if those files change.
3. Configures the target's command to return the output
report files: the rendered document, the source file,
and file paths mentioned in files
. All these file paths
are relative paths so the project stays portable.
4. Forces the report to run in the user's current working directory
instead of the working directory of the report.
5. Sets convenient default options such as deployment = "main"
in the target and quiet = TRUE
in quarto::quarto_render()
.
A list of target objects to render the Quarto
reports. Changes to the parameters, source file, dependencies, etc.
will cause the appropriate targets to rerun during tar_make()
.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Literate programming files are messy and variable,
so functions like tar_render()
have limitations:
* Child documents are not tracked for changes.
* Upstream target dependencies are not detected if tar_read()
and/or tar_load()
are called from a user-defined function.
In addition, single target names must be mentioned and they must
be symbols. tar_load("x")
and tar_load(contains("x"))
may not
detect target x
.
* Special/optional input/output files may not be detected in all cases.
* tar_render()
and friends are for local files only. They do not
integrate with the cloud storage capabilities of targets
.
If you encounter difficult errors, please read
https://github.com/quarto-dev/quarto-r/issues/16.
In addition, please try to reproduce the error using
quarto::quarto_render("your_report.qmd", execute_dir = getwd())
without using targets
at all. Isolating errors this way
makes them much easier to solve.
Other Literate programming targets:
tar_knit()
,
tar_quarto()
,
tar_render()
,
tar_render_rep()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Parameterized Quarto: lines <- c( "---", "title: 'report.qmd file'", "output_format: html_document", "params:", " par: \"default value\"", "---", "Assume these lines are in a file called report.qmd.", "```{r}", "print(params$par)", "```" ) writeLines(lines, "report.qmd") # In tar_dir(), not the user's file space. # The following pipeline will run the report for each row of params. targets::tar_script({ library(tarchetypes) list( tar_quarto_rep( name = report, path = "report.qmd", execute_params = tibble::tibble(par = c(1, 2)) ), tar_quarto_rep_raw( name = "report", path = "report.qmd", execute_params = quote(tibble::tibble(par = c(1, 2))) ) ) }, ask = FALSE) # Then, run the targets pipeline as usual. }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Parameterized Quarto: lines <- c( "---", "title: 'report.qmd file'", "output_format: html_document", "params:", " par: \"default value\"", "---", "Assume these lines are in a file called report.qmd.", "```{r}", "print(params$par)", "```" ) writeLines(lines, "report.qmd") # In tar_dir(), not the user's file space. # The following pipeline will run the report for each row of params. targets::tar_script({ library(tarchetypes) list( tar_quarto_rep( name = report, path = "report.qmd", execute_params = tibble::tibble(par = c(1, 2)) ), tar_quarto_rep_raw( name = "report", path = "report.qmd", execute_params = quote(tibble::tibble(par = c(1, 2))) ) ) }, ask = FALSE) # Then, run the targets pipeline as usual. }) }
Shorthand to include an R Markdown document in a
targets
pipeline.
tar_render()
expects an unevaluated symbol for the name
argument,
and it supports named ...
arguments for rmarkdown::render()
arguments.
tar_render_raw()
expects a character string for name
and
supports an evaluated expression object
render_arguments
for rmarkdown::render()
arguments.
tar_render( name, path, output_file = NULL, working_directory = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, ... ) tar_render_raw( name, path, output_file = NULL, working_directory = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, render_arguments = quote(list()) )
tar_render( name, path, output_file = NULL, working_directory = NULL, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, ... ) tar_render_raw( name, path, output_file = NULL, working_directory = NULL, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), error = targets::tar_option_get("error"), deployment = "main", priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, render_arguments = quote(list()) )
name |
Name of the target.
|
path |
Character string, file path to the R Markdown source file. Must have length 1. |
output_file |
Character string, file path to the rendered output file. |
working_directory |
Optional character string,
path to the working directory
to temporarily set when running the report.
The default is |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
quiet |
An option to suppress printing during rendering from knitr,
pandoc command line and others. To only suppress printing of the last
"Output created: " message, you can set |
... |
Named arguments to |
render_arguments |
Optional language object with a list
of named arguments to |
tar_render()
is an alternative to tar_target()
for
R Markdown reports that depend on other targets. The R Markdown source
should mention dependency targets with tar_load()
and tar_read()
in the active code chunks (which also allows you to render the report
outside the pipeline if the _targets/
data store already exists).
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
Then, tar_render()
defines a special kind of target. It
1. Finds all the tar_load()
/tar_read()
dependencies in the report
and inserts them into the target's command.
This enforces the proper dependency relationships.
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
2. Sets format = "file"
(see tar_target()
) so targets
watches the files at the returned paths and reruns the report
if those files change.
3. Configures the target's command to return both the output
report files and the input source file. All these file paths
are relative paths so the project stays portable.
4. Forces the report to run in the user's current working directory
instead of the working directory of the report.
5. Sets convenient default options such as deployment = "main"
in the target and quiet = TRUE
in rmarkdown::render()
.
A target object with format = "file"
.
When this target runs, it returns a character vector
of file paths: the rendered document, the source file,
and then the *_files/
directory if it exists.
Unlike rmarkdown::render()
,
all returned paths are relative paths to ensure portability
(so that the project can be moved from one file system to another
without invalidating the target).
See the "Target objects" section for background.
Literate programming files are messy and variable,
so functions like tar_render()
have limitations:
* Child documents are not tracked for changes.
* Upstream target dependencies are not detected if tar_read()
and/or tar_load()
are called from a user-defined function.
In addition, single target names must be mentioned and they must
be symbols. tar_load("x")
and tar_load(contains("x"))
may not
detect target x
.
* Special/optional input/output files may not be detected in all cases.
* tar_render()
and friends are for local files only. They do not
integrate with the cloud storage capabilities of targets
.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Literate programming targets:
tar_knit()
,
tar_quarto()
,
tar_quarto_rep()
,
tar_render_rep()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Unparameterized R Markdown: lines <- c( "---", "title: report.Rmd source file", "output_format: html_document", "---", "Assume these lines are in report.Rmd.", "```{r}", "targets::tar_read(data)", "```" ) # Include the report in a pipeline as follows. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_render(report, "report.Rmd") ) }, ask = FALSE) # Then, run the targets pipeline as usual. # Parameterized R Markdown: lines <- c( "---", "title: 'report.Rmd source file with parameters'", "output_format: html_document", "params:", " your_param: \"default value\"", "---", "Assume these lines are in report.Rmd.", "```{r}", "print(params$your_param)", "```" ) # Include the report in the pipeline as follows. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_render( name = report, "report.Rmd", params = list(your_param = data) ), tar_render_raw( name = "report2", "report.Rmd", params = quote(list(your_param = data)) ) ) }, ask = FALSE) }) # Then, run the targets pipeline as usual. }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Unparameterized R Markdown: lines <- c( "---", "title: report.Rmd source file", "output_format: html_document", "---", "Assume these lines are in report.Rmd.", "```{r}", "targets::tar_read(data)", "```" ) # Include the report in a pipeline as follows. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_render(report, "report.Rmd") ) }, ask = FALSE) # Then, run the targets pipeline as usual. # Parameterized R Markdown: lines <- c( "---", "title: 'report.Rmd source file with parameters'", "output_format: html_document", "params:", " your_param: \"default value\"", "---", "Assume these lines are in report.Rmd.", "```{r}", "print(params$your_param)", "```" ) # Include the report in the pipeline as follows. targets::tar_script({ library(tarchetypes) list( tar_target(data, data.frame(x = seq_len(26), y = letters)), tar_render( name = report, "report.Rmd", params = list(your_param = data) ), tar_render_raw( name = "report2", "report.Rmd", params = quote(list(your_param = data)) ) ) }, ask = FALSE) }) # Then, run the targets pipeline as usual. }
Targets to render a parameterized R Markdown report with multiple sets of parameters.
tar_render_rep()
expects an unevaluated symbol for the name
argument,
and it supports named ...
arguments for rmarkdown::render()
arguments.
tar_render_rep_raw()
expects a character string for name
and
supports an evaluated expression object
render_arguments
for rmarkdown::render()
arguments.
tar_render_rep( name, path, working_directory = NULL, params = data.frame(), batches = NULL, rep_workers = 1, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, ... ) tar_render_rep_raw( name, path, working_directory = NULL, params = expression(NULL), batches = NULL, rep_workers = 1, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, args = list() )
tar_render_rep( name, path, working_directory = NULL, params = data.frame(), batches = NULL, rep_workers = 1, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, ... ) tar_render_rep_raw( name, path, working_directory = NULL, params = expression(NULL), batches = NULL, rep_workers = 1, packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description"), quiet = TRUE, args = list() )
name |
Name of the target.
|
path |
Character string, file path to the R Markdown source file. Must have length 1. |
working_directory |
Optional character string,
path to the working directory
to temporarily set when running the report.
The default is |
params |
Code to generate a data frame or |
batches |
Number of batches. This is also the number of dynamic
branches created during |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
quiet |
An option to suppress printing during rendering from knitr,
pandoc command line and others. To only suppress printing of the last
"Output created: " message, you can set |
... |
Other named arguments to |
args |
Named list of other arguments to |
tar_render_rep()
is an alternative to tar_target()
for
parameterized R Markdown reports that depend on other targets.
Parameters must be given as a data frame with one row per
rendered report and one column per parameter. An optional
output_file
column may be included to set the output file path
of each rendered report.
The R Markdown source should mention other dependency targets
tar_load()
and tar_read()
in the active code chunks
(which also allows you to render the report
outside the pipeline if the _targets/
data store already exists
and appropriate defaults are specified for the parameters).
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
Then, tar_render()
defines a special kind of target. It
1. Finds all the tar_load()
/tar_read()
dependencies in the report
and inserts them into the target's command.
This enforces the proper dependency relationships.
(Do not use tar_load_raw()
or tar_read_raw()
for this.)
2. Sets format = "file"
(see tar_target()
) so targets
watches the files at the returned paths and reruns the report
if those files change.
3. Configures the target's command to return the output
report files: the rendered document, the source file,
and then the *_files/
directory if it exists. All these file paths
are relative paths so the project stays portable.
4. Forces the report to run in the user's current working directory
instead of the working directory of the report.
5. Sets convenient default options such as deployment = "main"
in the target and quiet = TRUE
in rmarkdown::render()
.
A list of target objects to render the R Markdown
reports. Changes to the parameters, source file, dependencies, etc.
will cause the appropriate targets to rerun during tar_make()
.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Literate programming files are messy and variable,
so functions like tar_render()
have limitations:
* Child documents are not tracked for changes.
* Upstream target dependencies are not detected if tar_read()
and/or tar_load()
are called from a user-defined function.
In addition, single target names must be mentioned and they must
be symbols. tar_load("x")
and tar_load(contains("x"))
may not
detect target x
.
* Special/optional input/output files may not be detected in all cases.
* tar_render()
and friends are for local files only. They do not
integrate with the cloud storage capabilities of targets
.
Other Literate programming targets:
tar_knit()
,
tar_quarto()
,
tar_quarto_rep()
,
tar_render()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Parameterized R Markdown: lines <- c( "---", "title: 'report.Rmd file'", "output_format: html_document", "params:", " par: \"default value\"", "---", "Assume these lines are in a file called report.Rmd.", "```{r}", "print(params$par)", "```" ) # The following pipeline will run the report for each row of params. targets::tar_script({ library(tarchetypes) list( tar_render_rep( name = report, "report.Rmd", params = tibble::tibble(par = c(1, 2)) ), tar_render_rep_raw( name = "report2", "report.Rmd", params = quote(tibble::tibble(par = c(1, 2))) ) ) }, ask = FALSE) # Then, run the targets pipeline as usual. }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. # Parameterized R Markdown: lines <- c( "---", "title: 'report.Rmd file'", "output_format: html_document", "params:", " par: \"default value\"", "---", "Assume these lines are in a file called report.Rmd.", "```{r}", "print(params$par)", "```" ) # The following pipeline will run the report for each row of params. targets::tar_script({ library(tarchetypes) list( tar_render_rep( name = report, "report.Rmd", params = tibble::tibble(par = c(1, 2)) ), tar_render_rep_raw( name = "report2", "report.Rmd", params = quote(tibble::tibble(par = c(1, 2))) ) ) }, ask = FALSE) # Then, run the targets pipeline as usual. }) }
Batching is important for optimizing the efficiency
of heavily dynamically-branched workflows:
https://books.ropensci.org/targets/dynamic.html#batching.
tar_rep()
replicates a command in strategically sized batches.
tar_rep()
expects unevaluated name
and command
arguments
(e.g. tar_rep(name = sim, command = simulate())
)
whereas tar_rep_raw()
expects an evaluated string for name
and an evaluated expression object for command
(e.g. tar_rep_raw(name = "sim", command = quote(simulate()))
).
tar_rep( name, command, batches = 1, reps = 1, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_rep_raw( name, command, batches = 1, reps = 1, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_rep( name, command, batches = 1, reps = 1, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_rep_raw( name, command, batches = 1, reps = 1, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command |
R code to run multiple times. Must return a list or
data frame because
|
batches |
Number of batches. This is also the number of dynamic
branches created during |
reps |
Number of replications in each batch. The total number
of replications is |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_rep()
and tar_rep_raw()
each create two targets:
an upstream local stem
with an integer vector of batch ids, and a downstream pattern
that maps over the batch ids. (Thus, each batch is a branch.)
Each batch/branch replicates the command a certain number of times.
If the command returns a list or data frame, then
the targets from tar_rep()
will try to append new elements/columns
tar_batch
, tar_rep
, and tar_seed
to the output
to denote the batch, rep-within-batch index, and rep-specific seed,
respectively.
Both batches and reps within each batch
are aggregated according to the method you specify
in the iteration
argument. If "list"
, reps and batches
are aggregated with list()
. If "vector"
,
then vctrs::vec_c()
. If "group"
, then vctrs::vec_rbind()
.
A list of two targets, one upstream and one downstream.
The upstream target returns a numeric index of batch ids,
and the downstream one dynamically maps over the batch ids
to run the command multiple times.
If the command returns a list or data frame, then
the targets from tar_rep()
will try to append new elements/columns
tar_batch
, tar_rep
, and tar_seed
to the output
to denote the batch, rep-within-batch ID, and random number
generator seed, respectively.
tar_read(your_target)
(on the downstream target with the actual work)
will return a list of lists, where the outer list has one element per
batch and each inner list has one element per rep within batch.
To un-batch this nested list, call
tar_read(your_target, recursive = FALSE)
.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other branching:
tar_map2()
,
tar_map2_count()
,
tar_map2_size()
,
tar_map_rep()
,
tar_rep2()
,
tar_rep_map()
,
tar_rep_map_raw()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_rep( x, data.frame(x = sample.int(1e4, 2)), batches = 2, reps = 3 ) ) }) targets::tar_make() targets::tar_read(x) targets::tar_script({ list( tarchetypes::tar_rep_raw( "x", quote(data.frame(x = sample.int(1e4, 2))), batches = 2, reps = 3 ) ) }) targets::tar_make() targets::tar_read(x) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_rep( x, data.frame(x = sample.int(1e4, 2)), batches = 2, reps = 3 ) ) }) targets::tar_make() targets::tar_read(x) targets::tar_script({ list( tarchetypes::tar_rep_raw( "x", quote(data.frame(x = sample.int(1e4, 2))), batches = 2, reps = 3 ) ) }) targets::tar_make() targets::tar_read(x) }) }
tar_rep()
Batching is important for optimizing the efficiency
of heavily dynamically-branched workflows:
https://books.ropensci.org/targets/dynamic.html#batching.
tar_rep2()
uses dynamic branching to iterate
over the batches and reps of existing upstream targets.
tar_rep2()
expects unevaluated language for the name
, command
,
and ...
arguments
(e.g. tar_rep2(name = sim, command = simulate(), data1, data2)
)
whereas tar_rep2_raw()
expects an evaluated string for name
,
an evaluated expression object for command
,
and a character vector for targets
(e.g.
tar_rep2_raw("sim", quote(simulate(x, y)), targets = c("x', "y"))
).
tar_rep2( name, command, ..., rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_rep2_raw( name, command, targets, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_rep2( name, command, ..., rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") ) tar_rep2_raw( name, command, targets, rep_workers = 1, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Name of the target.
|
command |
R code to run multiple times. Must return a list or
data frame because
|
... |
Symbols to name one or more upstream batched targets
created by |
rep_workers |
Positive integer of length 1, number of local R processes to use to run reps within batches in parallel. If 1, then reps are run sequentially within each batch. If greater than 1, then reps within batch are run in parallel using a PSOCK cluster. |
tidy_eval |
Logical, whether to enable tidy evaluation
when interpreting |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
targets |
Character vector of names of upstream batched targets
created by |
A new target object to perform batched computation. See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
In ordinary pipelines, each target has its own unique deterministic
pseudo-random number generator seed derived from its target name.
In batched replicate, however, each batch is a target with multiple
replicate within that batch. That is why tar_rep()
and friends give each replicate its own unique seed.
Each replicate-specific seed is created
based on the dynamic parent target name,
tar_option_get("seed")
(for targets
version 0.13.5.9000 and above),
batch index, and rep-within-batch index.
The seed is set just before the replicate runs.
Replicate-specific seeds are invariant to batching structure.
In other words,
tar_rep(name = x, command = rnorm(1), batches = 100, reps = 1, ...)
produces the same numerical output as
tar_rep(name = x, command = rnorm(1), batches = 10, reps = 10, ...)
(but with different batch names).
Other target factories with this seed scheme are tar_rep2()
,
tar_map_rep()
, tar_map2_count()
, tar_map2_size()
,
and tar_render_rep()
.
For the tar_map2_*()
functions,
it is possible to manually supply your own seeds
through the command1
argument and then invoke them in your
custom code for command2
(set.seed()
, withr::with_seed
,
or withr::local_seed()
). For tar_render_rep()
,
custom seeds can be supplied to the params
argument
and then invoked in the individual R Markdown reports.
Likewise with tar_quarto_rep()
and the execute_params
argument.
Other branching:
tar_map2()
,
tar_map2_count()
,
tar_map2_size()
,
tar_map_rep()
,
tar_rep()
,
tar_rep_map()
,
tar_rep_map_raw()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( tar_rep( data1, data.frame(value = rnorm(1)), batches = 2, reps = 3 ), tar_rep( data2, list(value = rnorm(1)), batches = 2, reps = 3, iteration = "list" # List iteration is important for batched lists. ), tar_rep2( aggregate, data.frame(value = data1$value + data2$value), data1, data2 ), tar_rep2_raw( "aggregate2", quote(data.frame(value = data1$value + data2$value)), targets = c("data1", "data2") ) ) }) targets::tar_make() targets::tar_read(aggregate) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ library(tarchetypes) list( tar_rep( data1, data.frame(value = rnorm(1)), batches = 2, reps = 3 ), tar_rep( data2, list(value = rnorm(1)), batches = 2, reps = 3, iteration = "list" # List iteration is important for batched lists. ), tar_rep2( aggregate, data.frame(value = data1$value + data2$value), data1, data2 ), tar_rep2_raw( "aggregate2", quote(data.frame(value = data1$value + data2$value)), targets = c("data1", "data2") ) ) }) targets::tar_make() targets::tar_read(aggregate) }) }
Select the names of targets from a target list.
tar_select_names(targets, ...)
tar_select_names(targets, ...)
targets |
A list of target objects as described in the "Target objects" section. It does not matter how nested the list is as long as the only leaf nodes are targets. |
... |
One or more comma-separated |
A character vector of target names.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other target selection:
tar_select_targets()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets <- list( list( targets::tar_target(x, 1), targets::tar_target(y1, 2) ), targets::tar_target(y2, 3), targets::tar_target(z, 4) ) tar_select_names(targets, starts_with("y"), contains("z")) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets <- list( list( targets::tar_target(x, 1), targets::tar_target(y1, 2) ), targets::tar_target(y2, 3), targets::tar_target(z, 4) ) tar_select_names(targets, starts_with("y"), contains("z")) }) }
Select target objects from a target list.
tar_select_targets(targets, ...)
tar_select_targets(targets, ...)
targets |
A list of target objects as described in the "Target objects" section. It does not matter how nested the list is as long as the only leaf nodes are targets. |
... |
One or more comma-separated |
A list of target objects. See the "Target objects" section of this help file.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other target selection:
tar_select_names()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets <- list( list( targets::tar_target(x, 1), targets::tar_target(y1, 2) ), targets::tar_target(y2, 3), targets::tar_target(z, 4) ) tar_select_targets(targets, starts_with("y"), contains("z")) }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets <- list( list( targets::tar_target(x, 1), targets::tar_target(y1, 2) ), targets::tar_target(y2, 3), targets::tar_target(z, 4) ) tar_select_targets(targets, starts_with("y"), contains("z")) }) }
Create a target that cancels itself if a user-defined decision rule is met.
tar_skip( name, command, skip, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
tar_skip( name, command, skip, tidy_eval = targets::tar_option_get("tidy_eval"), packages = targets::tar_option_get("packages"), library = targets::tar_option_get("library"), format = targets::tar_option_get("format"), repository = targets::tar_option_get("repository"), iteration = targets::tar_option_get("iteration"), error = targets::tar_option_get("error"), memory = targets::tar_option_get("memory"), garbage_collection = targets::tar_option_get("garbage_collection"), deployment = targets::tar_option_get("deployment"), priority = targets::tar_option_get("priority"), resources = targets::tar_option_get("resources"), storage = targets::tar_option_get("storage"), retrieval = targets::tar_option_get("retrieval"), cue = targets::tar_option_get("cue"), description = targets::tar_option_get("description") )
name |
Symbol, name of the target.
In A target name must be a valid name for a symbol in R, and it
must not start with a dot. Subsequent targets
can refer to this name symbolically to induce a dependency relationship:
e.g. |
command |
R code to run the target.
In |
skip |
R code for the skipping condition. If it evaluates to |
tidy_eval |
Whether to invoke tidy evaluation
(e.g. the |
packages |
Character vector of packages to load right before
the target runs or the output data is reloaded for
downstream targets. Use |
library |
Character vector of library paths to try
when loading |
format |
Optional storage format for the target's return value.
With the exception of |
repository |
Character of length 1, remote repository for target storage. Choices:
Note: if |
iteration |
Character of length 1, name of the iteration mode of the target. Choices:
|
error |
Character of length 1, what to do if the target stops and throws an error. Options:
|
memory |
Character of length 1, memory strategy. Possible values:
For cloud-based dynamic files
(e.g. |
garbage_collection |
Logical: |
deployment |
Character of length 1. If |
priority |
Numeric of length 1 between 0 and 1. Controls which
targets get deployed first when multiple competing targets are ready
simultaneously. Targets with priorities closer to 1 get dispatched earlier
(and polled earlier in |
resources |
Object returned by |
storage |
Character string to control when the output of the target
is saved to storage. Only relevant when using
|
retrieval |
Character string to control when the current target
loads its dependencies into memory before running.
(Here, a "dependency" is another target upstream that the current one
depends on.) Only relevant when using
|
cue |
An optional object from |
description |
Character of length 1, a custom free-form human-readable
text description of the target. Descriptions appear as target labels
in functions like |
tar_skip()
creates a target that cancels itself
whenever a custom condition is met. The mechanism of cancellation
is targets::tar_cancel(your_condition)
, which allows skipping to happen
even if the target does not exist yet. This behavior differs from
tar_cue(mode = "never")
, which still runs if the target does not exist.
A target object with targets::tar_cancel(your_condition)
inserted
into the command.
See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other targets with custom invalidation rules:
tar_change()
,
tar_download()
,
tar_force()
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_skip(x, command = "value", skip = 1 > 0) ) }) targets::tar_make() }) }
if (identical(Sys.getenv("TAR_LONG_EXAMPLES"), "true")) { targets::tar_dir({ # tar_dir() runs code from a temporary directory. targets::tar_script({ list( tarchetypes::tar_skip(x, command = "value", skip = 1 > 0) ) }) targets::tar_make() }) }
Loop over a grid of values and create an expression object from each one. Helps with general metaprogramming.
tar_sub()
expects an unevaluated expression for
the expr
object, whereas tar_sub_raw()
expects an
evaluated expression object.
tar_sub(expr, values) tar_sub_raw(expr, values)
tar_sub(expr, values) tar_sub_raw(expr, values)
expr |
Starting expression. Values are iteratively substituted
in place of symbols in
|
values |
List of values to substitute into |
A list of expression objects. Often, these expression objects evaluate to target objects (but not necessarily). See the "Target objects" section for background.
Most tarchetypes
functions are target factories,
which means they return target objects
or lists of target objects.
Target objects represent skippable steps of the analysis pipeline
as described at https://books.ropensci.org/targets/.
Please read the walkthrough at
https://books.ropensci.org/targets/walkthrough.html
to understand the role of target objects in analysis pipelines.
For developers, https://wlandau.github.io/targetopia/contributing.html#target-factories explains target factories (functions like this one which generate targets) and the design specification at https://books.ropensci.org/targets-design/ details the structure and composition of target objects.
Other Metaprogramming utilities:
tar_eval()
# tar_map() is incompatible with tar_render() because the latter # operates on preexisting tar_target() objects. By contrast, # tar_eval() and tar_sub() iterate over code farther upstream. values <- list( name = lapply(c("name1", "name2"), as.symbol), file = list("file1.Rmd", "file2.Rmd") ) tar_sub(tar_render(name, file), values = values) tar_sub_raw(quote(tar_render(name, file)), values = values)
# tar_map() is incompatible with tar_render() because the latter # operates on preexisting tar_target() objects. By contrast, # tar_eval() and tar_sub() iterate over code farther upstream. values <- list( name = lapply(c("name1", "name2"), as.symbol), file = list("file1.Rmd", "file2.Rmd") ) tar_sub(tar_render(name, file), values = values) tar_sub_raw(quote(tar_render(name, file)), values = values)