Title: | Find R Packages Matching Either Descriptions or Other R Packages |
---|---|
Description: | Find R packages matching either descriptions or other R packages. |
Authors: | Mark Padgham [aut, cre] |
Maintainer: | Mark Padgham <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4.3.097 |
Built: | 2025-03-27 12:32:05 UTC |
Source: | https://github.com/ropensci-review-tools/pkgmatch |
This function generates a selection of test data for the "cran" corpus, to allow functions to be run offline, without having to download the large datasets otherwise required for the package to function.
Note that these data are randomly generated, and results will be generally meaningless. They are generated solely to demonstrate how the package functions, and are not intended to derive meaningful outputs.
generate_pkgmatch_example_data()
generate_pkgmatch_example_data()
(Invisibly) The path to the temporary directory containing the package data.
Other utils:
head.pkgmatch()
,
pkgmatch_browse()
,
pkgmatch_load_data()
,
pkgmatch_update_cache()
,
print.pkgmatch()
,
text_is_code()
generate_pkgmatch_example_data () input <- "curl" # Name of a single installed package pkgmatch_similar_pkgs (input, corpus = "cran")
generate_pkgmatch_example_data () input <- "curl" # Name of a single installed package pkgmatch_similar_pkgs (input, corpus = "cran")
Return the URL of the specified ollama API. Default is "127.0.0.1:11434"
get_ollama_url()
get_ollama_url()
The ollama API URL
set_ollama_url
Other ollama:
ollama_check()
,
set_ollama_url()
Head method for 'pkgmatch' objects
## S3 method for class 'pkgmatch' head(x, n = 5L, ...)
## S3 method for class 'pkgmatch' head(x, n = 5L, ...)
x |
Object for which head is to be printed |
n |
Number of rows of full |
... |
Not used |
A (usually) smaller version of x
, with all columns displayed.
Other utils:
generate_pkgmatch_example_data()
,
pkgmatch_browse()
,
pkgmatch_load_data()
,
pkgmatch_update_cache()
,
print.pkgmatch()
,
text_is_code()
## Not run: input <- "Download open spatial data from NASA" p <- pkgmatch_similar_pkgs (input) p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object ## End(Not run)
## Not run: input <- "Download open spatial data from NASA" p <- pkgmatch_similar_pkgs (input) p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object ## End(Not run)
Performs the following checks:
Check that ollama is installed
Check that ollama is running
Check that ollama has the required models, and download if not
The required models are the Jina AI embeddings: https://ollama.com/jina/jina-embeddings-v2-base-en for text embeddings, and https://ollama.com/ordis/jina-embeddings-v2-base-code for code embeddings.
Note that the URL of a locally-running ollama instance is presumed by default to be "127.0.0.1:11434". Other values can be set using the set_ollama_url function.
ollama_check(sudo = is_docker_sudo())
ollama_check(sudo = is_docker_sudo())
sudo |
Set to |
TRUE if everything works okay, otherwise the function will error before returning, and issue an informative error message.
Other ollama:
get_ollama_url()
,
set_ollama_url()
## Not run: chk <- ollama_check () ## End(Not run)
## Not run: chk <- ollama_check () ## End(Not run)
BM25 values match single inputs to document corpora by weighting terms by their inverse frequencies, so that relatively rare words contribute more to match scores than common words. For each input, the BM25 value is the sum of relative frequencies of each term in the input multiplied by the Inverse Document Frequency (IDF) of that term in the entire corpus. See the Wikipedia page at https://en.wikipedia.org/wiki/Okapi_BM25 for further details.
pkgmatch_bm25(input, txt = NULL, idfs = NULL, corpus = NULL)
pkgmatch_bm25(input, txt = NULL, idfs = NULL, corpus = NULL)
input |
A single character string to match against the second parameter of all input documents. |
txt |
An optional list of input documents. If not specified, data will
be loaded as specified by the |
idfs |
Optional list of Inverse Document Frequency weightings generated
by the internal |
corpus |
If |
A data.frame
of package names and 'BM25' measures against text
from whole packages both with and without function descriptions.
Other bm25:
pkgmatch_bm25_fn_calls()
# The following function simulates remote data in temporary directory, to # enable package usage without downloading. Do not run for normal usage. generate_pkgmatch_example_data () input <- "curl" # Name of a single installed package pkgmatch_bm25 (input, corpus = "cran") # Or pre-load document-frequency weightings and pass those: idfs <- pkgmatch_load_data ("idfs", corpus = "cran", fns = FALSE) pkgmatch_bm25 (input, corpus = "cran", idfs = idfs)
# The following function simulates remote data in temporary directory, to # enable package usage without downloading. Do not run for normal usage. generate_pkgmatch_example_data () input <- "curl" # Name of a single installed package pkgmatch_bm25 (input, corpus = "cran") # Or pre-load document-frequency weightings and pass those: idfs <- pkgmatch_load_data ("idfs", corpus = "cran", fns = FALSE) pkgmatch_bm25 (input, corpus = "cran", idfs = idfs)
See ?pkgmatch_bm25
for details of BM25 ranks. This function
calculates "BM25" ranks from function-call frequencies between a local R
package and all packages in specified corpus. Values are thus higher for
packages with similar patterns of function calls, weighted by inverse
frequencies, so functions called infrequently across the entire corpus
contribute more than common functions.
Note that the results of this function are entirely different from
pkgmatch_bm25 with corpus = "ropensci-fns"
. The latter returns BM25
values from text descriptions of all functions in all rOpenSci packages,
whereas this function returns BM25 values based on frequencies of function
calls within packages.
pkgmatch_bm25_fn_calls(path, corpus = NULL)
pkgmatch_bm25_fn_calls(path, corpus = NULL)
path |
Local path to source code of an R package. |
corpus |
One of "ropensci" or "cran" |
A data.frame
of two columns:
"package" Naming the package from the specified corpus;
bm25 The "BM25" index value for the nominated packages, where high values indicate greater overlap in term frequencies.
Other bm25:
pkgmatch_bm25()
## Not run: u <- "https://cran.r-project.org/src/contrib/odbc_1.5.0.tar.gz" path <- file.path (tempdir (), basename (u)) download.file (u, destfile = path) bm25 <- pkgmatch_bm25_fn_calls (path) ## End(Not run)
## Not run: u <- "https://cran.r-project.org/src/contrib/odbc_1.5.0.tar.gz" path <- file.path (tempdir (), basename (u)) download.file (u, destfile = path) bm25 <- pkgmatch_bm25_fn_calls (path) ## End(Not run)
pkgmatch
resultsOpen web pages for pkgmatch
results
pkgmatch_browse(p, n = NULL)
pkgmatch_browse(p, n = NULL)
p |
A |
n |
Number of top-matching entries which should be opened. Defaults to the value passed to the main functions. |
(Invisibly) A named vector of integers, with 0 for all pages able to be successfully opened, and 1 otherwise.
Other utils:
generate_pkgmatch_example_data()
,
head.pkgmatch()
,
pkgmatch_load_data()
,
pkgmatch_update_cache()
,
print.pkgmatch()
,
text_is_code()
## Not run: input <- "genomics and transcriptomics sequence data" p <- pkgmatch_similar_pkgs (input) pkgmatch_browse (p) # Open main package pages on rOpenSci p <- pkgmatch_similar_pkgs (input, corpus = "cran") pkgmatch_browse (p) # Open main package pages on CRAN p <- pkgmatch_similar_fns (input) pkgmatch_browse (p) # Open pages for best-matching rOpenSci functions ## End(Not run)
## Not run: input <- "genomics and transcriptomics sequence data" p <- pkgmatch_similar_pkgs (input) pkgmatch_browse (p) # Open main package pages on rOpenSci p <- pkgmatch_similar_pkgs (input, corpus = "cran") pkgmatch_browse (p) # Open main package pages on CRAN p <- pkgmatch_similar_fns (input) pkgmatch_browse (p) # Open pages for best-matching rOpenSci functions ## End(Not run)
This function accepts a vector of either names of installed packages, or paths to local source code directories, and calculates language model (LM) embeddings for both text descriptions within the package (documentation, including of functions), and for the entire code base. Embeddings may also be calculating separately for all function descriptions.
The embeddings are currently retrieved from a local 'ollama' server (https://ollama.com) running Jina AI embeddings (https://ollama.com/jina/jina-embeddings-v2-base-en for text, and https://ollama.com/ordis/jina-embeddings-v2-base-code for code).
pkgmatch_embeddings_from_pkgs(packages = NULL, functions_only = FALSE)
pkgmatch_embeddings_from_pkgs(packages = NULL, functions_only = FALSE)
packages |
A vector of either names of installed packages, or local paths to directories containing R packages. |
functions_only |
If |
If !functions_only
, a list of two matrices of embeddings: one for
the text descriptions of the specified packages, including individual
descriptions of all package functions, and one for the entire code base. For
functions_only
, a single matrix of embeddings for all function
descriptions.
Other embeddings:
pkgmatch_embeddings_from_text()
packages <- "curl" emb_fns <- pkgmatch_embeddings_from_pkgs (packages, functions_only = TRUE) colnames (emb_fns) # All functions the package emb_pkg <- pkgmatch_embeddings_from_pkgs (packages, functions_only = FALSE) names (emb_pkg) colnames (emb_pkg$text_with_fns) # "curl"
packages <- "curl" emb_fns <- pkgmatch_embeddings_from_pkgs (packages, functions_only = TRUE) colnames (emb_fns) # All functions the package emb_pkg <- pkgmatch_embeddings_from_pkgs (packages, functions_only = FALSE) names (emb_pkg) colnames (emb_pkg$text_with_fns) # "curl"
This function accepts a vector of character strings, packages, or paths to local source code directories, and calculates language model (LM) embeddings for each string within the vector.
The embeddings are currently retrieved from a local 'ollama' server (https://ollama.com) running Jina AI text embeddings (https://ollama.com/jina/jina-embeddings-v2-base-en).
pkgmatch_embeddings_from_text(input = NULL)
pkgmatch_embeddings_from_text(input = NULL)
input |
A vector of one or more text strings for which embeddings are to be extracted. |
A matrix of embeddings, one column for each input
item, and a
fixed number of rows defined by the embedding length of the language models.
Other embeddings:
pkgmatch_embeddings_from_pkgs()
## Not run: input <- "Download open spatial data from NASA" emb <- pkgmatch_embeddings_from_text (input = input) ## End(Not run)
## Not run: input <- "Download open spatial data from NASA" emb <- pkgmatch_embeddings_from_text (input = input) ## End(Not run)
Load pre-computed data for a specified corpus. Data types are:
"embeddings" for language model embeddings;
"idfs" for Inverse Document Frequency weightings;
"functions" for frequency tables for text descriptions of function calls; or
"calls" for frequency tables for actual function calls.
This function is called within the main pkgmatch_similar_pkgs and pkgmatch_similar_fns functions to load required data there, and should not generally need to be explicitly called.
pkgmatch_load_data( what = "embeddings", corpus = "ropensci", fns = FALSE, raw = FALSE )
pkgmatch_load_data( what = "embeddings", corpus = "ropensci", fns = FALSE, raw = FALSE )
what |
One of the four data types described above: "embeddings", "idfs", "functions", or "calls". |
corpus |
Must be specified as one of "ropensci" or "cran". If
|
fns |
If |
raw |
Only has effect of |
The loaded data.
Other utils:
generate_pkgmatch_example_data()
,
head.pkgmatch()
,
pkgmatch_browse()
,
pkgmatch_update_cache()
,
print.pkgmatch()
,
text_is_code()
## Not run: embeddings <- pkgmatch_load_data ("embeddings") embeddings_fns <- pkgmatch_load_data ("embeddings", fns = TRUE) idfs <- pkgmatch_load_data ("idfs") idfs_fns <- pkgmatch_load_data ("idfs", fns = TRUE) ## End(Not run)
## Not run: embeddings <- pkgmatch_load_data ("embeddings") embeddings_fns <- pkgmatch_load_data ("embeddings", fns = TRUE) idfs <- pkgmatch_load_data ("idfs") idfs_fns <- pkgmatch_load_data ("idfs", fns = TRUE) ## End(Not run)
Function matching is only available for functions from the corpus of rOpenSci packages. Function matching is also based on LM output only, and unlike package matching does not combine LM output with BM25 word-frequency matching.
pkgmatch_similar_fns(input, embeddings = NULL, n = 5L, browse = FALSE)
pkgmatch_similar_fns(input, embeddings = NULL, n = 5L, browse = FALSE)
input |
A text string. |
embeddings |
Large Language Model embeddings for a suite of packages, generated from pkgmatch_embeddings_from_pkgs. If not provided, pre-generated embeddings will be downloaded and stored in a local cache directory. |
n |
When the result of this function is printed to screen, the top |
browse |
If |
A modified data.frame
object of class "pkgmatch". The data.frame
has 3 columns:
"function" with the name of the function in the form
"
"simil" with a similarity score between 0 and 1; and
"rank" as an integer index, with the highest rank of 1 as the first row.
The return object has a default print
method which prints the names only
of the first 5 best matching functions; see ?print.pkgmatch
for details.
Other main:
pkgmatch_similar_pkgs()
## Not run: input <- "Process raster satellite images" p <- pkgmatch_similar_fns (input) p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object ## End(Not run)
## Not run: input <- "Process raster satellite images" p <- pkgmatch_similar_fns (input) p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object ## End(Not run)
This function accepts as input
either a text description, or
a path to a local R package, and ranks all R packages within the specified
corpus in terms of how well they match that input. The "corpus" argument can
specify either rOpenSci's package suite, or
CRAN.
Ranks are obtained from scores derived from:
Cosine similarities between Language Model (LM) embeddings for the
input
, and corresponding embeddings for the specified corpus.
"Best Match 25" (BM25) scores based on document token frequencies.
For text input, ranks are generally obtained for packages both including and
excluding function descriptions as part of the package text, giving two sets
of ranks for a given input. Where input is an entire R package, separate
ranks are also calculated for package code and text, thus giving four
distinct ranks. The function ultimately returns a single rank, derived by
combining individual ranks using the Reciprocal Rank Fusion (RRF) algorithm. The
additional parameter of lm_proportion
determines the extent to which the
final ranking weights the LM versus BM25 components.
Finally, all components of this function are locally cached for each call
(by the memoise package), so additional calls to this function with
the same input
and corpus
should be much faster than initial calls. This
means the effect of changing lm_proportion
can easily be examined by
simply repeating calls to this function.
pkgmatch_similar_pkgs( input, corpus = NULL, embeddings = NULL, idfs = NULL, input_is_code = text_is_code(input), lm_proportion = 0.5, n = 5L, browse = FALSE )
pkgmatch_similar_pkgs( input, corpus = NULL, embeddings = NULL, idfs = NULL, input_is_code = text_is_code(input), lm_proportion = 0.5, n = 5L, browse = FALSE )
input |
Either a text string, a path to local source code of an R package, or the name of any installed R package. |
corpus |
Must be specified as one of "ropensci" or "cran". If
|
embeddings |
Large Language Model embeddings for a suite of packages, generated from pkgmatch_embeddings_from_pkgs. If not provided, pre-generated embeddings will be downloaded and stored in a local cache directory. |
idfs |
Inverse Document Frequency tables for a suite of packages, generated from pkgmatch_bm25. If not provided, pre-generated IDF tables will be downloaded and stored in a local cache directory. |
input_is_code |
A binary flag indicating whether |
lm_proportion |
A value between 0 and 1 to control the relative
contributions of results from Language Models ("LMs") versus results from
traditional token-frequency models. Final rankings are generated by
combining these two kinds of results, so that |
n |
When the result of this function is printed to screen, the top |
browse |
If |
A data.frame
with a "package" column naming packages, and one or
more columns of package ranks in terms of text similarity and, if input
is
an R package, of similarity in code structure.
The returned object has a default print
method which prints the best 5
matches directly to the screen, yet returns information on all packages
within the specified corpus. This information is in the form of a
data.frame
, with one column for the package name, and one or more
additional columns of integer ranks for each package. There is also a head
method to print the first few entries of these full data (default n = 5
).
To see all data, use as.data.frame()
. See the example below for how to
manipulate these objects.
The first time this function is run without passing either
embeddings
or idfs
, required values will be automatically downloaded and
stored in a locally persistent cache directory. Especially for the "cran"
corpus, this downloading may take quite some time.
input_is_code
Other main:
pkgmatch_similar_fns()
# The following function simulates remote data in temporary directory, to # enable package usage without downloading. Do not run for normal usage. generate_pkgmatch_example_data () input <- "curl" # Name of a single installed package p <- pkgmatch_similar_pkgs (input, corpus = "cran") p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object # This second call modifies default combining of results equally from language # model and token frequency (BM25) results. It will be much faster than first # call, because previously generated embeddings are re-used. p2 <- pkgmatch_similar_pkgs (input, corpus = "cran", lm_proportion = 0.25) # Example demonstrating how to combine results using different values of # `lm_proportion`. Input is a package, so result has columns for "text_rank" # and "code_rank". lm_props <- 0:10 / 10 res <- lapply (lm_props, function (p) { nm_text <- sprintf ("text_rank_p%02.0f", p * 10) nm_code <- sprintf ("code_rank_p%02.0f", p * 10) res <- pkgmatch_similar_pkgs (input, corpus = "cran", lm_proportion = p) |> dplyr::rename ({{nm_text}} := "text_rank", {{nm_code}} := "code_rank") |> dplyr::arrange (package) if (p > 0) { res <- dplyr::select (res, -package, -version) } return (res) }) res <- do.call (cbind, res) # That then has paired columns of (text rank, code rank) for each of the # 11 values of `lm_props`. head (res)
# The following function simulates remote data in temporary directory, to # enable package usage without downloading. Do not run for normal usage. generate_pkgmatch_example_data () input <- "curl" # Name of a single installed package p <- pkgmatch_similar_pkgs (input, corpus = "cran") p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object # This second call modifies default combining of results equally from language # model and token frequency (BM25) results. It will be much faster than first # call, because previously generated embeddings are re-used. p2 <- pkgmatch_similar_pkgs (input, corpus = "cran", lm_proportion = 0.25) # Example demonstrating how to combine results using different values of # `lm_proportion`. Input is a package, so result has columns for "text_rank" # and "code_rank". lm_props <- 0:10 / 10 res <- lapply (lm_props, function (p) { nm_text <- sprintf ("text_rank_p%02.0f", p * 10) nm_code <- sprintf ("code_rank_p%02.0f", p * 10) res <- pkgmatch_similar_pkgs (input, corpus = "cran", lm_proportion = p) |> dplyr::rename ({{nm_text}} := "text_rank", {{nm_code}} := "code_rank") |> dplyr::arrange (package) if (p > 0) { res <- dplyr::select (res, -package, -version) } return (res) }) res <- do.call (cbind, res) # That then has paired columns of (text rank, code rank) for each of the # 11 values of `lm_props`. head (res)
This function uses "treesitter" (https://github.com/tree-sitter/tree-sitter) to tag all function calls made within a local package, and to associate those calls with package namespaces.
This is used as input to the pkgmatch_bm25_fn_calls function, to enable function calls within a local package to be inversely weighted by frequencies within all packages within a corpus. The results of applying this function to the full corpora used in this package are contained within the data listed on https://github.com/ropensci-review-tools/pkgmatch/releases/tag/v0.4.0, as "fn-calls-ropensci.Rds" and "fn-calls-cran.Rds".
pkgmatch_treesitter_fn_tags(path)
pkgmatch_treesitter_fn_tags(path)
path |
Path to local package, or |
A data.frame
of all function calls made within the package, with
the following columns:
'fn' Name of the package function within which call is made, including namespace identifiers of "::" for exported functions and ":::" for non-exported functions.
name Name of function being called, including namespace.
start Byte number within file corresponding to start of definition
end Byte number within file corresponding to end of definition
file Name of file in which fn call is defined.
# Get function calls made within locally-installed packages: fn_tags <- pkgmatch_treesitter_fn_tags ("curl") # Name of installed package fn_tags <- pkgmatch_treesitter_fn_tags ("cli") # Name of installed package # Or get calls from full source code: u <- "https://cran.r-project.org/src/contrib/odbc_1.5.0.tar.gz" path <- file.path (tempdir (), basename (u)) ## Not run: download.file (u, destfile = path) fn_tags <- pkgmatch_treesitter_fn_tags (path) ## End(Not run)
# Get function calls made within locally-installed packages: fn_tags <- pkgmatch_treesitter_fn_tags ("curl") # Name of installed package fn_tags <- pkgmatch_treesitter_fn_tags ("cli") # Name of installed package # Or get calls from full source code: u <- "https://cran.r-project.org/src/contrib/odbc_1.5.0.tar.gz" path <- file.path (tempdir (), basename (u)) ## Not run: download.file (u, destfile = path) fn_tags <- pkgmatch_treesitter_fn_tags (path) ## End(Not run)
pkgmatch
data to latest versions.This function forces all locally-cached data to be updated with latest version of remote data provided on the latest release of GitHub repository at https://github.com/ropensci-review-tools/pkgmatch/releases.
Caching strategies are described in the "Data Caching and Updating"
vignette, accessible either locally via
vignette("data-caching-and-updating", package = "pkgmatch")
, or online at
https://docs.ropensci.org/pkgmatch/articles/C_data-caching-and-updating.html.
In short, locally-cached data used by this package are updated
by default every 30 days (with the vignette describing how to modify this
default behaviour). This function forces all locally-cached data to be
updated, regardless of update frequencies.
pkgmatch_update_cache()
pkgmatch_update_cache()
Other utils:
generate_pkgmatch_example_data()
,
head.pkgmatch()
,
pkgmatch_browse()
,
pkgmatch_load_data()
,
print.pkgmatch()
,
text_is_code()
## Not run: pkgmatch_update_cache () ## End(Not run)
## Not run: pkgmatch_update_cache () ## End(Not run)
This function is intended for internal rOpenSci use only. Usage
by any unauthorized users will error and have no effect unless run with
upload = FALSE
, in which case updated data will be created in the
sub-directory "pkgmatch-results" of R's current temporary directory. This
updating may take a very long time!
Note that this function is categorically different from
pkgmatch_update_cache. This function updates the internal data used
by the pkgmatch
package, and should only ever be run by package
maintainers. The pkgmatch_update_cache downloads the latest versions
of these data to a local cache for use in this package.
pkgmatch_update_data(upload = TRUE)
pkgmatch_update_data(upload = TRUE)
upload |
If |
Local path to directory containing updated results.
## Not run: pkgmatch_update_data (upload = FALSEE) ## End(Not run)
## Not run: pkgmatch_update_data (upload = FALSEE) ## End(Not run)
The main pkgmatch
functions, pkgmatch_similar_pkgs and
pkgmatch_similar_fns, return data.frame
objects of class
"pkgmatch". This class exists primarily to enable this print method, which
summarises by default the top 5 matching packages or functions. Objects can
be converted to standard data.frame
s with as.data.frame()
.
## S3 method for class 'pkgmatch' print(x, ...)
## S3 method for class 'pkgmatch' print(x, ...)
x |
Object to be printed |
... |
Additional parameters passed to default 'print' method. |
The result of printing x
, in form of either a single character
vector, or a named list of character vectors.
Other utils:
generate_pkgmatch_example_data()
,
head.pkgmatch()
,
pkgmatch_browse()
,
pkgmatch_load_data()
,
pkgmatch_update_cache()
,
text_is_code()
## Not run: input <- "Download open spatial data from NASA" p <- pkgmatch_similar_pkgs (input) p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object ## End(Not run)
## Not run: input <- "Download open spatial data from NASA" p <- pkgmatch_similar_pkgs (input) p # Default print method, lists 5 best matching packages head (p) # Shows first 5 rows of full `data.frame` object ## End(Not run)
Set the URL for local ollama API
set_ollama_url(ollama_url)
set_ollama_url(ollama_url)
ollama_url |
The desired ollama API URL |
The ollama API URL
Other ollama:
get_ollama_url()
,
ollama_check()
This function is used as part of the input of many functions,
to determine whether the input is text of whether it is code. All such
functions use it via an input parameter named input_is_code
, which is set
by default to the value returned from this function. That value can always
be over-ridden by specifying a fixed value of either TRUE
or FALSE
for
input_is_code
.
Values from this function are only approximate, and there are even software packages which can give false negatives and be identified as prose (like rOpenSci's "geonames" package), and prose which may be wrongly identified as code.
text_is_code(txt)
text_is_code(txt)
txt |
Single input text string |
Logical value indicating whether or not txt
was identified as
code.
Other utils:
generate_pkgmatch_example_data()
,
head.pkgmatch()
,
pkgmatch_browse()
,
pkgmatch_load_data()
,
pkgmatch_update_cache()
,
print.pkgmatch()
txt <- "Some text without any code" text_is_code (txt) txt <- "this_is_code <- function (x) { x }" text_is_code (txt)
txt <- "Some text without any code" text_is_code (txt) txt <- "this_is_code <- function (x) { x }" text_is_code (txt)