Package: textreuse 1.0.1
textreuse: Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Authors:
textreuse_1.0.1.tar.gz
textreuse_1.0.1.zip(r-4.7)textreuse_1.0.1.zip(r-4.6)textreuse_1.0.1.zip(r-4.5)
textreuse_1.0.1.tgz(r-4.6-x86_64)textreuse_1.0.1.tgz(r-4.6-arm64)textreuse_1.0.1.tgz(r-4.5-x86_64)textreuse_1.0.1.tgz(r-4.5-arm64)
textreuse_1.0.1.tar.gz(r-4.7-arm64)textreuse_1.0.1.tar.gz(r-4.7-x86_64)textreuse_1.0.1.tar.gz(r-4.6-arm64)textreuse_1.0.1.tar.gz(r-4.6-x86_64)
textreuse_1.0.1.tgz(r-4.5-emscripten)
manual.pdf |manual.html✨
card.svg |card.png
textreuse/json (API)
NEWS
| # Install 'textreuse' in R: |
| install.packages('textreuse', repos = c('https://packages.ropensci.org', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #20
Bug tracker:https://github.com/ropensci/textreuse/issues
Pkgdown/docs site:https://docs.ropensci.org
Last updated from:6f8cbe3802 (on master). Checks:14 OK. Indexed: yes.
| Target | Result | Time | Files | Syslog |
|---|---|---|---|---|
| linux-devel-arm64 | OK | 158 | ||
| linux-devel-x86_64 | OK | 159 | ||
| pkgdown docs | OK | 181 | ||
| source / vignettes | OK | 193 | ||
| linux-release-arm64 | OK | 159 | ||
| linux-release-x86_64 | OK | 162 | ||
| macos-release-arm64 | OK | 120 | ||
| macos-release-x86_64 | OK | 283 | ||
| macos-oldrel-arm64 | OK | 155 | ||
| macos-oldrel-x86_64 | OK | 239 | ||
| windows-devel | OK | 153 | ||
| windows-release | OK | 233 | ||
| windows-oldrel | OK | 151 | ||
| wasm-release | OK | 139 |
Exports:align_localas_sparse_matrixcontentcontent<-count_matchesfilenameshas_contenthas_hasheshas_minhasheshas_tokenshash_stringhasheshashes<-is.TextReuseCorpusis.TextReuseTextDocumentjaccard_bag_similarityjaccard_dissimilarityjaccard_similaritylshlsh_addlsh_candidateslsh_comparelsh_probabilitylsh_querylsh_subsetlsh_thresholdmatching_tokensmetameta<-minhash_generatorminhashesminhashes<-pairwise_candidatespairwise_compareratio_of_matchesrehashskippedTextReuseCorpusTextReuseTextDocumenttoken_indextoken_index_candidatestokenizetokenize_ngramstokenize_sentencestokenize_skip_ngramstokenize_wordstokenstokens<-wordcount
Dependencies:assertthatBHclicpp11digestdplyrgenericsgluelatticelifecyclemagrittrMatrixNLPpillarpkgconfigpurrrR6RcppRcppProgressrlangstringistringrtibbletidyrtidyselectutf8vctrswithr
Introduction to the textreuse package
Rendered fromtextreuse-introduction.Rmdusingknitr::rmarkdownon May 06 2026.Last update: 2026-05-05
Started: 2015-10-22
Minhash and locality-sensitive hashing
Rendered fromtextreuse-minhash.Rmdusingknitr::rmarkdownon May 06 2026.Last update: 2026-05-05
Started: 2015-10-22
Pairwise comparisons for document similarity
Rendered fromtextreuse-pairwise.Rmdusingknitr::rmarkdownon May 06 2026.Last update: 2026-05-05
Started: 2015-10-22
Text Alignment
Rendered fromtextreuse-alignment.Rmdusingknitr::rmarkdownon May 06 2026.Last update: 2026-05-05
Started: 2015-10-22
