Package: textreuse 0.1.5
textreuse: Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Authors:
textreuse_0.1.5.tar.gz
textreuse_0.1.5.zip(r-4.6)textreuse_0.1.5.zip(r-4.5)textreuse_0.1.5.zip(r-4.4)
textreuse_0.1.5.tgz(r-4.5-x86_64)textreuse_0.1.5.tgz(r-4.5-arm64)textreuse_0.1.5.tgz(r-4.4-x86_64)textreuse_0.1.5.tgz(r-4.4-arm64)
textreuse_0.1.5.tar.gz(r-4.6-arm64)textreuse_0.1.5.tar.gz(r-4.6-x86_64)textreuse_0.1.5.tar.gz(r-4.5-arm64)textreuse_0.1.5.tar.gz(r-4.5-x86_64)
textreuse_0.1.5.tgz(r-4.5-emscripten)textreuse_0.1.5.tgz(r-4.4-emscripten)
textreuse.pdf |textreuse.html✨
textreuse/json (API)
NEWS
# Install 'textreuse' in R: |
install.packages('textreuse', repos = c('https://packages.ropensci.org', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #20
Bug tracker:https://github.com/ropensci/textreuse/issues
Pkgdown site:https://docs.ropensci.org
Last updated 4 months ago from:895b5ff299 (on master). Checks:11 NOTE, 4 OK. Indexed: yes.
Target | Result | Total time |
---|---|---|
linux-devel-arm64 | NOTE | 186 |
linux-devel-x86_64 | NOTE | 153 |
pkgdown docs | OK | 191 |
source / vignettes | OK | 202 |
linux-release-arm64 | NOTE | 172 |
linux-release-x86_64 | NOTE | 155 |
macos-release-arm64 | NOTE | 84 |
macos-release-x86_64 | NOTE | 166 |
macos-oldrel-arm64 | NOTE | 93 |
macos-oldrel-x86_64 | NOTE | 195 |
windows-devel | NOTE | 137 |
windows-release | NOTE | 220 |
windows-oldrel | NOTE | 132 |
wasm-release | OK | 119 |
wasm-oldrel | OK | 145 |
Exports:align_localcontentcontent<-filenameshas_contenthas_hasheshas_minhasheshas_tokenshash_stringhasheshashes<-is.TextReuseCorpusis.TextReuseTextDocumentjaccard_bag_similarityjaccard_dissimilarityjaccard_similaritylshlsh_candidateslsh_comparelsh_probabilitylsh_querylsh_subsetlsh_thresholdmetameta<-minhash_generatorminhashesminhashes<-pairwise_candidatespairwise_compareratio_of_matchesrehashskippedTextReuseCorpusTextReuseTextDocumenttokenizetokenize_ngramstokenize_sentencestokenize_skip_ngramstokenize_wordstokenstokens<-wordcount
Dependencies:assertthatBHclicpp11digestdplyrgenericsgluelifecyclemagrittrNLPpillarpkgconfigpurrrR6RcppRcppProgressrlangstringistringrtibbletidyrtidyselectutf8vctrswithr
Introduction to the textreuse package
Rendered fromtextreuse-introduction.Rmd
usingknitr::rmarkdown
on Jun 09 2025.Last update: 2020-05-12
Started: 2015-10-22
Minhash and locality-sensitive hashing
Rendered fromtextreuse-minhash.Rmd
usingknitr::rmarkdown
on Jun 09 2025.Last update: 2015-10-31
Started: 2015-10-22
Pairwise comparisons for document similarity
Rendered fromtextreuse-pairwise.Rmd
usingknitr::rmarkdown
on Jun 09 2025.Last update: 2015-10-31
Started: 2015-10-22
Text Alignment
Rendered fromtextreuse-alignment.Rmd
usingknitr::rmarkdown
on Jun 09 2025.Last update: 2015-10-22
Started: 2015-10-22