Package: textreuse 0.1.5
textreuse: Detect Text Reuse and Document Similarity
Tools for measuring similarity among documents and detecting passages which have been reused. Implements shingled n-gram, skip n-gram, and other tokenizers; similarity/dissimilarity functions; pairwise comparisons; minhash and locality sensitive hashing algorithms; and a version of the Smith-Waterman local alignment algorithm suitable for natural language.
Authors:
textreuse_0.1.5.tar.gz
textreuse_0.1.5.zip(r-4.6)textreuse_0.1.5.zip(r-4.5)textreuse_0.1.5.zip(r-4.4)
textreuse_0.1.5.tgz(r-4.5-x86_64)textreuse_0.1.5.tgz(r-4.5-arm64)textreuse_0.1.5.tgz(r-4.4-x86_64)textreuse_0.1.5.tgz(r-4.4-arm64)
textreuse_0.1.5.tar.gz(r-4.6-arm64)textreuse_0.1.5.tar.gz(r-4.6-x86_64)textreuse_0.1.5.tar.gz(r-4.5-arm64)textreuse_0.1.5.tar.gz(r-4.5-x86_64)
textreuse_0.1.5.tgz(r-4.4-emscripten)
textreuse.pdf |textreuse.html✨
textreuse/json (API)
NEWS
# Install 'textreuse' in R: |
install.packages('textreuse', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #20
Bug tracker:https://github.com/ropensci/textreuse/issues
Pkgdown site:https://docs.ropensci.org
Last updated 4 months agofrom:895b5ff299 (on master). Checks:3 OK, 11 NOTE. Indexed: yes.
Target | Result | Total time |
---|---|---|
pkgdown docs | OK | 164 |
source / vignettes | OK | 233 |
linux-devel-x86_64 | NOTE | 147 |
linux-devel-arm64 | NOTE | 170 |
linux-release-x86_64 | NOTE | 150 |
linux-release-arm64 | NOTE | 169 |
macos-release-x86_64 | NOTE | 223 |
macos-release-arm64 | NOTE | 79 |
macos-oldrel-x86_64 | NOTE | 226 |
macos-oldrel-arm64 | NOTE | 130 |
windows-devel | NOTE | 202 |
windows-release | NOTE | 241 |
windows-oldrel | NOTE | 169 |
wasm-release | OK | 148 |
Exports:align_localcontentcontent<-filenameshas_contenthas_hasheshas_minhasheshas_tokenshash_stringhasheshashes<-is.TextReuseCorpusis.TextReuseTextDocumentjaccard_bag_similarityjaccard_dissimilarityjaccard_similaritylshlsh_candidateslsh_comparelsh_probabilitylsh_querylsh_subsetlsh_thresholdmetameta<-minhash_generatorminhashesminhashes<-pairwise_candidatespairwise_compareratio_of_matchesrehashskippedTextReuseCorpusTextReuseTextDocumenttokenizetokenize_ngramstokenize_sentencestokenize_skip_ngramstokenize_wordstokenstokens<-wordcount
Dependencies:assertthatBHclicpp11digestdplyrfansigenericsgluelifecyclemagrittrNLPpillarpkgconfigpurrrR6RcppRcppProgressrlangstringistringrtibbletidyrtidyselectutf8vctrswithr
Introduction to the textreuse package
Rendered fromtextreuse-introduction.Rmd
usingknitr::rmarkdown
on May 10 2025.Last update: 2020-05-12
Started: 2015-10-22
Minhash and locality-sensitive hashing
Rendered fromtextreuse-minhash.Rmd
usingknitr::rmarkdown
on May 10 2025.Last update: 2015-10-31
Started: 2015-10-22
Pairwise comparisons for document similarity
Rendered fromtextreuse-pairwise.Rmd
usingknitr::rmarkdown
on May 10 2025.Last update: 2015-10-31
Started: 2015-10-22
Text Alignment
Rendered fromtextreuse-alignment.Rmd
usingknitr::rmarkdown
on May 10 2025.Last update: 2015-10-22
Started: 2015-10-22