Package: rtika 2.7.0
rtika: R Interface to 'Apache Tika'
Extract text or metadata from over a thousand file types, using Apache Tika <https://tika.apache.org/>. Get either plain text or structured XHTML content.
Authors:
rtika_2.7.0.tar.gz
rtika_2.7.0.zip(r-4.6)rtika_2.7.0.zip(r-4.5)rtika_2.7.0.zip(r-4.4)
rtika_2.7.0.tgz(r-4.5-any)rtika_2.7.0.tgz(r-4.4-any)
rtika_2.7.0.tar.gz(r-4.6-any)rtika_2.7.0.tar.gz(r-4.5-any)
rtika_2.7.0.tgz(r-4.5-emscripten)rtika_2.7.0.tgz(r-4.4-emscripten)
rtika.pdf |rtika.html✨
rtika/json (API)
NEWS
# Install 'rtika' in R: |
install.packages('rtika', repos = c('https://packages.ropensci.org', 'https://cloud.r-project.org')) |
Reviews:rOpenSci Software Review #191
Bug tracker:https://github.com/ropensci/rtika/issues
Pkgdown site:https://docs.ropensci.org
extract-metadataextract-textjavaparsepdf-filespeer-reviewedtesseracttika
Last updated 2 years ago from:64f4be7c75 (on master). Checks:11 OK. Indexed: yes.
Target | Result | Total time | Artifact |
---|---|---|---|
linux-devel-x86_64 | OK | 131 | |
pkgdown docs | OK | 305 | |
source / vignettes | OK | 157 | |
linux-release-x86_64 | OK | 160 | |
macos-release-arm64 | OK | 92 | |
macos-oldrel-arm64 | OK | 75 | |
windows-devel | OK | 86 | |
windows-release | OK | 86 | |
windows-oldrel | OK | 120 | |
wasm-release | OK | 121 | |
wasm-oldrel | OK | 135 |
Exports:install_tikajavatikatika_checktika_fetchtika_htmltika_jartika_jsontika_json_texttika_texttika_xml
Readme and manuals
Help Manual
Help page | Topics |
---|---|
Install or Update the Apache Tika 'jar' | install_tika |
System Command to Run Java | java |
rtika: R Interface to 'Apache Tika' | rtika |
Main R Interface to 'Apache Tika' | tika |
Check Tika against a checksum | tika_check |
Fetch Files with the Content-Type Preserved in the File Extension | tika_fetch |
Get Structured XHTML | tika_html |
Path to Apache Tika | tika_jar |
Get json Metadata and XHTML Content | tika_json |
Get json Metadata and Plain Text Content | tika_json_text |
Get Plain Text | tika_text |
Get a Structured XHTML Rendition | tika_xml |