Package 'unrtf'

Title: Extract Text from Rich Text Format (RTF) Documents
Description: Wraps the 'unrtf' utility <https://www.gnu.org/software/unrtf/> to extract text from RTF files. Supports document conversion to HTML, LaTeX or plain text. Output in HTML is recommended because 'unrtf' has limited support for converting between character encodings.
Authors: Jeroen Ooms [aut, cre], Free Software Foundation, Inc [cph]
Maintainer: Jeroen Ooms <[email protected]>
License: GPL-3
Version: 1.4.7
Built: 2024-11-25 05:54:42 UTC
Source: https://github.com/ropensci/unrtf

Help Index


Convert rtf Documents

Description

Converts an rtf document to html, text or latex. Output in html is recommended because unrtf has limited support for converting between character encodings which is problematic for non-ascii text.

Usage

unrtf(
  file = NULL,
  format = c("html", "text", "latex"),
  verbose = FALSE,
  conf_dir = NULL
)

Arguments

file

path or url to the 'rtf' file

format

output format, must be "text", "html" or "latex"

verbose

print some output to stderr

conf_dir

use a custom dir with .conf files which serve as output templates.

Details

Output can be customized via a set of .conf files which serve as templates for the various formats. The default conf files are located in system.file("share", package = "unrtf") To modify the output, copy these files to a custom location and set pass the directory as the conf_dir argument in unrtf.

Examples

library(unrtf)
text <- unrtf("https://jeroen.github.io/files/sample.rtf", format = "text")
html <- unrtf("https://jeroen.github.io/files/sample.rtf", format = "html")
cat(text)