Package: robotstxt 0.7.15.9000
robotstxt: A 'robots.txt' Parser and 'Webbot'/'Spider'/'Crawler' Permissions Checker
Provides functions to download and parse 'robots.txt' files. Ultimately the package makes it easy to check if bots (spiders, crawler, scrapers, ...) are allowed to access specific resources on a domain.
Authors:
robotstxt_0.7.15.9000.tar.gz
robotstxt_0.7.15.9000.zip(r-4.5)robotstxt_0.7.15.9000.zip(r-4.4)robotstxt_0.7.15.9000.zip(r-4.3)
robotstxt_0.7.15.9000.tgz(r-4.4-any)robotstxt_0.7.15.9000.tgz(r-4.3-any)
robotstxt_0.7.15.9000.tar.gz(r-4.5-noble)robotstxt_0.7.15.9000.tar.gz(r-4.4-noble)
robotstxt_0.7.15.9000.tgz(r-4.4-emscripten)robotstxt_0.7.15.9000.tgz(r-4.3-emscripten)
robotstxt.pdf |robotstxt.html✨
robotstxt/json (API)
NEWS
# Install 'robotstxt' in R: |
install.packages('robotstxt', repos = c('https://packages.ropensci.org', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/ropensci/robotstxt/issues
crawlerpeer-reviewedrobotstxtscraperspiderwebscraping
Last updated 1 months agofrom:ca6957e6f2 (on main). Checks:OK: 7. Indexed: yes.
Target | Result | Date |
---|---|---|
Doc / Vignettes | OK | Oct 02 2024 |
R-4.5-win | OK | Oct 02 2024 |
R-4.5-linux | OK | Oct 02 2024 |
R-4.4-win | OK | Oct 02 2024 |
R-4.4-mac | OK | Oct 02 2024 |
R-4.3-win | OK | Oct 02 2024 |
R-4.3-mac | OK | Oct 02 2024 |
Exports:%>%get_robotstxtget_robotstxt_http_getget_robotstxtsis_valid_robotstxton_client_error_defaulton_domain_change_defaulton_file_type_mismatch_defaulton_not_found_defaulton_redirect_defaulton_server_error_defaulton_sub_domain_change_defaulton_suspect_content_defaultparse_robotstxtpaths_allowedrequest_handler_handlerrobotstxtrt_last_httprt_request_handler
Dependencies:askpassclicodetoolscurldigestfuturefuture.applyglobalsgluehttrjsonlitelifecyclelistenvmagrittrmimeopensslparallellyR6Rcpprlangspiderbarstringistringrsysvctrs