| Title: | Scrape News Sites using 'readability.js' |
|---|---|
| Description: | Uses Mozillas readability.js to scrape text from websites. This is particularly useful to obtain news articles. |
| Authors: | David Schoch [aut, cre] (ORCID: <https://orcid.org/0000-0003-2952-4812>) |
| Maintainer: | David Schoch <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0.9000 |
| Built: | 2026-05-19 13:58:17 UTC |
| Source: | https://github.com/schochastics/paperwizard |
Scrape using Readability.js
pw_deliver(x, type = c("static", "dynamic"))pw_deliver(x, type = c("static", "dynamic"))
x |
Either a vector of urls or a data.frame returned by |
type |
either "static" or "dynamic" if articles are scraped |
A tibble similar to the output of paperboy::pb_deliver().
Run NPM install Run NPM install to install dependencies
pw_npm_install()pw_npm_install()
An installed lib
Summary of delivered articles
pw_report(x, n = 100)pw_report(x, n = 100)
x |
result from |
n |
integer cutoff when articles are considered too short (default 100) |
nothing. called for side effects