Package 'paperwizard' reference manual

Title:	Scrape News Sites using 'readability.js'
Description:	Uses Mozillas readability.js to scrape text from websites. This is particularly useful to obtain news articles.
Authors:	David Schoch [aut, cre]
Maintainer:	David Schoch <[email protected]>
License:	MIT + file LICENSE
Version:	0.2.0.9000
Built:	2025-02-12 10:26:19 UTC
Source:	https://github.com/schochastics/paperwizard

Scrape using Readability.js

Scrape using Readability.js

pw_deliver(x, type = c("static", "dynamic"))
pw_deliver(x, type = c("static", "dynamic"))

`x`	Either a vector of urls or a data.frame returned by `paperboy::pb_collect()`.
`type`	either "static" or "dynamic" if articles are scraped

A tibble similar to the output of paperboy::pb_deliver().

Run NPM install Run NPM install to install dependencies

pw_npm_install()
pw_npm_install()

An installed lib

Summary of delivered articles

pw_report(x, n = 100)
pw_report(x, n = 100)

`x`	result from `pw_deliver()`
`n`	integer cutoff when articles are considered too short (default 100)

nothing. called for side effects