Package 'webbotparseR' reference manual

Title:	Parse html files containing search engine results
Description:	Parse search engine results which have been scraped with the 'WebBot' browser extension <https://github.com/gesiscss/WebBot>.
Authors:	David Schoch [aut, cre] , Chung-hong Chan [aut]
Maintainer:	David Schoch <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.0.9000
Built:	2025-01-31 05:55:17 UTC
Source:	https://github.com/schochastics/webbotparseR

Image data uri to file

Description

Convert a data uri to an image in the correct format and save it to a file.

Usage

base64_to_img(data_uri, slug)
base64_to_img(data_uri, slug)

Arguments

`data_uri`	charachter, base64 image string as returned by parse_search_results
`slug`	character, name of file to export image to. WITHOUT extension

Value

nothing, called for side effects

Examples

## Not run: 
data_uri <- paste0(
    "data:image/png;base64,",
    base64enc::base64encode(system.file("logo.png", package = "webbotparseR"))
)
base64_to_img(data_uri, "logo")

## End(Not run)
## Not run: 
data_uri <- paste0(
    "data:image/png;base64,",
    base64enc::base64encode(system.file("logo.png", package = "webbotparseR"))
)
base64_to_img(data_uri, "logo")

## End(Not run)

Parse metadata from search engine results

Description

Parse metadata from search engine results

Usage

parse_metadata(path)
parse_metadata(path)

Arguments

path

character. a path to a file that contains search results

Value

a tibble of parsed search engine results

Examples

parse_metadata("www.google.com_climate change_text_2023-03-16_08_16_11.html")
parse_metadata("www.google.com_climate change_text_2023-03-16_08_16_11.html")

Parse search engine results

Description

Parse search engine results

Usage

parse_search_results(path, engine, selectors = "latest")
parse_search_results(path, engine, selectors = "latest")

Arguments

`path`	character. either a path to a file that contains search results or a path to a directory containing search engine result files
`engine`	character.
`selectors`	either character or a `webbot_selectors` S3 object. For character, it represents the selectors version and valid choices are listed in `selectors_versions` and "latest" (select the latest version). You can also supply your own `webbot_selectors` object.

Value

a tibble of parsed search engine results

Examples

search_html <- system.file(
    "www.google.com_climatechange_text_2023-03-16_08_16_11.html",
    package = "webbotparseR"
)

parse_search_results(search_html, engine = "google text", selectors = "ver1")
search_html <- system.file(
    "www.google.com_climatechange_text_2023-03-16_08_16_11.html",
    package = "webbotparseR"
)

parse_search_results(search_html, engine = "google text", selectors = "ver1")

Package 'webbotparseR'

Help Index

Image data uri to file

Description

Usage

Arguments

Value

Examples

Parse metadata from search engine results

Description

Usage

Arguments

Value

Examples

Parse search engine results

Description

Usage

Arguments

Value

Examples