| Title: | Extract External 'R' Code and Insert Inline |
|---|---|
| Description: | An 'RStudio' and 'Positron' add-in that prompts the user for a web 'URL', fetches the page content, extracts 'R' code chunks, and inserts those code chunks into the active editor at the current cursor position. Supports extraction of raw 'Markdown' or 'Quarto' source files, 'GitHub' Gist and rendered 'HTML' pages that have markup elements with 'R'-related classes. |
| Authors: | VP Nagraj [aut, cre, cph] |
| Maintainer: | VP Nagraj <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-22 08:27:08 UTC |
| Source: | https://github.com/vpnagraj/ctrlvee |
When invoked from the Addins menu or a keyboard shortcut, this function:
addin_crawl_chunks()addin_crawl_chunks()
Opens a dialog asking the user for a web URL.
Auto-detects the best extraction strategy.
Fetches the page and extracts R code chunks.
Inserts the extracted code into the active source editor at the current cursor position.
Called for side effects (i.e., inserting text); returns NULL invisibly.
After installing the package, open
Tools > Modify Keyboard Shortcuts… (RStudio) or the Command Palette
in Positron, search for "Extract External R Code and Insert Inline", and assign your
preferred shortcut (e.g. Ctrl+Shift+U / Cmd+Shift+U).
if (interactive() && rstudioapi::isAvailable()) { addin_crawl_chunks() }if (interactive() && rstudioapi::isAvailable()) { addin_crawl_chunks() }
The primary engine of the ctrlvee package. Give it a URL to a raw source file, a GitHub Gist, or a rendered HTML page and it returns the R code chunks found there.
crawl_chunks(url, strategy = c("auto", "raw", "html"), verbose = TRUE)crawl_chunks(url, strategy = c("auto", "raw", "html"), verbose = TRUE)
url |
Character vector of length 1 with the URL to fetch |
strategy |
One of |
verbose |
Logical; default is |
A character vector of R code chunk bodies. Returns character(0) if no R chunks are found.
## rendered Quarto book chapter (HTML strategy detected) chunks <- crawl_chunks("https://r4ds.hadley.nz/data-visualize.html") ## GitHub Gist ... plain .R file (raw strategy detected) chunks <- crawl_chunks( "https://gist.github.com/vpnagraj/59fa609c5adf47c8c7a5b156eb261be7" ) ## you can also dictate a specific strategy chunks <- crawl_chunks("https://r4ds.hadley.nz/data-visualize.html", strategy = "html")## rendered Quarto book chapter (HTML strategy detected) chunks <- crawl_chunks("https://r4ds.hadley.nz/data-visualize.html") ## GitHub Gist ... plain .R file (raw strategy detected) chunks <- crawl_chunks( "https://gist.github.com/vpnagraj/59fa609c5adf47c8c7a5b156eb261be7" ) ## you can also dictate a specific strategy chunks <- crawl_chunks("https://r4ds.hadley.nz/data-visualize.html", strategy = "html")
Downloads an HTML page (e.g., a rendered Quarto book chapter) and extracts code blocks that are tagged as R.
crawl_chunks_html(url, verbose = interactive())crawl_chunks_html(url, verbose = interactive())
url |
Character vector of length 1 with the URL to fetch |
verbose |
Logical; default is |
The function looks for <code> elements whose class attribute
matches known patterns for R source code (e.g., sourceCode r,
language-r). It then grabs the inner text, stripping any
syntax highlighting <span> elements.
It is important to note that:
Chunk options (labels, echo, eval, etc.) are lost in rendered HTML
because the renderer strips them.
Output blocks that the renderer styled identically to source blocks
may occasionally be captured. A heuristic filters out blocks where
the majority (>50%) of lines look like R console output (e.g., [1], ##).
A character vector where each element is the text content of
one R code block, in document order. Returns character(0) if
none are found.
## extract R code from a rendered Quarto book chapter chunks <- crawl_chunks_html("https://r4ds.hadley.nz/data-visualize.html") length(chunks) cat(chunks[[1]])## extract R code from a rendered Quarto book chapter chunks <- crawl_chunks_html("https://r4ds.hadley.nz/data-visualize.html") length(chunks) cat(chunks[[1]])
Downloads the raw text content of a URL (typically a .qmd, .Rmd,
.md, or .R file served as plain text) and extracts R code.
crawl_chunks_raw(url, verbose = TRUE)crawl_chunks_raw(url, verbose = TRUE)
url |
Character vector of length 1 with the URL to fetch |
verbose |
Logical; default is |
For Markdown-family files, fenced R code blocks are extracted
individually. For plain .R files (or any file where no fenced
chunks are found), the entire file content is returned as a single
chunk.
A character vector where each element is the body of one R
chunk. For plain .R files, a vector of length 1 containing the
full file. Returns character(0) if the file is empty.
## fenced chunks from a Quarto source file chunks <- crawl_chunks_raw( "https://raw.githubusercontent.com/hadley/r4ds/main/data-visualize.qmd" ) ## plain .R file from a GitHub Gist chunks <- crawl_chunks_raw( "https://gist.githubusercontent.com/vpnagraj/59fa609c5adf47c8c7a5b156eb261be7/raw" )## fenced chunks from a Quarto source file chunks <- crawl_chunks_raw( "https://raw.githubusercontent.com/hadley/r4ds/main/data-visualize.qmd" ) ## plain .R file from a GitHub Gist chunks <- crawl_chunks_raw( "https://gist.githubusercontent.com/vpnagraj/59fa609c5adf47c8c7a5b156eb261be7/raw" )
Examines the URL pattern and returns a strategy string that tells the
caller which extraction function to use. This will be one of either:
"raw" (URL points directly to a raw source file or a GitHub Gist) or
"html" (URL is a rendered web page like a Quarto book chapter or pkgdown site).
detect_strategy(url)detect_strategy(url)
url |
Character vector of length 1 with the URL to fetch |
Detection is entirely URL-based (no network requests) and intentionally
conservative; when in doubt it falls back to "html".
Rules (evaluated in order):
Contains gist.githubusercontent.com or matches
gist.github.com/<user>/<hash>: "raw"
Contains raw.githubusercontent.com or ends in .Rmd, .qmd,
.md, .R, or .Rmarkdown: "raw"
Everything else: "html"
One of "raw", "github", or "html".
detect_strategy("https://gist.github.com/vpnagraj/59fa609c5adf47c8c7a5b156eb261be7") detect_strategy("https://raw.githubusercontent.com/hadley/r4ds/main/data-visualize.qmd") detect_strategy("https://r4ds.hadley.nz/data-visualize.html")detect_strategy("https://gist.github.com/vpnagraj/59fa609c5adf47c8c7a5b156eb261be7") detect_strategy("https://raw.githubusercontent.com/hadley/r4ds/main/data-visualize.qmd") detect_strategy("https://r4ds.hadley.nz/data-visualize.html")
Helper function to do HTML-parsing in crawl_chunks_html(). Also
exported so users can call it on local HTML strings if needed.
extract_r_chunks_from_html(html)extract_r_chunks_from_html(html)
html |
A character vector of length 1 containing HTML content |
Character vector of extracted R code bodies.
A helper to go line-by-line through a Markdown file to find the common R code fence
styles used in R Markdown, Quarto, and GitHub Markdown.
Note that non-R fenced blocks (e.g., python, sql, etc.) are skipped.
extract_r_chunks_from_markdown(text)extract_r_chunks_from_markdown(text)
text |
Character vector of length 1 with Markdown source text |
Character vector of extracted code bodies.