Title: | Engine for Acronyms and Initialisms |
---|---|
Description: | A tool for generating acronyms and initialisms from arbitrary text input. |
Authors: | VP Nagraj [aut, cre] |
Maintainer: | VP Nagraj <[email protected]> |
License: | GPL-3 |
Version: | 0.1.0 |
Built: | 2024-10-11 05:14:57 UTC |
Source: | https://github.com/vpnagraj/acroname |
The acroname
engines include methods to generate acronyms and initialisms. acronym()
searches for candidates by constructing words from characters provided. Each word constructed is compared to the terms in the dictionary specified, and once a match is found the acronym is returned. initialism()
takes the first characters from each word in the string. Both functions can optionally return a tibble
, ignore articles, and/or use a "bag of words" approach (for more see mince).
acronym( input, dictionary = NULL, acronym_length = 3, ignore_articles = TRUE, alnum_only = TRUE, timeout = 60, bow = FALSE, bow_prop = 0.5, to_tibble = FALSE ) initialism( input, ignore_articles = TRUE, alnum_only = TRUE, bow = FALSE, bow_prop = 0.5, to_tibble = FALSE )
acronym( input, dictionary = NULL, acronym_length = 3, ignore_articles = TRUE, alnum_only = TRUE, timeout = 60, bow = FALSE, bow_prop = 0.5, to_tibble = FALSE ) initialism( input, ignore_articles = TRUE, alnum_only = TRUE, bow = FALSE, bow_prop = 0.5, to_tibble = FALSE )
input |
Character vector with text to use as the input for the candidate |
dictionary |
Character vector containing dictionary of terms from which acronym should be created; default is |
acronym_length |
Number of characters in acronym; default is |
ignore_articles |
Logical indicating whether or not articles should be ignored ; default is |
alnum_only |
Logical that specifes whether only alphanumeric should be used; default is |
timeout |
Maximum seconds to spend searching for an acronym; default is |
bow |
Logical for whether or not a "bag of words" approach should be used for "input" vector; default is |
bow_prop |
Given |
to_tibble |
Logical as to whether or not the result should be a |
If to_tibble = FALSE
(default), then a character vector containing the name capitalized followed by the original string with letters used in the name capitalized.
If to_tibble = TRUE
, then a tibble
with the following columns:
formatted: The candidate name and string with letters used capitalized
prefix: The candidate name
suffix: Words used with letters in name capitalized
original: The original string used to construct the name
This function will check if an input word is an article in the English language ('a', 'an', 'the').
find_articles(word)
find_articles(word)
word |
Character vector of length 1 with word to check |
Logical vector of length one, TRUE
if the word is an article and FALSE
if not.
find_articles("the") find_articles("then") find_articles("whatever")
find_articles("the") find_articles("then") find_articles("whatever")
This is an unexported helper for acronym. The function is used wrapped in a tryCatch()
that uses withTimeout to manage maximum wait time for the candidate acronym search.
find_candidate(collapsed, acronym_length, probs, dictionary, words_len)
find_candidate(collapsed, acronym_length, probs, dictionary, words_len)
collapsed |
The collapsed string of characters generated by mince |
acronym_length |
Number of characters in acronym; default is |
probs |
Vector of probabilities for selecting each character while generating candidate |
dictionary |
Character vector containing dictionary of terms from which acronym should be created; default is |
words_len |
Vector of the length of each word in the input |
Named list with three elements:
formatted: The candidate acronym and string with letters used capitalized
prefix: The candidate acronym
suffix: Words used with letters in acronym capitalized #'
This helper function will extract the first character from a string. The element may be a letter, number, or special character but will be coerced to a character vector in the output.
first_char(string)
first_char(string)
string |
Character vector from which the first character will be extracted |
Character vector with the first character from each element in the vector passed to the input "string" argument. This value will be the same length as the original vector.
first_char("purple") first_char(c("purple","rain")) first_char(c("nothing","compares","2u"))
first_char("purple") first_char(c("purple","rain")) first_char(c("nothing","compares","2u"))
This helper is used by both acronym and initialism to extract elements needed from the input string.
If the function is used with bow = TRUE
the input will be processed with a "bag of words" approach, by which words will be shuffled and sampled without replacement. In this case, the number of characters used will be determined by the proportion passed to "bow_prop".
mince( input, ignore_articles = TRUE, alnum_only = TRUE, bow = FALSE, bow_prop = 0.5 )
mince( input, ignore_articles = TRUE, alnum_only = TRUE, bow = FALSE, bow_prop = 0.5 )
input |
Character vector with text to use as the input for the candidate |
ignore_articles |
Logical indicating whether or not articles should be ignored ; default is |
alnum_only |
Logical that specifes whether only alphanumeric should be used; default is |
bow |
Logical for whether or not a "bag of words" approach should be used for "input" vector; default is |
bow_prop |
Given |
Named list with the following elements:
words: Vector with one element per word to be used in the acronym or initialism
collapsed: Vector of length 1 containing all characters from words collapsed
words_len: Vector containing length of each word
first_chars: Vector containing first character from each word