Convert features into equivalence classes defined by values of a dictionary object.

applyDictionary(x, dictionary, ...)

# S3 method for tokens
applyDictionary(x, ...)

# S3 method for tokenizedTexts
applyDictionary(x, ...)

# S3 method for dfm
applyDictionary(x, ...)

Arguments

x

object to which dictionary or thesaurus will be supplied

dictionary

the dictionary-class object that will be applied to x

...

not used

exclusive

if TRUE, remove all features not in dictionary, otherwise, replace values in dictionary with keys while leaving other features unaffected

case_insensitive

ignore the case of dictionary values if TRUE

capkeys

if TRUE, convert dictionary keys to uppercase to distinguish them from other features

verbose

print status messages if TRUE

Value

an object of the type passed with the value-matching features replaced by dictionary keys

Details

applyDictionary.dfm is the deprecated function name for dfm_lookup.

Note

Selecting only features defined in a "dictionary" is traditionally known in text analysis as a dictionary method, even though technically this "dictionary" operates more like a thesarus. If a thesaurus-like application is desired, set exclusive = FALSE to convert features defined as values in a dictionary into their keys, while keeping all other features.

Examples

toks <- tokens(data_corpus_inaugural) head(kwic(toks, "united states"))
#> kwic object with 0 rows
dict <- dictionary(list(country = "united states")) toks2 <- applyDictionary(toks, dict, valuetype = "fixed")
#> Warning: 'applyDictionary.tokens' is deprecated. #> Use 'tokens_lookup' instead. #> See help("Deprecated")
toks2
#> tokens from 58 documents. #> 1789-Washington : #> [1] "country" "country" #> #> 1793-Washington : #> character(0) #> #> 1797-Adams : #> [1] "country" "country" "country" #> #> 1801-Jefferson : #> character(0) #> #> 1805-Jefferson : #> [1] "country" #> #> 1809-Madison : #> [1] "country" "country" #> #> 1813-Madison : #> [1] "country" "country" "country" "country" #> #> 1817-Monroe : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" "country" "country" "country" "country" "country" #> #> 1821-Monroe : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" "country" "country" "country" "country" "country" "country" #> [15] "country" "country" #> #> 1825-Adams : #> [1] "country" #> #> 1829-Jackson : #> [1] "country" #> #> 1833-Jackson : #> [1] "country" "country" "country" #> #> 1837-VanBuren : #> [1] "country" #> #> 1841-Harrison : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" "country" #> #> 1845-Polk : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" "country" "country" "country" "country" #> #> 1849-Taylor : #> [1] "country" #> #> 1853-Pierce : #> [1] "country" "country" #> #> 1857-Buchanan : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" #> #> 1861-Lincoln : #> [1] "country" "country" "country" "country" "country" #> #> 1865-Lincoln : #> character(0) #> #> 1869-Grant : #> [1] "country" "country" #> #> 1873-Grant : #> character(0) #> #> 1877-Hayes : #> [1] "country" "country" #> #> 1881-Garfield : #> [1] "country" "country" "country" "country" "country" "country" "country" #> #> 1885-Cleveland : #> [1] "country" #> #> 1889-Harrison : #> character(0) #> #> 1893-Cleveland : #> [1] "country" "country" #> #> 1897-McKinley : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" "country" "country" "country" "country" #> #> 1901-McKinley : #> [1] "country" "country" "country" "country" "country" "country" "country" #> [8] "country" "country" #> #> 1905-Roosevelt : #> character(0) #> #> 1909-Taft : #> [1] "country" "country" "country" "country" "country" "country" #> #> 1913-Wilson : #> character(0) #> #> 1917-Wilson : #> [1] "country" #> #> 1921-Harding : #> [1] "country" #> #> 1925-Coolidge : #> character(0) #> #> 1929-Hoover : #> [1] "country" "country" #> #> 1933-Roosevelt : #> [1] "country" "country" #> #> 1937-Roosevelt : #> [1] "country" "country" "country" #> #> 1941-Roosevelt : #> [1] "country" "country" "country" #> #> 1945-Roosevelt : #> character(0) #> #> 1949-Truman : #> [1] "country" "country" "country" "country" "country" #> #> 1953-Eisenhower : #> [1] "country" #> #> 1957-Eisenhower : #> character(0) #> #> 1961-Kennedy : #> character(0) #> #> 1965-Johnson : #> character(0) #> #> 1969-Nixon : #> [1] "country" #> #> 1973-Nixon : #> character(0) #> #> 1977-Carter : #> [1] "country" #> #> 1981-Reagan : #> [1] "country" #> #> 1985-Reagan : #> character(0) #> #> 1989-Bush : #> [1] "country" #> #> 1993-Clinton : #> character(0) #> #> 1997-Clinton : #> character(0) #> #> 2001-Bush : #> character(0) #> #> 2005-Bush : #> [1] "country" "country" "country" "country" "country" #> #> 2009-Obama : #> [1] "country" #> #> 2013-Obama : #> [1] "country" "country" #> #> 2017-Trump : #> [1] "country" "country" #>