R/textstat_lexdiv.R
dfm_split_hyphenated_features.Rd
Takes a dfm that contains features with hyphenated words, such as
"split-second" and turns them into features that split the elements
in the same was as tokens(x, remove_hyphens = TRUE)
would have done.
dfm_split_hyphenated_features(x)
x | input dfm |
---|
#> Document-feature matrix of: 1 document, 5 features (0.0% sparse). #> features #> docs one-two one two three . #> text1 1 1 1 1 1quanteda:::dfm_split_hyphenated_features(dfmat)#> Document-feature matrix of: 1 document, 5 features (0.0% sparse). #> features #> docs one two three . - #> text1 2 2 1 1 1