R/textstat-methods.R
textstat_select.Rd
Users can subset output object of textstat_collocations
,
textstat_keyness
or textstat_frequency
based on
"glob"
, "regex"
or "fixed"
patterns using this method.
textstat_select( x, pattern = NULL, selection = c("keep", "remove"), valuetype = c("glob", "regex", "fixed"), case_insensitive = TRUE )
x | a |
---|---|
pattern | a character vector, list of character vectors, dictionary, or collocations object. See pattern for details. |
selection | whether to |
valuetype | the type of pattern matching: |
case_insensitive | logical; if |
period <- ifelse(docvars(data_corpus_inaugural, "Year") < 1945, "pre-war", "post-war") dfmat <- dfm(data_corpus_inaugural, groups = period) tstat <- textstat_keyness(dfmat) textstat_select(tstat, 'america*')#> feature chi2 p n_target n_reference #> 7 america 177.5686921 0.000000e+00 130 54 #> 9 americans 151.2940052 0.000000e+00 67 7 #> 16 america's 94.4420979 0.000000e+00 35 0 #> 107 american 19.3289745 1.100241e-05 69 94 #> 1038 americas 0.8013128 3.707012e-01 2 1 #> 1624 american's 0.2671007 6.052833e-01 1 0 #> 5294 americanism -0.3706871 5.426300e-01 0 1