R/casechange-functions.R
dfm_tolower.Rd
dfm_tolower()
and dfm_toupper()
convert the features of the dfm or
fcm to lower and upper case, respectively, and then recombine the counts.
dfm_tolower(x, keep_acronyms = FALSE) dfm_toupper(x) fcm_tolower(x, keep_acronyms = FALSE) fcm_toupper(x)
x | the input object whose character/tokens/feature elements will be case-converted |
---|---|
keep_acronyms | logical; if |
fcm_tolower()
and fcm_toupper()
convert both dimensions of
the fcm to lower and upper case, respectively, and then recombine
the counts. This works only on fcm objects created with context = "document"
.
# for a document-feature matrix dfmat <- dfm(tokens(c("b A A", "C C a b B")), tolower = FALSE) dfmat#> Document-feature matrix of: 2 documents, 5 features (40.00% sparse) and 0 docvars. #> features #> docs b A C a B #> text1 1 2 0 0 0 #> text2 1 0 2 1 1dfm_tolower(dfmat)#> Document-feature matrix of: 2 documents, 3 features (16.67% sparse) and 0 docvars. #> features #> docs b a c #> text1 1 2 0 #> text2 2 1 2dfm_toupper(dfmat)#> Document-feature matrix of: 2 documents, 3 features (16.67% sparse) and 0 docvars. #> features #> docs B A C #> text1 1 2 0 #> text2 2 1 2# for a feature co-occurrence matrix fcmat <- fcm(tokens(c("b A A d", "C C a b B e")), context = "document") fcmat#> Feature co-occurrence matrix of: 7 by 7 features. #> features #> features b A d C a B e #> b 0 2 1 2 1 1 1 #> A 0 1 2 0 0 0 0 #> d 0 0 0 0 0 0 0 #> C 0 0 0 1 2 2 2 #> a 0 0 0 0 0 1 1 #> B 0 0 0 0 0 0 1 #> e 0 0 0 0 0 0 0fcm_tolower(fcmat)#> Feature co-occurrence matrix of: 5 by 5 features. #> features #> features b a d c e #> b 1 3 1 2 2 #> a 1 1 2 0 1 #> d 0 0 0 0 0 #> c 2 2 0 1 2 #> e 0 0 0 0 0fcm_toupper(fcmat)#> Feature co-occurrence matrix of: 5 by 5 features. #> features #> features B A D C E #> B 1 3 1 2 2 #> A 1 1 2 0 1 #> D 0 0 0 0 0 #> C 2 2 0 1 2 #> E 0 0 0 0 0