To make the usage as consistent as possible with other packages, quanteda
also provides shortcut wrappers to convert()
, designed to be
similar in syntax to analogous commands in the packages to whose format they
are converting.
as.wfm(x) # S3 method for dfm as.wfm(x) as.DocumentTermMatrix(x) # S3 method for dfm as.DocumentTermMatrix(x) dfm2austin(x) dfm2tm(x, weighting = tm::weightTf) dfm2lda(x, omit_empty = TRUE) dtm2lda(x, omit_empty = TRUE) dfm2dtm(x, omit_empty = TRUE) dfm2stm(x, docvars = NULL, omit_empty = TRUE)
x | the dfm to be converted |
---|---|
weighting | a tm weight, see |
omit_empty | logical; if |
docvars | optional data.frame of document variables used as the
|
... | additional arguments used only by |
A converted object determined by the value of to
(see above).
See conversion target package documentation for more detailed descriptions
of the return formats.
as.wfm
converts a quanteda dfm into the
wfm
format used by the austin
package.
as.DocumentTermMatrix
will convert a quanteda dfm into the
tm package's DocumentTermMatrix format. Note: The
tm package version of as.TermDocumentMatrix()
allows a weighting
argument, which supplies a weighting function for
TermDocumentMatrix()
. Here the default is for
term frequency weighting. If you want a different weighting, apply the
weights after converting using one of the tm functions. For other
available weighting functions from the tm package, see
TermDocumentMatrix.
dfm2lda
provides converts a dfm into the list representation
of terms in documents used by the lda package (a list with components
"documents" and "vocab" as needed by
lda::lda.collapsed.gibbs.sampler()
).
dfm2ldaformat
provides converts a dfm into the list
representation of terms in documents used by the lda package (a list
with components "documents" and "vocab" as needed by
lda::lda.collapsed.gibbs.sampler()
).
Additional coercion methods to base R objects are also available:
[as.data.frame](x)
converts a dfm into a data.frame
[as.matrix](x)
corp <- corpus_subset(data_corpus_inaugural, Year > 1970) dfmat <- dfm(corp) # shortcut conversion to austin package's wfm format identical(as.wfm(dfmat), convert(dfmat, to = "austin"))#> [1] TRUEif (FALSE) { # shortcut conversion to tm package's DocumentTermMatrix format identical(as.DocumentTermMatrix(dfmat), convert(dfmat, to = "tm")) } if (FALSE) { # shortcut conversion to lda package list format identical(quanteda:::dfm2lda(dfmat), convert(dfmat, to = "lda")) } if (FALSE) { # shortcut conversion to lda package list format identical(dfm2ldaformat(dfmat), convert(dfmat, to = "lda")) }