R/dfm-classes.R, R/dfm-subsetting.R
dfm-class.RdThe dfm class of object is a type of Matrix-class object with
additional slots, described below. quanteda uses two subclasses of the
dfm class, depending on whether the object can be represented by a
sparse matrix, in which case it is a dfm class object, or if dense,
then a dfmDense object. See Details.
# S4 method for class 'dfm'
t(x)
# S4 method for class 'dfm'
colSums(x, na.rm = FALSE, dims = 1, ...)
# S4 method for class 'dfm'
rowSums(x, na.rm = FALSE, dims = 1, ...)
# S4 method for class 'dfm'
colMeans(x, na.rm = FALSE, dims = 1, ...)
# S4 method for class 'dfm'
rowMeans(x, na.rm = FALSE, dims = 1, ...)
# S4 method for class 'dfm,numeric'
Arith(e1, e2)
# S4 method for class 'numeric,dfm'
Arith(e1, e2)
# S4 method for class 'dfm,index,index,missing'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,index,index,logical'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,missing,missing,missing'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,missing,missing,logical'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,index,missing,missing'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,index,missing,logical'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,missing,index,missing'
x[i, j, ..., drop = TRUE]
# S4 method for class 'dfm,missing,index,logical'
x[i, j, ..., drop = TRUE]the dfm object
if TRUE, omit missing values (including NaN) from
the calculations
ignored
additional arguments not used here
first quantity in an Arith operation for dfm
second quantity in an Arith operation for dfm
document names or indices for documents to extract.
feature names or indices for documents to extract.
The dfm class is a virtual class that will contain
dgCMatrix-class.
weightTfthe type of term frequency weighting applied to the dfm. Default is
"frequency", indicating that the values in the cells of the dfm are
simple feature counts. To change this, use the dfm_weight()
method.
weightFfthe type of document frequency weighting applied to the dfm. See
docfreq().
smootha smoothing parameter, defaults to zero. Can be changed using
the dfm_smooth() method.
DimnamesThese are inherited from Matrix-class but are
named docs and features respectively.
# dfm subsetting
dfmat <- dfm(tokens(c("this contains lots of stopwords",
"no if, and, or but about it: lots",
"and a third document is it"),
remove_punct = TRUE))
dfmat[1:2, ]
#> Document-feature matrix of: 2 documents, 16 features (59.38% sparse) and 0 docvars.
#> features
#> docs this contains lots of stopwords no if and or but
#> text1 1 1 1 1 1 0 0 0 0 0
#> text2 0 0 1 0 0 1 1 1 1 1
#> [ reached max_nfeat ... 6 more features ]
dfmat[1:2, 1:5]
#> Document-feature matrix of: 2 documents, 5 features (40.00% sparse) and 0 docvars.
#> features
#> docs this contains lots of stopwords
#> text1 1 1 1 1 1
#> text2 0 0 1 0 0