Get the number of documents or features in an object.
ndoc(x) nfeat(x)
x | a quanteda object: a corpus, dfm, or tokens object, or a readtext object from the readtext package. |
---|
an integer (count) of the number of documents or features
ndoc
returns the number of documents in an object
whose texts are organized as "documents" (a corpus,
dfm, or tokens object, a readtext object from the
readtext package).
nfeat
returns the number of features from a dfm; it is an
alias for ntype
when applied to dfm objects. This function is only
defined for dfm objects because only these have "features". (To count
tokens, see ntoken()
.)
# number of documents ndoc(data_corpus_inaugural) #> [1] 59 ndoc(corpus_subset(data_corpus_inaugural, Year > 1980)) #> [1] 11 ndoc(tokens(data_corpus_inaugural)) #> [1] 59 ndoc(dfm(tokens(corpus_subset(data_corpus_inaugural, Year > 1980)))) #> [1] 11 # number of features toks1 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = FALSE) toks2 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = TRUE) nfeat(dfm(toks1)) #> [1] 3426 nfeat(dfm(toks2)) #> [1] 3412