Get the number of documents or features in an object.

ndoc(x)

nfeat(x)

Arguments

x

a quanteda object: a corpus, dfm, tokens, or tokens_xptr object, or a readtext object from the readtext package

Value

ndoc() returns an integer count of the number of documents in an object whose texts are organized as "documents" (a corpus, dfm, or tokens/tokens_xptr object.

nfeat() returns an integer count of the number of features. It is an alias for ntype() for a dfm. This function is only defined for dfm

objects because only these have "features".

See also

Examples

# number of documents
ndoc(data_corpus_inaugural)
#> [1] 59
ndoc(corpus_subset(data_corpus_inaugural, Year > 1980))
#> [1] 11
ndoc(tokens(data_corpus_inaugural))
#> [1] 59
ndoc(dfm(tokens(corpus_subset(data_corpus_inaugural, Year > 1980))))
#> [1] 11

# number of features
toks1 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = FALSE)
toks2 <- tokens(corpus_subset(data_corpus_inaugural, Year > 1980), remove_punct = TRUE)
nfeat(dfm(toks1))
#> [1] 3426
nfeat(dfm(toks2))
#> [1] 3410