Package index • quanteda

Package-level
`quanteda-package` `quanteda`	An R package for the quantitative analysis of textual data
`quanteda_options()`	Get or set package options for quanteda
Data Built-in data objects.
`data_char_sampletext`	A paragraph of text for testing various text-based functions
`data_char_ukimmig2010`	Immigration-related sections of 2010 UK party manifestos
`data_corpus_inaugural`	US presidential inaugural address texts
`data_dfm_lbgexample`	dfm from data in Table 1 of Laver, Benoit, and Garry (2003)
`data_dictionary_LSD2015`	Lexicoder Sentiment Dictionary (2015)
`data-relocated` `data_corpus_dailnoconf1991` `data_corpus_irishbudget2010`	Formerly included data objects
Corpus functions Functions for constructing and manipulating corpus class objects.
`corpus()`	Construct a corpus object
`corpus_chunk()`	Segment a corpus into chunks of a given size
`corpus_group()`	Combine documents in corpus by a grouping variable
`corpus_reshape()`	Recast the document units of a corpus
`corpus_sample()`	Randomly sample documents from a corpus
`corpus_segment()` `char_segment()`	Segment texts on a pattern match
`corpus_subset()`	Extract a subset of a corpus
`corpus_trim()` `char_trim()`	Remove sentences based on their token lengths or a pattern match
`docvars()` `docvars<-`() `$`(<corpus>) `$<-`(<corpus>) `$`(<tokens>) `$<-`(<tokens>) `$`(<dfm>) `$<-`(<dfm>)	Get or set document-level variables
`as.character(<corpus>)` `is.corpus()` `as.corpus()`	Coercion and checking methods for corpus objects
Tokens functions Functions for constructing and manipulating tokens class objects.
`tokens()`	Construct a tokens object
`tokens_annotate()`	Annotate a tokens object using a dictionary
`tokens_chunk()`	Segment tokens object by chunks of a given size
`tokens_compound()`	Convert token sequences into compound tokens
`tokens_group()`	Combine documents in a tokens object by a grouping variable
`tokens_lookup()`	Apply a dictionary to a tokens object
`tokens_ngrams()` `char_ngrams()` `tokens_skipgrams()`	Create n-grams and skip-grams from tokens
`tokens_replace()`	Replace tokens in a tokens object
`tokens_sample()`	Randomly sample documents from a tokens object
`tokens_segment()`	Segment tokens object by patterns
`tokens_select()` `tokens_remove()` `tokens_keep()`	Select or remove tokens from a tokens object
`tokens_split()`	Split tokens by a separator pattern
`tokens_subset()`	Extract a subset of a tokens
`tokens_tolower()` `tokens_toupper()`	Convert the case of tokens
`tokens_trim()`	Trim tokens using frequency threshold-based feature selection
`tokens_wordstem()` `char_wordstem()` `dfm_wordstem()`	Stem the terms in an object
`is.tokens_xptr()` `as.tokens_xptr()`	Methods for tokens_xptr objects
`types()`	Get word types from a tokens object
`concat()` `concatenator()`	Return the concatenator character from an object
`as.list(<tokens>)` `as.character(<tokens>)` `is.tokens()` `as.tensor()` `as.tokens()`	Coercion, checking, and combining functions for tokens objects
Character functions Functions for constructing and manipulating character objects.
`char_tolower()` `char_toupper()`	Convert the case of character objects
`corpus_segment()` `char_segment()`	Segment texts on a pattern match
`tokens_ngrams()` `char_ngrams()` `tokens_skipgrams()`	Create n-grams and skip-grams from tokens
`char_select()` `char_remove()` `char_keep()`	Select or remove elements from a character vector
`corpus_trim()` `char_trim()`	Remove sentences based on their token lengths or a pattern match
`tokens_wordstem()` `char_wordstem()` `dfm_wordstem()`	Stem the terms in an object
Text matrix functions Functions for constructing and manipulating a document-feature matrix (dfm) or feature co-occurrence matrix object.
`dfm()`	Create a document-feature matrix
`dfm_compress()` `fcm_compress()`	Recombine a dfm or fcm by combining identical dimension elements
`dfm_group()`	Combine documents in a dfm by a grouping variable
`dfm_lookup()`	Apply a dictionary to a dfm
`dfm_match()`	Match the feature set of a dfm to given feature names
`dfm_replace()`	Replace features in dfm
`dfm_sample()`	Randomly sample documents from a dfm
`dfm_select()` `dfm_remove()` `dfm_keep()` `fcm_select()` `fcm_remove()` `fcm_keep()`	Select features from a dfm or fcm
`dfm_sort()`	Sort a dfm by frequency of one or more margins
`dfm_subset()`	Extract a subset of a dfm
`dfm_tfidf()`	Weight a dfm by tf-idf
`dfm_tolower()` `dfm_toupper()` `fcm_tolower()` `fcm_toupper()`	Convert the case of the features of a dfm and combine
`dfm_trim()`	Trim a dfm using frequency threshold-based feature selection
`dfm_weight()` `dfm_smooth()`	Weight the feature frequencies in a dfm
`tokens_wordstem()` `char_wordstem()` `dfm_wordstem()`	Stem the terms in an object
`docfreq()`	Compute the (weighted) document frequency of a feature
`featfreq()`	Compute the frequencies of features
`head(<dfm>)` `tail(<dfm>)`	Return the first or last part of a dfm
`as.dfm()` `is.dfm()`	Coercion and checking functions for dfm objects
`as.matrix(<dfm>)`	Coerce a dfm to a matrix or data.frame
`fcm()`	Create a feature co-occurrence matrix
`fcm_sort()`	Sort an fcm in alphabetical order of the features
`as.fcm()`	Coercion and checking functions for fcm objects
Dictionary functions Constructor and utility functions for working with dictionaries.
`dictionary()`	Create a dictionary object
`as.dictionary()` `is.dictionary()`	Coercion and checking functions for dictionary objects
`as.yaml()`	Convert quanteda dictionary objects to the YAML format
Phrase discovery functions Functions for exploring and detecting keywords and phrases.
`is.collocations()`	Check if an object is collocations
`kwic()` `is.kwic()` `as.data.frame(<kwic>)`	Locate keywords-in-context
Utility functions R-like functions to return counts and object information.
`index()` `is.index()`	Locate a pattern in a tokens object
`ndoc()` `nfeat()`	Count the number of documents or features
`nsentence()`	Count the number of sentences
`ntoken()` `ntype()`	Count the number of tokens or types
`print(<corpus>)` `print(<dfm>)` `print(<dictionary2>)` `print(<fcm>)` `print(<kwic>)` `print(<tokens>)`	Print methods for quanteda core objects
`docnames()` `docnames<-`() `docid()` `segid()`	Get or set document names
`featnames()`	Get the feature labels from a dfm
Miscellaneous functions
`phrase()` `as.phrase()` `is.phrase()`	Declare a pattern to be a sequence of separate patterns
`convert()`	Convert quanteda objects to non-quanteda formats
`bootstrap_dfm()`	Bootstrap a dfm
`meta()` `meta<-`()	Get or set object metadata
`spacyr-methods`	Extensions for and from spacy_parse objects
Statistics, models, and plots Functions for computing statistics, fitting models, and producing visualisations models from text.
`sparsity()`	Compute the sparsity of a document-feature matrix
`topfeatures()`	Identify the most frequent features in a dfm
`textmodels`	Models for scaling and classification of textual data
`textplots`	Plots for textual data
`textstats`	Statistics for textual data

Reference

Package-level

Data

Corpus functions

Tokens functions

Character functions

Text matrix functions

Dictionary functions

Phrase discovery functions

Utility functions

Miscellaneous functions

Statistics, models, and plots