dfm_group()that changed or deleted docvars attributes of dfm objects (#1506).
textplot_xray()that caused incorrect facet labels when a pattern contained multiple list elements or values (#1514).
kwic()now correctly returns the pattern associated with each match as the
"keywords"attribute, for all
textstat_lexdiv()now works on tokens objects, not just dfm objects. New methods of lexical diversity now include MATTR (the Moving-Average Type-Token Ratio, Covington & McFall 2010) and MSTTR (Mean Segmental Type-Token Ratio).
tokens_split()allows splitting single into multiple tokens based on a pattern match. (#1500)
tokens_chunk()allows splitting tokens into new documents of equally-sized “chunks”. (#1520)
textstat_entropy()now computes entropy for a dfm across feature or document margins.
textstat_readability()is vastly improved, now providing detailing all formulas and providing full references.
dfm_match()allows a user to specify the features in a dfm according to a fixed vector of feature names, including those of another dfm. Replaces
patternwas a dfm.
textplot_network()to allow more precise control of label sizes, either globally or individually.
dfm_sample()to the number of features, not the number of documents. (#1643)
force = TRUEoption and error checking for the situations of applying
dfm_group()to a dfm that has already been weighted. (#1545) The function
textstat_frequency()now allows passing this argument to
textstat_frequency()now has a new argument for resolving ties when ranking term frequencies, defaulting to the “min” method. (#1634)