Version 0.99 • quanteda

Added tokens_segment(), which works on tokens objects in the same way as corpus_segment() does on corpus objects (#902).
Added magrittr pipe support (#927). %>% can now be used with quanteda without needing to attach magrittr (or, as many users apparently believe, the entire tidyverse.)
corpus_segment() now behaves more logically and flexibly, and is clearly differentiated from corpus_reshape() in terms of its functionality. Its documentation is also vastly improved. (#908)
Added data_dictionary_LSD2015, the Lexicoder Sentiment 2015 dictionary (#963).
Significant improvements to the performance of tokens_lookup() and dfm_lookup() (#960).
New functions head.corpus(), tail.corpus() provide fast subsetting of the first or last documents in a corpus. (#952)

Fixed a problem when applying purrr::map() to dfm() (#928).
Added documentation for regex2fixed() and associated functions.
Fixed a bug in textstat_collocations.tokens() caused by “documents” containing only "" as tokens. (#940)
Fixed a bug caused by cbind.dfm() when features shared a name starting with quanteda_options("base_featname") (#946)
Improved dictionary handling and creation now correctly handles nested LIWC 2015 categories. (#941)
Number of threads now set correctly by quanteda_options(). (#966)

summary.corpus() now generates a special data.frame, which has its own print method, rather than requiring verbose = FALSE to suppress output (#926).
textstat_collocations() is now multi-threaded.
head.dfm(), tail.dfm() now behave consistently with base R methods for matrix, with the added argument nfeature. Previously, these methods printed the subset and invisibly returned it. Now, they simply return the subset. (#952)
Dictionary keys are now unique, and if multiple, identical keys are defined for a dictionary when constructed, the values will be merged into the consolidated key. (#959)

The computation of tfidf has been more thoroughly described in the documentation for this function (#997).
Fixed a bug discovered in #1011 for unused keys in tokens_lookup(..., exclusive = FALSE).

tokens_segment() has a new window argument, permitting selection within an asymmetric window around the pattern of selection. (#521)
tokens_replace() now allows token types to be substituted directly and quickly.
textmodel_affinity() now adds functionality to fit the Perry and Benoit (2017) class affinity model.
Added a spacy_parse method for corpus objects. Also restored quanteda methods for spacyr spacy_parsed objects.

Improved documentation for textmodel_nb() (#1010), and made output quantities from the fitted NB model regular matrix objects instead of Matrix classes.

All of the deprecated functions are now removed. (#991)
tokens_group() is now significantly faster.
The deprecated “list of characters” tokenize() function and all methods associated with the tokenizedTexts object types have been removed.
Added convenience functions for keeping tokens or features: tokens_keep(), dfm_keep(), and fcm_keep(). (#1037)
textmodel_NB() has been replaced by textmodel_nb().

Changelog 0.99