R/corpus-addsummary-metadata.R
summary_metadata.Rd
Functions to add or retrieve corpus summary metadata
add_summary_metadata(x, extended = FALSE, ...)
get_summary_metadata(x, ...)
summarize_texts_extended(x, stop_words = stopwords("en"), n = 100)
add_summary_metadata()
returns a corpus with summary metadata added
as a data.frame, with the top-level list element names summary()
.
get_summary_metadata()
returns the summary metadata as a data.frame.
summarize_texts_extended()
returns extended summary information.
This is provided so that a corpus object can be stored with
summary information to avoid having to compute this every time
summary.corpus()
is called.
So in future calls, if !is.null(meta(x, "summary", type = "system") && !length(list(...))
,
then summary.corpus()
will simply return get_system_meta()
rather than
compute the summary statistics on the fly, which requires tokenizing the
text.
corp <- corpus(data_char_ukimmig2010)
corp <- quanteda:::add_summary_metadata(corp)
quanteda:::get_summary_metadata(corp)
#> Corpus consisting of 9 documents, showing 9 documents:
#>
#> Text Types Tokens Sentences
#> BNP 1125 3280 88
#> Coalition 142 260 4
#> Conservative 251 499 15
#> Greens 322 679 21
#> Labour 298 683 29
#> LibDem 251 483 14
#> PC 77 114 5
#> SNP 88 134 4
#> UKIP 346 723 26
#>
## using extended summary
if (FALSE) {
extended_data <- quanteda:::summarize_texts_extended(data_corpus_inaugural)
topfeatures(extended_data$top_dfm)
}