Get or set document-level meta-data. Document-level meta-data are a special type of docvars, meant to contain information about documents that would not be used as a "variable" for analysis. An example could be the source of the document, or notes pertaining to its transformation, copyright information, etc.

Document-level meta-data differs from corpus-level meta-data in that the latter pertains to the collection of texts as a whole, whereas the document-level version can differ with each document.

metadoc(x, field = NULL)

metadoc(x, field = NULL) <- value

Arguments

x

a corpus object

field

character, the name of the metadata field(s) to be queried or set

value

the new value of the new meta-data field

Value

For texts, a character vector of the texts in the corpus. For texts <-, the corpus with the updated texts.

Note

Document-level meta-data names are preceded by an underscore character, such as _language, but when named in in the field argument, do not need the underscore character.

See also

metacorpus

Examples

mycorp <- corpus_subset(data_corpus_inaugural, Year > 1990) summary(mycorp, showmeta = TRUE)
#> Corpus consisting of 7 documents: #> #> Text Types Tokens Sentences Year President FirstName #> 1993-Clinton 642 1833 81 1993 Clinton Bill #> 1997-Clinton 773 2449 111 1997 Clinton Bill #> 2001-Bush 621 1808 97 2001 Bush George W. #> 2005-Bush 773 2319 100 2005 Bush George W. #> 2009-Obama 938 2711 110 2009 Obama Barack #> 2013-Obama 814 2317 88 2013 Obama Barack #> 2017-Trump 582 1660 88 2017 Trump Donald J. #> #> Source: Gerhard Peters and John T. Woolley. The American Presidency Project. #> Created: Tue Jun 13 14:51:47 2017 #> Notes: http://www.presidency.ucsb.edu/inaugurals.php
metadoc(mycorp, "encoding") <- "UTF-8" metadoc(mycorp)
#> _encoding #> 1993-Clinton UTF-8 #> 1997-Clinton UTF-8 #> 2001-Bush UTF-8 #> 2005-Bush UTF-8 #> 2009-Obama UTF-8 #> 2013-Obama UTF-8 #> 2017-Trump UTF-8
metadoc(mycorp, "language") <- "english" summary(mycorp, showmeta = TRUE)
#> Corpus consisting of 7 documents: #> #> Text Types Tokens Sentences Year President FirstName _encoding #> 1993-Clinton 642 1833 81 1993 Clinton Bill UTF-8 #> 1997-Clinton 773 2449 111 1997 Clinton Bill UTF-8 #> 2001-Bush 621 1808 97 2001 Bush George W. UTF-8 #> 2005-Bush 773 2319 100 2005 Bush George W. UTF-8 #> 2009-Obama 938 2711 110 2009 Obama Barack UTF-8 #> 2013-Obama 814 2317 88 2013 Obama Barack UTF-8 #> 2017-Trump 582 1660 88 2017 Trump Donald J. UTF-8 #> _language #> english #> english #> english #> english #> english #> english #> english #> #> Source: Gerhard Peters and John T. Woolley. The American Presidency Project. #> Created: Tue Jun 13 14:51:47 2017 #> Notes: http://www.presidency.ucsb.edu/inaugurals.php