Plots a dispersion or "x-ray" plot of selected word pattern(s) across one or more texts. The format of the plot depends on the number of kwic class objects passed: if there is only one document, keywords are plotted one below the other. If there are multiple documents the documents are plotted one below the other, with keywords shown side-by-side. Given that this returns a ggplot2 object, you can modify the plot by adding ggplot2 layers (see example).

textplot_xray(..., scale = c("absolute", "relative"), sort = FALSE)

Arguments

...

any number of kwic class objects

scale

whether to scale the token index axis by absolute position of the token in the document or by relative position. Defaults are absolute for single document and relative for multiple documents.

sort

whether to sort the rows of a multiple document plot by document name

Value

a ggplot2 object

Known Issues

These are known issues on which we are working to solve in future versions:

  • textplot_xray() will not display the patterns correctly when these are multi-token sequences.

  • For dictionaries with keys that have overlapping value matches to tokens in the text, only the first match will be used in the plot. The way around this is to produce one kwic per dictionary key, and send them as a list to textplot_xray.

Examples

if (FALSE) { corp <- corpus_subset(data_corpus_inaugural, Year > 1970) # compare multiple documents textplot_xray(kwic(corp, pattern = "american")) textplot_xray(kwic(corp, pattern = "american"), scale = "absolute") # compare multiple terms across multiple documents textplot_xray(kwic(corp, pattern = "america*"), kwic(corp, pattern = "people")) # how to modify the ggplot with different options library(ggplot2) tplot <- textplot_xray(kwic(corp, pattern = "american"), kwic(corp, pattern = "people")) tplot + aes(color = keyword) + scale_color_manual(values = c('red', 'blue')) # adjust the names of the document names docnames(corp) <- apply(docvars(corp, c("Year", "President")), 1, paste, collapse = ", ") textplot_xray(kwic(corp, pattern = "america*"), kwic(corp, pattern = "people")) }