Functions to build or re-build core objects, or to upgrade earlier versions of these objects to the current format.
build_dfm(
x,
features,
docvars = data.frame(),
meta = list(),
class = NULL,
...
)
rebuild_dfm(x, attrs)
upgrade_dfm(x)
build_tokens(
x,
types,
padding = TRUE,
docvars = data.frame(),
meta = list(),
class = NULL,
...
)
rebuild_tokens(x, attrs)
upgrade_tokens(x)
build_corpus(x, docvars = data.frame(), meta = list(), class = NULL, ...)
rebuild_corpus(x, attrs)
upgrade_corpus(x)
build_dictionary2(x, meta = list(), class = "dictionary2", ...)
rebuild_dictionary2(x, attrs)
upgrade_dictionary2(x)
build_fcm(
x,
features1,
features2 = features1,
meta = list(),
class = "fcm",
...
)
rebuild_fcm(x, attrs)
upgrade_fcm(x)
an input corpus, tokens, dfm, fcm or dictionary object.
character for feature of resulting dfm
.
data.frame for document level variables created by
make_docvars()
. Names of documents are extracted from the
docname_
column.
list for meta fields
class labels to be attached to the object.
values saved in the object meta fields. They overwrite values
passed via meta
. If not specified, default values in
make_meta()
will be used.
a list of attributes to be reassigned
character for types of resulting the tokens
object.
logical indicating if the tokens
object contains paddings.
character for row feature of resulting fcm
.
character for column feature of resulting fcm
iff.
different from feature1
quanteda:::build_tokens(
list(c(1, 2, 3), c(4, 5, 6)),
docvars = quanteda:::make_docvars(n = 2L),
types = c("a", "b", "c", "d", "e", "f"),
padding = FALSE
)
#> Tokens consisting of 2 documents.
#> text1 :
#> [1] "a" "b" "c"
#>
#> text2 :
#> [1] "d" "e" "f"
#>
quanteda:::build_corpus(
c("a b c", "d e f"),
docvars = quanteda:::make_docvars(n = 2L),
unit = "sentence"
)
#> Corpus consisting of 2 documents.
#> text1 :
#> "a b c"
#>
#> text2 :
#> "d e f"
#>