Word segmentation for PTT post content and comments.
seg_content(x, words = NULL, tags = NULL, user = NULL) seg_comment(x, words = NULL, tags = NULL, user = NULL)
x | Column 'content' or 'comment' from a data frame
returned by |
---|---|
words | Character vector. A vector of words to
pass to jiebaR dictionary.
See |
tags | Character vector. A vector of tags
specifying the lexical categories of the words in
`words`. Defaults to `n` (noun).
See |
user | Character. A string specifying the path to an user defined dictionary. Defaults to pttR built-in dictionary. See https://qinwenfeng.com/jiebaR/worker-.html#user- for details. |
For details about the built-in ptt dictionary, see https://liao961120.github.io/PTT-scrapy/.