The numeric vector for floating digits of the collostruction strength. The default is 3.
Value
A tibble consisting of the collocates (column w),
co-occurrence frequencies with the node (column a),
the expected co-occurrence frequencies with the node (column a_exp),
the direction of the association (e.g., attraction or repulsion) (column assoc),
the T-Score (column tscore),
and two uni-directional association measures of Delta P.
Examples
out<-colloc_leipzig(leipzig_corpus_list=demo_corpus_leipzig,
pattern="ke", # it is a preposition meaning 'to(wards)'window="r",
span=2L,
save_interim=FALSE)
#> Detecting a 'named list' input!
#> You chose NOT to SAVE INTERIM RESULTS, which will be stored as a list in console!
#> 1. Tokenising the "ind_mixed_2012_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_mixed_2012_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_news_2008_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_news_2008_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_news_2009_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_news_2009_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_news_2010_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_news_2010_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_news_2011_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_news_2011_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_news_2012_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_news_2012_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_newscrawl_2011_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_newscrawl_2011_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_newscrawl_2012_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_newscrawl_2012_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_newscrawl_2015_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_newscrawl_2015_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_newscrawl_2016_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_newscrawl_2016_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_web_2011_300K" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_web_2011_300K.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_web_2012_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_web_2012_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind_wikipedia_2016_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind_wikipedia_2016_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind-id_web_2013_1M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind-id_web_2013_1M.
#> 2.1 Gathering the collocates for 'ke' ...
#> 1. Tokenising the "ind-id_web_2015_3M" corpus. This process may take a while!
#> 1.1 Removing one-character tokens...
#> 1.2 Lowercasing the tokenised corpus...
#> At least a match is detected for 'ke' in ind-id_web_2015_3M.
#> You chose to combine the collocational and frequency list data from ALL CORPORA!
#> Tallying frequency list of all words in ALL CORPORA!
#> You chose to remove stopwords!
collex_TScore(assoc_tb)
#> # A tibble: 301 x 8
#> # Groups: node, w [301]
#> node w a a_exp assoc tscore dP_collex_cue_c… dP_cxn_cue_coll…
#> <chr><chr><int><dbl><chr><dbl><dbl><dbl>
#> 1 ke rumah 10 0.578 attracti… 2.98 0.036 0.104
#> 2 ke luar 6 0.273 attracti… 2.34 0.022 0.133
#> 3 ke arah 5 0.127 attracti… 2.18 0.019 0.244
#> 4 ke berbagai 5 0.33 attracti… 2.09 0.018 0.09
#> 5 ke kata 5 1.33 attracti… 1.64 0.014 0.018
#> 6 ke negara 5 0.647 attracti… 1.95 0.017 0.043
#> 7 ke daerah 4 0.489 attracti… 1.76 0.013 0.046
#> 8 ke tempat 4 0.317 attracti… 1.84 0.014 0.074
#> 9 ke belakang 3 0.076 attracti… 1.69 0.011 0.244
#> 10 ke dua 3 0.609 attracti… 1.38 0.009 0.025
#> # … with 291 more rows