Fisher's Exact Test — collex

Perform one-tailed Fisher's Exact test for the collostruction/collocation strength. It internally calls the utility function fye_compute and performs a row-wise computation using map_dbl. The p-value is log-transformed to the base of ten as in the Collostructional Analysis.

collex_fye(df, collstr_digit = 3)

Arguments

df	The output of `assoc_prepare`.
collstr_digit	The numeric vector for floating digits of the collostruction strength. The default is `3`.

Value

A tibble consisting of the collocates (column w), co-occurrence frequencies with the node (column a), the expected co-occurrence frequencies with the node (column a_exp), the direction of the association (e.g., attraction or repulsion) (column assoc), the collostruction strength (column collstr), and two uni-directional association measures of Delta P.

Examples

out <- colloc_leipzig(leipzig_corpus_list = demo_corpus_leipzig,
                      pattern = "mengatakan",
                      window = "r",
                      span = 3L,
                      save_interim = FALSE)
#> Detecting a 'named list' input!
#> You chose NOT to SAVE INTERIM RESULTS, which will be stored as a list in console!
#> 1. Tokenising the "ind_mixed_2012_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_mixed_2012_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_news_2008_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_news_2008_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_news_2009_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_news_2009_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_news_2010_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_news_2010_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_news_2011_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_news_2011_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_news_2012_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_news_2012_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_newscrawl_2011_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_newscrawl_2011_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_newscrawl_2012_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_newscrawl_2012_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_newscrawl_2015_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_newscrawl_2015_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_newscrawl_2016_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_newscrawl_2016_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_web_2011_300K" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_web_2011_300K.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_web_2012_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_web_2012_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind_wikipedia_2016_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind_wikipedia_2016_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind-id_web_2013_1M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind-id_web_2013_1M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 1. Tokenising the "ind-id_web_2015_3M" corpus. This process may take a while!
#>     1.1 Removing one-character tokens...
#>     1.2 Lowercasing the tokenised corpus...
#>     At least a match is detected for 'mengatakan' in ind-id_web_2015_3M.
#> 2.1 Gathering the collocates for 'mengatakan' ...
#> 3. Storing all of the outputs...
#> 
#> DONE!
assoc_tb <- assoc_prepare(colloc_out = out, stopword_list = stopwords)
#> Your colloc_leipzig output is stored as list!
#> You chose to combine the collocational and frequency list data from ALL CORPORA!
#> Tallying frequency list of all words in ALL CORPORA!
#> You chose to remove stopwords!
am_fye <- collex_fye(df = assoc_tb, collstr_digit = 3)