R/corplingr_collex_fye.R
collex_fye.Rd
This is a vectorised wrapper for the dhyper
function in the stats
package.
The implementation of the code is adapted from Gries (2012).
collex_fye
also provides a logical argument (i.e., two_sided
) whose value is passed to the alternative
argument of the embedded fisher.test
if two_sided
is TRUE
.
collex_fye( a = "frequency of co-occurrence of the collocate and the node", a_exp = "expected frequency", n_w_in_corp = "total frequency of collexemes/collocates in the whole corpus", corpus_size = "total size of the corpus", n_pattern = "total frequency of the construction/node word in the whole corpus", two_sided = FALSE, collstr_res = TRUE, float = 3 )
a | cell |
---|---|
a_exp | expected frequency for cell |
n_w_in_corp | the total frequency of the collexemes/collocates of the target construction/node word in the corpus. |
corpus_size | the total size (in word tokens) of the corpus. |
n_pattern | the total frequency of occurrence of the target construction/node word in the corpus. |
two_sided | logical; whether to perform one-sided test ( |
collstr_res | logical; whether output the FYE p-value as the Collostruction Strength value ( |
float | the floating digits of the Collostruction/Collocation Strength. The default value is |
Numeric vector of the same length as a
interpreted as the Collostruction Strength of the construction/node word with the collexemes/collocates.
Collostruction Strength is (i) the negative logarithm to the base of ten of the Fisher-Yates Exact test p-value when a
> a_exp
, and (ii) the positive logarithm when a
<= a_exp
.
if (FALSE) { # do the collocate search using "corpus_path" input-option library(tidyverse) df <- colloc_default(corpus_path = orti_bali_path, pattern = "^nuju$", window = "b", # focusing on both left and right context window span = 3) # retrieve 3 collocates to the left and right of the node # prepare the collexeme analysis input tibble # and select to focus on R1 and R2 collocates. collex_tb <- collex_prepare(df, span = c("r1", "r2")) # run the Fisher-Yates Exact (FYE) Test in vectorised fashion with the help of purrr's pmap # the example below runs the one-tailed FYE and output the p-value in log10 of CollStr value collex_tb <- mutate(collex_tb, collstr = purrr::pmap_dbl(list(a, a_exp, n_w_in_corp, corpus_size, n_pattern), collex_fye, two_sided = FALSE, collstr_res = TRUE)) # preview the results collex_tb }