The function to produce frequency table required as input for association measures for collocations

assoc_prepare(
  colloc_out = NULL,
  window_span = NULL,
  per_corpus = FALSE,
  stopword_list = NULL,
  float_digits = 3
)

Arguments

colloc_out

The output list of colloc_leipzig.

window_span

Specify the window and span combination of the collocates to focus on for the measure (e.g., "r1" for 1 word to the right of the node; or a set of values as in c("r1", "r2")). The default is NULL.

per_corpus

Logical; whether to process the collocates per corpus file (TRUE) or aggregate the data across the corpus files (FALSE).

stopword_list

Character vectors containing list of stopwords to be removed from the collocation measures.

float_digits

The numeric vector for floating digits of the expected frequency values. The default is 3.

Value

A tbl_df of two columns. One of them is nested columns with input-data for row-wise association measure calculation (e.g., the Fisher-Exact Test with collex_fye).

Examples

# Apology that I commented the examples due to error in parsing # the examples section for assoc_prepare and colloc_leipzig # when building the website using pkgdown. # I still cannot get solution to this issue. # If the colloc_leipzig output is stored as list on console, run as follows #assoc_tb <- assoc_prepare(colloc_out = colloc_leipzig_output, # window_span = "r1", # per_corpus = FALSE, # stopword_list = NULL, # float_digits = 3) # If the output of colloc_leipzig is saved into disk # supply the vector of output file names ## Example of running colloc_leipzig with "save_interim = TRUE" # outfiles <- colloc_leipzig(leipzig_path = c('corp_path1.txt', 'corp_path2.txt'), # pattern = "mengatakan", # window = "r", # span = 3, # save_interim = TRUE # save interim results to disk # freqlist_output_file = "~/Desktop/out_1_freqlist.txt", # colloc_output_file = "~/Desktop/out_2_collocates.txt", # corpussize_output_file = "~/Desktop/out_3_corpus_size.txt", # search_pattern_output_file = "~/Desktop/out_4_search_pattern.txt" # ) ## Example of supplying colloc_out with "outfiles" #assoc_tb <- assoc_prepare(colloc_out = outfiles, # window_span = "r1", # per_corpus = FALSE, # stopword_list = stopwords, # float_digits = 3)