A function to generate a concordance from parallel, bilingual corpora.

para_conc(
  source_text = "The source text corpus",
  target_text = "The target text corpus",
  pattern = "Search pattern for words in the source text",
  case_insensitive = FALSE,
  conc_sample = 25,
  filename = "parallel_conc.txt"
)

Arguments

source_text

character vector of the source-text corpora

target_text

character vector of the target-text corpora

pattern

regular expression search pattern for the source-text node word

case_insensitive

logical; whether the search pattern is case insensitive. Default to FALSE

conc_sample

random sample of the concordance lines

filename

file name of the parallel concordance output

Value

A tibble of parallel concordance with source-text node word and its left and right context, and their target-text translation. By default, para_conc() also automatically saves the concordance into a tab-separated plain text named "parallel_conc.txt". Users can specify their own output file name.

Examples

para_conc(sci_en, sci_id, pattern = "should", conc_sample = 20)
#> The output concordance file (called: 'parallel_conc.txt') will be saved in this directory: '/Users/Primahadi/Documents/r-packages/paracorp/docs/reference'
#> The output concordance will ALSO be returned as a tibble data frame in the R console.
#> Detecting the match/pattern...
#> You choose to generate a 20 random-sample of the concordance lines.
#> Creating a 20 random-sample of the concordance lines...
#> Generating the concordance for the match/pattern...
#> Saving the output concordance file (called: 'parallel_conc.txt') in '/Users/Primahadi/Documents/r-packages/paracorp/docs/reference'.
#> # A tibble: 20 × 4 #> LEFT NODE RIGHT TRANSLATION #> <chr> <chr> <chr> <chr> #> 1 This enzyme should cr… shou… also react to the ho… "Enzim ini tentunya menja… #> 2 When designating thes… shou… always be borne in m… "Ketika menentukan filum … #> 3 It shou… be admitted that ini… "Perlu diakui bahwa untuk… #> 4 The minor improvement… shou… be as readily preser… "Perubahan kecil dari gen… #> 5 To reach that, a cond… shou… be created, either i… "Untuk menggapainya, haru… #> 6 This was doubly ironi… shou… be denigrated rather… "Hal ini sangat ironis, k… #> 7 The number of these t… shou… be found all over th… "Jumlah bentuk peralihan … #> 8 So, in designing a ta… shou… be identified betwee… "Oleh karena itu dalam pe… #> 9 With thousands of exa… shou… be possible to descr… "Dengan ribuan contoh mut… #> 10 It shou… be realized that not… "Bukan hanya pembakaran k… #> 11 When airplane has bee… shou… be reliable so the a… "Ketika pesawat telah jad… #> 12 So that, in arranging… shou… be seriously conside… "Sehingga dalam menyusun … #> 13 Nucleus also function… shou… be started, executed… "Nukleus juga berfungsi m… #> 14 The fibre is for stre… shou… be. "Serat untuk menguatkan l… #> 15 Organization in compa… shou… focus on user produc… "Organisasi di perusahaan… #> 16 Prior to the genom pr… shou… have been gained, Ar… "Sebelum ada proyek genom… #> 17 It might seem odd tha… shou… lie at the center of… "Mungkin terlihat aneh ba… #> 18 This very powerful ma… shou… not be confused with… "Baterai utama yang sanga… #> 19 Products or services … shou… not only know but al… "Penyedia produk atau jas… #> 20 Nevertheless Posman r… shou… take precautions whe… "Meski begitu, Posman men…
# we delete the automatic output file to remove warning in R CMD check unlink("parallel_conc.txt") # example when automatic output file is suppressed with filename = FALSE # and only producing a tibble/data frame. para_conc(sci_en, sci_id, pattern = "should", conc_sample = 20, filename = FALSE) # suppress automatic output
#> The output concordance will be returned as a tibble data frame in the R console.
#> Detecting the match/pattern...
#> You choose to generate a 20 random-sample of the concordance lines.
#> Creating a 20 random-sample of the concordance lines...
#> Generating the concordance for the match/pattern...
#> # A tibble: 20 × 4 #> LEFT NODE RIGHT TRANSLATION #> <chr> <chr> <chr> <chr> #> 1 When designating thes… shou… always be borne in m… "Ketika menentukan filum … #> 2 The process shou… be carried by the do… "Proses tersebut harus di… #> 3 To reach that, a cond… shou… be created, either i… "Untuk menggapainya, haru… #> 4 The electric resulted… shou… be directly distribu… "Listrik yang dihasilkan … #> 5 To prevent the incide… shou… be parked in a locat… "Untuk menghindari terjad… #> 6 The applications is t… shou… be selected as the b… "Aplikasi tersebut adalah… #> 7 It is essential that … shou… be tightly attached … "Sayap tidak bisa tidak h… #> 8 Prior to the use of f… shou… be used. "Sebelum menggunakan sist… #> 9 The insects for consu… shou… come from areas or r… "Serangga yang akan dikon… #> 10 The condition shou… ease veterinary team… "Kondisi ini mempermudah … #> 11 The number of these t… shou… have been even great… "Jumlah bentuk peralihan … #> 12 Or there shou… have existed some re… "Atau seharusnya telah hi… #> 13 It is necessary that … shou… have gained the skil… "Ia harus memiliki dua je… #> 14 It might seem odd tha… shou… lie at the center of… "Mungkin terlihat aneh ba… #> 15 On the contrary, mate… shou… not enter the tissue… "Sebaliknya, zat-zat dala… #> 16 Products or services … shou… not only know but al… "Penyedia produk atau jas… #> 17 We shou… promote local food r… "Kita harus mempromosikan… #> 18 Nature harmonious lif… shou… receive attention of… "Gaya hidup selaras denga… #> 19 We shou… thank to the chemist… "Berterimakasihlah kepada… #> 20 it shou… that the birds flyin… "Ini menunjukkan bahwa te…