www.tlab.it

Word Associations


This T-LAB tool allows us to check how co-occurrence relationships determine the local meaning of selected words.

On the left there is the table with the key-term list and their occurrence values within the whole corpus or a subset of it.

On user request (a simple click), for each key-term T-LAB shows the lexical units that share co-occurrence contexts with that key-word.

The selection is carried out by the computation of an Association Index (Cosine, Dice and Jaccard).

For each query, T-LAB produces graphs and tables.
Both graphs and tables can be saved using the appropriate buttons.

In the radial diagram (see below) the lemma selected is placed in the center. The others are distributed around it, each at distance proportional to its degree of association. The significant relationships are therefore one-to-one, to the central lemma and to each of the others.
Each click on a item produces a new chart and, by using the right click of the mouse, it is possible to to open a dialog box which allows several customizations.

A table shows the data used to create the graph.

The reading keys are as follows:

· LEMMA (A) = selected lemma;
· LEMMA (B) = lemmas associated with LEMMA (A);
· COEFF = value of the selected index (Cosine, Dice o Jaccard);
· TOT EC = total amount of elementary contexts (EC) in the corpus or in the analysed subset;
· EC_A = total amount of EC that contains the selected lemma (A);
· EC_B = total amount of EC that contains every associated lemma (B);
· EC_AB = total amount of EC where lemmas "A" and "B" are associated (co-occurrences);
· CHI2 = chi square value concerning the co-occurrence signifiance.

In the case of chi square test, for each couple of lemmas ("A" and B") the structure of the analysed table is the following:

Where : nij = EC_AB; Nj = EC_A; Ni = EC_B; N = TOT EC.


A
double click on each table item (e.g. "financial") allows us to save a HTML file with all the elementary contexts (i.e. sentences or paragraphs) where the selected lemma co-occurs with the central word (e.g. "financial" and "terrorist").

Further graphs (bar charts) allow us to appreciate the values of the coefficient used and the percentage of co-occurrence contexts (see below).