T-LAB Home
T-LAB 7.3 ON-LINE HELP Prev Page Prev Page
T-LAB
Introduction
What T-LAB does and what it enables us to do
Requirements and Performances
Corpus Preparation
Corpus Preparation
Structural Criteria
Formal Criteria
File
New Corpus
Gather your Texts
Open Corpus
Settings
Automatic Settings
Customized Settings
Co-occurrence Analysis
Word Associations
Co-Word Analysis and Concept Mapping
Comparison between Word pairs
Sequence Analysis
Concordances
Thematic Analysis
Thematic Analysis of Elementary Contexts
Modeling of Emerging Themes
Sequences of Themes
Key Contexts of Thematic Words
Thematic Document Classification
Comparative Analysis
Specificity Analysis
Correspondence Analysis
Multiple Correspondence Analysis
Cluster Analysis
Contingency Tables
Lexical Tools
Stop-Word List
Multi-Word List
Corpus Vocabulary
Disambiguation
Dictionary Building
Utilities
Editor
Memo
Variable Manager
Create a Sub-Corpus
Glossary
Analysis Unit
Association Indexes
Chi-Square
Cluster Analysis
Coding
Context Unit
Corpus and Subsets
Correspondence Analysis
Data Table
Disambiguation
Dictionary
Elementary Context
Frequency Threshold
GraphML
Homograph
IDnumber
Isotopy
Key-Word (Key-Term)
Lemmatization
Lexical Unit
Lexie and Lexicalization
Markov Chain
MDS
Multiwords
Naïve Bayes
Normalization
Occurrences and Co-occurrences
Poles of Factors
Primary Document
Profile
Specificity
Stop Word List
Test Value
Thematic Nucleus
TF-IDF
Variables and Categories
Words and Lemmas
Bibliografia
www.tlab.it

Elementary Contexts


During the importation phase, T-LAB makes a corpus segmentation into elementary contexts in order to help user exploration and, above all, to make analyses that require the co-occurrences computation.


According to the user's choices, the elementary contexts can be:

1 - Sentences

Elementary contexts ending with punctuation marks (.? ! ), whose length range is 50-1,000 characters.

 

2 - Chunks

Elementary contexts of comparable length made up of one or more sentences.

More precisely:

- T-LAB considers an elementary context to be every sequence of words interrupted by full stop and carriage return, whose dimensions are inferior to 400 characters;

- in the case where, within the maximum length, a full stop is not present, it searches for other punctuation marks in the following order (? ! ; : ,). If none are found, it performs segmentation on the basis of a statistical criterion, but without cutting the lexical units.



3 - Paragraphs

Elementary contexts ending with punctuation marks (.? ! ) and the return key, whose maximum length is 2,000 characters.

4 - Short Texts

This option is enabled only when the maximum length of texts is 2,000 characters (e.g. responses to open-ended questions).

N.B.:

- the corpus_segments.dat file contains the result of corpus segmentation;

- In T-LAB, the Concordances option allows the checking of elementary contexts where each word (or lemma) is present.