T-LAB Plus 202114 October 2020
T-LAB Plus 2022
was released on October 16 2021
Below is an illustration of the main new features and improvements found in this version of the software.
1 - The way T-LAB processes Chinese texts has been refined and three built-it examples in this language have been added, that is the Analects of Confucius, the 2020 annual report on China's policies and actions to address Climate Change, ten thousand Weibo posts related to COVID-19 (see the pictures below).
2 - Now the ‘Open Table’ option of the Corpus Builder tool allows one to easily import data in three further formats: .SAV (i.e. Spss files), .JSON (e.g. Twitter data) and .XML. Moreover, the process through which T-LAB generates a corpus from a data table with thousands of records is faster.
3 - The way T-LAB imports/exports .XLS and .XLSX files has been improved and it doesn't require having Microsoft Office installed anymore. Also, when importing .CSV files the Corpus Builder tool automatically detects delimiters and from the main menu the user is now allowed to choose the default format of the .CSV files to be exported (see the picture below).
4 - By taking into account that many T-LAB tools use clustering algorithms, as a guide for the users the sub-menu of Cluster Analysis has been changed as shown below. Accordingly, when selecting one of the possible options, one is automatically redirected to the corresponding tool.
For example, when choosing the first of the above options, the following window will appear.
As a reminder, here is a dendrogram which summarizes the main T-LAB tools to which the Cluster Analysis sub-menu may redirect (see below the tools marked with a red bullet point).
5 - Now, before performing an SVD (i.e. Single Value Decomposition) of a co-occurrence matrix with up to 5,000 columns, it is possible to access several advanced options for word embedding .
As a result of this, after the user checks the advanced options (e.g. co-occurrence context and co-occurrence threshold), T-LAB performs the following steps: 1- Build the co-occurrence matrix; 2- Compute PPMI values (Positive Pointwise Mutual Information); 3- Perform an SVD; 4- Extract the first 50 dimensions (i.e. word embedding).
Also, by clicking the Associations button it is possible to explore the second-order similarities of each item.
N.B.: While first order indexes point out phenomena concerning the syntagmatic axis ('in praesentia' combination and proximity, i.e. each word 'near to' the other), second order indexes point out phenomena concerning the paradigmatic axis ('in absentia' association and similarity, i.e. quasi-synonymity between key-terms used within the same corpus).
Moreover, depending on the size of the corpus and on the clustering method, it is possible to obtain and explore up to 30 clusters (K-Means method) and up to 20 cluster partitions (Hierarchical method).
6 - Further tables can be exported which allow the user to process T-LAB outputs with other software for data analysis. Among these, are the adjacency matrix created by the Sequences and Network Analysis tool and the co-occurrence matrix created by the Co-Word Analysis tool, both with up to 5,000 columns (see the pictures below).