Text Clustering integrates the functionality provided by the Text Clustering API. It performs automatic clustering of documents in order to group them by similarity and discover significant subjects.
On the right, you can see the sidebar that appears when you click on Text Clustering.
There are two sections in the interface: Select cells with texts to analyze, which we have already covered in the corresponding section, and Analysis settings.
In Analysis settings you can configure three elements:
To be able to use any of our language packs, you need to have access to them! You can request access in the developer home or in the language packs section. You can read more about it here.
The Advanced settings menu contains additional options for Text clustering. There is only one section: Output configuration, to configure the output of the analysis.
In this section, you will be able to select which fields to show in the output:
You can read more information about these fields in the response section of the API documentation.
The results obtained from the analysis will be shown in a new spreadsheet called "Text Clustering". This sheet will include a column with the source text, a column with the IDs if enabled, and a column for each one of the output fields selected in the advanced settings.
When the document is included in more than one cluster, each additional cluster will be inserted as a new row.
This is an example of a possible output of a number of texts. We are not using IDs and the configuration is set to show all the possible output fields: