The Text Clustering analysis integrates the functionality provided by the Text Clustering API. It allows automatic clustering of documents in order to group them by similarity and discover significant subjects.
This is the interface that will appear when you click the Text Clustering button:
You can see that there are two areas in the interface: Input, which we have already covered in the corresponding section, and Analysis Settings.
In Analysis Settings there are three elements to configure:
We've seen in the Settings section that there's an advanced settings menu with additional configuration options for Text Clustering. These are the options for Text Clustering and their default values:
In this section you will be able to configure which fields are shown in the output:
There's more information about each one of these fields in the response section of the API documentation.
The results obtained from the analysis will be shown in a new Excel sheet called "Text Clustering". This sheet will include a column with the source text, a column with the IDs if enabled, and then a column for each of the output fields configured in the advanced settings.
When the document is included in more than one cluster, each additional cluster will be inserted as a new row. The original text and the ID will be behave as specify in the "Combine cells in the output" option in the Settings section.
This is an example of a possible output of a number of texts. We are not using IDs, the configuration is set to show all the possible output fields and the "Combine cells in the output" option is disabled: