Combine predictive analytics and text mining

Easily integrate MeaningCloud's text mining into the most widespread predictive analytics platform.

RapidMiner is an open-source data science platform, recognized as a leader in the field of advanced analytics tools. RapidMiner enables you to prepare data, create predictive models, validate them, and embed them into business processes quickly and easily.

The MeaningCloud extension for Rapidminer allows you to integrate the most accurate text analytics into your RapidMiner pipelines, thereby combining data and content analysis, with the added benefit of being fully customizable to your domain.


The extension for RapidMiner features a set of operators that give access to some of MeaningCloud's most frequently used features and lets you customize MeaningCloud's functions to your domain to achieve maximum accuracy.

  • Topic Extraction: extracts names of people, organizations, brands or places, abstract concepts, and amounts from the text.
  • Text Classification: categorizes a text according to predefined taxonomies, which include IPTC and IAB out of the box.
  • Sentiment Analysis: detects the positive/negative/neutral polarity expressed in the text.
  • Lemmatization: extracts a list of the lemmas of the words found in the text.
  • Customization: possibility of using personal dictionaries and classification or sentiment models created with MeaningCloud's customization tools.


RapidMiner users are provided with the most effective text analytics, thanks to the wide range of analytic functions and powerful customization capabilities, which guarantee the highest accuracy.

MeaningCloud users are provided with the most advanced tools for combining unstructured analytics with structured and multisource data in sophisticated predictive models.

Use scenarios

Use MeaningCloud's extension to expand RapidMiner's data-based predictive models with the most advanced text analytics in scenarios like the following:

Root cause analysis in surveys

Analyze answers to open-ended questions to find out the causes underlying the numerical scores given by users.

Fraud and churn prevention

Complement the predictive numerical models with unstructured and multisource information to increase their predictive capacity and more effectively prevent situations of fraud or churn.


Combine numerical criteria with psychographic profiles based on opinions, affinities, or lifestyles inferred from the users' comments in contact centers and social media to get a 360-degree view of your customers.

Lead scoring and targeting

Incorporate the insights obtained from all types of external and unstructured sources into your scoring and customer profiling models, including opinions and intentions that can be inferred from content published by customers.

Causal analysis for health care

Combine structured patient data (age, blood pressure cholesterol, etc.) with unstructured information about their symptoms and lifestyle from their medical records to perform risk analysis.

People analytics

Extend the numerical results of workforce climate surveys and employees' demographic data with comments about performance assessment, exit interviews, etc. to detect the attributes of the top performers and manage talent within the organization.