Here are some examples of linguistic analyses using MeaningCloud:
To show how to carry out a syntactic analysis extracting topics we will use the following text as an example:
"Robert Downey Jr has topped Forbes magazine's annual list of the highest paid actors for the second year in a row."
txt
parameter to submit the text.lang
the language in which the text is going to be analyzed, in this case English, en.
key
parameter.of
.uw
so the engine tries to find possible analysis when there are typos in the text.tt
with the value e to obtain the entities detected in the text in the corresponding token.curl
curl -XPOST "https://api.meaningcloud.com/parser-2.0?key=<<YOUR OWN KEY>>&of=json&lang=en&txt=Robert%20Downey%20Jr%20has%20topped%20Forbes%20magazine%27s%20annual%20list%20of%20the%20highest%20paid%20actors%20for%20the%20second%20year%20in%20a%20row.&uw=y&tt=e"
{ "status": { "code": "0", "msg": "OK", "credits": "1" }, "token_list": [ { "type": "sentence", "id": "29", "inip": "0", "endp": "113", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "A", "quote_level": "0", "affected_by_negation": "no", "token_list": [ { "type": "phrase", "form": "Robert Downey Jr has topped Forbes magazine's annual list of the highest paid actors for the second year in a row", "id": "44", "inip": "0", "endp": "112", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "_", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "Z-----------", "lemma": "*", "original_form": "Robert Downey Jr has topped Forbes magazine's annual list of the highest paid actors for the second year in a row" } ], "token_list": [ { "type": "phrase", "form": "Robert Downey Jr", "id": "34", "inip": "0", "endp": "15", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "_", "quote_level": "0", "affected_by_negation": "no", "head": "31", "syntactic_tree_relation_list": [ { "id": "28", "type": "isSubject" } ], "analysis_list": [ { "tag": "GNMS3S--", "lemma": "Robert Downey Jr", "original_form": "Robert Downey Jr" }, { "tag": "GNFS3S--", "lemma": "Robert Downey Jr", "original_form": "Robert Downey Jr" } ], "token_list": [ { "type": "multiword", "form": "Robert Downey Jr", "id": "31", "inip": "0", "endp": "15", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "_", "quote_level": "0", "affected_by_negation": "no", "head": "27", "analysis_list": [ { "tag": "NPMS-N-", "lemma": "Robert Downey Jr", "original_form": "Robert Downey Jr", "sense_id_list": [ { "sense_id": "__12123288058840445720" } ] }, { "tag": "NPMP-N-", "lemma": "Robert Downey Jr", "original_form": "Robert Downey Jr", "sense_id_list": [ { "sense_id": "__12123288058840445720" } ] }, { "tag": "NPFS-N-", "lemma": "Robert Downey Jr", "original_form": "Robert Downey Jr", "sense_id_list": [ { "sense_id": "__12123288058840445720" } ] }, { "tag": "NPFP-N-", "lemma": "Robert Downey Jr", "original_form": "Robert Downey Jr", "sense_id_list": [ { "sense_id": "__12123288058840445720" } ] } ], "sense_list": [ { "id": "__12123288058840445720", "form": "Robert Downey Jr", "info": "sementity/class=instance@type=Top@confidence=unknown" } ], "topic_list": { "entity_list": [ { "form": "Robert Downey Jr", "id": "__12123288058840445720", "sementity": { "class": "instance", "type": "Top", "confidence": "unknown" } } ] } } ] }, { "type": "multiword", "form": "has topped", "id": "28", "inip": "17", "endp": "26", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "4", "syntactic_tree_relation_list": [ { "id": "34", "type": "iof_isSubject" }, { "id": "43", "type": "iof_isDirectObject" }, { "id": "40", "type": "iof_isComplement" } ], "analysis_list": [ { "tag": "VI-S3PPA-N-N9", "lemma": "top", "original_form": "has topped" } ] }, { "type": "phrase", "form": "Forbes magazine's annual list of the highest paid actors for the second year", "id": "43", "inip": "28", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "36", "syntactic_tree_relation_list": [ { "id": "28", "type": "isDirectObject" } ], "analysis_list": [ { "tag": "GN-S3D--", "lemma": "list", "original_form": "Forbes magazine's annual list of the highest paid actors for the second year" } ], "token_list": [ { "type": "phrase", "form": "Forbes magazine", "id": "35", "inip": "28", "endp": "42", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "7", "analysis_list": [ { "tag": "GN-S3---", "lemma": "magazine", "original_form": "Forbes magazine" } ], "token_list": [ { "form": "Forbes", "id": "6", "inip": "28", "endp": "33", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NP-S-N-", "lemma": "Forbes", "original_form": "Forbes", "sense_id_list": [ { "sense_id": "db0f9829ff" } ] }, { "tag": "NP-S-N-", "lemma": "Forbes", "original_form": "Forbes", "sense_id_list": [ { "sense_id": "9752b8b5ee" } ] }, { "tag": "NPMS-N-", "lemma": "Forbes", "original_form": "Forbes", "sense_id_list": [ { "sense_id": "4a3369b337" } ] }, { "tag": "NPFS-N-", "lemma": "Forbes", "original_form": "Forbes", "sense_id_list": [ { "sense_id": "4a3369b337" } ] } ], "sense_list": [ { "id": "4a3369b337", "form": "Forbes", "info": "sementity/class=instance@fiction=nonfiction@id=ODENTITY_LAST_NAME@type=Top>Person>LastName\tsemld_list=sumo:LastName" }, { "id": "9752b8b5ee", "form": "Forbes", "info": "sementity/class=instance@fiction=nonfiction@id=ODENTITY_RIVER@type=Top>Location>GeographicalEntity>WaterForm>River\tsemld_list=sumo:River" }, { "id": "db0f9829ff", "form": "Forbes", "info": "sementity/class=instance@fiction=nonfiction@id=ODENTITY_MAGAZINE@type=Top>Product>CulturalProduct>Printing>Magazine\tsemgeo_list/continent=America#id:33fc13e6dd@country=United States#id:beac1b545b#ISO3166-1-a2:US#ISO3166-1-a3:USA\tsemld_list=sumo:Magazine\tsemtheme_list/id=ODTHEME_ECONOMY@type=Top>SocialSciences>Economy" } ], "topic_list": { "entity_list": [ { "form": "Forbes", "id": "4a3369b337", "sementity": { "class": "instance", "fiction": "nonfiction", "id": "ODENTITY_LAST_NAME", "type": "Top>Person>LastName" }, "semld_list": [ "sumo:LastName" ] }, { "form": "Forbes", "id": "9752b8b5ee", "sementity": { "class": "instance", "fiction": "nonfiction", "id": "ODENTITY_RIVER", "type": "Top>Location>GeographicalEntity>WaterForm>River" }, "semld_list": [ "sumo:River" ] }, { "form": "Forbes", "id": "db0f9829ff", "sementity": { "class": "instance", "fiction": "nonfiction", "id": "ODENTITY_MAGAZINE", "type": "Top>Product>CulturalProduct>Printing>Magazine" }, "semgeo_list": [ { "continent": { "form": "America", "id": "33fc13e6dd" }, "country": { "form": "United States", "id": "beac1b545b", "standard_list": [ { "id": "ISO3166-1-a2", "value": "US" }, { "id": "ISO3166-1-a3", "value": "USA" } ] } } ], "semld_list": [ "sumo:Magazine" ], "semtheme_list": [ { "id": "ODTHEME_ECONOMY", "type": "Top>SocialSciences>Economy" } ] } ] } }, { "form": "magazine", "id": "7", "inip": "35", "endp": "42", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NC-S-N5", "lemma": "magazine", "original_form": "magazine", "sense_id_list": [ { "sense_id": "a0a1a5401f" } ] } ], "sense_list": [ { "id": "a0a1a5401f", "form": "magazine", "info": "sementity/class=class@fiction=nonfiction@id=ODENTITY_MAGAZINE@type=Top>Product>CulturalProduct>Printing>Magazine\tsemld_list=sumo:Magazine" } ] } ] }, { "form": "'s", "id": "26", "inip": "43", "endp": "44", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "A", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "WN-", "lemma": "'s", "original_form": "'s" } ] }, { "type": "phrase", "form": "annual list of the highest paid actors for the second year", "id": "36", "inip": "46", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "11", "analysis_list": [ { "tag": "GN-S3---", "lemma": "list", "original_form": "annual list" } ], "token_list": [ { "form": "annual", "id": "10", "inip": "46", "endp": "51", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "AP-N5", "lemma": "annual", "original_form": "annual" } ] }, { "form": "list", "id": "11", "inip": "53", "endp": "56", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NC-S-N5", "lemma": "list", "original_form": "list" } ] }, { "type": "phrase", "form": "of the highest paid actors for the second year", "id": "42", "inip": "58", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "12", "analysis_list": [ { "tag": "GY------", "lemma": "of", "original_form": "of the highest paid actors for the second year" } ], "token_list": [ { "form": "of", "id": "12", "inip": "58", "endp": "59", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "YN9", "lemma": "of", "original_form": "of" } ] }, { "type": "phrase", "form": "the highest paid actors for the second year", "id": "37", "inip": "61", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "16", "analysis_list": [ { "tag": "GN-P3---", "lemma": "actor", "original_form": "the highest paid actors" } ], "token_list": [ { "form": "the", "id": "13", "inip": "61", "endp": "63", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "TD-PN9", "lemma": "the", "original_form": "the" } ] }, { "form": "highest", "id": "14", "inip": "65", "endp": "71", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "AS-N5", "lemma": "high", "original_form": "highest" } ] }, { "form": "paid", "id": "15", "inip": "73", "endp": "76", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "VP---ASA-N-N5", "lemma": "pay", "original_form": "paid", "sense_id_list": [ { "sense_id": "ODENTITY_TRANSACTION" }, { "sense_id": "ODENTITY_COMMUNICATION_PROCESS" }, { "sense_id": "ODENTITY_CHANGE_OF_POSSESSION" }, { "sense_id": "ODENTITY_INTENTIONAL_PROCESS" }, { "sense_id": "ODENTITY_INTENTIONAL_PSYCHOLOGICAL_PROCESS" }, { "sense_id": "ODENTITY_MEETING" } ] } ], "sense_list": [ { "id": "ODENTITY_CHANGE_OF_POSSESSION", "form": "pay", "info": "sementity/id=ODENTITY_CHANGE_OF_POSSESSION@type=Top>Process>IntentionalProcess>SocialInteraction>ChangeOfPossession\tsemld_list=sumo:IntentionalProcess" }, { "id": "ODENTITY_COMMUNICATION_PROCESS", "form": "pay", "info": "sementity/id=ODENTITY_COMMUNICATION_PROCESS@type=Top>Process>ContentBearingProcess>CommunicationProcess\tsemld_list=sumo:Entity" }, { "id": "ODENTITY_INTENTIONAL_PROCESS", "form": "pay", "info": "sementity/id=ODENTITY_INTENTIONAL_PROCESS@type=Top>Process>IntentionalProcess\tsemld_list=sumo:IntentionalProcess" }, { "id": "ODENTITY_INTENTIONAL_PSYCHOLOGICAL_PROCESS", "form": "pay", "info": "sementity/id=ODENTITY_INTENTIONAL_PSYCHOLOGICAL_PROCESS@type=Top>Process>IntentionalProcess>IntentionalPsychologicalProcess\tsemld_list=sumo:IntentionalProcess" }, { "id": "ODENTITY_MEETING", "form": "pay", "info": "sementity/id=ODENTITY_MEETING@type=Top>Process>IntentionalProcess>SocialInteraction>Meeting\tsemld_list=sumo:IntentionalProcess" }, { "id": "ODENTITY_TRANSACTION", "form": "pay", "info": "sementity/id=ODENTITY_TRANSACTION@type=Top>Process>DualObjectProcess>Transaction\tsemld_list=sumo:Entity" } ] }, { "form": "actors", "id": "16", "inip": "78", "endp": "83", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NC-P-N5", "lemma": "actor", "original_form": "actors", "sense_id_list": [ { "sense_id": "4014a7dc12" } ] } ], "sense_list": [ { "id": "4014a7dc12", "form": "actor", "info": "sementity/class=class@fiction=nonfiction@id=ODENTITY_VOCATION@type=Top>OtherEntity>Vocation\tsemld_list=sumo:Position\tsemtheme_list/id=ODTHEME_ARTS@type=Top>Arts" } ] }, { "type": "phrase", "form": "for the second year", "id": "41", "inip": "85", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "17", "analysis_list": [ { "tag": "GY------", "lemma": "for", "original_form": "for the second year" } ], "token_list": [ { "form": "for", "id": "17", "inip": "85", "endp": "87", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "YN5", "lemma": "for", "original_form": "for" } ] }, { "type": "phrase", "form": "the second year", "id": "38", "inip": "89", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "20", "analysis_list": [ { "tag": "GN-S3T--", "lemma": "year", "original_form": "the second year" } ], "token_list": [ { "form": "the", "id": "18", "inip": "89", "endp": "91", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "TD-SN9", "lemma": "the", "original_form": "the" } ] }, { "form": "second", "id": "19", "inip": "93", "endp": "98", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NC-S-N5", "lemma": "second", "original_form": "second", "sense_id_list": [ { "sense_id": "9a8f54cd04" } ] } ], "sense_list": [ { "id": "9a8f54cd04", "form": "second", "info": "sementity/class=class@fiction=nonfiction@id=ODENTITY_PERIOD@type=Top>Timex>Period\tsemld_list=sumo:TimeInterval" } ] }, { "form": "year", "id": "20", "inip": "100", "endp": "103", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NC-S-N7", "lemma": "year", "original_form": "year", "sense_id_list": [ { "sense_id": "782b286b79" }, { "sense_id": "50b37cb80f" } ] } ], "sense_list": [ { "id": "50b37cb80f", "form": "year", "info": "sementity/class=class@fiction=nonfiction@id=ODENTITY_DATE@type=Top>Timex>Date\tsemld_list=sumo:DatePeriod" }, { "id": "782b286b79", "form": "year", "info": "sementity/class=class@fiction=nonfiction@id=ODENTITY_PERIOD@type=Top>Timex>Period\tsemld_list=sumo:TimeInterval" } ] } ] } ] } ] } ] } ] } ] }, { "type": "phrase", "form": "in a row", "id": "40", "inip": "105", "endp": "112", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "21", "syntactic_tree_relation_list": [ { "id": "28", "type": "isComplement" } ], "analysis_list": [ { "tag": "GY---C--", "lemma": "in", "original_form": "in a row" } ], "token_list": [ { "form": "in", "id": "21", "inip": "105", "endp": "106", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "YN6", "lemma": "in", "original_form": "in" } ] }, { "type": "phrase", "form": "a row", "id": "39", "inip": "108", "endp": "112", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "head": "23", "analysis_list": [ { "tag": "GN-S3---", "lemma": "row", "original_form": "a row" } ], "token_list": [ { "form": "a", "id": "22", "inip": "108", "endp": "108", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "QD-SPN9", "lemma": "a", "original_form": "a" } ] }, { "form": "row", "id": "23", "inip": "110", "endp": "112", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "1", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "NC-S-N2", "lemma": "row", "original_form": "row", "sense_id_list": [ { "sense_id": "c5fd6e4956" } ] } ], "sense_list": [ { "id": "c5fd6e4956", "form": "fight", "info": "sementity/class=class@fiction=nonfiction@id=ODENTITY_EVENT@type=Top>Event\tsemld_list=sumo:Meeting" } ] } ] } ] } ] }, { "form": ".", "id": "24", "inip": "113", "endp": "113", "style": { "isBold": "no", "isItalics": "no", "isUnderlined": "no", "isTitle": "no" }, "separation": "A", "quote_level": "0", "affected_by_negation": "no", "analysis_list": [ { "tag": "1D--", "lemma": ".", "original_form": "." } ] } ] } ] }
To show how to work with the output to obtain the morphological analysis of a sentence we will use the following text as an example:
"Robert Downey Jr has topped Forbes magazine's annual list of the highest paid actors for the second year in a row."
We will:
txt
parameter to submit the text.lang
the language in which the text is going to be analyzed, in this case English, en.
key
parameter.of
.verbose
parameter to obtain the morphological tag explained.The following code contains a request to the API using this parameters, and processes the output to obtain an array (morpho) with the tokens with the morphological analysis. After that we will print them
php parser-2.0-gist-morpho.php
Tokens: ============== Robert Downey Jr NPMS-N-: noun, proper, masculine, singular NPMP-N-: noun, proper, masculine, plural NPFS-N-: noun, proper, feminine, singular NPFP-N-: noun, proper, feminine, plural has topped VI-S3PPA-N-N9: verb, indicative, singular, 3rd person, present, perfect, active, maximum frequency word Forbes NP-S-N-: noun, proper, singular NP-S-N-: noun, proper, singular NPMS-N-: noun, proper, masculine, singular NPFS-N-: noun, proper, feminine, singular magazine NC-S-N5: noun, common, singular, medium frequency word 's WN-: saxon genitive annual AP-N5: adjective, positive, medium frequency word list NC-S-N5: noun, common, singular, medium frequency word of YN9: preposition, maximum frequency word the TD-PN9: article, determiner, plural, maximum frequency word highest AS-N5: adjective, superlative, medium frequency word paid VP---ASA-N-N5: verb, participle, past, simple, active, medium frequency word actors NC-P-N5: noun, common, plural, medium frequency word for YN5: preposition, medium frequency word the TD-SN9: article, determiner, singular, maximum frequency word second NC-S-N5: noun, common, singular, medium frequency word year NC-S-N7: noun, common, singular, high frequency word in YN6: preposition, medium-high frequency word a QD-SPN9: quantifier, determiner, singular, positive, maximum frequency word row NC-S-N2: noun, common, singular, very low frequency word . 1D: punctuation, other
To show how to work with the output to lemmatize a sentence we will use the following text as an example:
"Robert Downey Jr has topped Forbes magazine's annual list of the highest paid actors for the second year in a row."
We will:
txt
parameter to submit the text.lang
the language in which the text is going to be analyzed, in this case English, en.
key
parameter.of
.The following code contains a request to the API using this parameters, and processes the output to obtain an array (morpho) with the tokens with the morphological analysis. After that we will print them
php parser-2.0-gist-lemma.php
Tokens:
=============
{{{{Robert Downey Jr|Robert Downey Jr|Robert Downey Jr|Robert Downey Jr|Robert Downey Jr}}{has topped|top}{{{Forbes|Forbes|Forbes|Forbes|Forbes}{magazine|magazine}}{'s|'s}{{annual|annual}{list|list}{{of|of}{{the|the}{highest|high}{paid|pay}{actors|actor}{{for|for}{{the|the}{second|second}{year|year}}}}}}}{{{in|in}{{a|a}{row|row}}}}{.|.}}