Wolfram Language

Text & Language Processing

Find Country Entities in Texts

This example shows another application of TextCases, in this case to recognize countries in a given text.

Construct a list of three historical periods.

In[1]:=
Click for copyable input
periods = {Entity["HistoricalPeriod", "EuropeanRenaissance"], Entity["HistoricalPeriod", "AgeEnlightenment"], Entity["HistoricalPeriod", "IndustrialRevolution"]};

Extract their respective names.

In[2]:=
Click for copyable input
names = CommonName[periods]
Out[2]=

Use WikipediaData to retrieve the text on the page for each historical era.

In[3]:=
Click for copyable input
wikipages = WikipediaData /@ names;

Use TextCases to extract the countries mentioned on each of those pages, deleting duplicate mentions.

In[4]:=
Click for copyable input
countries = DeleteDuplicates[TextCases[#, "Country" -> "Interpretation"]] & /@ wikipages;

For instance, these are the countries found on the page for the European Renaissance.

In[5]:=
Click for copyable input
First[countries]
Out[5]=

Plot on respective world maps the countries mentioned in each article.

show complete Wolfram Language input
In[6]:=
Click for copyable input
countries = DeleteCases[countries, Entity["Country", "World"], {2}]; Table[ GeoGraphics[{ EdgeForm[{Black}], Red, Polygon /@ countries[[i]] }, GeoRange -> "World", ImageSize -> 400, PlotLabel -> names[[i]], GeoBackground -> "Coastlines" ], {i, 3} ]
Out[6]=

Related Examples

de es fr ja ko pt-br ru zh