working paper

Embedded Lexica: Extracting Topical Dictionaries from Unlabeled Corpora using Word Embeddings

Researchers frequently need to extract information, such as events or target topics, from large corpora. One common solution involves applying semantically-related keywords to identify tweets, news articles, or other documents of interest. However, …

Networks of Power: Extracting Measurements of De Jure Power from Constitutional Text

A topic that has been of perennial interest in political science is that of government actors' political power. However, measurement of political power has thus far been a highly costly process involving hand-coding by experts; consequently, only the …

Over-fishing, Conflict, and the South China Sea

In this paper, we determine whether scarcity of a resource that is high in demand can induce international conflict. Specifically, we test whether the combination of fishery depletion and high fishing activity causes an increase conflict in the South …

Predicting Left-Right Positions from Hand-Coded Content Analysis using Machine Learning

The Manifesto Project’s widely used left-right index of party policy positions (RILE), built from human-coded sentences from party manifestos, can be predicted using machine learning. We demonstrate this using some simple classifiers to show that …