working paper

Embedded Lexica: Extracting Topical Dictionaries from Unlabeled Corpora using Word Embeddings

The rise of the internet, social media, and the digitization of archives have led to an accumulation of untold quantities of unlabeled text data of relevance to the social sciences. Efficiently extracting information from those corpora frequently …

Networks of Power: Extracting Measurements of De Jure Power from Constitutional Text

A topic that has been of perennial interest in political science is that of government actors' political power. However, measurement of political power has thus far been a highly costly process involving hand-coding by experts; consequently, only the …

Over-fishing, Conflict, and the South China Sea

In this paper, we determine whether scarcity of a resource that is high in demand can induce international conflict. Specifically, we test whether the combination of fishery depletion and high fishing activity causes an increase conflict in the South …

Predicting Left-Right Positions from Hand-Coded Content Analysis using Machine Learning

The Manifesto Project’s widely used left-right index of party policy positions (RILE), built from human-coded sentences from party manifestos, can be predicted using machine learning. We demonstrate this using some simple classifiers to show that …