Trump's SOTU vs. the Past — Sentiment Analysis and Topic Modeling

President Donald Trump delivered his first State of the Union Address (really, enough with this “joint address” stuff) Tuesday evening. As with his inauguration there is reason to believe Trump’s SOTU was both unprecedentedly Trumpian and fantastically standard. Let’s analyze the text of his speech to figure out which of the above is more true.

They story I want to tell for Trump’s address is guided by these questions:

  1. What words did he use? Is his word use unique when comparing it with past presidents'?
  2. What policy areas did he talk about?
  3. What sentiment was hidden under the words?
  4. Is there something about Trump's speech that sets him apart?

Without further ado, because we have a lot to talk about, let’s get started:

What words did Trump use?

For months now, we’ve known that Donald Trump uses some words more than others. Some of these words are easy to explain, like “America” and “Great,” yet some are more unique. To introduce our analysis, let’s first look at a table of the most used words in Trump’s speech.

word total
american 30
america 27
must 20
new 19
country 17
world 17
americans 15
people 15
great 13
one 13
tonight 13
united 13
nation 12
dollars 11
just 11
many 11
states 11

The table above is helpful, but sometimes we need simple visual tools to help us out. Here’s a simple word cloud of Trump’s speech, with a slight twist. The word cloud below is not any old word cloud; it’s a cloud of words that are unique to Trump in comparison with State of the Union Addresses given by other presidents since 1970. The blue words are words that identify those speeches by other presidents; the red, Trump’s.

Immediately, you notice that Trump uses words similar to “America” and “country” more than words used by our past commanders-in-chief. He also talks about “citizens,” “friends,” and “allies” more often. You’ll notice some words in blue that we generally like to think of as “presidential words” are missing — words describing the federal budget, energy and healthcare policy, etc. This is good evidence to support claims that Trump uses rhetoric that is more nationalistic and, contrary to the media’s reporting, less “presidential” than past presidents. That being said, the word cloud alone can’t defend that statement well, if at all.

However, this word cloud and the accompanying table not give us much useful information. At best, it’s a tool to set the frame of our analysis. In this case, the cloud helps us make the decision to look at Trump’s coverage of public policy and nationalism. Let’s do exactly that:

Trump Vs. Other Presidents: The Issues

As I framed it, Trump departed from past presidents by focusing more on the nation and less on its policy. That’s true in some areas, but not others.

For instance, Trump mentioned words associated with immigration (immigrant, immigration, migrate, “come to America”, etc.) much more than any president in our analysis. In fact, he spoke about immigration five times more than some other POTUSes we’re tracking here.

The above graph shows each president’s coverage of certain issues in each of their speeches. The red and blue shading is, of course, a representation of the partisanship of the president during that particular State of the Union. You’ll notice that President Trump likes to talk about taxes and the budget at a rate comparable to past Republican executives. He even talks like them about war and, to some extent, civil rights — although the trend there is really skewed by Jimmy Carter’s civil rights talk.

What is really different about Trump’s speech is the sharp drop of coverage on energy policy. The only real comparable time frame is Bill Clinton’s 1995 SOTU. And, of course, there’s one more thing:

Most importantly, we see that enormous increase in the percentage of Trump’s state of the union spent on immigration. On its face, that is simply a dedication to immigration policy. But when we combine it to Trump’s usage of words about America — IE, nationalist rhetoric — we get a better idea at what this chat is about.

Effectively, Trump talked more about our country than Obama ever did. On top of that, Trump’s dedication to the state rivals that of Carter during the Cold War and George W Bush during the wars in Iraq and Afghanistan. Trump looks to be setting up rhetoric that looks to be America vs Immigrant. That is a pretty strong conclusion given the evidence, however, and is really only backed up by anecdotal evidence from his past speeches.

So, how can we be sure that Trump’s speech isn’t just praising America for being such a great proponent of immigration? After all, that’s a possible conclusion when just looking at word frequencies. Using sentiment analysis can provide us with an answer.

Trump Vs. Other Presidents: Sentiment

Sentiment analysis is a nifty tool that attaches a feeling to each word and phrase in a given speech. In the two cases below, we look at how Trump’s SOTU stacks up with the others on the basis of its negativity.

In the first graph, we are looking at the number of negative words used in Trump’s speech and comparing it to the number of words used in others’ speeches. These words are coded using the Lexicoder Sentiment Dictionary, a tool developed by Mark Daku, Stuart Soroka and Lori Young (all initially at McGill University, and now at McGill, Michigan and Pennsylvania respectively). It has been developed specifically with political word usage in mind.

Here, we get the idea that Trump’s speech is more negative than any speech since those delivered by Ronald Reagan.

We can also analyze the speeches using the Linguistic Linguistic Inquiry and Word Count engine, which is designed specifically for computer text analysis.

It’s not clear which of these approaches is better — although I’d like us to lean more heavily on Lexicoder — but the story being told is clear: Donald Trump’s first pseudo-State of the Union Address was a pretty negative one.

This is helpful, but we want to know exactly why it was negative; could it be that Trump was simply angry at those who are preventing him from “Making America Great Again?” Sure, but the data rejects that hypothesis.

That’s a question to be answered by NRC Emotion Lexicon — a tool that looks at emotions besides negativity/positivity behind words.

Here we see that Trump’s speech ranks highest for joy but is directly followed by fear, anger, and sadness. We can infer here that the “negativity” uncovered above is the result of Trump’s fearful, angry, sad and disgusted rhetoric outweighing the joyous (and, to some extent, the “surprising”).

Thus, I argue that President Trump’s rhetoric about immigration and nationalism is largely fearful, angry, and sad. The idea floating around the press that Trump is being “presidential” is just not true once we look at the words he is using and extract their underlying emotions.

However, it’s theoretically impossible to say that Trump’s word usage is entirely unlike that of any president before him. This is partly because there are only a certain number of words (duh!) and partly because he is like some presidents before him.

Trump Vs. Other Presidents: A Model

There are a couple of ways to arrive at the conclusion that Trump is both strikingly different and somewhat similar to some past Chief Executives. First, we’ll use the words in his speech to group him with similar presidents, then we’ll determine which topics differentiate Trump from his predecessors.

First, let’s take stock of we have learned so far:

  1. Trump used "nationalistic" words more often than other words>
  2. Trump spoke (generally) about policy less than his predecessors
  3. The policy words he did use were almost classically Republican
  4. His sentiment was negative (angry, fearful, and sad)

This is helpful, but a selection of what we don’t know may be more helpful. We still need answers to the following questions to really make the case that Trump is uniquely nationalistic.

  1. Is Trump's "nationalism" really nationalism, or could it be common patriotism? IE: Does it make him unique?

Initially, we should look at the topics mentioned in state of the union addresses and how they are used over time. We’ve done some of that work above, but it wasn’t “smart” work. Recall from earlier that we selected certain policy issues and nationalism to analyze over time — this is a good approach if we know what we’re looking for. Instead of taking that approach here, we want the computer to tell us which topics are the dominant forces.

A first step in this analysis is to simply group the speeches by topics until we find something interesting. This first step is what the computer-savvy will know as “k-means clustering.” Clustering is a fantastic (and comparatively simple!) tool for learning the topics of speeches without telling it what to look for. Don’t worry about this; it’s a fancy name for a simple process:

Without getting too far in the weeds, k-means clustering is a method of splitting up our text into ‘k’ subsets, where each speech is put to the “closest” cluster based on the distance of the speech’s topics from the topics of the cluster.

At the end, we’ll have a bunch of different “clusters” with similar speeches in them. In our case, we have 5 different clusters, and they look like this:

If you squint, you can see that Trump’s speech is grouped with those from Reagan and George H. W. Bush. Just because we can see the bag, however, doesn’t mean we can look inside of it. In this case, we don’t really want to look at the label on the bag, because the topic there is a mumbo jumbo jumble of words.

However, just because we can’t see the specific label does not mean we can’t get guess at what the topic of Trump’s, Reagan’s, and H.W. Bush’s bag actually is.

We have already established that Trump gave a speech that is strikingly similar to those from the 1980s, as far as talking about taxes and the budget go. Here, we can guess that Trump is getting grouped with Republican presidents on these policy-driven grounds. But it could be something else.

What if we wanted to see what sets Trump apart, rather than what makes him different? To do that, let’s move on to our second step of this section: modeling Trump’s speech as a function of his “Trumpiness.”

To model the “Trumpiness” of the president’s State of the Union Address, we employ a method similar to the K-Means Clustering described above called topic modeling. Topic modeling allows us to automatically identify topics within the addresses and see how much of each speech falls under that Topic.

Below, I make a type of topic model called a Structural Topic Model that explains Trump’s deviation from other SOTUs. Simply, it looks at the year the speech was delivered, the word usage of Trump and non-Trump addresses and identify words that Trump uses that others do not. This is very similar to the initial word cloud on this post, although it is much more scientific and has a host of other benefits.

When we give the model the SOTUs from after 1970, it gives us the results 3 topic:

Topics High Frequency Words
Topic 1 free, freedom, great, war, men, world, peace
Topic 2 congress, federal, government, new, must, program, year
Topic 3 America, people, can, new, now, American, years

We can infer that the first topic is made from speeches during times of war or when public cry for peace is high. In fact, Topic 1 is most correlated with speeches from the Cold War era and the Bush era.

Topic 2 is the government program category, where SOTUs about policy go. The inclusion of the word “new” in the topic makes me think it’s commonly the Democrat’s category — when was the last time you heard Republicans calling for new social programs?

Topic 3 is the populist party, where rhetoric is centered on the power of the American people to do something immediate. This, we would assume, is the Trump category. Yet, we do not have to assume; we can estimate the effects of the model. Simply defined, the model takes as inputs the year the speech was delivered and whether or not it was delivered by Trump and determines the effects those variables on each topic.

According to the STM, when a State of the Union Address is delivered by Trump, the speech shifts about 25% towards topic 3. In other words, Trump’s SOTU was considerably less about freedom and government policy and more about the country, both as a nation and a body politic.

This may seem contrary to the findings from the clustering, which seemed to say “hey, Trump fits in with H. W. Bush and Regan. What a normal dude!” when really, this result makes perfect sense. If Trump is using rhetoric that is an aberration from recent SOTUs, and if Trump is favoring a nationalistic rhetoric over a more traditional speech, then Trump would have to be the indicator for that shift.

At this point in the analysis, however, we only have one State of the Union Address (or “joint address”) for Trump, and so the effects of his rhetoric on the topics may be either exaggerated or underestimated.

Wrapping Up

I have found that our 45th POTUS uses rhetoric that is both similar to that from the 1980s and unique in its blending of nationalism with lack of policy specifics. Trump’s speech was remarkably negative as well. Specifically, it contained a lot of fear and anger — although it was joyous at some points.

I have found that Trump serves as a “pretty OK” independent variable when predicting what topic a State of the Union Address will be about, and this holds true when we factor in the time period in which Trump is president. We can make a pretty good case with this evidence that Trump is a nationalist president unlike any we’ve had before. At least, that is what his words say (and his actions probably don’t disagree with that).

At this point in his Presidency, we don’t really know if all of Trump’s State of the Union Addresses will be this uniquely nationalistic. It’s very possible that he reigns in the rhetoric next time around. That being said, if we combine the text from his press conferences and news interviews I doubt we’ll arrive at a substantially different conclusion.

For now, however, it’s clear that Trump is not like any president we’ve had in recent history. If the February SOTU is any indication of a larger trend in our president’s rhetoric, what we’re hearing about a “presidential” President Trump is likely erroneous.

. . . . .

Thanks for reading today everyone. Tune in to my twitter for updates and make sure you sign up for my newsletter to get notifications of recent posts.


Sign up for our newsletter.