R for Political Data Science Week 11: Is Beto O’Rourke the Media’s Sweetheart?

O’Rourke’s bid for the presidency has scored some big media attention

By G. Elliott Morris

On Mar. 15, 2019  in R for Political Data US Politics R-Posts



This is part of a series of short posts about politics that seeks to show how we use data science to learn more about the real world. Follow along here.


Beto O’Rourke announced his campaign for president last Thursday. It has been just a few months since his closer-than-expected 2.6 point loss in last year’s race for Texas Senator against incumbent Ted Cruz. People have been cheering him on ever since. Read my piece about Beto’s run for the presidency here.

It has long been alleged that O’Rourke is a sort of media sweetheart; He used to play in a rock band, possesses an Obama-esque energy when campaigning, and has a unique ability to captivate the talking heads on both the right and the left. For this reason, his campaign announcement was covered everywhere, constantly. CNN covered the launch, MSNBC opined on his chances, and Fox News took to immediately re-introducing their tired 2018 attacks on him as a fake Hispanic (raised in El Paso, Texas, his friends nicknamed him “Beto,” even though his full name is Robert O’Rourke — not Roberto, the usual origin name.)

I took it upon myself to simply graph the media attention to O’Rourke, measured in word frequencies for each of the main three cable outlets.

# setup -------------
library(newsflash) # devtools::install_github("hrbrmstr/newsflash")
library(tidyverse)
library(tidytext) #devtools::install_github("juliasilge/tidytext")
library(lubridate)
library(stringi)

# data from last x days --------
ENDDATE   <- ymd("2019-03-17")
STARTDATE <- ymd("2019-03-14")

days <- seq(from=STARTDATE,
            to=ENDDATE,
            by = "days"
)


# chryons -------
chyrons_objects <- as.data.frame(list_chyrons()) %>% 
  filter(ts %in% days, type == "cleaned")

chyrons <- data.frame(NULL)
# uncomment these lines if you haven't downloaded the data yet. I'm commenting and loading to save time:
# for(i in 1:nrow(chyrons_objects)){
#   #print(paste("Fetching chyrons for ",chyrons_objects[i,]$ts))
#   chyrons <- chyrons %>%
#     bind_rows(read_chyrons(chyrons_objects[i,]$ts) %>%
#                 mutate(channel = gsub("BBCNEWS","BBC",channel),
#                        channel = gsub("CNNW","CNN",channel),
#                        channel = gsub("MSNBCW","MSNBC",channel),
#                        channel = gsub("FOXNEWSW","FOX",channel)) %>% 
#                 filter(channel %in% c("CNN","MSNBC","FOX"))
#     )
# }

# load the data (comment out if you haven't downloaded yet)
load(file='../../data_no_export/post/2019_03_15-media-beto/beto-media-data.Rdata')

# tokenize words ----------
# get words
chyrons <- chyrons %>% 
  unnest_tokens(word, text,drop = FALSE)


# drop stop words
data("stop_words")
chyrons <- chyrons %>%
  anti_join(stop_words)


# example --------
# need to get words
mention_chyron <- chyrons %>% 
  mutate(date = date(ts),
         hour = hour(ts),
         time = paste0(date," ",hour,":00:00"),
         text = tolower(as.character(text)),
         mention = grepl("beto|o\'rourke", text))  

# percentage of words by channel ---------

# frequency only of keyword mentions
mention_chyron_n <- mention_chyron %>%
  filter(mention) %>% 
  count(time, channel) 

mention_by_time_chyron <- left_join(
  mention_chyron_n, 
  mention_chyron  %>%
    group_by(time, channel) %>%
    summarise(total = n()) %>%
    #mutate(time = as_datetime(time,tz="US/Central")) %>%
    select(time,channel,total)
) %>% 
  mutate(n = n/total)

# datetime object for candidate
mention_by_time_chyron <- mention_by_time_chyron %>%
  mutate(time = as_datetime(time))

# graph
gg <- ggplot(mention_by_time_chyron,aes(x=time, y=n, col=channel,fill=channel)) +
  geom_col() + 
  scale_y_continuous(labels = scales::percent_format()) +
  scale_x_datetime(date_labels = "%b %d %I%p",expand = expand_scale(0.1)) + 
  facet_wrap(~channel, scales="fixed",ncol = 1) +
  labs(title="Cable News Outlets Hand Microphone to Beto O'Rourke",
       subtitle="Counting mentions of the candidate in cable networks' scrolling chyrons",
       caption="Source: Internet Archive Third Eye project",
       x="Date",
       y="Percent of Hourly Mentions") +
  theme(legend.position="bottom") +
  scale_color_manual("Network",values=c("CNN"="#F8C471","FOX"="#EC7063","MSNBC"="#5DADE2","BBC"="#D2B4DE")) +
  scale_fill_manual("Network",values=c("CNN"="#F8C471","FOX"="#EC7063","MSNBC"="#5DADE2","BBC"="#D2B4DE")) +
  coord_cartesian(ylim=c(0,max(mention_by_time_chyron$n)))

In the figure below, I show the percentage of each outlets’ news chyrons — the scrolling bar of text on the bottom of the screen — was devoted to O’Rourke’s in hourly increments.

preview(gg)

As you can see, O’Rourke got a lot of coverage during his early events on announcement day, March 14, and has continued to draw scattered attention from CNN and MSNBC. Fox has been steadily devoting 10-40% of their coverage to O’Rourke since the 14th.

I encourage you all to play around a little more with the data here. Does O’Rourke get more attention than others? Is that undeserved? Is there a gender bias?







comments powered by Disqus