This is a collection of short posts that answer relevant questions about politics using political data. Most posts are short and sweet, getting at my curiosities for the week in just a few paragraphs and corresponding lines of code. I like to think we’re answering some salient, yet simple, questions here. Due to the ability to show and hide the computer code used for the analyses, I think them pretty accessible to anyone interested in either politics or data science, but one will probably get more out of this if they have an interest in (or tolerance of!) data-related subjects.
Some of you will recall about a year ago when I began a project that attempted a similar task. I got four posts in and then stopped. Why? I wasn’t holding myself accountable. This year, I’ve resolved to embark on a weekly journey to write simple introductions to concepts and datasets (yes, they’ll sometimes be repetitive — data science is too!) that I have found useful as a data journalist and political analyst. My mission is cheesy, I know, but who doesn’t like a good brie? (STATA users, probably.)
For your reference, I have also published my own course about political data analysis in R, titled “Analyzing Polling and Election Data with R” at DataCamp.com. If you’re looking for a more formal and structured guide to learning data science, that’s one place to start.
I am also making all the data and code easily available for you all via GitHub. Don’t be afraid to create issues, make pull requests, etc. And of course, check out my R package that provides some tools for using data science to analyze politics, called
politicaldata, which you can install via GitHub.
A Guide to Analyzing Political Data in R
Please see this page for my guide on how to analyze and visualize political data in R. It is not quite long enough to be a book, but will (in the end) be fairly comprehensive.
The organization of this weekly series will be straightforward: there’s no organization. Why does there need to be? Apart from the introductory information in this page, each post should be a standalone answer to a simple question which we can use techniques of data science to answer. So, they’ll be linked together, but don’t think of this as a workshop or short course on R and politics. Again, for that, consult my DataCamp course.
Without further ado…
Heads up: you can also find a list of all the posts at the “R for Political Data” category page. It might be helpful if I forget to update the contents here.
- Week 1: Polarization in the 115th Congress
- Week 2: This Early Before 2020, It’s All About Name Recognition
- Week 3: How Marginal Tax Rates Work
- Week 4: What Happens To Our Algorithms When Socialists Vote in Congress?
- Week 5: The Ideological Diversity of the American Electorate
- Week 6: Just How Liberal Are the 2020 Democratic Candidates?
- Week 7: The 2020 Twitter Primary
- Week 8: Four Parties in America? Probably Not Anytime Soon
- Week 9: The “Strongest” Democrats and Republicans (That Ran for Office in 2018)
- Week 10: What If Each State Allocated Their Electoral College Votes Proportionally?
- Week 11: Beto O’Rourke is a Media Sweetheart