Minute by Minute Twitter Sentiment Timeline from the VP debate

Click on above graph to enlarge.

Background

The data for this graph was collected automatically every ~60 seconds of the VP debate on 10/11/2012, with an ending aggregate sample size of 363,163 tweets.  From this dataset duplicate tweets were removed (because of bots), which gave a final dataset of 81,124 remaining unique tweets (52,303-Biden, 28,821-Ryan).  Every point in this graph is the mean sentiment of tweets gathered for that minute.  The farther above zero the point is means that it is higher positive sentiment of the tweets, and the lower it gets below zero the more negative. It would be very interesting to compare this to the transcript for inference.  The one very noticeable take away is the jump in sentiment as soon as the debate was over at 22:30

R Code for this data collection and graphing

To collect this data I updated my original code from the presidential debate as follows:

vp<-function(x){
Ryan=searchTwitter('@PaulRyanVP', n=1500)
Biden=searchTwitter('@JoeBiden', n=1500)
textRyan=laply(Ryan, function(t) t$getText())
textBiden=laply(Biden, function(t) t$getText())
resultRyan=score.sentiment(textRyan, positive.words, negative.words)
resultRyan$candidate='Ryan'
resultBiden=score.sentiment(textBiden, positive.words, negative.words)
resultBiden$candidate='Biden'
result<-merge(resultBiden,resultRyan, all=TRUE)
result$candidate<-as.factor(result$candidate)
result$time<-date()
return(result)
}

Then to have it R run automatically collect the data every 60 seconds in an endless loop (I wasn’t sure when I wanted to stop it at the time) you just run a repeat function.

debate<-vp()
repeat {
startTime x<-vp()
debate<-merge(x, debate, all=TRUE)
sleepTime 0)
Sys.sleep(sleepTime)
}

At 10:56pm I got bored and the debate was over, so I just hit stop and ran the following to get the graph:
x<-subset(debate, !duplicated(text))
x$minute<-strptime(x$time, "%a %b %d %H:%M:%S %Y")
x$minute1<-format(x$minute,"%H:%M")
x<-subset(x, minute1>="21:00")
period<-unique(x$minute1)
period<-period[order(period)]
Biden Ryan mean<-data.frame(period, Biden, Ryan)
dfm ggplot(dfm, aes(period, value, colour=variable, group=variable, xlab="time", ylab="score"))+
geom_point()+geom_line()+opts(axis.text.x=theme_text(angle=45),
axis.ticks = theme_blank(),axis.title.y=theme_blank())

I have to admit, doing this actually made watching the debate kind of fun.

For cleaner access to the code please go to my git hub

About these ads

7 thoughts on “Minute by Minute Twitter Sentiment Timeline from the VP debate

    • I wouldn’t say it’s all about who has won, but more about immediate reaction to what was being said at the time. Also, there may have been other factors, like demographics, that had something to do with it.

      • I would think that it would start to diverge significantly from what was being said “at the time” in the debate. Indeed, I would expect the conversation of “the twitterverse” to diverge significantly from the content of the debate itself.

        The “Big Bird” explosion from the first presidential debate stands as a good analogy.

  1. Pingback: Twitter Analysis of the US Presidential Debate « NERD PROJECT

  2. Pingback: Scheduling R Tasks with Crontab to Conserve Memory | NERD PROJECT

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s