Friday, 28 December 2018

Word Cloud of Tweets with #Trump Using Python

This is a short post. Have you ever tried going through a list of tweets trying to see what people are talking about a particular topic? Basically, you search or click on a hashtag.

There is a fun and visually appealing way to quickly "scan" through the tweets with a hastag by using Python and Twitter's API.

Python libraries you need:
  • tweepy  - for accessing Twitter's API
  • wordcloud - for creating the wordcloud
  • matplotlib - for converting the wordcloud to an image
What you also need are access tokens and keys for Twitter's API. These are bascially 4 different codes that are unique to your Twitter account for accessing Twitter's API. Twitter has instructions to how you get these: https://developer.twitter.com/en/docs/basics/authentication/guides/access-tokens.html



After downloading the latest 1500 tweets (at AEDT 10:07pm 28/12/2018), we have the below wordcloud:

It seems that people have been talking about a closer nation and "maga" (Make America Great Again. People have also been talking about the Syrian withdrawal and "love prague" -  it turned out that the popular news story about Trump in the past day has been that Michael Cohen, Donald Trump's former lawyer, allegedly travelled to Prague in 2016.

Anyway, the full code is below. You can modify the hashtag you want to search by replacing 'trump' and modifying the stopword list. Trump WordCloud
In [ ]:
import tweepy
from wordcloud import WordCloud, STOPWORDS
import matplotlib.pyplot as plt
%matplotlib inline
####input your credentials here
consumer_key =
consumer_secret =
access_token = 
access_token_secret = 

#Connect to twitter using your credentials
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)

#Create a list of tweets with the hashtag "trump". Limiting to the latest 1500 tweets, since 2018-12-19
tweets = []

for tweet in tweepy.Cursor(api.search,q="#trump",count=100,
                           lang="en",
                           since="2018-12-19").items(1500):
    tweets.append(tweet.text)


# Create stopword list and include words that are not useful, such as the word "trump" itself:
stopwords = set(STOPWORDS)
stopwords.update(["trump", "realdonaldtrump", "potus", "president", "donald", "donald trump",
                  "rt", "https", "http", "co","retweet"])

#Create one long text of the tweets
text = '\n'.join(tweets)

#Make all text lower case.
text = text.lower()

#Create our word cloud. Limit to 100 words.
wordcloud_trump = WordCloud(stopwords = stopwords, background_color = 'white', max_words = 100).generate(text)

#Create image
plt.figure(figsize = [10,10])
plt.imshow(wordcloud_trump, interpolation="bilinear")
plt.axis("off")
plt.show()
# store to file
#plt.savefig("trump.png", format="png")

Portfolio Optimisation with Python

 Recently I have been busy so I have been neglecting this blog for a very long time. Just want to put out some new content. So there is this...