A tinted collage of screenshots of various dataviz Twitter bots.

Using Twitter bots for data visualization

Twitter bots can be an effective method for data visualization thanks to the site’s reach, with some popular Twitter bots gaining thousands or tens of thousands of followers. Twitter bots are suitable for datasets that are continuously updated, or contain a variety of data points for exploring.

Here are two of my own examples. @last100bills tweets a daily breakdown of legislative activity in the US government. And @nycdatabot cycles through datasets from NYC OpenData that can be displayed on a map every six hours.

These two bots are open-sourced on Glitch, and for more examples, check out a collection of dataviz bots over on Botwiki.

In this tutorial I will walk you through a process of recreating one of my bots. I am going to use Python so that we can get access for some common Python libraries for working with data, including pandas and seaborn.

We are also going to be using the tweepy library to interact with Twitter’s API.

First step will be creating a Twitter account for our bot and setting up a developer account. This is all covered in this tutorial on Botwiki.

If you run into any issues during this step, or at any other point, be sure to join other creative botmakers over at botmakers.org. And you can find the finished project files on my GitHub.

Once that’s taken care of, it’s time to set up a structure for our project.

I’m going to start by creating four blank Python files.

  • bot.py — this will be our main script that brings everything together
  • get_data.py — in here we will load and process the data for our chart
  • make_chart.py — here we will create a chart from our data
  • tweet.py — and in this file we’ll write code that handles the bot account

Splitting the code into parts will make it easier to update and maintain it as we add more features.

Before we start writing code, let’s install a few dependencies.

pip3 install tweepy seaborn matplotlib numpy pandas python-dotenv

And then let’s save our dependencies so that we can install them again when we need to.

pip3 freeze > requirements.txt

To reinstall the required packages later, you can run:

pip3 install -r requirements.txt

Note that I’ve added the python-dotenv package to help us manage our API keys from the first step.

There’s a few other ways to handle API keys depending on how you’re hosting the bot, and I will touch on this a bit more later on.

Now, let’s make our bot tweet out some text to make sure things are working correctly.

Inside the tweet.py file, we’ll start off with a function that will handle the tweeting part of our bot.

import os
import tweepy
from dotenv import load_dotenv

load_dotenv()

def tweet(text):  
    client = tweepy.Client(consumer_key=os.environ.get("TWITTER_API_KEY"),
                           consumer_secret=os.environ.get("TWITTER_API_SECRET"),
                           access_token=os.environ.get("TWITTER_ACCESS_TOKEN"),
                           access_token_secret=os.environ.get("TWITTER_ACCESS_TOKEN_SECRET"))

    response = client.create_tweet(text=text)
    return response
    

And inside our main bot.py script we will call this function.

from tweet import *

tweet("hello world!")

Now you can run python3 bot.py and behold, our bot’s first tweet.

A screenshot of our bot's first tweet saying

If you’re not seeing the tweet posted, make sure your API keys are saved in a .env file. You can also print a response from the API and check for any helpful error messages.

from tweet import *

response = tweet("hello world!")
print(response)

If you’d like to run the script again, you will have to either change the tweeted text or delete your tweet, because Twitter won’t let you post a duplicate tweet and you’ll get an error instead.

We’re able to tweet some text, great work so far! Let’s see if we can also upload an image.

For that, we will have to do things a bit differently, because the tweepy package uses an older version of Twitter’s API to handle the image upload. We will be adding a new function called upload_image where we can pass the path to the file, an alt text for the image, to make our bot accessible to people who rely on screen readers and other assistive technology, and a text that will be tweeted with the image.

Our tweet.py script will now look like this.

import os
import tweepy
from dotenv import load_dotenv

load_dotenv()

def tweet(text):  
    client = tweepy.Client(consumer_key=os.environ.get("TWITTER_API_KEY"),
                           consumer_secret=os.environ.get("TWITTER_API_SECRET"),
                           access_token=os.environ.get("TWITTER_ACCESS_TOKEN"),
                           access_token_secret=os.environ.get("TWITTER_ACCESS_TOKEN_SECRET"))

    tweet = client.create_tweet(text=text)
    return tweet
    
def upload_image(filename, alt_text, text):
    auth = tweepy.OAuth1UserHandler(
        os.environ.get("TWITTER_API_KEY"),
        os.environ.get("TWITTER_API_SECRET"),
        os.environ.get("TWITTER_ACCESS_TOKEN"),
        os.environ.get("TWITTER_ACCESS_TOKEN_SECRET")
    )
    api = tweepy.API(auth)

    response = api.media_upload(filename=filename)
    api.create_media_metadata(response.media_id_string, alt_text=alt_text)
    tweet = api.update_status(media_ids=[response.media_id_string], status=text)
    return tweet

Now we need to create an image. Let’s use the seaborn package to do just that. We can use one of the built-in datasets. I’ll go with “penguins”.

Here’s one way our get_data.py script can work. Let’s use a dataset dictionary that will contain the data itself, a description of the data that we can use for the alt text, and a short title as a text in the tweet.

The resulting code would look like this.

import seaborn as sns

def get_data():
    dataset = {}
    dataset["data"] = sns.load_dataset("penguins")
    dataset["title"] = "Learning about penguins: https://github.com/mwaskom/seaborn-data/blob/master/penguins.csv"
    dataset["description"] = "A series of charts exploring data about penguins."
    return dataset

Let me pause here for a moment. The alt text should be a lot more descriptive than the example I’m using here for testing. The chart itself is a bit much altogether, but this will all suffice for a little demo to make sure everything works.

Next we will turn the data into a chart inside make_chart.py.

import seaborn as sns

def make_chart(data):
    filename = "chart.png"
    plot = sns.pairplot(data, hue="species")
    fig = plot.fig
    fig.savefig(filename)
    return filename

Now we can use the make_chart and upload_image functions in our main bot.py script.

from tweet import *
from get_data import *
from make_chart import *

dataset = get_data()
chart = make_chart(dataset["data"])
response = upload_image(chart, dataset["description"], dataset["title"])
# print(response)

And it works!

A screenshot of our bot posting a series of charts.

If you’re not seeing your chart posted, uncomment the last line in bot.py and check your terminal for any errors. The uploaded image will make each tweet be considered unique, so you don’t have to worry about changing the text if you want to rerun the script.

At this point we have a nicely structured code that works pretty well. It’s time to use some more interesting data, and then we can look into automating our bot.

Let’s try recreating one of the two bots that I introduced in the beginning.

First, the original source for @last100bills is here. You can see that the bot was written in JavaScript. Here’s the important part.

const datasetUrl = 'https://www.govtrack.us/api/v2/bill?order_by=-current_status_date',
      datasetName = 'Last 100 bills in the US government',
      datasetLabels = ['group', 'value'];

We can see that the bot loads data from govtrack.us, most of which, looking at the About page, updates daily, which is perfect, we can simply set up our bot to run once a day and pull data from the same URL.

After playing a bit with the code, this is what I came up with for get_data.py

import json
from urllib.request import urlopen

def get_data():
    url = "https://www.govtrack.us/api/v2/bill?order_by=-current_status_date" 
    response = urlopen(url) 
    data_json = json.loads(response.read())

    statuses = [item["current_status"] for item in data_json["objects"]]

    dataset = {} 
    dataset["data"] = { 
        "Introduced": statuses.count("introduced"), 
        "Passed House": statuses.count("pass_over_house"), 
        "Passed House & Senate": statuses.count("passed_bill"), 
        "Concurrent Resolution": statuses.count("passed_concurrentres"),
        "Simple Resolution": statuses.count("passed_simpleres"), 
        "Ordered Reported": statuses.count("reported"), 
        "Enacted": statuses.count("enacted_signed"), 
    }

    dataset["title"] = "Analyzing the last 100 bills introduced in the US congress"

    data = dataset["data"]
    dataset["description"] = "\n".join([f"{x}: {list(data.values())[ind]}" for ind, x in enumerate(list(data.keys()))])

    return dataset

Shoutout to cbunn81 for helping me simplify the code above. And here’s the code from make_chart.py.

import matplotlib.pyplot as plt
import seaborn as sns

def make_chart(data):
    filename = "chart.png"

    sns.set_style("darkgrid")
    sns.barplot(x = list(data.values()), y = [f"{x}: {list(data.values())[ind]}" for ind, x in enumerate(list(data.keys()))], orient = "h")
    plt.tight_layout()
    plt.savefig(filename)

    return filename

The result looks pretty good!

Side by side screenshots showing our Twitter bot posting a bar chart and highlighting alt text with the chart's content.

Deploying your bot

Now, in order for our bot to tweet on its own, we need to schedule the bot.py script to run periodically. There’s a list of options for this over on Botwiki and going through all of them is way beyond the scope of this tutorial, but let me quickly show you how you can host your bot for free on Pipedream.

After you sign up for an account, add your Twitter API keys in the settings.

If you want to make multiple bots, you can differentiate their API keys by adding the bot’s name to each key’s name, for example:

TWITTER_API_KEY_MY_BOT
TWITTER_API_SECRET_MY_BOT
TWITTER_ACCESS_TOKEN_MY_BOT
TWITTER_ACCESS_TOKEN_SECRET_MY_BOT

TWITTER_API_KEY_MY_OTHER_BOT
TWITTER_API_SECRET_MY_OTHER_BOT
TWITTER_ACCESS_TOKEN_MY_OTHER_BOT
TWITTER_ACCESS_TOKEN_SECRET_MY_OTHER_BOT

Then you will also have to change the key names in your code.

Once your API keys are saved, head over to your workflow dashboard and create a new workflow.

Screenshot of the Pipedream workflows dashboard with the New button highlighted.

Choose Schedule as a trigger for the bot. For now let’s go with the Custom Interval option to test things out.

Screenshot of  a new workflow trigger being added in Pipedream.

Let’s use a short trigger interval, something like 30 seconds.

Screenshot of  a new workflow trigger being customized in Pipedream.

Next, we’ll add a Python code block that will run every time our scheduler is triggered.

Screenshot of  a new workflow step being added in Pipedream.
Screenshot of  a new workflow step being added in Pipedream with Python code block highlighted.
Screenshot of  a new workflow step being added in Pipedream with Python code block highlighted.
Screenshot of  a new workflow step being added in Pipedream with Python code block being edited.

Now we can start adding our bot scripts, splitting them into multiple steps. A few things will need to be adjusted, for example, we will need to use the tmp folder for any files we want to save. Have a look at the finished workflow here.

Before you enable your workflow, be sure to change the trigger frequency. To get an idea of how often your bot should tweet, have a look at this guide on Botwiki, and consult Twitter’s documentation of their rate limits. And if you’re using the free Pipedream plan, be aware that you are limited to 333 daily invocations, which should be plenty for most bots.

More tutorials

A screenshot showing a tweet with an image and highlighting the code with alternative text.

💻 Browse all