A tinted collage of screenshots of various dataviz Twitter bots.

Using Mastodon bots for data visualization

Note: For Twitter bots, check out this version of the tutorial and some useful resources over at botwiki.org.

Bots can be an effective method for data visualization, particularly for datasets that are continuously updated, or contain a variety of data points for exploring. Mastodon’s API is very easy to get started with, and you can host your bot for free on a dedicated instance called botsin.space. (I spoke with the instance’s creator a few years back for Botwiki.)

And one great thing about using Mastodon, other than supporting the free and open web, is that you get an RSS feed of your bot’s posts, like this one.

Here are two of my own examples. @last100bills tweets a daily breakdown of legislative activity in the US government. And @nycdatabot cycles through datasets from NYC OpenData that can be displayed on a map every six hours.

These two bots are open-sourced on Glitch, and for more examples, check out a collection of dataviz bots over on Botwiki.

In this tutorial I will walk you through a process of recreating one of my bots. I am going to use Python so that we can get access for some common Python libraries for working with data, including pandas and seaborn. We are also going to be using the Mastodon.py library to interact with Mastodon’s API.

First step will be creating an account for our bot and creating an app. This is all covered in this tutorial on Botwiki.

If you run into any issues during this step, or at any other point, be sure to join other creative botmakers over at botmakers.org. And you can find the finished project files on my GitHub.

Once that’s taken care of, it’s time to set up a structure for our project.

I’m going to start by creating four blank Python files.

  • bot.py — this will be our main script that brings everything together
  • mastodon_helper.py — here we can add a few helper functions to keep our main bot.py script tidier
  • get_data.py — in here we will load and process the data for our chart
  • make_chart.py — here we will create a chart from our data

Splitting the code into parts will make it easier to update and maintain it as we add more features.

Before we start writing code, let’s install a few dependencies.

pip3 install Mastodon.py seaborn matplotlib numpy pandas python-dotenv

And then let’s save our dependencies so that we can install them again when we need to.

pip3 freeze > requirements.txt

To reinstall the required packages later, you can run:

pip3 install -r requirements.txt

Note that I’ve added the python-dotenv package to help us manage our API keys from the first step.

There’s a few other ways to handle API keys depending on how you’re hosting the bot, and I will touch on this a bit more later on. For now, let’s create a file called .env, with a dot at the beginning of the file name.

And in this file we will put the URL of the instance our bot will be posting on and also the access token from our Mastodon app.

MASTODON_INSTANCE="https://botsin.space"
MASTODON_ACCESS_TOKEN="123-ABCDEFGHIJKLMNOP"

Now, let’s make our bot post out some text to make sure things are working correctly. We’ll start with mastodon_helper.py.

import os
from mastodon import Mastodon
from dotenv import load_dotenv

load_dotenv()

mastodon = Mastodon(access_token=os.environ.get("MASTODON_ACCESS_TOKEN"), api_base_url=os.environ.get("MASTODON_INSTANCE"), request_timeout=100)

def post(text):  
    response = mastodon.status_post(text)
    return response

And then in bot.py:

from mastodon_helper import *

post("Hello world!")

Now you can run python3 bot.py and behold, our bot’s first tweet.

A test post from a Mastodon bot, saying

If you’re not seeing the post, make sure your the values saved in your .env file are correct. You can also print a response from the API and check for any helpful error messages.

from mastodon_helper import *

response = post("Hello world!")
print(response)

We’re able to post some text, great work so far! Let’s see if we can also upload an image.

This will be a three-step process, as we need to do the following:

Our updated mastodon_helper.py will look like this. Note that I added a few print function calls so that we can see what our bot is doing.

import os
from mastodon import Mastodon
from dotenv import load_dotenv

load_dotenv()

mastodon = Mastodon(access_token = os.environ.get("MASTODON_ACCESS_TOKEN"), api_base_url = os.environ.get("MASTODON_INSTANCE"), request_timeout = 100)

def post(text):  
    response = mastodon.status_post(text)
    print("posted")
    return response


def upload_image(filename, alt_text, text):
    try:
        print("uploading media to mastodon")
        media_upload_mastodon = mastodon.media_post(filename)

        print("adding description")
        mastodon.media_update(media_upload_mastodon, description = alt_text)

        print("ready to post")
        post = mastodon.status_post(text, media_ids = media_upload_mastodon)

        print("posted")
    except:
        print("failed")

Let’s use the seaborn package to create our image with one of the built-in datasets. I’ll go with “penguins”.

Here’s one way our get_data.py script can work. Let’s use a dataset dictionary that will contain the data itself, a description of the data that we can use for the alt text, and a short title as a text in the tweet.

The resulting code would look like this.

import seaborn as sns

def get_data():
    dataset = {}
    dataset["data"] = sns.load_dataset("penguins")
    dataset["title"] = "Learning about penguins: https://github.com/mwaskom/seaborn-data/blob/master/penguins.csv"
    dataset["description"] = "A series of charts exploring data about penguins."
    return dataset

Let me pause here for a moment. The alt text should be a lot more descriptive than the example I’m using here for testing. The chart itself is a bit much altogether, but this will all suffice for a little demo to make sure everything works.

Next we will turn the data into a chart inside make_chart.py.

import seaborn as sns

def make_chart(data):
    filename = "chart.png"
    plot = sns.pairplot(data, hue="species")
    fig = plot.fig
    fig.savefig(filename)
    return filename

Now we can use the make_chart and upload_image functions in our main bot.py script.

from mastodon_helper import *
from get_data import *
from make_chart import *

dataset = get_data()
chart = make_chart(dataset["data"])
response = upload_image(chart, dataset["description"], dataset["title"])
# print(response)

And here’s our chart posted on Mastodon.

A screenshot of a chart posted by our bot.

If you’re not seeing your chart posted, check your terminal for any errors.

At this point we have a nicely structured code that works pretty well. It’s time to use some more interesting data, and then we can look into automating our bot.

Let’s try recreating one of the two bots that I introduced in the beginning.

First, the original source for @last100bills is here. You can see that the bot was written in JavaScript. Here’s the important part.

const datasetUrl = 'https://www.govtrack.us/api/v2/bill?order_by=-current_status_date',
      datasetName = 'Last 100 bills in the US government',
      datasetLabels = ['group', 'value'];

We can see that the bot loads data from govtrack.us, most of which, looking at the About page, updates daily, which is perfect, we can simply set up our bot to run once a day and pull data from the same URL.

After playing a bit with the code, this is what I came up with for get_data.py

import json
from urllib.request import urlopen

def get_data():
    url = "https://www.govtrack.us/api/v2/bill?order_by=-current_status_date" 
    response = urlopen(url) 
    data_json = json.loads(response.read())

    statuses = [item["current_status"] for item in data_json["objects"]]

    dataset = {} 
    dataset["data"] = { 
        "Introduced": statuses.count("introduced"), 
        "Passed House": statuses.count("pass_over_house"), 
        "Passed House & Senate": statuses.count("passed_bill"), 
        "Concurrent Resolution": statuses.count("passed_concurrentres"),
        "Simple Resolution": statuses.count("passed_simpleres"), 
        "Ordered Reported": statuses.count("reported"), 
        "Enacted": statuses.count("enacted_signed"), 
    }

    dataset["title"] = "Analyzing the last 100 bills introduced in the US congress"

    data = dataset["data"]
    dataset["description"] = "\n".join([f"{x}: {list(data.values())[ind]}" for ind, x in enumerate(list(data.keys()))])

    return dataset

Shoutout to cbunn81 for helping me simplify the code above. And here’s the code from make_chart.py.

import matplotlib.pyplot as plt
import seaborn as sns

def make_chart(data):
    filename = "chart.png"

    sns.set_style("darkgrid")
    sns.barplot(x = list(data.values()), y = [f"{x}: {list(data.values())[ind]}" for ind, x in enumerate(list(data.keys()))], orient = "h")
    plt.tight_layout()
    plt.savefig(filename)

    return filename

The result looks pretty good!

Deploying your bot

Now, in order for our bot to post on its own, we need to schedule the bot.py script to run periodically. There’s a list of options for this over on Botwiki and going through all of them is way beyond the scope of this tutorial, but let me quickly show you how you can host your bot for free on Pipedream.

After you sign up for an account, add your Mastodon instance URL and token in the settings.

A screenshot from Pipedream.com settings page with the
A screenshot from Pipedream.com settings page showing a field for input environmental variables.

If you want to make multiple bots, you can differentiate their API keys by adding the bot’s name to each key’s name, for example:

MASTODON_INSTANCE="https://botsin.space"

MASTODON_ACCESS_TOKEN_BOT_1="123-ABCDEFGHIJKLMNOP"
MASTODON_ACCESS_TOKEN_BOT_2="456-IJKLMNOPQRSTUVWX"
MASTODON_ACCESS_TOKEN_BOT_3="789-RSTUVWXYZABCDEFG"

Then you will also have to change the key names in your code.

Once your API keys are saved, head over to your workflow dashboard and create a new workflow.

Screenshot of the Pipedream workflows dashboard with the New button highlighted.

Choose Schedule as a trigger for the bot. For now let’s go with the Custom Interval option to test things out.

Screenshot of  a new workflow trigger being added in Pipedream.

Let’s use a short trigger interval, something like 30 seconds.

Screenshot of  a new workflow trigger being customized in Pipedream.

Next, we’ll add a Python code block that will run every time our scheduler is triggered.

Screenshot of  a new workflow step being added in Pipedream.
Screenshot of  a new workflow step being added in Pipedream with Python code block highlighted.
Screenshot of  a new workflow step being added in Pipedream with Python code block highlighted.
Screenshot of  a new workflow step being added in Pipedream with Python code block being edited.

Now we can start adding our bot scripts, splitting them into multiple steps. A few things will need to be adjusted, for example, we will need to use the tmp folder for any files we want to save. Have a look at the finished workflow here.

Before you enable your workflow, be sure to change the trigger frequency. To get an idea of how often your bot should posy, have a look at this guide on Botwiki, and review Mastodon’s documentation of their API rate limits. And if you’re using the free Pipedream plan, be aware that you are limited to 333 daily invocations, which should be plenty for most bots.

More tutorials

A tinted, zoomed in screenshot of a JSON object showing server information about a Mastodon instance.
A tinted screenshot of two charts, one showing the popularity of various fediverse platforms (with Mastodon far ahead of the rest), and the other chart showing distribution of domain creation dates, mostly clustered around 2023.
A tinted screenshot showing the @mtaupdates Mastodon profile and a few example posts with subway status alerts.

💻 Browse all