Python Twitch API Tutorial

This article will serve as a brief introduction to the integration of Python and the Twitch API. For the API documentation, visit https://dev.twitch.tv/docs/. This specific implementation will take place in the Jupyter Notebook environment.


Step 1 - Register App

The first action that must be taken is making a Twitch account. This is required to register an app to receive the credentials necessary to interact with the API. Once you have an account, go to the dev console on Twitch and register the app. Give it whatever name you want and set the redirect URL to http://localhost for local testing. This is the URL you will be directed to if making an auth token request of a user and can be changed at a later time.

Once you have the app registered, there are 2 important variables. The client id and secret are the identifiers for your app that grant you permission to use the API. Note that they should not be publicly visible.

To keep these credentials hidden, create an empty .py file in the project directory, and name it something like secret or credentials (this is what I will use in this example). If the project is stored in a public repository, make sure to add this file to the .gitignore. Within the .py file, add the following lines, replacing the xxx's with the random string of letters.

client_id = 'xxxxxxxxxxxxxxxxxxxxx'
secret = 'xxxxxxxxxxxxxxxxxxxxx'

Import these variables into the python environment


Step 2 - Authentication

Different pieces of information require different forms of authentication. Requesting public information requires an App Authentication token. Personal user information requires a User token. For this example I just want generic public information on channels, so according to the documentation I should use the OAuth client credentials flow. For apps that require special user permissions, this is where you will set your redirect url and use a different flow for user access tokens with the necessary scopes.

OAuth client credentials flow:

POST https://id.twitch.tv/oauth2/token

    ?client_id=<your client ID>

    &client_secret=<your client secret>

    &grant_type=client_credentials

    &scope=<space-separated list of scopes>

Step 3 - Make Requests

Now that we have an access token, we can make requests. Twitch has depricated the v5 and kraken APIs so helix is the version that should be used. Whenever a request is made, the client id and authorization token need to be passed as headers. Some of the types of queries that can be made are:

Some queries require special permissions or scopes and need the authorization of the user. Note that the API has a rate limit of 800 points per minute (requests by token).

For example, to look up a specific channel by the username, we can specify ./helix/users and login=gmhikaru.

Using this function we can begin collecting a dataset. But first we will need to generate a list of streams to get data on.


Step 4 - Collect Data

The json format in which the data is returned can be easily converted into a structured dataset using pandas. First, a loop is using the ./helix/streams API, returning 100 results on the page. Then, the cursor information is passed in recursion to move to the subsequent page. Storing each page in a list and then unwrapping the list of pages gives a sequential list of all the top currently live streams sorted by viewer count.

Note that it is important to limit the frequency of requests - to not to place undue network burden on the server you are requesting from, or hit the rate-limit for your token/IP address causing a negative status code error.

Now that the data exists a list of dictionaries, it is as simple as calling pd.DataFrame() to convert it into a useable dataset