API call scheduler for Python

Question

My favourite 3rd party library isn't getting maintained so now I need to make my own library for interfacing with Riot Games' API. Epic.

Problem is that there are rules, such as:

100 requests in 2 minutes
20 requests in 1 second

So I made an API request scheduler.

import asyncio
import time
from collections import deque
from typing import Deque

import httpx


class SingleRegionCoordinator:
    WAIT_LENIENCY = 2

    def __init__(self, subdomain: str) -> None:
        self.subdomain = subdomain
        self.calls: Deque[float] = deque()
        self.base_url = f"https://{subdomain}.api.riotgames.com"
        self.force_wait = 0

    def _schedule_call(self) -> float:
        """
        Schedule an API call. This call must be atomic (call finishes before
        being called by another coroutine).

        Returns:
            float: The time in seconds to wait to make the API call
        """
        now = time.time()
        # Remove all calls older than 2 minutes
        while self.calls and self.calls[0] < now - 120:
            self.calls.popleft()
        # Figure out how long to wait before there will be less than 100 calls
        # in the last 2 minutes worth of requests
        rate_1s_time, rate_2m_time = 0, 0
        if len(self.calls) >= 100:
            rate_2m_time = (
                self.calls[-100] + 120 + SingleRegionCoordinator.WAIT_LENIENCY
            )
        # Figure out how long to wait before there will be less than 20 calls
        # in the last second worth of requests
        if len(self.calls) >= 20:
            rate_1s_time = (
                self.calls[-20] + 1 + SingleRegionCoordinator.WAIT_LENIENCY
            )
        scheduled_time = max(self.force_wait, rate_2m_time, rate_1s_time, now)
        self.calls.append(scheduled_time)
        return scheduled_time - now

    async def _api_call(
        self, method: str, path: str, params: dict = None
    ) -> dict:
        """
        Make an API call

        Args:
            method: The HTTP method to use
            path: The path to the API endpoint
            params: The parameters to pass to the API endpoint

        Returns:
            dict: The API response
        """
        # Schedule the call
        wait_time = self._schedule_call()
        await asyncio.sleep(wait_time)

        url = f"{self.base_url}{path}"
        headers = {"X-Riot-Token": "code edited for codereview"}
        response = await httpx.request(
            method, url, headers=headers, params=params
        )
        res = response.json()

        # Check if we got a rate limit error
        if res["status"]["status_code"] == 429:
            # Let the scheduler know that we are in trouble
            self.force_wait = (
                time.time() + 120 + SingleRegionCoordinator.WAIT_LENIENCY
            )
            return await self._api_call(method, path, params)
        return res

We can see it in action (sort of) with some quick test code to see what it tries to achieve:

# Not recommended in practice but makes this example clearer
SingleRegionCoordinator.WAIT_LENIENCY = 0

src = SingleRegionCoordinator("")
for _ in range(120):
    src._schedule_call()

arr = np.array(src.calls)
# arr is now the time to wait that the function tells all of the API calls
arr -= arr.min()

print(arr)

Output:

[  0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.   0.
   0.   0.   0.   0.   0.   0.   1.   1.   1.   1.   1.   1.   1.   1.
   1.   1.   1.   1.   1.   1.   1.   1.   1.   1.   1.   1.   2.   2.
   2.   2.   2.   2.   2.   2.   2.   2.   2.   2.   2.   2.   2.   2.
   2.   2.   2.   2.   3.   3.   3.   3.   3.   3.   3.   3.   3.   3.
   3.   3.   3.   3.   3.   3.   3.   3.   3.   3.   4.   4.   4.   4.
   4.   4.   4.   4.   4.   4.   4.   4.   4.   4.   4.   4.   4.   4.
   4.   4. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120. 120.
 120. 120. 120. 120. 120. 120. 120. 120.]

Works as intended. It greedily schedules API calls as soon as possible, but respects the rules.

I know that:

This solution won't horizontally scale.
If the application closed/reopened it's going to run into a little trouble (though it can resolve by itself).
Exponential backoff is the typical go-to solution for error handling with APIs. However, I find it to be no good with rate limit scenarios, as batching, say, 1000 requests would eventually lead you to be waiting way more time than you need. This is supposed to be the optimal scheduler.
There are additional techniques such as caching requests.

My main questions are:

What are the general best practices when it comes to dealing with API rate limiting? This solution works in my mind but I have little idea how it's actually going to behave in the wild.
Code style is fine? I now use the Black autoformatter with line length 79 plus import ordering (builtin, external, internal, alphabetical). Is 79 line length a thing of the past yet or is that not important? I like short line lengths because it means I can put multiple scripts side-by-side easily.
Is -> dict OK? I really like Typescript's interface, where I could do something like:

interface SomethingDTO {
    id: string
    data: Array<SomethingElseDTO>
}

Instead now I have to write classes and... you know. A lot of work and honestly, hardly worth the effort either. Or am I delusional? Even this suggested alternative is not that good since it doesn't deal with nested objects. Some API responses are also just a straight PITA to typehint.

Reinderien · Accepted Answer · 2021-12-05 00:13:47Z

Don't use time.time; use time.monotonic - otherwise, certain OS time changes are going to deliver a nasty surprise.

Make a constant for your 120 seconds.

You ask:

Is -> dict OK?

Not really. This:

params: dict = None

is first of all incorrect since it would need to be Optional[dict], which is basically Optional[Dict[Any, Any]]. Setting aside the outer Optional, in decreasing order of type strength, your options are roughly:

TypedDict
Dict[str, str] if all of your values are strings but you don't enforce key names
Dict[str, Union[...]] if you know of a value type set and don't enforce key names
Dict[str, Any] if you have no idea what the values are

That's all assuming that you're stuck using a dictionary. Keep in mind that all of the above hinting is hinting only, and is not enforced in runtime. If you want meaningful runtime type enforcement (which, really, you should) then move to @dataclasses. They're really not a PITA; they're basically the lightest-weight class definition mechanism and have a convenience asdict which makes API integration a breeze. This is effort worth investing if you at all care about program correctness.

Stack Exchange Network

API call scheduler for Python

1 Answer 1

You must log in to answer this question.

Hot Network Questions

API call scheduler for Python

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions