GitHub Action API Gotchas
As mentioned in the previous post, there are some specific details that one needs to be careful about when it comes to using the GitHub Actions API. As promised, here they are! Now, to be fair, we've been working hard on building Terrateam, so it's entirely possible some of the items on this list are there because I missed something in the documentation. I'm pretty sure they are accurate, but if I made a mistake, shoot me an email at firstname.lastname@example.org and I'll update this post (and give you credit, of course).
We try to make our error messages informative enough such that we can tell our users what to do to resolve the issue. Before running an action, we want to know if the action is properly setup in the repository. To do that, GitHub has an API to list workflows (these define how to run the action) in the repository. But, GitHub Action workflows are per-branch, not per-repository. The API call does not let you query a particular branch. That means a workflow can be listed in the repository but when we publish the
workflow_dispatch event for that branch, it could fail because the workflow file has been deleted on that branch.
When we get the event about an update to a pull request, we get the commit SHA for the branch. Ideally, we'd like to run the GitHub Action on that specific SHA. But the API for publishing a
workflow_dispatch does not take a SHA, only a branch name. That means between getting the event to run the action and it finally executing, the pull request could be updated again. It's unlikely, but all it takes for us to get into that state is a few delays in publishing webhooks. Someone realizes they made a mistake, quickly updates the branch, and pushes again.
To address this, the first thing our action does is determine what SHA it's got in its checkout and then call back to the Terrateam server saying "hey, I got SHA
$SHA, is that the one I'm supposed to run on?". The server can respond back "yes, keep going" or "no, abort, abort!"
When we do start a GitHub Action, it would be nice to get the ID of the run so we can track it in the Terrateam server. That way if it unexpectedly fails, we can query it and take appropriate action. The downside is, the API to publish a
workflow_dispatch event only returns success or failure, not any information about the action. We try to address this by making the very first thing the action does is to call back home with its run information. For the most part, this works fine, however there are some edge cases we've hit where the action fails in the setup phase. There aren't any really great foolproof solutions to this, just some timeout heuristic.
Those three gotchas add some complexity to the implementation of Terrateam that I wish we didn't have to handle, but that's life, and at least they all have solutions. All APIs, including Terrateam's, have rough edges.