In this article, I’ll explain how we handle community pull requests while avoiding storing Algolia credentials in Travis’s secured environment variables. It requires a bit of time, but for a SaaS company maintaining open source projects, we believe it’s worth it.
First, it’s important to understand that the only way to have a green build for community pull request is to make the credentials publicly available. It sounds like a bad idea, but I will show you how to do it safely.
Before we start, let’s summarize the issue we want to solve.
Algolia maintains more than 10 open source API clients and most of them run tests against our API. It requires credentials to be able to contact the service, so following Travis’s best practices, we store them as encrypted environment variables in our .travis.yml. To test our API clients, we use Algolia API keys, which are not just read-only but will also be used to test insertions in an Algolia index. Thus they can’t be made available outside of our own builds (otherwise people might use the keys to create a lot of objects and indexes in our API).
The way encrypted environment variables works is that you use the travis gem to add encrypted variables to your .travis.yml. Those keys are then automatically decoded by Travis at build time, but only for builds triggered by contributors with write access to the repository — not for forks, meaning not for external contributors’ PRs.
The screenshot above illustrates the problem: maintainers’ PRs are green and community PRs are red —not because they’re doing something bad, but because in this context, Travis is not able to call Algolia.
Terrible experience for contributors
At Algolia we care a lot about developer experience. Everything — from our rest API to all the ecosystems we built on top of it — is focused on providing the best developer experience for our users. So when our users are willing to contribute to our open source projects and their experience is less than ideal, it can be…frustrating.
Imagine you took some of your free time to fix a bug or refactor something in a library you use and the first feedback you get from the build system is more or less: “Tests are failing for some reason, figure it out yourself”.
You might look at it and spend some time understanding what’s going on. Then, you’d realize there is nothing you can do and you just wasted some of your precious time. In short: this is terrible for contributors and will definitely not encourage them to come back to us again.
Painful for maintainers
Not only do we want to provide the best developer experience for our users, but we also want our own developers to have a great maintainer experience.
As a maintainer, I’m always hesitant to merge a contribution if I don’t see the results of the tests. Sure enough, if only the README.md was modified, it’s safe to merge. But in general, it leaves you with a feeling of doing something wrong. What I am used to is to pull the contributor’s branch and run the tests locally, but this is really time consuming.
Another solution would be to re-submit the same PR by pushing the branch to the original repository, either manually or using a script. But this means that the original PR gets closed, which complicates the process and is seen by contributors as a strange way to handle PR.
Overall, this was so painful that we thought it was time to invest some time to find a better solution, once and for all.
First try: using temporary keys with limited capabilities
In the end, the solution was pretty straightforward: every build will use temporary credentials that will expire. We still want to avoid malicious developers using these keys, so we added a set of limitations to each key. It means that even if the key is publicly exposed and has write access to Algolia, there is so little one can do in such a limited time that it is simply not worth the effort.
On Algolia side, these limitations are different for each repository, but each set of keys:
- Has a subset of all the available ACLs
- Expires automatically after a few minutes
- Only has access to some indices
- Has a limited number of operations per hour
- Has a small nbHits that you can’t override
- Can only be used from Travis IP addresses
This is how we protect our API keys given Algolia API key features. Of course, it will be different for your own API.
But wait…we realized that, even with such keys, there’s one drawback: in the case of Algolia, you’re still able to create API keys, which means that an attacker could fork one API client repository, change the testing code to create keys using our restricted keys, and still escape the protection. Back to square one: we need another layer of protection.
A proper solution: using an API key dealer
The best way we found to solve this “API keys can create other API keys” issue was to build a little server that would act as an API key dealer. It’s open (no authentication) and holds one configuration per repository (what restrictions to apply to the key). This server is responsible for creating keys on the Algolia side, and giving them back to the build system.
We plan to open source the API Key Dealer we built in the coming weeks, and will let you know when it’s out.
Because it’s a publicly available server, anyone is able to generate keys, so we check if the call is legit and originates from Travis’s IP ranges. Not only this, but when a key is requested, we sends the original TRAVIS_JOB_ID and we use it to verify that the repository is part of the Algolia organization and that a job is currently running.
The whole process is described in this schema:
To be able to call such an API key dealer from Travis, we needed a small client that could be reused in all our repositories.
To do so, we built a small Go script that compiles into one binary file.
This binary is downloaded before each build, so we can easily update it on all the repositories.
If you want more details, read the whole .travis.yml file.
The small binary will:
- Assess if temporary credentials are necessary
- Call the API key dealer
- Print a comment (for help and debug purposes)
- Export env
Challenge: testing advanced admin permissions
Despite having a temporary key with limited capabilities, there might be permissions you cannot give to a public key: managing team members or changing account credentials for instance. The only solution for this is to skip the tests requiring admin permissions in external contributions.
The client is then responsible for determining if the credentials should be grabbed from the env variables (pull request from organization member) or from the API key dealer (pull request from external contributor). So yes, you still need to leave the master credentials in the encrypted env vars of Travis.
Is it really worth it?
As you can see, we spent some time designing this system so finding out if this was time well invested is a healthy question to ask.
The answer: it all depends on your own ecosystem and what experience you want to offer to your contributors. If one day someone decides that they want to take some of their free time to help improve our API clients, we believe they should have the best possible developer experience. Our libraries don’t yet have hundreds of external contributors but we want to reach that milestone and the only way to get there is to show respect to every contributor, no matter their level of engagement or skill set.
I hope you enjoyed this blog post and that it encourages you to, if you get inspired, contribute to our open source projects.
Thanks to Vincent Voyer, Josh Dzielak, Ivana Ivanovic and Tiphaine Gillet for helping me with this article.