Last month, we wrote an introductory overview of our experience running the second annual Car Hacking Village CTF infrastructure at DEF CON 25. Most notably, our use of Zappa to deploy Flask-based CTFd on AWS Lambda and API Gateway resulted in a $1.50 bill for the entire month of July (excluding database instances), while providing a number of operational advantages over last year’s traditional infrastructure.
In the second post of this series, we’ll go over our inspiration for the idea as well as some of the challenges we encountered – including a show-stopping bug that we didn’t catch until after the contest started Friday morning!
Hack all the things!
About a year ago, Bugcrowd started hosting quarterly company-wide hackathons during which employees are free to explore problems and solutions related to pretty much anything that they believe may be of value to themselves, others, or the company. While we’ve had participants work on everything from an Elixir-based API and related Slack bot for monitoring bathroom occupancy based on external lock color, to rough prototypes that have inspired features that are now part of Bugcrowd’s platform, even just the learnings from a couple of days of attempting to prototype a rough idea can be massively valuable for future related work.
Deploying CTFd with Zappa was an idea conceived at the last minute before our most recent hackathon in early June, and going in we weren’t sure we’d be able to even create a working proof-of-concept in only a couple of days. While our group was able to finish most of the theme design work (got to have the cool-looking demo to please the judges!), much of the follow-up work related to deploying with Zappa was left incomplete.
Regardless, all of the research on less frequently used AWS technologies, draft Terraform definitions, and other non-demoable components still saved us significant time over the course of the months leading up to DEF CON in getting CTFd to work with Zappa.
Naturally, there was fire…
Despite our testing before the conference and some prior experience running CTFd, we almost immediately started receiving reports of contestants receiving HTTP 500 responses when trying to submit flags shortly after CHV CTF began at 10am.
This oddly didn’t seem to be the case for all contestants, so we retreated from the cellular dead zone on the conference floor back to a suite in order to investigate.
Our logs (stored in AWS CloudWatch but conveniently accessible via zappa tail) revealed an issue that had to do with our Postgres RDS instance trying to store an excessively large value during each challenge solve attempt. Each challenge “solve” included a number of parameters, but the most suspect was the submitting user’s IP address, which was being converted to and stored as an integer. While CTFd has since moved to storing IP addresses as strings, our version was based on the master branch from prior to that pull request and used the Postgres integer column type to store these integer IP addresses. Since Python 3’s int type actually behaves like Python 2’s long type, CTFd trying to store some of these integer-converted IP addresses in a Postgres integer column was definitely a problem.
Once we were able to reproduce the issue, we performed a test migration using the RDS in our Zappa development stage to change the IP address column type to Postgres’ bigint. This seemed to solve the intermittent 500s reported by CHV CTF participants, so we applied the same migration to the production stage and were thankfully able to continue the CTF without any other hiccups less than an hour from our original start time. (While zappa deploy essentially just uploads a new revision of the application bundle to AWS S3, that ~25MB upload took nearly 20 minutes on our VPN-over-cellular Internet connection at the conference venue!)
Stay tuned for the final part of this series where we’ll talk about some of the things we learned this year by the time the CTF had ended – including how our deployment handled a large volume of vulnerability scanner traffic…