Through the Digitalist give back program — where we share expertise to help causes — I joined a student from the University of Saskatchewan tracking COVID-19 cases across Canada. The COVID-19 Tracker Canada (covid19tracker.ca) was one of the first initiatives displaying daily updates across the country. It also housed a dataset of individual cases, despite the challenges in gathering such data.
Since late March, I've been helping make improvements to grow and scale the tracker. Along the way I collected notes of my experience.
When I first joined, I started with a few minor tweaks and fixes. At the time, reported cases had yet to reach 5-digits nationwide. With the rising media attention, traffic to the site soon grew...
The site slowed to a crawl, and was eventually suspended by the web host. The site was deployed on a shared hosting plan — ideal when needs are small, but can quickly become a bottleneck.
We quickly decided to migrate to a VM and in a few hours was back online (DNS propagation aside). Know what your deployment is capable of handling and have a plan in place when the situation arises. Especially when there might be a slashdot effect.
Turns out migrating was not enough. The dataset was steadily growing. Soon, CPU usage started to spike. Even with a generous amount of compute resources provided by the Digital Ocean Hub for Good program, the site was being brought to its knees.
We pinpointed that the issue was happening at the database. Queries that ran fine last month became sluggish the next. A closer look showed that we had plenty of room to optimize and rewrite these queries. Moving forward, we want to have a process in place to review performance and catch issues before they start to compound.
With the troubles behind us, we can refocus on the planned improvements: expand the data to include additional collections like tests and hospitalizations; and rework the API as a separate service. While putting the pieces together, other discoveries start to emerge. As a result, I ran into unexpected “what if” scenarios, data hiccups, and forgotten items.
When you only volunteer a couple of hours each week, you want to maximize the effort. But as the project evolves, we also have to be adaptable. Show and tell work, have discussions early.
The API rework was also a good opportunity to review internal processes. In particular the daily upkeep of the dataset. Additional data processing was required with the new features. I started looking into automating this process, then into admin panels. But in the end, the simple approach fit the bill.
Using the command line interface, a command could ask for key input, then proceed with the work. Solve the functional requirement first, then the icing can come later. Starting simple puts emphasis on the tradeoffs when deciding on an approach.
Contributing to the COVID-19 Tracker Canada has been an experience which recalled old skills and developed new ones. While we may have reached the proverbial 1.0, data continues to be added. And so I will continue to lookout for ways to improve and optimize.
Change will be recurring theme as the world tackles the ongoing pandemic. Find a challenge to tackle, read a book, learn a new skill or refine an existing one.