The Challenge

Sometimes Even Demo Systems Need A Lot of Data

During the first phase of the project, we put together a basic web interface to prove out the concept and allow some limited remote data collection, since it isn’t always possible to have the equipment being monitored in the lab or to have a laptop connected to something in the field.

As the project developed, however, it became apparent that collecting lots of data would be required to develop the product further. This meant we would be putting the relational database we had selected for ease of prototyping to the test.

We quickly realized that the demo approach would not be adequate for the amount of data we needed to collect, and that we would need to beef up our pipeline to avoid locking up the database and potentially missing data.

The Considerations

Scalability Without Eliminating Flexibility

Scaling up a backend is not a small task, particularly when receiving large amounts of time-series data, and in doing so you can end up over-optimizing for requirements that may shift before the product ships. We needed more bandwidth than we had, but we also didn’t want to waste our customer’s time and money building a solution that could support millions of devices when they only had hundreds to thousands at the moment, potentially locking them into an architecture they didn’t want.

The Solution

Save It Up and Do It All at Once

Inserts are some of the longer operations on a relational database. The solution we came up with was to use AWS SQS to create a queue that could temporarily receive all of the incoming data, then after a short interval, insert it into the database in a single batch.

Even this relatively small step took us from handling tens of devices to comfortably passing a load test with ten thousand devices reporting at the same time

‟customer comment”

— author

The Technologies Behind This Project

Web Development
AWS SQS AWS RDS SQL