We wanted to provide an update about the recent system downtime that happened on our NY3 public server this weekend. The server was down on Oct 21 from 18:39 to 21:39 (EST) and, as we mentioned in our previous posts, this was the first prolonged outage on Blynk since 2015.
Important to also note that none of the other servers, including our dedicated Business servers in the region were affected by this incident.
Our latest server update included 500+ various improvements. And while we ran all our manual and automated tests according to our QA processes, one of the database connectivity issues went undetected. It’s a non-typical issue for us, as we have multiple safeguards in Blynk architecture to prevent such cases.
The issue was fixed and we introduced more safety layers to prevent issues like this from happening in future. Additionally we’ve developed a series of integration tests to cover similar scenarios.
We fully understand how crucial Blynk’s services are to your development projects, and we don’t take incidents like this lightly. We’re taking proactive steps to improve our response times and ensure that you’re kept in the loop during any emergency situations:
- We’re growing our team with new hires to enhance our 24/7 emergency operations coverage.
- Our critical situations protocols will be improved and include better communication with clients and community.
- We will provide server status and uptime data across our products: website, console, and apps. This will be communicated once implemented.
- Critical updates will be timely posted in Blynk’s Twitter (X). Make sure to follow.
Nobody’s perfect, but we’re proud to say that we’ve maintained a 99.9% uptime on our public cloud throughout all the way from 2015 to 2022. In 2023 we are also 99.9%, even considering the latest incident on the NY server. We’re committed to maintaining and improving this level of service reliability.
While Blynk can work for a wide variety of use cases, it’s important to note that our self-served offering is not intended for life-critical or high-stakes operations as detailed in our Terms of Service. If your business needs a higher / guaranteed level of reliability, our Enterprise Plan provides 99.99 % SLA.
We’re genuinely sorry for the trouble this incident may have caused. Your understanding, patience, and continued trust mean a lot to us. If you have any questions or concerns, please feel free to comment below or contact our support team directly.
CEO @ Blynk