Please let me provide an explanation of the issues. There are a series of events that have occurred over the last 4 weeks, some of which we have control over and some of which we don’t, that have created a perfect storm of problems. Here’s what’s transpired and how we’ve resolved and will continue to fix, these issues.
(1) At the beginning of March, Cloudflare encountered a problem that created caching issues with our sites. Unfortunately, this is something that we couldn’t control, and Cloudflare communicated these performance issues publicly. Cloudflare was able to troubleshoot this issue which mitigated issues on our end.
(2) The following week we experienced site performance issues resulting in 504 errors. Our engineering team investigated this issue and discovered that the huge influx of traffic due to winter season tournaments and the beginning of the spring season caused our systems to reach over 60% capacity. Our team mitigated this issue and stood up additional instances to manage the capacity and eliminate the 504 errors. Moving forward our team is mapping the seasonality of these high-trafficked times and will proactively stand-up instances in advance to avoid this happening again.
(3) Unfortunately, once that issue was corrected we began to see a delay in the publishing of LiveStats and photos. Our engineering team began to investigate these issues, mitigating the problems, only to see the same issue repeat. What they began to notice is these delays had a trend and were occurring around the same time(s) every day. We engaged our Holding Company CTO as well as additional engineering resources at our Holding Company. They installed a tool called GlowRoot to monitor and identify the troubling queries. As of yesterday afternoon, they identified the culprit and are deleting the problem queries. They began this process yesterday during the day which has caused a few hiccups in performance. They completed some of the clean-up yesterday evening and are completing the rest tonight when they believe less site disruption will occur. We’re confident that these fixes will resolve the issues experienced as of late.
I do apologize that you’ve experienced so much trouble. Each incident occurring alone is manageable and may be understandable, but having three incidents occur back-to-back-to-back has created a perfect storm for performance issues as well as resulting support cases which have then resulted in the slower response time of our support team. Please know that our engineering team, support team, and customer success team are working day and night to fix these issues in a manner that won’t crash the entire system. I assure you we are committed to providing the service and support you deserve and expect. Please know that this email address and my number below are available to you at any time. If you continue to have issues or concerns, please know you can contact me directly.