Saturday 20th October 2018

Optimus Communications - SIP Trunks Incident Report on Voice Outage 18/10/2018

Priority: P2 - Location: Nationwide - Reference: 486247

Start Date/Time: 18/10/2018 16:07 - Resolution Date/Time: 18/10/2018 16:42 - Duration: 35 minutes


Event Description: [Restored] Degradation of Voice Services - Inbound/Outbound

Customer Impact: Some customers would have experienced issues with both incoming and outbound calls.

Summary of Incident: Following on from the voice outage yesterday afternoon, below is a timeline and summary of the outage:

Our monitoring system indicated that an upstream interconnect Session Border Controller (SBC) was under high load. At this time multiple tests were run to investigate the issue.

SIP Trunk registrations and calls started dropping for customers and an urgent escalation was raised. The upstream interconnect provider made the decision to restart their SBC, which appeared to resolve the issue, while we re-routed services to a secondary interconnect simultaneously.

Post event actions: Our engineers are reviewing SIP trunk SRV DNS records to ensure calls and registrations failover to our secondary interconnects quicker and will be implementing a change control process to action this.


Time line:

15:56 – First notification received of high load on host – engineers start investigating. Multiple tests calls are placed which are all successful.

16:00 – Notification received that the service was up.

16:07 – Continued notification of degraded service.

16:09 – Registrations on SIP Trunk servers start reducing and some calls are failing.

16:17 – Advised via Help Desk of customer calls

16:20 – Case raised with primary interconnect who restarted their SBC in attempt to resolve issue.

16:22 – Engineers believe the restart of the SBC resolved issue. Continued monitoring the situation.

16:35 - Engineers re-routed SIP Trunk calls to a secondary interconnect.

16:42 – Calls and registrations monitored as successful.

16:44 – Issue confirmed resolved and extended monitoring underway.