GitHub February 28th DDoS Incident Report

Brink

Administrator
Staff member
Local time
9:11 PM
Messages
74,808
Location
Oklahoma
On Wednesday, February 28, 2018 GitHub.com was unavailable from 17:21 to 17:26 UTC and intermittently unavailable from 17:26 to 17:30 UTC due to a distributed denial-of-service (DDoS) attack. We understand how much you rely on GitHub and we know the availability of our service is of critical importance to our users. To note, at no point was the confidentiality or integrity of your data at risk. We are sorry for the impact of this incident and would like to describe the event, the efforts we’ve taken to drive availability, and how we aim to improve response and mitigation moving forward.

Background

Cloudflare described an amplification vector using memcached over UDP in their blog post this week, “Memcrashed - Major amplification attacks from UDP port 11211”. The attack works by abusing memcached instances that are inadvertently accessible on the public internet with UDP support enabled. Spoofing of IP addresses allows memcached’s responses to be targeted against another address, like ones used to serve GitHub.com, and send more data toward the target than needs to be sent by the unspoofed source. The vulnerability via misconfiguration described in the post is somewhat unique amongst that class of attacks because the amplification factor is up to 51,000, meaning that for each byte sent by the attacker, up to 51KB is sent toward the target.

Over the past year we have deployed additional transit to our facilities. We’ve more than doubled our transit capacity during that time, which has allowed us to withstand certain volumetric attacks without impact to users. We’re continuing to deploy additional transit capacity and develop robust peering relationships across a diverse set of exchanges. Even still, attacks like this sometimes require the help of partners with larger transit networks to provide blocking and filtering.

The incident

Between 17:21 and 17:30 UTC on February 28th we identified and mitigated a significant volumetric DDoS attack. The attack originated from over a thousand different autonomous systems (ASNs) across tens of thousands of unique endpoints. It was an amplification attack using the memcached-based approach described above that peaked at 1.35Tbps via 126.9 million packets per second.

At 17:21 UTC our network monitoring system detected an anomaly in the ratio of ingress to egress traffic and notified the on-call engineer and others in our chat system. This graph shows inbound versus outbound throughput over transit links:

36830510-bd9ea02e-1cd8-11e8-96c0-dc199a530164.png


Given the increase in inbound transit bandwidth to over 100Gbps in one of our facilities, the decision was made to move traffic to Akamai, who could help provide additional edge network capacity. At 17:26 UTC the command was initiated via our ChatOps tooling to withdraw BGP announcements over transit providers and announce AS36459 exclusively over our links to Akamai. Routes reconverged in the next few minutes and access control lists mitigated the attack at their border. Monitoring of transit bandwidth levels and load balancer response codes indicated a full recovery at 17:30 UTC. At 17:34 UTC routes to internet exchanges were withdrawn as a follow-up to shift an additional 40Gbps away from our edge.

36830539-e48f683a-1cd8-11e8-9ef7-ec7e923a0c7f.png


The first portion of the attack peaked at 1.35Tbps and there was a second 400Gbps spike a little after 18:00 UTC. This graph provided by Akamai shows inbound traffic in bits per second that reached their edge:

36830593-2ae2cf98-1cd9-11e8-944d-dde6248ac0e5.png


Next steps

Making GitHub’s edge infrastructure more resilient to current and future conditions of the internet and less dependent upon human involvement requires better automated intervention. We’re investigating the use of our monitoring infrastructure to automate enabling DDoS mitigation providers and will continue to measure our response times to incidents like this with a goal of reducing mean time to recovery (MTTR).

We’re going to continue to expand our edge network and strive to identify and mitigate new attack vectors before they affect your workflow on GitHub.com.

We know how much you rely on GitHub for your projects and businesses to succeed. We will continue to analyze this and other events that impact our availability, build better detection systems, and streamline response.


Source: February 28th DDoS Incident Report | GitHub Engineering
 

My Computer

Computer type
PC/Desktop
Computer Manufacturer/Model Number
Self built custom
OS
64-bit Windows 11 Pro for Workstations
CPU
Intel i7-8700K OC'd to 5 GHz
Motherboard
ASUS ROG Maximus XI Formula Z390
Memory
64 GB (4x16GB) G.SKILL TridentZ RGB DDR4 3600 MHz
Graphics Card(s)
ASUS ROG-STRIX-GTX1080TI-O11G-GAMING
Sound Card
Integrated
Monitor(s) Displays
2 x Samsung Odyssey G7 27"
Screen Resolution
2560x1440
Hard Drives
1TB Samsung 990 PRO M.2,
4TB Samsung 990 PRO PRO M.2,
TerraMaster F8 SSD Plus NAS
PSU
Seasonic Prime Titanium 850W
Case
Thermaltake Core P3
Cooling
Corsair Hydro H115i
Keyboard
Logitech wireless K800
Mouse
Logitech MX Master 4
Internet Speed
2 Gb/s Download and 100 Mb/s Upload
Antivirus
Malwarebyte Anti-Malware Premium
Browser
Google Chrome
Other Info
Logitech Z625 speaker system,
Logitech BRIO 4K Pro webcam,
HP Color LaserJet Pro MFP M477fdn,
APC SMART-UPS RT 1000 XL - SURT1000XLI,
Galaxy S23 Plus phone
Back
Top