Skip to main content
Twitch leak 2021
  1. Posts/

Twitch leak 2021

··
Table of Contents
Leaks - This article is part of a series.
Part 1: This Article

📣 Since our article, the official Twitch blog has spoken out about the incident

We are two days after Facebook’s biggest historical outage, and it’s now Twitch’s turn to tease another member of the GAFAM.

Since last night, a mysterious post has appeared on 4Chan. After some claims, the author reveals a magnet link allowing the download of a significant amount of internal documents from the company Twitch.

So, it’s 128GB of compressed data that are now out in the wild, and importantly, they are all very recent. The last information in this leak dates from just yesterday.

Twitch’s teams have not yet responded, so the intrusion may not yet be under control. Caution is therefore advised before renewing passwords and other stream tokens.

For this analysis, we have based ourselves on a copy of the data that leaked within a journalistic framework as well as on the analyses of various IT experts and their publications.

Twitch Leak Contents
#

Datasets and Databases
#

Among the mass of data made public (More than 4000 CSV files), there are very sensitive and strategic data such as:

  • Streamers’ earnings
    • Sources of income (Sub, Prime, Donations)
    • Amounts of earnings (In revenue, not profit)
    • Streamer status (Partner, Affiliate)
  • Stream statistics (Time / Month / Streamer)
  • Game statistics (Time / Month)
  • Audience trends by periods (Time / Month / Games)
  • Advertisers list
    • Private contact details

All this data is raw and needs to be processed by analysts to become readable for the majority. From the initial analyses we have access to, there are numerous disparities concerning the remuneration of broadcasters on the platform.

For example, it has been shown that 50% of the platform’s streamers receive less than one dollar per month as remuneration via Twitch. Enough to surprise some naysayers.

Code and Security
#

The leak includes about 6000 internal repositories that have been recovered. We can distinguish two things: Firstly, the source code of the platform and internal tools. But also server configurations and other text files used for GitOps logic.

There is quite a bit of internal documentation on the use of tools, their role, logical models, or even the new arrival’s guide.

More than 6000 internal code repositories have thus been recovered and here are some figures drawn from a statistical analysis of the files.

Statistics

Statistics of the source files

Cybersecurity
#

In terms of cybersecurity practices, there appears to be a bit of everything. Both good and not so good. Given the vastness of the codebase, it’s evident that the developers’ level of cybersecurity awareness is not uniform.

Among the bad practices we noted:

  • Lists of hard-coded admin identifiers of the platform
  • Private SSH keys directly in the code
  • Administrator accounts without passwords on certain machines
  • The use of weak hashing functions until 2015 (SHA1 before transitioning to bcrypt)

In one of the folders, we had access to a welcome guide for new developers. VPN access, internal tools, everything is explained, including the use of a Yubikey security key.

Video Encoders
#

In the sources, one folder caught our attention. This is the source of the video encoders used by the platform. Twitch, being a platform based on live video streaming, holds this as one of the company’s biggest industrial secrets. One can easily imagine the competition rushing to study these precious sources to understand the implementations.

Recommendation Algorithms
#

Still on the subject of industrial secrets, we can find recommendation algorithms, the spearhead of this type of platform. They are the ones that determine what will or will not be visible.

These algorithms, typically highly opaque, are now out in the open. No doubt the more adventurous data miners will take the time to comb through them. Once the magic formula is mastered, the platform’s actors will be able to optimize their channel to improve their visibility.

Vapor
#

It’s quite embarrassing for Twitch, as the entirety of the client and server code for the game distribution service named Vapor was present in the leaks. This is a direct competitor to Steam and the Epic Games Launcher that had never been announced by the company.

In addition, a game written with the Unity3D engine called VaporWorld was also found, which strongly resembles a VR chat.

In short, the element of surprise is definitely gone. 😬

Conclusion
#

Through this massive data leak, we see the first big American tech company laid bare. Industrial secrets, core business, business data, it’s all been exposed.

Of course, our thoughts are with the teams that are on deck. Behind a crisis, there are many humans under pressure trying to restore order, so strength to you 💪

As always, a big thank you to Nicolas for his careful proofreading 😜

Sources
#

Twitter
#

Defend Intelligence

Khaos Farbauti Ibn Oblivion

Akanoa

Twitch

Jake_$

Troy Hunt

Article
#

50 Nuances d'Octets
Author
50 Nuances d’Octets
No bullshit 🛸
Author
Guillaume Assier
Tech, Cloud et Cybersécurité ⛅
Leaks - This article is part of a series.
Part 1: This Article