This is a nearly complete collection of Gab accounts, collected between August 10, 2016 and October 28, 2018. There are two files:
Users Information (Node List) (1.2GB)
A JSON Lines text format file, with one line per account. This contains many many different fields many are directly from the API, most should be self explanatory. A couple notes though:
usernameis globally unique
created_at_month_labelis when the account was created
- the users bios (
bio) are auto-populated with quotations, so empty bios are rare
has_hate_speech, and a few others are baed on the user’s posts
hate_probsis the output of a simple hate-speech detector run on the bio
Edge List (2.0GB)
A CSV edge list file, with the source following and/or reposting the target. The
is_follow column indicates if the edge is a follow, if it’s false than the edge only a reposts.
reposts_count shows how many times the source has reposted the target. Please not that the edges were generated for each user independently, so many edges are present twice.
More coming soon
I’m working a collection of chess engines and will have them available once they’re working