Skip to content

Instantly share code, notes, and snippets.

@buley
Created March 3, 2013 16:57
Show Gist options
  • Save buley/5076940 to your computer and use it in GitHub Desktop.
Save buley/5076940 to your computer and use it in GitHub Desktop.
Glass Research Raw Data
* For the (6332 + 1026) Google+ postings whose authors had first names assigned to a gender group, TK% (6332/(6332 + 1026)) were guessed to be male and TK% (1026/(6332 + 1026)) female. Including 517 uncertain cases, which somewhat generously throw out anything guessed to be less than 95% probability, the numbers water down to TK% male (6332/(517 + 6332 + 1026)) and TK% (1026/(517 + 6332 + 1026)) female. Among the uncertain cases, TK% (396/517) where thought to be 'likely male' and TK% (121/517) 'likely female'. TK% (2907/(6332 + 1026 + 2907 + 517)) of the overall postings were deemed ambiguous.
array(8) {
["all"]=>
int(10782)
["male"]=>
int(6332)
["gendered"]=>
int(10265)
["likely-male"]=>
int(396)
["uncertain"]=>
int(517)
["ambiguous"]=>
int(2907)
["female"]=>
int(1026)
["likely-female"]=>
int(121)
}
* For the (8407 + 2117) Twitter tweets whose authors had first names assigned to a gender group, TK% (8407/(8407 + 2117)) were guessed to be male and TK% (2117/(8407 + 2117)) female. Including 807 uncertain cases, the numbers adjust to TK% male (8407/(807 + 8407 + 2117)) and TK% (2117/(807 + 8407 + 2117)) female. Among the uncertain cases, TK% (581/807) where thought to be 'likely male' and TK% (226/807) 'likely female'. TK% (10353/(10353 + 8407 + 2117 + 807)) of the overall postings were deemed ambiguous.
array(8) {
["all"]=>
int(21684)
["male"]=>
int(8407)
["gendered"]=>
int(20877)
["female"]=>
int(2117)
["ambiguous"]=>
int(10353)
["likely-male"]=>
int(581)
["uncertain"]=>
int(807)
["likely-female"]=>
int(226)
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment