Chris McLaren (aka Big-Headed Canadian Whiskey Man) points to the Gender Genie, which analyzes writing samples and predicts whether the author is male or female. Somewhat concerningly, he said it thinks I’m a man. (Of course, it also thought Emma Bull was.) I decided to test it out with some longer samples.
Scores
The first 1,000 words and change of a new story I just finished called Cassie Says:
Female Score: 1883
Male Score: 1007
The Gender Genie thinks the author of this passage is: female!
So far, so good, but that’s a pretty tight score.
The first chapter of my YA novel Girl’s Gang:
Female Score: 2083
Male Score: 2137
The Gender Genie thinks the author of this passage is: male!
Wrongo!
A longer blog entry, From Two to Four:
Female Score: 1416
Male Score: 1776
The Gender Genie thinks the author of this passage is: male!
Of course, according to the stats, the Genie seems to be running about 41 percent incorrect. Not knowing much about what this is based on (and being unable to see the Nature article), I’m not sure any conclusions can be drawn from this — except perhaps that this oversimplifies the algorithm or something. (Or it’s just stoopid.) Kind of interesting though.
Well, I fed that thing seven of my blog entries and it called them all as male, usually with a gap of at least 2000 points. The website’s response when I reported the error – ‘that is one butch chick’.
Perhaps not coincidentally, I’ve been gnashing my teeth at various responses all over the interweb to my recent Babylon 5 post, whose authors insist on referring to me in the masculine. Is it a knee-jerk assumption based on the subject matter, or just laziness on the part of these respondents (my gender-specific name is, after all, right there at the bottom of the post)?
If I remember correctly, the algorithm gives words like ‘I’, ‘me’, ‘mine’, etc a higher male score. Because men are more egocentric, I don’t know? Anyway, that probably explains the weird blog post scores.
Except that a) my blog posts don’t tend to be confessional and b) you can classify your text as a blog post and presumably compensate for this problem.
I was rather tickled to see, by the way, to see that ‘the’ is categorized as a male word.