KDnuggets : News : 2003 : n13 : item31 < PREVIOUS | NEXT >

Briefs

Software can identify the writer's gender just by reading

Scotsman (UK) (07/09/03); Doherty, James

A computer algorithm developed by Moshe Koppel at Israel's Bar Ilan University can reportedly determine the gender of an author with 80 percent accuracy. The program was fed 604 documents from the British National Corpus that were equally split between male and female authors. Following the program's sampling of the 604 texts, the remaining 3,520 documents in the British National Corpus were added so that the algorithm could ascertain writing elements unique to either men or women, and the program came up with 50 features that could be used to predict the writer's sex. Through the program, the researchers discovered that women are more likely than men to employ personal pronouns such as "I," "you," and "she," whereas men regularly use "a," "the," "these," and numbers and quantifiers.

However, Koppel notes that a paper detailing the research team's findings was rejected by the National Academy of Sciences, which thought the experiment's conclusions were sexist. The team then tried to quell criticism by using the program to analyze scientific documents. Old Dominion University linguist Janet Bing also finds fault with gender-detection experiments, which she says cannot account for lesbian, gay, bisexual, and transgendered individuals. She says, "This whole rush to categorization usually works against women."

Here is the full story.


KDnuggets : News : 2003 : n13 : item31 < PREVIOUS | NEXT >

Copyright © 2003 KDnuggets.   Subscribe to KDnuggets News!