In November of 2022, Stepford was presented to and tested by a group of eight volunteers in Manchester, UK.
Naromass gave a short presentation on the development of the tool, including theoretical influences and the representation of AI in media.
In self-selected pairs, participants engaged with the tool, indicating whether they agreed with Stepford’s scoring of texts for sexist bias.
The text segments were split between the condition of ‘AI-authored’ vs. ‘human-authored’, visible to the participants. A member of the development team sat with the participants while they engaged in the testing, which led to rich discussions about the nature of creativity, machine learning, and the future of synthetic media.
After testing of approximately 20-30 minutes, participants were interviewed in those same pairs with the following questions:
- How was using the tool for you?
- Who might use this tool?
- How might it be misused, or how could it go wrong?
- Any other comments?
Overall, participants found that engaging with the tool in pairs made them feel more secure in their scoring choices than they would have if they used Stepford alone. One mentioned that the tool provided an entry point to the “sort of discussions that can be uncomfortable” and that varying levels of sensitivity to sexist language between the pairs led to conversations that led them to examine their own biases.
Stepford’s scoring output having a sort of human character in its expression was received positively by members of the group who found it easier to react to rather than a purely informational delivery of the scoring.
For the second question of who and/or what platforms might use the tool, responses included:
- Educational Institutions
- Students submitting written work
- Recruiters
- Communication platforms/apps
- Browser plugins
- Speech writers
Potential similarities to the function of Grammarly were mentioned, with Stepford’s functionality serving as a reminder to consider the intended audience of a piece of text.
Participants pointed out potential misuses of the tool, echoing some of the feedback we’ve received during previous events, as well as the team’s own predictions based on the proliferation of technologies that reproduce existing hierarchies of agency in society:
I was getting into the realm of cancelling people. Who knows where the tool will go and how powerful it will become?
People are complicated and they don’t always get things right and that’s what is brilliant about people.
When it starts to tread the line of censorship. Ugly historical facts are still truth, and when it reinterprets and changes things, it becomes a new reality.
The clean ideal in creative writing… sometimes the portrayal of awful behaviour is important to bring. All the isms aren’t ok but pretending they don’t exist is also wrong.
Wondering aloud how to distinguish between the function of bias in narrative text versus the harmful reproduction of bias that is baked into algorithms generated discussion, though the conclusion reached by many was that human intervention is crucial to creating nuanced tools.
Findings from this event will influence the next phase of Stepford’s development.