An exploration of tweets using cluster analysis. By Ellie Frymire.

Online, things happen quickly. It’s rare to feel any long-lasting sentiment after a trending tweet spins around the world—we’re often quickly on to the next, before fully getting our heads around the implications of the last.

But on October 15, 2017, Alyssa Milano’s #MeToo tweet spurred a flood of stories that for myself—and many other hundreds of thousands—encouraged a real moment of pause and retrospection. Combing through the hashtag online, and reading the countless experiences shared by friends, colleagues, and acquaintances via Facebook, many uncomfortable and disturbing past encounters began to take on a new meaning. What once felt like an inevitable and unavoidable part of life felt met with strong resistance and a global network of solidarity. I’ve never spent so long reading through the results of a hashtag—trying to make sense of it all and feeling connected to so many women that I’d never met. And it was the same for a data visualization student at New York’s Parsons School of Design named Ellie Frymire.

For Frymire—with her bachelors degree in mathematics and experience working at an analytics consultancy—the obvious way to get to grips with the #MeToo movement was through data. “I wanted to find out: What are people really saying with #MeToo?” Frymire says to me when I met her at this year’s Design Indaba in Cape Town, where she’s just presented her Parson’s thesis project as part of the conference’s Global Graduate program. “I was curious about who the people are using the hashtag. What are their stories?”

Using k-means cluster analysis—an unsupervised machine learning process that uses input to find natural groups in the data—Frymire has identified the words and themes associated with #MeToo. Simply put, k-means clustering is a method that uses an algorithm to find groups in a set of data—so instead of the data scientist defining groups themselves, clustering allows you to see groups that form without human intervention. For example, if a tweet contains the words “Trump” and “vote,” it’s assigned to be grouped with other tweets that use those words—and then gets put into clusters of similar words. The result means that tweets with words like “Trump” have ended up in clusters with tweets using “Clinton” and “Democrat.” The algorithm doesn’t know they are political tweets, but based on the words within the tweet, it knows that they belong together.

The results of Frymire’s project is a series of clusters—425 to be exact. The clusters vary in size: one for example, which includes 3,641 tweets, is brought together through the shared use of words like “courage,” “women,” “finally,” “power,” “believe,” “people,” “love,” and “alone.” Another, of 2,437 tweets, features “rape,” “abuse,” “wondering,” “public,” “men,” and “real” as its top words. When analysing the clusters, Frymire discovered details such as that the word “power” often occurs in tweets about sexual assault in the workplace.

#MeToo isn’t a political issue, but it was taken to be one.

“Once I used an unbiased solution to make the groupings, I could then put my influence on top,” says Frymire. Looking through her findings, several themes occurred—which Frymire then organised into a series of data visualizations for her project’s web platform. Her first theme is the political clusters—with the top words of its tweets including “vote,” “please,” “end,” and “Trump.” Scroll over the visualization online, and you can read the individual tweets that make up the cluster: the larger the dot representing each tweet, the more shared it was. “#MeToo isn’t a political issue, but it was taken to be one,” says Frymire.

The second theme is workplace clusters. “#MeToo sparked conversation about sexual harassment in the workplace,” says Frymire. “With these clusters, we can observe how a core tenant of the movement is the abuse of power.” Next, the angry clusters reveal the anger of both supporters and critics of the movement—its tweets are riddled with words like “fucked” “shit,” and “disgusting.” The conversation clusters feature inspiring stories and set out to encourage change and discussion. Lastly, the uplifting clusters exemplify how the movement created a community of support—and how it commends the brave women and men that have come forward.

I wanted to do more than count voices. I wanted to make sure voices are also heard. 

“Although I have my own interpretation of the findings, I think it’s important for all of us to find power and growth in this movement,” says Frymire. “This work is not done and never will be. Not only is it growing, but the data is expanding. There are so many ways to understand digital native movements.”

When a moment—or movement—occurs online, it often gets reduced to numbers. We see how many retweets or likes a sentiment has received, or we hear on the news that X amount of people have used a particular hashtag. Yet how—or why—a hashtag is used often gets lost: very rarely is there data on how it’s interpreted, hijacked, twisted, or how it unites and inspires. “I wanted to do more than count voices,” says Frymire. “I wanted to make sure voices are also heard.”