When Stanley Kubrick set out to design the videophone featured in his 1968 film 2001: A Space Odyssey, he enlisted help from a Bell Labs researcher. The invention they came up with sits inside a public booth behind a hydraulic glass door and features a full-screen, color image with a camera that follows its subject. A few years earlier, in The Jetsons, videophones perched on walls and tabletops, both at home and in the office, appearing like midcentury televisions with antennas. Video calls also featured on Star Trek in the mid-’60s, where they’re displayed on large, conference room-style screens on the bridge of the USS Enterprise.
In all of these retrofuturistic depictions, video call technology has a kind of analog elegance that seems far removed from the digital video calls we make today from our cell phones, tablets, and computers. One important element of video conferencing that fiction did not anticipate? The image of oneself that appears onscreen alongside the participants in the call, commonly referred to as “self-view.” Self-view is not a particularly new feature of today’s video conferencing platforms, and until recently it’s gone relatively unexamined in the popular consciousness. But over the past months, as COVID-19 has made Zoom calls, Google Hangouts, and FaceTime a regular part of daily life, the prospect of staring at yourself for hours at a time has spurred all kinds of new concerns and discomforts. Which has begged the question: Why does all video chat technology today include the self-view, and where did it come from?
When “self-view” first appeared
Like their fictional counterparts, the real life versions of video call technology that predated the internet also didn’t include self-images the way we see them today. The AT&T Picturephone, for example (which also came out of Bell Labs), was a technically sophisticated but prohibitively expensive two-way video phone first shown in 1964 at the New York City World’s Fair that looked like a desktop television, with a rounded back and five-inch black and white screen. The display options were binary—you could choose to put yourself on the screen or the caller—with the self-view more or less functioning as a pre-conversation glance in the mirror. The Mitsubishi Luma 2000, an office phone released in 1986, allowed callers to send black and white stills of themselves mid-conversation. The following year, Mitsubishi released VisiTel, a home version of the technology that only sent images while the caller spoke on a landline.
When initial models of PictureTel video conferencing systems first appeared in the mid 1990s, they looked a bit like what sci-fi production designers had imagined. They transmitted color images between large monitors (sometimes wired to a television screen) and had dynamic cameras. Desktop video conferencing software in the 1990s, like CU-SeeMe and VDOPhone Internet, which featured self-view in a separate window, were among the first platforms to offer users a view of their camera’s feed mid-conversation. Early versions of the PolyCom Viewstation, revolutionary when it was released in 1998 at six pounds and “only $6,000,” had a built-in “picture-in-picture window (PiP),” which seems to have set the norm for incorporating a self-image into a larger full-screen display. When Skype started offering video calls 2005, it incorporated self-image into its interface, and that image later became larger with advancements in resolution. As video conferencing hardware got smaller and more mobile, the software got bigger, and displays became more complex.
It seems that designers started incorporating self-images into video call interfaces simply because they could. Almost twenty years after Skype became a verb, Shawn Sprockett, design director of emergent experiences at Godfrey Dadich Partners, says that self-view is a “locked-in” feature that people have come to expect. For many designers, too, this convention has become a given. In his last job, on the UX team for Portal, Facebook’s line of in-home video call devices, no one even considered removing it. One reason this feature has remained so consistent, Sprockett believes, is because it solves the problem of orientation. “When I’m sharing a space with you, I know what you can see of me,” he says. “If I stand behind a pole, I know what you can’t see. You don’t have that same advantage when you’re doing a video call.” In other words, self-view shows you the image you’re presenting to others — whether it’s out of frame, unflatteringly angled, or poorly lit.
Yet for all its usefulness in self-presentation, the self-view can also be a distraction. In a 2017 study published in Computers and Human Behavior, researchers at Marquette University found that seeing oneself during video calls negatively impacted team performance and individual satisfaction with both process and outcomes. They offered two possible explanations as to why: increased self-awareness and cognitive overload. According to the study, a view of oneself “shifts individuals’ focus from the environment and task,” while too much information of any kind leads to lower performance. Given this, the researchers suggest that “as technology and system bandwidth increase, individual virtual team members may actually become less effective.” Unless, they note, users have the option to turn off self-view, a feature that Zoom offers but most of its competitors currently don’t.
New views on the horizon
As we’ve become acutely aware amid the pandemic, today’s video chat options are still a ways away from recreating the feeling of seeing and talking to someone IRL, sitting across a conference table from you. Therefore, we use self-view to remind ourselves to stay attentive. In my conversations with UX designers, the idea of “presence” came up as a consistent goal in video chat design—it’s what can differentiate one platform’s experience from another. Most of today’s most prevalent video platforms—Zoom, Skype, Google Hangouts—are one-dimensional and have limited range cameras. But that’s beginning to change, and when it does it will potentially shift the need for self-view.
Take, for example, concepts like The Square, a multi-camera LED screen that looks like a window, or Portal products, which include tabletop devices and televisions, giving platforms a fuller, multi-camera view of the spaces we live and work in seems to be the direction video platforms are moving in. A more 3D perspective on both sides of a call would make conversations feel more real, but would also require users to share more of their own environment.
Jonas Christiansen, head of product at Bunch, an app that combines video chat with games, estimates that the impact of COVID-19 has accelerated adoption of existing video conference technology by five years. He thinks that whatever platform displaces Zoom will likely come from the bottom up, not the corporate world. As was the case with Facebook, “teenagers will use it first and make it cool,” Christiansen says, and then the rest of us will take notice. It seems unlikely, then, that the next big advancement in video call technology will be hardware. So if we’re stuck with software advancements on existing devices for the moment, a greater adoption of a “hide self-view” option might be the next big design evolution in video calls.
Esther Crawford, CEO of Squad, a mobile and desktop video chat platform that’s been described as the “Gen-Z Zoom” has no interest in redesigning her product for B2B customers. Squad has a self-view mode that can’t be turned off without opting out of video entirely, but the company has also been toying with voice and image filters, text overlay, and the ability to customize video layouts. She thinks that mainstream video chat services have looked the same for so long because they were designed by the same kinds of people and for the same kind of uses. Indeed, when watching the promotional videos for Picturephone, it’s hard not to think about how most enterprise-first products are built for and by those who tend to hold the most power in business. But as video chats become a larger part of socializing outside of the corporate world, it makes sense that we’ll start to see more playful interfaces and features.
Still, change is slow in technology made for work. (See: Microsoft Outlook.) Despite the visibility of the Brady Bunch-style tile interface (and Google Meet’s recent decision to add a gallery view to compete with Zoom), the default presentation style for both platforms is an active speaker design, which makes whoever is talking larger than the other participants in a call. Giving equal space to every person seems like an obvious design choice, but in a corporate culture where the loudest voice is the one that gets listened to, it’s not surprising that this imbalance is the norm. With people from a wider range of industries and demographics than ever before now using video calls regularly, we may start to see more challenging of norms that have defined the look of virtual communication.
This story is part of an ongoing series about UX Design in partnership with Adobe XD, the collaboration platform that helps teams create designs for websites, mobile apps, and more.