Spike Jonze uses close-ups to teach us about him in Her

Spike Jonze uses close-ups to teach us about him in Her

Most of us are accomplished watchers of TV and film, so we intuitively understand some of the concepts lurking in dense film-theory tomes. You won’t need them for Internet Film SchoolThe A.V. Club’s column about film and television. In each installment, we explore a basic element of visual composition and analyze examples to understand how the formal properties of film and television manipulate viewers. 

Spike Jonze’s Her is a conventional love story with an unconventional constraint—one of the lovers is an inhuman, disembodied consciousness that interacts with the other via an earbud. Typically, romances rely on a small stable of predictable-but-effective techniques that convince the audience it’s witnessing the first, chemical blush of fresh love.

The most basic of these techniques is the two-shot, in which the director places both prospective lovers in the same frame. A series of two-shots, stacked one after the other, has a cumulative effect on the audience, which begins to expect to see these two characters together in every shot. After a while, shots that only contain one of the lovers will strike the audience as oddly empty, even if the sole lover in it is centrally framed in a way that would make it impossible for the other to be in the shot. By manipulating audience expectations in this way, the missing lover becomes an absent-presence in the film, something the audience wants to see. If the director only includes one lover in shots for an extended period of time, the audience will begin to feel that something is “wrong,” because the director is confounding the expectation he or she created. When the director relents and fulfills that expectation with a two-shot of the lovers reunited, the frame suddenly seems somehow more “correct” to the audience.

The problem Jonze faced in directing Her becomes obvious: Because “she” is an operating system, Scarlett Johansson’s “Samantha” can’t ever occupy the same frame as Joaquin Phoenix’s Theodore. Jonze cannot rely on the emotional resonance created by on-screen proximity, but he still needs to find a way to make the audience feel that these lovers share a profound connection. His solution is simple: He will force the audience to study Phoenix’s face, repeatedly and at some length, by employing unnecessary jump cuts and unnecessarily slow zoom-ins.

That these cuts and zooms involve Phoenix’s face is no accident. Human beings are hardwired to read faces. Large swaths of mental real estate are devoted to facial processing, to recognizing that certain configurations of lips and eyes correspond to a particular set of emotions. The hardware behind facial processing is so powerful that people frequently see faces where none actually exist—for example, on a piece of burnt toast or when a close-parenthesis follows a colon. Jonze exploits this innate desire to understand faces, creating an emotional connection to Theodore that approximates the power of expectation two-shots would have produced.

To wit, when Theodore first decides to acquire his new operating system/future girlfriend, Jonze begins the scene with an off-center medium shot…

The compositional imbalance of the shot alerts the audience to the fact that something is wrong with Theodore, that something is missing. Had he been in the center of the shot, he would have possessed a more commanding presence, almost as if the world were centered on, or at least presently revolved around, the power of his intellect in action. “Leave this important man alone,” it would have suggested. “He is having important thoughts.”

Instead, the audience is presented with an image of a man incapable of dominating a frame even when he’s the only significant element in it. Put differently, he seems as important as the empty space frame-right, which is only important by virtue of its emptiness. If there were something or someone filling it up, after all, he might not even be contemplating buying this new operating system, which is what he’s doing.

Jonze cuts to the commercial for the operating system, then back to Theodore, only instead of reversing back to the original medium shot, he reverses back to a medium close-up. The imbalance still exists, as does the empty space it creates, but it has lost a little of its severity. Watching this commercial has, visually speaking, made Theodore seem a little less lonesome and a little more in control.

More importantly, Jonze has strongly signaled to members of the audience that he wants them to stare at Theodore’s face. His mouth is saying something, even though nothing is currently leaving it, and Jonze is providing the viewer with the images required to further understand what Theodore’s silent face is trying to say. He cuts back to the commercial, then back again to Theodore…

Yet Jonze returns not to the medium close-up he cut away from, but to a proper close-up that is almost perfectly centered, as if making the decision to acquire “Samantha” has provided Theodore a way to become the central element in his own mental environment. Moreover, members of the audience can continue studying Theodore’s face, because Jonze has once again moved them closer to it. If he had only done this once or twice, a viewer would not begin to expect ever-increasing access to the micro-expressions that flash across Phoenix’s face. But Jonze uses this technique throughout the film, creating a pattern of medium shots that either jump-cut or zoom into medium close-ups or close-ups.

For example, when Theodore plays the ukulele for Samantha, Jonze opens with a long shot…

Before jump-cutting to a close-up…

This conversation is shot such that it follows the pattern established by Theodore’s first encounter with her: The camera cuts from an off-balance longer shot to an almost perfectly centered closer one. In compositional terms, Jonze informs the audience that the more Theodore interacts with Samantha, the more “centered” his life becomes; at the same time, he also provides the audience with the means of understanding the emotional balance “she” provides him by cutting closer and closer to his human face. The part of the human brain that processes facial patterns is, with great regularity, being called into action, such that the audience increasingly expects that it will be provided with access intimate enough to read each and every meaningful wrinkle on Theodore’s face.

As Her progresses, long and medium shots begin to feel inadequate to an audience now fully primed to perform hardcore facial analysis. Longer shots lack the visual information the audience now requires—nay, demands—about Theodore’s emotional state. The distance between the camera and Phoenix’s face acquires a psychological dimension, so longer shots make the audience feel like it isn’t as “close” to Theodore as it had been. Zoom-ins like this one become extended exercises in agony…

Over the 32 seconds it takes to move from that initial medium close-up to that final extreme close-up, the audience is slowly doled more of the facial information it is expecting. Instead of the brusque cut, though, the act of acquiring additional information occurs gradually, developing over time, lending the process itself the melancholic twinge not unlike the one that accompanies a slow, painful realization.

Hammering home this impression is the fact that the initial medium close-up possessed compositional balance, whereas the final close-up is off-centered. When he first encountered “her,” Samantha brought a sense of balance to Theodore’s life and to Jonze’s framing of it. Here at the end, though, as Theodore is losing “her,” Jonze reverses the logic of those earlier shots, using centered shots that become increasingly imbalanced as they move closer to Theodore’s face. 

Theodore has returned to his imbalanced life, only now it seems sadder to members of the audience, because they are acutely aware not only of the centeredness he has lost, but of the toll that loss has taken on a face they’ve come to know so well.