The Journal of Sonic Studies

Journal of Sonic Studies, volume 3, nr. 1 (October 2012)

EDITORIAL: Rethinking Theories of Television Sound

Carolyn Birdsall, Anthony Enns

Studies on television sound typically begin by emphasizing that television, unlike film, relies more heavily on sounds than images and that the sound practices used in the production of television’s primary genres (including news, sports, game shows, sitcoms, commercials, etc.) are based on practices developed not for film sound but rather for radio. For example, in his 1982 book Visible Fictions John Ellis argues that television, unlike film, employs sound “to ensure a certain level of attention, to drag viewers back to looking at the set” (Ellis 1982: 128). Sound is more important for television, in other words, because it appeals to the sense of hearing rather than the voyeuristic pleasures of the cinematic gaze. Rick Altman’s 1986 essay “Television/Sound” similarly argues that film viewers assume “the stance of the voyeur,” while television employs sound to draw the viewer’s attention away from “surrounding objects of attention” (Altman 1986: 50). Altman also argues that the television soundtrack “serves a value-laden editing function, identifying…the parts of the image that are sufficiently spectacular to merit closer attention on the part of the intermittent viewer” (Altman 1986: 47). Altman thus concludes that the average viewer watches television intermittently, and the soundtrack enables these “intermittent viewers” to follow the program even if they are not watching the images.

The assumption underlying Altman’s argument is that television is primarily viewed in the home and that its target audience consists of housewives. This assumption was directly inspired by Tania Modleski’s famous argument that the structure of television programming is based on the rhythms of domestic labor. A viewer engaged in household duties can thus be considered an “intermittent viewer,” and television programs appeal to these intermittent viewers by “addressing the audience and involving the spectators in dialogue, enjoining them to look, to see, to partake of that which is offered up for vision” (Modleski 1983: 50). The techniques used to structure this appeal were also inherited from radio, another medium that was received in the home and that was forced to compete with household distractions for the listener’s attention.[1]

These arguments have often been echoed in subsequent studies on television sound. In his 1985 essay “The Television Spectator-Subject,” for example, Robert Deming echoes Ellis’ argument that television primarily appeals to the sense of hearing: “[S]ound dominates [and] ensures continuity of attention through direct address” (Deming 1985: 50). Ellen Seiter’s 1987 essay “Semiotics, Structuralism and Television” also supports Altman’s notion that the television soundtrack is designed to draw the attention of “intermittent viewers” who are engaged in household chores: “Because television is a domestic appliance that we tend to have on while we are doing other things—cooking, eating, talking, caring for children, cleaning—our relationship to the television set is often that of auditor rather than viewer” (Seiter 1992: 32). Michel Chion also supports Ellis and Altman’s distinction between film and television in his 1990 book Audio-Vision, in which he argues that “in the cinema everything passes through an image or rather through a place of images,” but “sound, mainly the sound of speech, is always foremost in television” (Chion 1994: 157). Chion adds that unlike the use of sound in the cinema, television sound is “always there, in its place, and does not need the image to be identified” (Chion 1994: 157), which explains why “silent television is inconceivable” (Chion 1994: 165). Chion thus concludes that “television is fundamentally a kind of radio, ‘illustrated’ by images” (Chion 1994: 165). Patricia Holland similarly claims that “the flow of sound holds television programs together” (Holland 2000: 79), and Jeremy Butler even suggests that “[s]o little is communicated in the visuals of some genres…that they would cease to exist without sound” (Butler 2007: 228). Herbert Zettl similarly posits that “television is definitely not a predominantly visual medium.... All television events happen within a specific sound environment, and it is often the sound track that lends authenticity to the pictures and not the other way around” (Zettl 2005: 328-329). Michele Hilmes also emphasizes that “television owes its most basic narrative structures, programme formats, genres, modes of address, and aesthetic practices not to cinema but to radio” (Hilmes 2008: 160).[2] Jane Stadler similarly argues that television “is situated in people’s homes and uses sound to evoke an inclusive, familiar relationship with the audience” (Stadler 2009: 68), and she particularly emphasizes how television sound is used to manipulate audiences, as “television producers tend to amplify the sounds to which they want us to listen” (Stadler 2009: 69). By employing laugh tracks and raising the volume of television commercials, for example, producers are able to direct the viewer’s attention and influence their responses (see Smith 2008: 15-49). These critics thus agree that television sound not only provides coherence to the images and maintains the televisual flow, but it also serves as the primary tool of ideological interpellation by “hailing” the audience—an argument was already put forward in Philip Tagg’s 1979 book on the Kojak series, which used the military metaphor of the “reveille” to describe how the listener is affectively positioned through the use of sound and music (Tagg 1979). These critics also agree that the fundamental differences between film and television—in terms of structure, content, and modes of address—are a direct result of the fact that film privileges the eye, while television privileges the ear. This distinction aligns television much more closely to radio than film, and the notion of television as “illustrated radio” has since become the basis of television sound studies.[3]

In recent decades, however, the rise of widescreen televisions, high-definition receivers, home entertainment systems, and 3D technology has effectively brought the cinematic experience into the living room, which has challenged the persistent belief that television is primarily an acoustic rather than a visual medium.[4] John Caldwell points out, for example, that “even when Ellis was writing, MTV and Miami Vice and miniseries like Shogun had established highly visual arenas for narrative, music, and drama,” and “in the decade that followed a growing concern with stylishness evolved out of these forms” (Caldwell 1995: 365). Caldwell thus concludes that “the image has not acquiesced to any inherent low-resolution nature of the sound-driven video image. Far from it. There is, in fact, an obsession with making images that spectacularize, dazzle, and elicit gazelike viewing” (Caldwell 1995: 158). While television images have become increasingly cinematic, so has television sound, as Kevin Donnelly points out in his 2005 book The Spectre of Sound: “Due to the proliferation of ‘home-cinema’ set-ups with multiple speakers including woofers…television has begun to catch up with the sort of sonic experience for which many of the films it shows were actually designed” (Donnelly 2005: 112). With the introduction of stereo and surround sound speaker systems, in other words, television sound has finally “caught up” to the technical standards set by the film industry. For critics like Donnelly, therefore, the rise of “home cinema” requires scholars to shift their emphasis away from radio practices and begin applying new approaches to the study of television sound. Donnelly emphasizes a variety of practices used in the production of television music, for instance, that mirror long-standing practices used in the production of film music, such as the design of “underscores” to convey emotional states and enhance narrative tension.[5]

In addition to the rise of “home cinema,” the production, distribution, and reception of television have undergone a number of other changes in recent years, which further complicate earlier approaches to television sound. First, Altman’s notion that television sound mediates the relationship between “household flow” and “programming flow” is increasingly problematic, as television is now exhibited in a wide range of locations besides the traditional living room, such as lobbies, malls, airports, sports stadiums, and concert halls, where it often assumes the function of an electronic billboard (McCarthy 2001: 111).[6] Chion’s claim that “silent television is inconceivable” also appears to be obsolete, as viewers are increasingly watching television without sound.[7] The use of a “laugh track,” which has been a fundamental part of the sitcom genre and was inherited directly from radio practices, is also falling out of use in recent years, due in no small part to the fact that the “liveness” of television is increasingly being de-emphasized.[8] This shift in television sound practices has important ideological implications, as the laugh track was one of the principle tools through which television audiences were interpellated by television programs. As Karen Lury points out, “the laugh track is one of television’s most explicit attempts to promote the illusion of sociability, to suggest that television viewing is a social rather than an individual encounter” (Lury 2005: 83). By no longer guiding audience responses, therefore, the absence of a laugh track fundamentally alters the interpellative function of television sound.

Second, the notion of television as “home cinema” is also problematic due to the increased use of portable devices for viewing television programs. These devices include, but are not limited to, laptops, tablets, and cell phones, which transmit sound over low-quality speakers or headphones. The use of these portable devices signals a shift away from home viewing practices as well as a shift away from high fidelity sound reproduction, as portability, transferability, and ease of access have become more important than the simulation of a cinematic experience. The use of these portable devices thus problematizes the “home cinema” model of television and its continued relevance for understanding the function of television sound.

Third, the rise of new media interfaces and web viewing platforms has also altered the experience of watching television. Not only do these interfaces alter the context in which television programs are viewed by time-shifting, eliminating advertising, and removing programs from televisual flow, but they also provide metadata about the programs and their content, which further de-emphasizes the “liveness” of televisual texts.[9] Contemporary television programming is thus viewed in a wider range of contexts, it is viewed on a wider range of devices, and it is increasingly mediated by new media interfaces, which have introduced new sound practices and new modes of address that demand a reevaluation of the form and function of television sound. For example, Altman’s argument concerning television sound was based on institutional practices like the Nielsen rating system, which only calculated the number of television sets in the home that were turned on rather than the number of viewers who were watching them:

The Nielsen rating system…assumes that active viewing is the exclusive model of spectatorship, yet there is a growing body of data suggesting that intermittent attention is in fact the dominant mode of television viewing…. Since network strategists aim not at increasing viewership but at increasing ratings, and since those ratings count operating television sets rather than viewers, the industry has a vested interest in keeping sets on even when no viewers are seated in front of them. (Altman 1986: 42)

It remains unclear, however, whether this argument would still be relevant at a time when television is increasingly viewed in contexts and through devices that are part of a different economic model (i.e. subscription and pay-per-view).[10] If network decisions are less influenced by traditional rating systems, then this would suggest that the television soundtrack no longer serves the same economic function.

The problems outlined above show that there is an urgent need for television scholars to rethink the existing theories of television sound, and the essays collected in this special issue of the Journal of Sonic Studies are intended to provide the first step towards this goal by offering a reexamination of some of the most persistent accounts of television sound from the 1980s to the present. These essays examine the technological and aesthetic changes that have accompanied the rise of new technologies, production practices, and listening perspectives over the past few decades, and they draw on a wide range of genres and categories of television sound, including commentary, voice-over, sound effects, and music soundtracks.

Justin St. Clair’s “White Noise and Television Sound,” for example, looks at the cultural discourse surrounding television sound by comparing Ellis and Altman’s theories to the representation of television sound in Don DeLillo’s 1985 novel White Noise. Like Ellis and Altman, DeLillo’s novel represents television is primarily an acoustic rather than a visual medium, and it employs sound to manipulate and direct the attention of the presumably distracted and inattentive audience. The novel also illustrates the power of television sound by describing television as a “ubiquitous aural backdrop in the home” and by showing how the inner lives of the characters—their unconscious dreams and fantasies—are saturated with sound bites from corporate advertising campaigns. The novel also illustrates the power of television sound by muting the television set and by cross-tracking televisual images with unrelated sounds. The former technique makes readers more aware of the impact of sound through its absence, while the latter technique produces juxtapositions of audio and video that are often highly ironic, which effectively undermines the medium’s ability to manipulate and control its audience. St. Clair concludes that DeLillo’s novel “renders television audio an ironic literary device capable of turning the medium back upon itself” by showing readers how television sound permeates their consciousness yet bypasses their conscious awareness. The silent realm of print thus becomes a privileged space where the sonic effects of television can be scrutinized and critiqued.

David Sedman’s “The Legacy of Broadcast Stereo Sound: The Short Life of MTS Audio, 1984-2009” examines how television sound dramatically changed over the course of the 1980s and 1990s. Sedman points out that the development of stereo television in the 1980s was largely driven by improvements in cinema sound. Theaters had already begun promoting their ability to screen films with multi-channel audio in the late 1970s, and the television industry took advantage of this growing interest in stereo sound by heavily promoting programs that employed multi-channel sound mixes, such as late night talk shows, sports programs and prime-time dramas. Stereo sound played a particularly significant role in the development of television programs like Miami Vice and Cop Rock, whose visual and acoustic aesthetics were borrowed directly from music videos. Stereo sound also made it possible for television sound engineers to employ film sound practices by attaching “placement of actors onscreen to an audio space.” A similar shift occurred in television sports, where new methods of microphone placement were used to create a surround sound experience in the home. At the same time that television scholars were first developing the theory of television as “illustrated radio,” therefore, television itself was already in the process of transforming into “home cinema.”

Svein Hoier’s “The Relevance of Point of Audience in Television Sound: Rethinking a Problematic Term” examines the rise of increasingly ambitious sound design in contemporary television, which was made possible through the development of multi-channel audio. Hoier begins by examining the debate surrounding the term “point of audition” (POA). Although this term has been criticized for its dependence on a visual metaphor that does not adequately convey the unique properties of sound, Hoier argues that it is particularly useful for contemporary television scholars—provided that it is expanded into four categories that better convey the diversity television sound effects: 1) observational POA, which views the action from a distance, 2) active POA, which is closer to the action and conveys a greater sense of intimacy with the actors, 3) individual POA, which conveys the subjective experiences of a particular character, and 4) personal POA, which conveys the inner thoughts, feelings, emotions, or memories of a character. Active POA is extremely common on television programs that employ multiple microphone placements, such as talk shows, televised debates, game shows, morning shows and other voice-centered studio productions. Individual POA is more often used on prime-time dramas where characters experience a highly emotional state, and personal POA is used to give access to a character’s internal world, such as memories, daydreams, hallucinations, or fantasies. Hoier argues that these categories are particularly useful when analyzing contemporary “high end” or quality dramas, which often employ ambitious sound design techniques to convey the inner lives of the characters. These categories are also useful for describing contemporary sports programs, which similarly employ multiple microphones and prerecorded sound effects to produce a more intimate, “close up” sound. Like Sedman, therefore, Hoier concludes that the techniques and aesthetic devices used in contemporary television sound design are consistent with those previously only used for film soundtracks.

James Deaville’s “The Envoicing of Protest: Occupying Television News through Sound and Music” focuses on the use of sound in contemporary television news coverage—particularly the coverage of protests and demonstrations. Deaville begins by describing the sonic tactics that have historically been employed by protest movements, such as the development of the “human microphone” in the 1930s, the use of collective chanting and clapping in the Civil Rights protests of the 1960s, and the use of music in anti-Vietnam war protests. Deaville also examines the history of television news sound beginning with Fox Movietone newsreels in the late 1920s and continuing through the coverage of the Civil Rights movement, the anti-war movement, the Middle East protests of the late 1970s and the more recent Occupy Wall Street (OWS) movement. While protest movements have always employed sound to draw media attention, Deaville argues that the use of sound in television news broadcasts always reflects an underlying ideological bias. By analyzing the tension between the “tactical deployment” of sound and its “manipulative framing” by broadcasters, Deaville argues that television sound represents a site of struggle where the competing interests of protesters and broadcasters are constantly being negotiated. In the coverage of the OWS protests, for example, the participants actively sought a “sonic advantage” by borrowing practices from the Civil Rights and anti-war movements (chanting, clapping, call-and-response, etc.), while the sonic representation of the movement in news reports “rendered Occupy vulnerable to trivialization.” Regardless of how these sounds were framed by media outlets, however, Deaville concludes that the chants of protesters still penetrated into living rooms across the country, and they still have the potential to effect positive change.

Cormac Deane’s “The Hiss of Data” explores the transition to digital television by analyzing the kinds of sounds that are attributed to digital media in prime-time television drama. Deane focuses in particular on CSI and 24, which feature highly advanced digital interfaces and whose narratives focus on “how the forces of law and order fight the battle against criminality (in CSI) and ‘terrorism’ (in 24) to a large degree by means of computation.” Deane begins by examining “fantasy user interfaces” (FUIs) or fictional screens that are designed to represent computation. Deane is particularly interested in the sounds that accompany these FUIs, which transform the computer into an active agent while simultaneously distracting the viewer “from the mundane reality that forensic and other scientific processes on computers take time, that they often produce no valid results, and that results are not usually rendered in immediately readable visualizations.” Computers are thus imagined to be highly efficient tools for processing data, yet in order to maintain this fiction it is necessary to add a sonic element that distracts the viewer from the computer’s intrinsic deficiencies. According to Deane, this sonic element also reflects the “political characteristics of our media environment.” In his conclusion, for example, Deane suggests that television, like FUIs, is saturated with noise and television sound, like the sounds that accompany FUIs, serves to disguise this noise as information. Television sound thus provides what the technology itself cannot: it conceals the “deficiencies” of television by encouraging viewers to interpret the noise of television programs as meaningful signals.

Recent changes in the production and exhibition of television thus demand that scholars reevaluate existing theories of television sound and formulate new theories that address the rise of new technologies, production practices, and listening perspectives. The goal of the following issue is to begin this process of reevaluation by reviewing the major trends in the study of television sound and problematizing some of its central notions. While some of the essays collected in this issue focus on how technological changes have influenced the development of new production practices, and others introduce new concepts for describing particular sonic effects, they all clearly demonstrate that the notion of television as “illustrated radio” no longer provides a convincing account of the nature of contemporary television sound.


1. Two instructive accounts on the relationship between radio and listening attention include Kate Lacey’s “Toward a Periodization of Modern Listening” (Lacey 2000: 279-288) and Jo Tacchi’s “Radio and Affective Rhythm in the Everyday” (Tacchi 2009: 171-183).

2. For a more recent example of how Chion’s “illustrated radio” concept of television continues to circulate in recent scholarship, see Claudia Gorbman’s foreword to Music In Television (Gorbman 2011: ix).

3. In earlier television studies, the question of the voice was mainly subsumed within discussions of televisual “talk” (see Scannell 1991; Corner 1995). One recent example that extends the scope of existing scholarship is Shawn VanCour’s account of the technological developments in television sound (VanCour 2011: 57-79).

4. Barbara Klinger, in particular, has discussed the rise of the “home cinema” ideal in the 1990s and 2000s (Klinger 2006).

5. This focus on musical soundtracks has extended to more recent publications, such as Ron Rodman’s Tuning In (Rodman 2010), which does not elaborate on the broader issue of television sound. For two journal issues on the topic of television music, see Popular Music (Negus and Street 2002) and Perfect Beat (Evans 2009).

6. Anna McCarthy argues that television monitors have been prevalent in bars and stores since the 1940s (McCarthy 2001).

7. See, for instance, Karen Lury’s discussion of silent television screens with only the potential to be heard (Lury 2005: 85-87).

8. Some notable examples of this recent trend include Arrested Development (2003-2006), The Office (2005-), Parks and Recreation (2009-) and Modern Family (2009-). See also Jacob Smith’s chapter on the history of the laugh track (Smith 2008: 15-49).

9. For more on the importance of “liveness” in television studies, see Jane Feuer’s “The Concept of Live Television: Ontology as Ideology” (Feuer 1983: 12-21) and Mimi White’s “The Attractions of Television: Reconsidering Liveness” (White 2004: 75-92).

10. For more on the changing economic model operating within American television from the 1980s to the 2000s, see Ted Magder”s “Television 2.0: The Business of American Television in Transition” (Magder 2009: 141-164) and Chad Raphael’s “The Political Economic Origins of Reali-TV” (Raphael 2009: 123-136).


Carolyn Birdsall is Assistant Professor of Media Studies at the University of Amsterdam. Her research interests are in the fields of media and cultural history, with a particular focus on radio, film and television sound, non-fiction and urban studies. Her recent monograph Nazi Soundscapes (2012) examines the significance of radio and sound systems in urban environments during Weimar and National Socialist Germany. She is also co-editor of Sonic Mediations: Body, Sound, Technology (2008) and Inside Knowledge: (Un)Doing Ways of Knowing in the Humanities (2009).

Anthony Enns is Associate Professor of Contemporary Culture in the Department of English at Dalhousie University in Halifax, Nova Scotia. His work on television has appeared in such journals as Popular Culture Review, Studies in Popular Culture, Journal of Popular Film and Television, Quarterly Review of Film and Video, and Science Fiction Studies.