Isaac Vayo


This study examines the present predominance of visuality in relation to narratives of 9/11, concluding that aurality, typically undervalued in such conversations, is a more accurate and effective representation of 9/11-as-event. Within the broader field of 9/11 aurality, three specific examples are subject to more lengthy analysis in terms of their original context and their presentation to audiences via popular media: the voices of pilot-hijackers Mohamed Atta and Ziad Jarrah, the impacts of those jumping from the burning World Trade Center towers, and George W. Bush’s 14 September 2001 speech delivered from atop the rubble at the World Trade Center site. 9/11 aurality, then, succeeds where the visual imagination fails, allowing its account of the event to persist generationally, its internal logic to exist rationally, and its chief interlocutor, Osama bin Laden, to continue the discourse verbally.

In their pseudo-definitive document The 9/11 Commission Report, The National Commission on Terrorist Attacks Upon the United States reaches the conclusion that 9/11 arose from what it termed “a failure of imagination” (The National Commission on Terrorist Attacks Upon the United States 2004), meant in context as an indictment of shortcomings in information-sharing between intelligence organizations and administrative shortsightedness in both the Bush and Clinton administrations. For the Commission, a full accounting of the event is rooted in this imagination, an attempted visualization that accords directly with the visual fixity sought after in both the course of the event and its subsequent portrayal in mainstream media artifacts, an image-ination of sorts. Such a flawed fixation on fixity reflects both the insufficiency of the visual to fully capture 9/11-as-event, as well as the accompanying (and certainly undervalued) importance of the role played by the aural in 9/11 and its aftermath.

What appears as a failure to envision, to appropriately frame 9/11 within the field of the visual in the words of the Commission, is more essentially a failure to listen. It is not simply the visual that fails; it is also that the aural is not given the opportunity to succeed, an omission of a positive trajectory in which the event would find a more welcoming residence within the field of the aural. This is not to say that the event is not yet deeply visual; for many, it is only through live television coverage that 9/11 is first experienced (or firsthand witnessing for those in closer proximity), the iconic images of the smoking towers and their subsequent demise preceding any aurality to follow, with only the roar of pancaking floors and the terror of onlookers registering sonically. Yet, while initially deeply visual, that visuality is paradoxically lacking in depth, is superficial in its preference for the image over the sound. In this sense, the failures of 9/11 are compound, both the breach of national security and the failure to do justice to the event in its rightful aural setting, amplifying into a sound of silence, the eerie calm of the streets of Manhattan in the following days, and of sonic cessation such that the voices of 9/11, be they of the hijackers, of bodies’ encounter with the earth, or of a President, must take on the form of artifacts to find their audience.

These three signal aural artifacts, the voices of the hijackers as preserved on air traffic control tapes and cockpit voice recorders, the booming jointure of those jumping from the upper floors of the World Trade Center towers with the pavement below, and George W. Bush’s declaration from his perch atop the World Trade Center rubble pile that “I can hear you, and the people who knocked down these buildings will be hearing from all of us soon” (AmericanRhetoricOrig 2009),

stand as the foundational points from which 9/11 aurality draws its strength. Thus, the battle is joined in the soundscape via Bush’s call, amplifying the devaluation of the aural in favor of the visual into an inverse valuation of the aural, locating the incommensurable nature of the event not in the visual, at which one cannot look but from which one cannot look away, but rather in the aural, which leaves nowhere to turn. The battle joined, Osama bin Laden can only continue the dialogue opened by his foot soldiers, superseding his own spare imagery with a more abundant aurality in his cave tapes.

In brief, this analysis will examine the three aforementioned artifacts and their centrality to 9/11 narrativity, before which an introductory sketch of the nature of each is necessary to place them within the context of the theoretical framework. The hijackers’ voices, those of American Airlines Flight 11 pilot-hijacker Mohamed Atta and United Airlines Flight 93 pilot-hijacker Ziad Jarrah, exist in recorded form only accidentally, due to the likely unintentional and unknowing activation of the talkback button, which broadcast communications meant only for the cabin and passengers across air traffic control channels, a serendipitous audience that eagerly consults those voices for further understanding of the men and their motives. The jumpers’ percussive contributions, their impacts on streets, sidewalks, and surrounding buildings, exist outside of their intended frame, heard primarily in tandem with footage of the jumpers themselves, but also permeating unrelated footage of the attacks, in the background but self-foregrounding within images of evacuation and first response, standing as a martial drumbeat, the first tentative pulses of the march to war. Finally, George W. Bush’s typically bravado-sodden call (preceding a likeminded Verizon Wireless campaign, the immortal “can you hear me now?” [jwyoung5 2009]

is enshrined within news coverage of a president finding his voice, engaging in a call-and-response with those answering the summons to jihad. Embedded audio and video will permit the reader to follow along with each discussion.

This analysis is then concerned with both the production and consumption of 9/11 aurality, how these three artifacts take on their rightful status as the crowning achievement of al Qaeda’s ingenious plot, the most enduring element that will last well beyond the fading of the iconic images of impact and inferno, garnering a resonance, a re-son-ance, a generationality, as well as a reason-ance, a rationality, ringing through and true in a manner that the visual cannot hope to achieve. How the artifacts arise, how the hijackers’ voices make their way into the communicational apparatus aboard the commandeered planes, how the bodies of otherwise anonymous office workers come to a sort of visceral instrumentality, how a president heretofore lacking a signature moment beyond juridical favor seizes the day in a photo op that becomes something more, something aural, and how those artifacts reach the listener are of equal importance, production meeting reception in the act of transmission. 9/11 therefore derives its most accurate meaning, its closest approximation of the lived reality of the event, not through the overvalued visual, but instead through the undervalued aural.

Short-sighted: The Failings of 9/11 Visuality

Before moving on to an individual reevaluation of the undervalued aural in relation to the hijackers’ voices, the jumpers’ impacts, and Bush’s barbaric yawp, it is necessary to first establish the nature of 9/11 visuality in the configuration in which it has dominated discourse surrounding the event, as a means of understanding the shortcomings of that visuality and its eventual supersession by the aural. Though images various and sundry arise from 9/11 as a whole, there are three general groupings which come to dominate the visual rhetoric of the event, each providing a different facet of the accepted narrative to which visuality is a principal contributor: immediate site imagery (the impact of United Airlines Flight 175 into the South Tower of the World Trade Center; the collapse of the South Tower at 9:59 and the North Tower at 10:28; and the subsequent rush from the encroaching ash plume), first responders (FDNY, NYPD, and Port Authority officers, as well as the wave of volunteers from across the country who arrive in the aftermath) and their emphasized status as members of a national symbolic (alongside such standard images as American flags, bald eagles, and the Statue of Liberty), and tears, the collective lachrymosity of a people that gushes forth, performing a dual function in relation to national ocularity.

Taking immediate site imagery as a starting point, this focus on the site itself, only a fraction of 9/11-as-site(s) (omitting extended attention to the Pentagon and Shanksville, Pennsylvania), enacts an editing of sorts, a prioritization of the spectacular city as the fitting site of a spectacular attack, at the same time placing the injury and demise of the World Trade Center within the filmic vernacular common to such a celluloid city. Experience at the level of the visual reads like a blockbuster, the fractured megablock within which the towers arise registering as another The Towering Inferno (John Guillermin 1974), the ravenous ash plume and those in its unrelenting path as The Blob (Irving S. Yeaworth 1958). Sight, in this instance, is flight, progressing swiftly from wide-eyed incomprehension to a stolen glance over one’s shoulder in mid-sprint and then, as the ash cloaks lower Manhattan, a filmy darkness that renders sight moot. Here then, the visual is regent, the primary initial experience of the event for most being the looped arrival of Flight 175 and the havoc it wreaks, yet repellent, at once excessive (suggesting the irreality of the filmic analogue) and unsuccessful, discarded on the trot, and worth not a jot amidst the settling ash.

Moving second to first responders, firefighters striding bravely into the breach, police officers NYPD and Port Authority guiding evacuees to safety and clearing the confused remnants from their office refuges, it is these individuals who provide a nationalizable human face to the event. In a time of trauma, those unmoored by the traumatic event look to static figures, authorities, for guidance, and though the deceased first responders can only lead the stricken back into the towers, those who survive them, the volunteers at the World Trade Center site, the assembled legions at the funerals of the following months, can tread the path of recovery, leading those stricken blind by the day into a different, tenable tomorrow. Indeed, this process extrapolates itself into a localized omnipresence in which local authorities, however geographically and culturally removed from the ultimately limited body of FDNY, NYPD, and Port Authority figures, are the recipients of cast-off graciousness, transfer objects whose proximity renders them visual foci, and enables diasporic victims outside the city limits access to these living statues, with subsequent placement of their impassive monumentality in the pantheon of likeminded symbols, including the flag, the bald eagle, and the Statue of Liberty. Here then, likewise, the visual is doubly cognizant, aware of the appropriate object of national affections, yet ultimately unaware that this visual attribution is rendered as a hollow sign, the uniform standing in for the uniformed and now unformed deceased, a uniform now decorporealized, a body no longer visualizable, but spectral, clouded in the manner of the curling ash filigrees on the gilded narrativity of the event.

Finally, all ends in tears, fonts far and wide unfurling their sadnesses, their individual and collective senses of grief and loss, into a distinctly visual outpouring, an inversion of ocular dynamics from reception and process to production, though a production which is ultimately unproductive, which performs dually, setting the visual and the aural more at odds than ever. At a basic level, the tear serves two functions: first, at the physiological level, tears are intended to clean and lubricate the eyes, to grease the wheels of visuality by washing away various irritants, a literal and figurative flooding of the visual realm; and second, at the psychological level, tears serve as a form of aqueous exorcism, a ridding of the traumatic element, flushing the system, draining away retention. In this dual sense, the tears of 9/11, seen so often in images of horrified onlookers, devastated family members, and appropriately distressed national figures, seem to be a positive phenomenon, allowing for a clarity of vision and, by implication, mind. Yet, these tears function alternately as tears, small rips in the illusion of invulnerability, and as moments of visual excess, as a superflui(di)ty, as a flooding of the visual space such that vision becomes cloudy, unreliable, overlubricated and liable to slippages, the sliding away from reality seen in the filmic referentiality of immediate site imagery and the elision of first responders and secondary ones. Here then, the visual is quiescent, seemingly still, tears providing an eventually calming influence, though the passed and past waters run deep in this instance, revealing an overall unreliability of the visual, a failure in each of the three general groupings of 9/11 imagery that may only be remedied by aurality.

A Preliminary Hearing: Opening 9/11 Aurality

In advance of the more thoroughgoing formulation of the centrality of aurality to 9/11 narrativity to follow, it is equally necessary to preview, or rather pre-listen, to what the aural has to say about the event, to its own unique contributions, to its particular capacities for locating and attending to aporia within visual accountings of 9/11. Taken briefly, there are four essential qualities that render the aural of greater value than the visual in terms of capturing the event: the involuntary nature of hearing, the infinite nature of sound, the accompanying averted eye as a differential turning, and the physical proximity of the aural, each of which will receive initial consideration below, and lengthier use [application/attention?] in the specific analyses to follow.

Beginning with the involuntary nature of hearing, this aspect of sound itself is one that garners little notice in the immediate aftermath of the event, though it remains a phenomenon which grows in importance over time through its auto-revocation of the listener’s agency. Unlike sight, which is a product of ocularity and subject to the whims of the eyelid in all but the brightest occurrences, one cannot close one’s ears. Fingers, earplugs, cotton, and any other interferences may be attempted, yet sound persists, resists stoppage, entering instead through skeletal translation, a feeling down to the very bones of the listener, as inescapable as one’s own visceral frame, and as essential to survival and stability. One may endeavor to block the sounds of the hijackers’ voices, the jumpers’ impacts, Bush’s aural offerings, yet no relief is available, the sound ringing still in the absence of its source.

It is this interminability that in part renders the aural such a powerful entity, that paradoxically calls the listener forth when the call itself is but an afterthought. As sound is first issued from its originating locus, the waves run strong and deep; the aural overwhelms. In time, however, waves ebb, and their perceptible impact diminishes beyond the point of registration; or so it seems. Indeed, the waves, the vibrations of sound are subject to a dimming, a thinning; gone is the clamor of the originary iteration of a sound, which overwhelms, exceeds the ability of the listener to process its subtleties. In its stead is the softer sound, which calls to the listener as it calls on that selfsame listener, requiring a level of engagement, a willingness to address the sound on its own terms, to reach into the void, into the apparent silence, producing a positive amplification in its own deamplification, sounding the listener by requiring the listener to repeat the sound, to revivify a dying tone. One may put Atta and Jarrah’s broadcasts off to the side, hoping that their suicides may subside into a silence; one may muffle the similarly muffled demises of the jumpers, muzzling their narrative contribution, hoping that sound may decay as the sounder disintegrates; one may flip off Bush’s megaphone, interrupting the signal to a single instance, never to be repeated; yet, a wave, once made, is made forever more.

The desire to look away, to negate the thrall of horrified vision, is dizzying, nauseating, yet possible. However, in the course of this differential turn, this move from the visual-as-primary to the aural-as-primary, not only is the failure of the visual in evidence, but also the transference of that unease, able to be countered in the realm of the visual by focus on a singular object, to the aural, producing a vertigo of sorts that is not so easily countered, that instead inhabits the inner ear. Where the visual act of looking away, which is itself not so much a direction of vision from one object than a willful redirection of vision to another proxy object taken as the negation of the horror of the first object, suggests a mostly successful evasion, such success is not available in the realm of the aural, a turning away only yielding a turn inward, to the inner ear itself, which is turned dizzying by the grip of vertigo. Thus, vision may be avoided, may be voided, made void, through a differential turn to the aural, though turning from the aural remains internal, turning the stomach and voiding its contents.

Such a viscerality is characteristic of the aural as well, the projected gaze of ocularity and visuality instead giving way to an imbibing, a taking into the body of the sonic wave and, through the connection created by auto-acceptance of the aural artifact, a return to the source, to the intimacy of the aural’s oral, the mouth. As vision constitutes a lesser penetration, the arrival of light data into the eye itself being processed by the optic nerve and interpreted in the brain thereafter, aurality constitutes a greater penetration, burrowing deep into one’s head before being registered, then processed initially further still into one’s brain. This penetration draws the listener back to the penetrating agent, the sound wave and, by proxy, its producer, in most instances the mouth (though proxy mouths may exist, the “mouth” standing more generally for the producing entity, be it tongue and teeth or body and pavement, the softer surface and the harder combining to produce a constellation of sounds, constituting a language), drawing the listener into this moist opening, site of orality, the taking in of the other, both in captivation and copulation. The aural is the oral is the oral, then, the listener soliciting and receiving pleasure from the act, coming at once to and from the event, vectoring away from the visual to the invective of Atta and Jarrah’s, of those plunging, finally, of Bush’s aurality.

An Unholy Trinity: Three Artifacts

Having situated this analysis within the torch-passing from visuality to aurality, one may now move on to an examination of each of the three signal aural artifacts identified above, the hijackers’ voices, the jumpers’ impacts, and Bush’s rubble rhetoric, with an introductory sketch, an understanding of the original form of the artifact, a survey of the artifact’s specific qualities, and a tracing of its broader dissemination for each revealing broader currents within 9/11 aurality. Before engaging with these artifacts, one must address perhaps the best known assemblage of 9/11 aurality more generally defined, the Sonic Memorial Project, to determine its relation to this analysis, and also situate the aural artifacts to follow within the broader field of “chatter” from which the artifacts are birthed, but in which they take no residence.

The Sonic Memorial Project is an online memory repository of sorts, a website to which those with specific aural memories of not only 9/11, but also the World Trade Center throughout its history, may leave voicemail records of their recollections, or upload sound files, with those recordings then being made available to all through both a searchable database and the Sonic Browser function, where a visitor may place their cursor over a thread to receive a semi-randomly selected recording (each thread pertaining to a broad theme). Seemingly, such an archive would be central to the parameters of this analysis, given that the Project focuses solely on the aural, with the implication that it is an important, if not the important, site of 9/11 memory. Yet, in its status as an aural history project whose scope exceeds the bounds of 9/11, the Project more aptly functions as a context and frame for the event rather than a meditation on the event itself. The vast majority of its recordings, including those more specifically about 9/11, are ephemeral, capturing sirens on the surrounding streets, busy signals on local phone networks, and the fleeing crowd on the Manhattan Bridge, with only the inclusion of police scanner and FDNY radio broadcasts approximating the artifacts to follow. Indeed, it is proximity that is the rub here, these recordings holding a generational remove that makes them less of 9/11 (like the hijackers’ voices, jumpers’ impacts, and Bush’s reply) than about 9/11. Some generationality is inevitable, as otherwise those not in air traffic control centers, on the streets surrounding the wounded towers, or near the rubble pile would not hear the artifacts in question; however, in these instances, the original recordings are subject to a lesser removal, fewer generations coming between their generation and their listeners’ reception. In the case of the Sonic Memorial Project, direct recordings often are not even included, with discussions of sound rather than sound itself making up the bulk of the included recordings. As such, while certainly an impressive and worthwhile undertaking, the Sonic Memorial Project lies beyond the generations of this analysis.

These generations and their attendant artifacts may be taken as a result of, though not an example of, what the intelligence community and, increasingly, popular audiences refer to as “chatter… gossip, scuttlebutt, the babble of a child” (Keefe 2005), as noted in Patrick Radden Keefe’s Chatter: Dispatches From the Secret World of Global Eavesdropping, though that chatter takes on a more sinister bent in the context of potential terrorist attacks. Chatter follows a specific pattern in these instances: “a sudden spike in chatter, a crescendo of foreign voices. Then silence. Then disaster” (Keefe 2005). This was indeed the case with 9/11, as the rising tide of ominous communications during the long summer of 2001 tapered suddenly, yielding the event, though few were any the wiser given a backlog of intercepts using vague language and regional dialects, highlighting a shortage of skilled translators able to work through recordings in a timely fashion. Yet, Keefe cautions against the reliability of chatter, calling it “a perfect word for conversations culled from the airwaves: fickle, misleading, most often inconsequential” (Keefe 2005). That pre-9/11 chatter proved so very consequential is thus the exception and not the rule, and it is the pre- prefix that is most instructive here in divining this analysis’ artifacts’ relation to chatter. Where chatter, as defined by Keefe, builds in advance of a portended event, and is then silenced as the final preparations are made, the artifacts in question occur within or, in the case of Bush’s speech, shortly after the event, making them a product of what that chatter suggests, rather than chatter itself. Where chatter is unclear, allusive, these artifacts offer stark clarity: control of the hijacked planes, death from a great height, national vengeance. Still, both represent missed signals of a sort: chatter, heard retroactively, reveals the seeds of the already occurred; the artifacts, truly heard only after the fact, reveal the centrality of the aural to the event.

Double Jeopardy: Atta and Jarrah’s Aurality

To start off, it is necessary to introduce our speakers, Mohamed Atta and Ziad Jarrah, and the speech that they offer as their primary aural contribution to the event. As the pilot-hijacker at the controls of American Airlines Flight 11, the first of the four hijacked planes to reach its end, Atta fulfills his role as designated leader of the plot, with his words being the first to grace the airwaves, beginning at 8:23am, with the immortal “we have some planes” statement (HistoryCommons 2008),

which is followed by a number of platitudes placing the hijacking within the broader context of hostage-oriented hijackings of the past. Shortly thereafter, at 8:24am, Atta cautions that “[i]f you try to make any moves, you’ll endanger yourself and the airplane” (HistoryCommons 2008), putting the onus of danger not on himself, but on the passengers. Further, at 8:33am, Atta reclaims the microphone, cautioning the passengers “don’t try any stupid moves” (HistoryCommons 2008), suggesting perhaps a burgeoning revolt, though there is little documentary indication of such an intervention. Jarrah, as the final pilot-hijacker to seize control of his flight, is somewhat late to the party, making his first entry at 9:31am, introducing himself as “the captain” and calling for all to “keep remaining sitting” in a flawed effort at pacification (HistoryCommons 2008),

one that necessitates a 9:39am transmission emphasizing the presence of a (likely fake) bomb on board and urging compliance. These five short transmissions constitute the totality of the hijackers’ voices and are all that remain of Atta and Jarrah. The pilot-hijackers of the other two flights, United Airlines Flight 175’s Marwan al-Shehhi and American Airlines Flight 77’s Hani Hanjour, are not known to have made any statements over open channels, and neither cockpit voice recorder has been recovered.

The context of each of these transmissions is essential as, given that both Atta and Jarrah’s words were meant only for the passengers aboard their respective flights and only reached beyond the cabins of those planes due to a mistaken use of the talkback button, which sent their words out over air traffic control channels as well, a broader pulpit was unintentional, indeed perhaps undesirable given its alerting of authorities, though central to the full perpetuation of the hijackers’ voices. In their original form, the hijackers’ voices-as-aural artifacts exist only in recordings made by air traffic controllers and, in the case of United 93, in the cockpit voice recorder recovered in Shanksville. Atta’s words, as captured on reel-to-reel recordings as the hijackings unfold, are the first to reveal the nature of the event, as air traffic controllers return to the tapes to discern his somewhat muffled speech; Jarrah’s indicate another hijacking in the seemingly unending string. The fact that Jarrah’s take place within a larger field of discourse, the remainder of the cockpit voice recording, which also captures discussions within the cockpit between Jarrah and a fellow hijacker, does little to qualify the importance of his two broader transmissions which, in their broader audience (the two aforementioned missives reach air traffic control, and are available online, while the broader cockpit recordings are only available in transcript form), have greater resonance. For both Atta and Jarrah, then, the mistaken use of the talkback button is something of a happy accident, a means of inserting their dictates into a larger auditory space, making those artifacts later available to a much greater potential audience.

This hijacker voice is unusual, rubbing the listener in a wrong but (informatively) pleasurable way due to a number of factors: the voice’s politeness, its constitution of the event itself, and its command (of) English. Though offering relatively traumatic news of the overtaking of the plane by his hijacking team, Atta remains courteous throughout, telling passengers to “just stay quiet and you’ll be ok” (HistoryCommons 2008), though his tone becomes slightly harsher in his third transmission, where he amends that statement to read “[n]obody move, please… don’t try to make any stupid moves” (HistoryCommons 2008), a statement that is exceedingly mannered given its context. Jarrah approaches his task similarly, first asking the passengers to “please sit down and keep remaining sitting” (HistoryCommons 2008), then later requesting that the passengers “[p]lease remain quiet” (HistoryCommons 2008). Though such politeness is not without motive, a subdued body of passengers being less of a threat to a completed hijacking while docile, the hijackers gentle touch remains striking in its counterintuitive kindness.

Polite as it might be, Atta’s voice also signals the commencement of the event in such a way that his invocation in fact calls the event into being. Prior to Atta’s emergence as a discussant at 8:23am, little knowledge of the hijacking is even available, the hijacking having been executed a mere nine minutes prior, and not fully having been recognized as such by those on the ground, including air traffic control. As such, the event does not register as a hijacking until Atta’s words are heard over air traffic control channels, his “we have some planes” functioning multiply. First, the “we” indicates that his forces are legion, that he is not alone in his mission; then, the “have” indicates possession, a seizure, a taking of control, though the object that is possessed is yet unidentified; finally, the “some planes” specifies that object, again expanding the field of control. While Atta’s may be the only acknowledged hijacking upon receipt of his transmission, and is indeed the only plane to have been hijacked by that point in time, the others are effectively hijacked by words alone, deed following shortly thereafter.

These voices, Atta and Jarrah’s, also possess a certain grain, a texture, which grants them an authenticity, deepening their accounts with a gravity, a veracity that reflects the serious undertaking that they represent and the associated relationship of command. Speaking in terms of timbre, Atta’s tone is curt, clipped, his voice deep and yet subtly keening, and his words are delivered with the intimate proximity of their initial reception in the plane’s cabin and the air traffic controllers’ headphones. These dualities create a tension between the words themselves, unfailingly polite, if somewhat testy in the second and third transmissions, and the deeds that lie beneath. In each case, his English is clear, crisp and, though tinged with traces of the Arabic of his native Egypt, demonstrates a dual command, not only of the tongue, but by the tongue, Atta calling forth the event and demanding (and receiving) the rapt attention of the passengers, who mount no significant revolt and offer little interference for Atta’s aims. Jarrah is similarly terse, his voice slightly higher and more pained, thin and almost shrill in its underlying sense of panic, though his succinctness is less a consequence of tone than of time, the delayed hijacking leaving him breathless in the face of cockpit alarms and the just-completed bloody takeover of the cockpit. Yet, his English is less precise, the phrase “keep remaining sitting” demonstrating a loose understanding of English grammar conventions, an imprecision that is consistent with Jarrah’s overall lack of surety (he is considered to be the weak link in the plot due to his continued contact with his wife back in Germany, as well as his suspect commitment to the cell), and an accompanying lack of command exists such that control of the cockpit is eventually reverted to other hands, potentially those of the passengers. For both, the roughness of their recorded sayings, a product of the mediation of their words through cockpit microphones under rather heated conditions, air traffic control channels, and then reel to reel recording apparatuses, grants them an authenticity, as they are pulled from the midst of the event itself and offered, unscrubbed, to the listener eager to learn more about the men who executed the plot and actively seeking out documentary evidence to that end.

As part of an effort to access such authenticity, to bring the truth of those unvarnished transmissions to their accounts of the event, these original recordings are integrated into both purely documentary films (of which National Geographic: Inside 9/11 [Michael Eldridge and Lance Hori, 2005] will stand as the example) and semi-fictionalized dramatic films (of which United 93 [Paul Greengrass 2006] will stand as the example). Within the field of documentary films surrounding 9/11, none is equal in scope and detail to Inside 9/11 which, in its tracing of the threads of the event back to the Afghan War of the 1980s and forward to the day of 9/11, utilizes the hijacker voice to add weight to its account. Indeed, both the voices of Mohamed Atta and Ziad Jarrah are included in the second part of the documentary, entitled “Zero Hour” and chronicling the day of 9/11 itself, and are subject to a similar treatment which reveals much about the failings of the visual in relation to the event. Upon the first appearance of the hijacker voice, when “Mohamed Atta’s voice crackles over an air traffic controller’s headphones” (Eldridge and Hori 2005), it is only the voice itself that appears, the actual recording playing on the audio track as the words are captioned to the screen (suggesting a lack of clarity in Atta’s words where none in fact exists) and projected over blurry stock footage of a cockpit. Later, when Jarrah’s voice is included, it is dealt with similarly, the actual recording comprising the audio, the words being captioned, and a blurry cockpit providing the backdrop. In both instances, the aural, the hijackers’ voices, are unmoored, with no successful linkage being made between their visual production, in the mouths of Atta and Jarrah, and the artifacts themselves, the recordings, manifesting both the failure of the visual to account for the true impact of the event, as well as the manner in which the voice pulls away, is not bound to its visceral originary locus, instead floating free, coming into the cockpit communication apparatus, into air traffic control, onto a length of magnetic tape and, via the documentary, into the listener’s living room and, subsequently, the listener’s ears, lodging there, living as its producing body dies. National Geographic: Inside 9/11, in placing these recordings at a central location within its narrative, emphasizes the importance of aurality, enacting the failure of the visual in its own blurred non-fixations.

Though not purely documentary in the manner of Inside 9/11, Greengrass’ United 93 attempts to achieve a similar proximity to the “truth” of the event (with methods to that end including casting actual pilots, flight attendants, and air traffic controllers, using long takes, and keeping the actor-hijackers and actor-passengers separate during filming), using recordings of the hijacker’s voices to similar ends. Here, however, only Atta’s voice appears, his “we have some planes” statement being included through the headphones of an air traffic controller, then in a tape review room as his message is being clarified. After Atta’s words are first heard over air traffic control channels, amidst other communications from planes on similar frequencies, the tape is reviewed, with a controller concluding “it’s, uh, planes… PLANES. Plural, yeah” (Greengrass 2006), demonstrating full comprehension of the performative nature of Atta’s linguistic choices. Greengrass’ commentary track, available on the DVD release of the film, casts some light on his use of Atta’s voice, when he notes that “he [the air traffic controller] knows at that point that something is wrong” (Greengrass 2006). This inclusion is telling for two reasons: first, it is the only archival audio recording included in the film and, save for a few short clips of news coverage shown in FAA and military headquarters, is the only archival material included at all, reemphasizing the importance of the hijacker voice to the event; and second, it is only Atta’s voice that is used, not Jarrah’s, though Jarrah himself is the protagonist of the film about his hijacked flight. The words that Jarrah speaks in his recorded transmissions are not even included in the film, though words previously only available on the cockpit voice recording transcript are given voice, suggesting that, while voice may be given, visualization of the recorded words may not, representing another failure of visuality. In both Inside 9/11 and United 93, then, the hijackers’ voices make an appearance, albeit one that does not appear, that cannot appear, that disappears, the visual giving way to the aural in the realm of evidence and event narrativity.

Indecent Descent: The Impactful Jumpers

Atta, Jarrah, the perpetrators have had their say; it is time to hear from the victims, who go out more with a bang than a whimper, none more so than those who jump to their deaths from the upper floors of the World Trade Center towers. Upon the arrival of Atta and Jarrah’s respective flights, 11 and 175, at the North and South Towers, hellfire is unleashed, leaving those still alive in the offices and spaces above with few options. With fewer choices, and less time, many elect to seek a reprieve from the inferno behind, with somewhere in the vicinity of two hundred coming down on the side of coming down not with the towers around them, but on their own terms. Minds made, the jumpers step into the void, pirouette inexpertly and, with a rude thud, are no more, in all cases careening with immense force into the pavement and surrounding buildings, producing a sickening sound as sinew, bone, blood are fragmented forth. The specter is so gruesome that many cannot look (yet find it difficult to turn away), though the turn is marked, as expected, by the aural artifact, present in both individual experience and recorded evidence captured by news programs.

The context of these impacts, the method of their capture by those on the scene, is of great importance to this discussion, given that it is primarily through these means that the impacts are able to reach more than those in the immediate vicinity of the World Trade Center site, despite voluntary limitations motivated by concerns over propriety. Unlike the hijackers’ voices, which reside almost exclusively in the realm of the aural, the jumpers’ impacts are a seeming byproduct of efforts to visually fix the event, to record the spectacle as it unfolds, up to and including those leaping from above, with the accompanying soundtrack something of an added bonus to the striking footage of the strike itself. Such footage is recorded predominantly by news cameras on the scene, and is in one instance also recorded by documentary filmmakers Jules and Gédéon Naudet for what would become their film 9/11 (Hanlon, Klug, Naudet and Naudet 2002), though some private footage taken by other witnesses is available as well. Yet, though seemingly an ancillary benefit of visual fixation, the jumpers’ impact soon takes on a central role in the act of artifactual recording, given that, for reasons of taste, decorum, or sheer repulsion, most turn away from the jumpers themselves, unable or unwilling to stomach the final moments of those trapped in the burning towers, in these last seconds exercising a measure of deference, panning away but, perhaps unintentionally, not turning off their cameras, leaving the microphones to capture what the lens no longer can. Some of this footage makes its way to news broadcasts unedited, in the raw first hours of the event, though it is soon deleted (at least from domestic coverage) in the name of propriety, thereby being outsourced to other venues. Lacking the visual referent, the aural becomes paramount (no known footage existing of an actual impact), comes to define the event, picking up the slack for the visual victimized by the averted eye.

Such impacts are characterized by a number of essential qualities common across all recordings, all of which contribute to their signification of the event: the figurative weight of their demises, the visceral wetness of their meeting of their arrest by building or blacktop, and the halting beat created by their rhythmic departures. In addition to the literal weight of bodies crashing into immoveable objects, there is a figurative gravity at play, the event placing the weight of the world not only on the shoulders of the falling, but also those of the failing, those unable to respond to the event with any sort of effectuality (read: intelligence agencies, the military, the Bush administration, among others). Theirs is a burden, one born lightly in most cases, perhaps too lightly in this instance, leaving the falling to alight less gently than preferred. This alighting is accordingly heavy, thick, clotted with bodies, with viscera once neatly packed, now strewn on sidewalks and rooftops, an icky thump adding a physical reality to what is at first a beautiful act out of context, a delicate dance through the harsh blue of the morning to the dark below, from a darker hell within to a stark one without. Bodies, absent in the footage of the planes’ impacts, are made present in the jumpers, absented again briefly in the respectful turn, but presented eternally in the resonant thwack of impact. The timbre of these impacts is unmistakable: a dense, moist, heavy booming sound that seems at once immediate, at an arm’s length, and impossibly resonant, cannonading off the surrounding buildings, perversely resonant. Theirs is a thwack akin to a skin, less of skin on surface than skin-as-surface, drum-tightened, pounding out a crude beat, irregular, then more steady as the falling proceed apace. It is a heartbeat, erratic, tachycardia, a sign of distress though, in the same moment, martial, beating one set of lasts (that of the jumpers) and implying another (those to fall beneath the cannon to follow), laying a rigid, unrelenting pattern, knees jerking in time, the unspoken but heard of the jumpers opening to a wider hearing in kind.

In much the same manner as the hijackers’ voices are included in documentary and semi-fictionalized renderings of the event to achieve a certain veracity, the jumpers’ impacts are similarly implemented in the documentary realm, though their most unvarnished availability is located in less formalized channels, with the chief examples of each being the absence-as-presence of the leaping in the aforementioned 9/11 and National Geographic: Inside 9/11, and the presence-in-absence of the considerable amount of YouTube footage of the jumpers gleaned from news and private footage. Within the documentary context, coverage of the jumpers is included primarily in the name of completeness, and also as a notation of severity; no narrative of 9/11 would be complete without at least a token mention of one of its most gruesome phenomena and, in that valuation, the severity of US victimhood is also emphasized. 9/11, in fact, does not even include footage of the jumpers in its domestic version (such footage being deemed too painful for a film released a mere year after the event), attention to the jumpers only remaining in the international release. Yet, the jumpers are present in their willful omission, evincing a narrative chasm that cannot help but be filled by the viewer, whose own recollections of the tumbling bodies and sickening impacts stand in stead of the trimmed reels. Inside 9/11 offers more substantial jumper content, capturing an amazing juxtaposition between a Muzak version of Billy Joel’s “She’s Always a Woman” playing in the WTC plaza with the resounding impacts of the jumpers, described as “the most God awful sound you can imagine” (Eldridge and Hori 2005), with one firefighter wincing in realization that “someone else just died, someone else just died, someone else just died” as the impacts grow in number (Eldridge and Hori 2005). Still, other than a few brief clips of the jumpers themselves, mainly shown in situ, framed in shattered office windows rather than en route, there is little visual portrayal offered, the auralized jumper taking precedence.

Contrary to this absence-as-presence, where jumpers are evident by their very inevidence in the realm of documentary film, raw news footage finds another, more provisional, yet relatively stable, home on YouTube, the only location where substantial amounts of jumper imagery may be located, though even there it is subject to a willful attempt at absenting, yielding a presence-as-absence. Where such imagery is generally deemed unfit for broader human consumption, particularly in the US, YouTube, as a user-driven platform for video uploads, allows the public desire for jumper information as a means of attending to narrative lack a venue from which it may hold forth, and many short pieces of jumper footage, either taken directly from news coverage or edited together from multiple sources, are available. This footage is, for the most part, silent, or overlaid with unrelated musical accompaniment, at once rendering the jumpers present, considering the visual newly palatable, while still refusing to touch the aural, suggesting its more profound capture of the horror of the jumpers themselves.

In many cases, this footage is subject to a doubled absence, with not only its sound being stripped, but also, in some cases, its potential audience being in a sense chastised by the necessity of signing into a YouTube account to view such graphic footage. These absences-as-presence, these presences-in-absence, these films and uploaded videos serve only to emphasize the inability and unwillingness of the visual to account for the jumpers, as well as the ability and, upon realization of that ability, attempted neutralization of the aural in its efforts to chronicle the jumpers. Documentary film must include the jumpers to approach truth, though in trading a lesser trauma (visuality) for a greater one (aurality), that truth is a grim one; YouTube uploads, as the work of more aware users, tip the balance oppositely, though the power of the aural remains.

King of the Mountain: Bush’s Bullhorn

To set the scene (or rather the tone): it is September 14, the US is still reeling from 9/11-as-event and its implications for national vulnerability, and it is up to the President to assert leadership and offer guidance in a trying time. George W. Bush travels to New York City, visiting the World Trade Center site, talking with first responders and volunteers, and giving encouragement and conviction that the event will not go unanswered. Perched upon a pile of rubble, with his arm occasionally around an FDNY firefighter beside him, bullhorn to his mouth, Bush delivers a clutch of memorable lines, some more banal, such as the assertion that “America today is on bended knee in prayer for the people whose lives were lost here, for the workers who work here, for the families who mourn” (AmericanRhetoricOrig 2009) (see: Video1), less his desired message than a necessary gesture of decorum, others more inflammatory, such as the legendary response to the complaints that he is inaudible: “I can hear you! I hear you, the rest of the world hears you, and the people who knocked these buildings down will hear all of us soon” (AmericanRhetoricOrig 2009). Bush’s brief comments, lasting just a few minutes, are peppered throughout by calls from the audience, most concerned with the limited amplification of his voice by the bullhorn, “we can’t hear you!” being a frequent shout (AmericanRhetoricOrig 2009), (to which Bush answers “It can’t go any louder” [AmericanRhetoricOrig 2009], underlining his own relative ineffectuality in the wake of the event, available tools being insufficient and necessitating extra-legal additions to the national arsenal), but others intervening in the overall discourse of his speech, including calls to action (“Go get ‘em George!” [AmericanRhetoricOrig 2009]) and reassertions of patriotism in the face of external threat (“God Bless America!” [AmericanRhetoricOrig 2009]). Though brief, Bush’s comments speak volumes, their context and content situating them within discourses of 9/11 and, through placement within news coverage, within the field of the listener.

As per the typical blanket coverage of Presidential activities, Bush’s words are immediately situated within the context of news programming, though their placement within that context does not come without an accompanying decontextualization, a context-as-contextlessness that isolates his already isolating language. Though his comments are relatively brief, even for what is essentially a photo opportunity, they must be further reduced to fit within the limitations of most news programming, necessitating a trimming of Bush’s rhetoric down to the level of a sound bite, a terseness that in some sense mimics that of Atta and Jarrah, and which is similarly situated within an overall environment of pleasantries. From the entirety of his speech, the line that emerges as the sound bite is not Bush’s acknowledgement of the sacrifices of the lost and those who search for them in the rubble, it is not any of his other comments which border on a sanctification of the event through religious invocations of the blesser of America, God; it is, instead, his bravado-laden statement that “the people who knocked these buildings down will hear all of us soon” (AmericanRhetoricOrig 2009), a phrasing meant less to calm than to incense, to channel the bloodlust into a new vein of (implied) militarism. This statement, once excised from the overall context of Bush’s comments, is used often throughout news coverage in the days after the event and contributes to the overall call for war, meeting Atta and Jarrah’s voices with one of its own, regularizing the halting martial beat of the jumpers into a strident path forward, less the path to 9/11 than the path from it.

Brevity may indeed be the soul of wit, and though not often noted for his sparkling wordplay (at least in intentional forms), Bush’s words at the World Trade Center site are qualitatively dense, containing numerous implications for not only himself, but for 9/11 discourse more generally, with four main qualities being in clearest evidence: the call and response nature of the exchange with the audience of workers (an audience broadened by placement within news coverage), Bush’s ability to find a voice via the found voice offered by Atta and Jarrah, the setting atop unstable rubble as symbolic for the Bush administration itself, and failing amplification as a symbol of national vulnerability. The speech functions as a call and response dialogue on three levels: first, in the simple give and take between Bush and the audience of workers, with calls for more volume and defiant patriotism; second, as a give and take between Bush and the broader American audience, where he offers assurance, solace, and vengeance; and finally, as a response to the call issued by Atta and Jarrah, a rejoinder to the “we have some planes” that notes that, in its military might, the US has a few planes at its disposal as well (though Atta and Jarrah possibly see their own transmissions as a response to US aggressions before 9/11, indicating the cyclical nature of the call and response form). It is this hijacker voice that allows Bush to find his own, to craft his own Presidential carriage from the seizure of discourse, though that voice (the hijackers’) must first be located, recognized, and heeded before it may be less rejoined than joined. His freighted positionality is reflected in both the setting of the speech and its technological apparatus, the unsteady, still settling rubble of the World Trade Center reflecting the tenuous nature of his administration at that point and the equally suspect war footing that follows the event, especially in the case of the Iraq conflict, the very need for and ineffectuality of the amplification offered by the megaphone implying a certain weakness. The timbre of his communique, thin, reedy, forced in an attempt at reaching the heights of volume demanded by his audience, also reflects a distance, as if Bush is again speaking from an undisclosed location, from a rhetorical positionality somewhat distant from that of his audience more broadly drawn, though the first responders around him seem of one[?] mind.

As noted above, save for the workers and news professionals assembled at the World Trade Center site on September 14, Bush’s words reach the listener via news coverage, though that coverage is itself incomplete, sound bitten, yet the sound bites back, resonating multiply, qualitatively, beyond the immediacy of the curtailed quote. Such a pre-packaged passage seems readymade for enshrinement within the lore of 9/11, for use and reuse in documentary and semi-fictionalized contexts yet, for the most part, Bush’s words do not find a refuge in those locations, being superseded instead by his address to a joint session of Congress on 20 September 2001 (source of the equally noteworthy “you’re either with us, or you are with the terrorists” bite [boredjoewo 2007])

and, on the rare occasions that they do appear, serving more as a means of diminishing the effectuality of his and his administration’s immediate response rather than inflating it. Indeed, though perhaps representative of the stridency of Bush’s stance in the early days after the event, as per the particulars of the news cycle, the speech is soon vacated for more pressing and impressive informations, with its previous existence as sound bite making Bush seem curt, less a warrior than a wordsmith, making another abrupt and unimpressive appearance much like that at Barksdale Air Force Base on 9/11, where the look of a hounded animal undermines any content of the words on offer. For the canny, observant, aware listener, Bush’s 14 September 2001 speech is available through news coverage, and the thirst for information in those early days makes the potential audience substantial; however, after the fact, though locating the language becomes more difficult, Bush’s call, like Atta and Jarrah’s, is collected, maintained, by the responding listener.

Calling an Audible: Conceptualizing 9/11 Aurality

With three examples in hand, all that remains is to formalize the centrality of the aural, to give it a final push into the limelight or, more appropriately, in stereo, and four different tactics will be used to that end: a revisitation of the notions of the failure of imagination, re-son-ance, reason-ance, and bin Laden’s retorts/rhetoric. As demonstrated in the above examples, what we have here is at once a failure of imagination, an inability to image 9/11, to pull it back into the realm of the (overvalued) visual from its locus in the (undervalued) aural, as well as an accompanying failure to listen, to seek where one might find, to give the event the hearing for which it calls and of which it is wholly deserving. In the case of the hijackers’ voices, both National Geographic: Inside 9/11 and United 93 are unsuccessful in their efforts to reattach the voice to its producing body and that body’s accompanying visuality, directed efforts proving directionless on that account, with the visuality of the voice being blurry at best, appropriately absent at worst. In the case of the jumpers’ impacts, a similar lack of success is the rule, as both 9/11 and the YouTube videos lack precision in their approach to the aural, the former omitting imagery entirely and creating a yawning silence filled by the listener, the latter supersaturating the visual field while creating a similar silence through literal omission of the aural, spurring a likeminded filling effort. Finally, in the case of George W. Bush’s speech, both the visual and aural prove suspect (though the latter less so), Bush’s tentative steps atop the rubble undermining any Presidential bearing that might be brought to bear on the moment, and his words being subject to an unforgiving edit at the hands of news coverage, being reduced to a sound bite, an aural appetizer upon which the listener may snack, but from which the listener will not achieve satiety. These failures three function as representatives of the larger failure of the visual and success of the aural, as after the turn, it is and always will be the turn of aurality.

While mentioned in passing in each of the examples and their accordant discussions of the move from the aural artifact in the original to the artifact as disseminated to the listener, there is a generationality present in each, where the resonance of the aural gives way to a

re-son-ance, a primarily (though not exclusively) patrilineal passing of the aural not only after the event, but also before, pointing to the enduring nature of the aural artifacts. Before Mohamed Atta and Ziad Jarrah, authors of the hijacker voice in their respective transmissions, there is Ramzi Yousef, plotter of the 1993 World Trade Center bombing. From there, the voice is passed down to the listener upon Atta and Jarrah’s deaths, a last will and testament of sorts that testifies to the unstable nature of the post-9/11 world. Before the World Trade Center jumpers, too, there are the Triangle Shirtwaist factory fire jumpers ninety years prior who, though leaping from a lesser altitude, and composed, in the majority, of younger women (rather than the predominantly male population leaping from the World Trade Center, a function of the gendering of certain financial activities practiced in the offices there), are similarly moved to flee flames. From there, the impacts are passed on to the listener as the impacting themselves pass on. Before George W. Bush, of course, there is George H.W. Bush, who similarly speaks of lines in the sand and aggressions which must remain seated, leaving certain Iraqi threads loosened for his descendant, and from there, the word passes to the next in the line of succession, the next conservative leader, Bush’s unnamed Republican successor (in 2012, perhaps?). Though the aural may pass through generations, and may suffer a slight degradation in quality as a function of that repetition, enough clarity remains to keep the listener rapt.

Such attention is attained, and is subsequently maintained, by the internal logic of 9/11 aurality, the way in which it follows from the conditions of its origin, as is true of each of the examples, where the capacity for understanding gives each a leg on which to stand. Though some may deem it so, it is not as if the voices of Atta and Jarrah arise out of some primordial ooze; these voices have a specific origin, one related to the call-and-response ethic demonstrated by George W. Bush’s speech. Due to a number of factors, chief among them being the presence of US troops near the holy sites of Mecca and Medina, US support for Israel, and US desertion of Afghan rebels in the wake of the Afghan war of the 1980s, the fundamentalist Muslims of al Qaeda feel themselves to be under attack and, as a result of this sentiment, the US faces blowback, the repercussions of these instigating actions, in the form of 9/11 more generally and Atta and Jarrah’s dispatches more specifically. Similarly, it is not as if the jumpers merely grow bored with their plight, or take a wrong turn at Albuquerque; instead, theirs is a considered decision, the result of no doubt agonizing machinations, less an act of suicide, a willful self-murder, than a reaction to or logical consequence of homicide, Atta and al Shehhi’s suicide acts yielding delayed homicides, not complete until the jumpers arrive at their bottom. Bush’s words are not without reason either, or at least their context is not, given that his scarce public appearances on September 11 and the days thereafter call for a show of presence, an effort to keep up appearances, despite his own lackluster appearance and the more central aurality of the speech itself. Each example arises not from the ether, from static, but from logic, a kinesis of past events and causality from which only a heightened aurality, produced at altitude (be it plane, tower, or rubble pile), may emerge.

Aurality being designated as the lingua franca, or rather lingua Americana, of the event, bin Laden, its assumed shepherd and subsequent target, may only follow suit, seizing on that aurality to expand his own profile as his literal profile becomes increasingly absent. War, the (seemingly) inevitable consequence of the event, is, in many cases, an increasingly aural affair, missiles being launched from afar by invisible foes, soldiers concealed beneath layers of armored abstraction, noticeable only upon explosive arrival, an extrapolated end of the aural beginning begun by the event. As a function of that conflict, bin Laden takes flight, though on foot rather than wing, becoming necessarily less visible, hiding from sight, leaving its field and becoming, for all intents and purposes, invisible, immune to the spying eyes of drones and informants. His visage, once a trademark not only of al Qaeda, but also of the nebulous “terror” that is now the designated enemy, drops from view, rendering bin Laden a specter, fading further every day until his appearance is nearly forgotten, until potential body doubles may be taken for the original. Upon disappearance, bin Laden’s communiqués change form as well, less the simple framed headshots of his previous addresses than grainy recordings smuggled from distant hideouts, voiceprints to be checked against past speeches for signs of mortality, the products of far off caves, themselves resonant spaces, echoing his words as his words echo those of Atta, Jarrah, and others before, capacious locales that contain the man as he himself evades containment. The visual holds no truck here, in the rugged land of four-wheel drive, and in the event itself, rough-hewn, uneven, unstable; views are obscured, blurred, blinded, and because of, or perhaps in spite of or independent of, these failings, visuality gives way to aurality. Recent revelations concerning bin Laden’s true circumstances in the suburban Pakistani compound where he was assassinated notwithstanding, with his relative comfort doing some damage to his self-made and other-made mythos, his spectrality persists, as he remains an unknown and unknowable figure, perhaps less the grand enigma than previously thought, but no less mysterious for it. If some had listened, 9/11 may never have come to be; all that is left is to listen after the fact, to the fact.


Isaac Vayo is an Instructor in Arts and Humanities at Defiance College. His research areas include sound studies and 9/11 (including attention to the use of voice recordings of the hijackers in popular media) and popular music and public memory (in relation to the Holocaust and postwar Germany, as well as 9/11 and the post-event U.S.). He is currently co-editing a collection (with Todd Comer) tentatively entitled Terror and (Post)Cinematic Sublime, as well as a linked special journal issue.


