BY CADDIE ALFORD AND DAMIEN SMITH PFISTER
Right up there with the most popular YouTube videos, from viral classics like “Charlie Bit My Finger” to new hits like Luis Fonsi’s catchy “Despacito,” are kids’ videos. But these videos are not what you might think—we’re not talking clips from Disney movies or KIDZ BOP songs. Rather, the kind of kids’ videos on YouTube that are raking in millions and sometimes billions of views have titles like “Huge Eggs Surprise Toys Challenge” and “Wrong Heads Skeleton Spiderman Tom Finger Family Colors Learn.” In some ways, they are what you might expect from kids’ videos: noisy, brightly-colored, and chock full of references to the Marvel Universe and Pixar characters. But in many other ways they are unsettling train wrecks that can only be described as too many cooks clamoring for attention.
Consider the video (now deleted) “Learn Color Bicycle & Finger Family ” by YouTube channel “Kids Spiderman TV.” The video’s soundtrack is a lilting earworm intermixed with disembodied, giggly kid noises. The landscape is confused: an animated mountainside lurks over gigantic bowling balls and pins, which in turn lurk over low-budget renditions of Spiderman and Captain America. A ghostly counting begins, directing us to Spiderman playing a bongo drum and Superman performing a glitchy, out-of-character dance. There is no rhyme or reason, just off-putting idiosyncrasies that increasingly transition from wholesome fun to violence writ large. Elsa from Frozen, for instance, is also in the beginning lineup, but the whole time she is inexplicably doubled-over, as if in pain. A minute in, a flamboyant Spiderman runs into Captain America, knocking him down. Later, Elsa falls repeatedly from her bike. Each time, she lies crumpled on her side for a full second before getting up. To the adult eye, this video is disturbing, but it’s bright and fast-paced. Kids love it.
Parents probably do too—there is a patina of educational value promised in the title that is also queued up in the opening seconds of these videos. In a video with the search engine optimized name of “LEARN COLOR BICYCLE & Finger Family — SUPERHEROES – Animation 3D Cartoon Fun for Kids”, superhero bike racers are counted out. Parents can confidently start the video, see that the number work suggests some educational integrity, and then have over ten minutes without intensive childcare. If they glanced over to the video later, they might see RED or BLUE on the screen, but they would likely miss the moment when the cartoon character Shrek mutters, “The gods wish more of me.”
We have a neologism for this genre: “Elsagate,” which refers to the genre’s reliance on a gold mine of unlicensed characters and images twitching along to catchy beats and nursery rhymes with surreal, fast-moving scenes. Perhaps at first glance the numbers and colors suggest something educational, but, like the Elsagate videos mentioned above, violent and eerie scenarios creep in, undisturbed by the cradle of digital capitalism. And sometimes, as is the case with (now terminated) Elsagate channel Lyra Channel TV, the violence doesn’t even hide: the videos’ very thumbnails depict a knife and the threat of punishment (see figure 2).
The person who brought critical attention to the multitude of “decidedly off” YouTube kids’ videos is James Bridle, whose 2017 Medium piece, “Something is Wrong on the Internet,” went viral. Bridle claims that Elsagate videos are being used “to systematically frighten, traumatize, and abuse children, automatically and at scale.” Such widely-viewed YouTube content for kids is more often than not creepy. Popular tropes like “finger family” and “bad baby” that started off completely innocent have become warped as video producers realize this format is an economical way to garner views and ad clicks. A stew of low-budget design studios, stock animation figures, algorithms, and key terms get jumbled around, churning out endless iterations, the byproducts of which are strange at best and violent at worst. Bridle calls this evolution in children’s entertainment a kind of “infrastructural violence” that impacts all of us on levels that are almost impossible to measure or even detect.
Is “infrastructural violence” too strong a term? We don’t think so. YouTube’s platform has long cared more about monetization than users’ mental and emotional well-being. David Farrier, director of HBO docu-thriller Tickled, wrote a piece in 2016, a year before Bridle’s piece, about predators on YouTube creating exploitative challenges to elicit video responses from kids. The kids doing these challenges, like sitting on Legos or pretending they are shrinking, are effectively “making fetish content for adults. They just don’t know it” (Farrier). And it’s not just kids’ entertainment: MSNBC journalist Chris Hayes recently tweeted about how searching for “The Federal Reserve” on YouTube algorithmically directs one to videos about the Illuminati’s role in the world economy. We live in an age when many of us are anxious about the role institutions play, or don’t play, in maintaining some kind of infrastructural order—especially the infrastructures designed and maintained by technology giants with no democratic accountability. In April 2018, we watched Mark Zuckerberg waffle before Congress on how Facebook doesn’t necessarily need to restrict Russian content. Later in the summer, Zuckerberg went on record saying that Facebook allows Holocaust deniers to maintain pages because they weren’t intentionally getting it wrong. Jack Dorsey similarly expressed the virtues of platform neutrality when Twitter defended not shutting down the account of right wing conspiracist Alex Jones. These technology platforms aim to be “Digital Switzerlands”—an analogy that, given Swiss collusion with Nazis in World War II, is a bit on the nose.
YouTube responded to Bridle’s piece going viral by shutting down channels and videos that he brought attention to. In the wake of Bridle’s 2018 TedTalk, some channels are cleaning up shop, including the aforementioned video “Learn Color Bicycle & Finger Family.” Hundreds, however, will fill their place, and the lack of adequate content moderators makes the task of takedown Sisyphean. Platforms certainly need to assume more responsibility for the infrastructure they are shaping; they need to enforce stricter policies, pull ads, and be smarter about the algorithms they use. Infrastructure in this more general sense are the durable connectors and underpinnings that support activity and structure exchange—the hardware and software, the routers and deep-sea cables and algorithms and platforms. They are paradoxically as unassuming as they are the determining factor for what circulates and what meanings are coaxed out of those circulations. Infrastructure are the substances and habitats that nurture through systematizing. Slipping in and out of focus, infrastructure is present without consistent presence. This is what makes the current political moment so fraught: we are suddenly aware of how central infrastructure is, especially in its digital permutation, but infrastructure is typically not ostentatious. We take infrastructures for granted: we travel a bridge to get to a shopping mall; we watch Hulu online and don’t think about the router. We usually only notice infrastructure when it breaks, as it did during the 2016 presidential election in the United States. As a kind of cooperative interface for beings to act upon and with each other, infrastructure sustains and directs what John Durham Peters eloquently refers to as “the ghostly cumulus of bodies at work” (p. 35). There is another kind of infrastructure at play here, though, and it’s one that doesn’t just enable platform experiences, but actually provides the grounds for ethical behavior.
The same transactional and subtle interactions between beings and physical infrastructure occur through what we are terming “rhetorical infrastructure.” Just as physical structures organize and inflect basic societal routines, so, too, do our rhetorical foundations: beliefs, opinions, and expectations—what in ancient Greek would be termed doxa. These are the resources we “think with rather than about” (Anderson, p. 8). We get to ignore such foundational resources because they are more the navigational cues than the destination itself. For instance, a deeply-held belief in education’s power might inspire conversation or, on the flipside, a meeting might get out of hand when someone’s value system (e.g., fiscal conservatism, vegetarianism, feminism, etc.) gets mocked by a colleague. These doxa, these opinions and beliefs, make possible rhetorical behaviors and actions—they offer a shared wellspring that coordinates complex social action. When we impulsively reach out to comfort someone, we are traveling an opinion both within and without us; when a crowd forms to celebrate a wedding, the guests intuitively react to the occasion by standing as witnesses. Doxa encompass a rhetorical infrastructure that is there for us, the foundation of every ethical decision we make.
What we are grappling with as a culture is some doxic rattling—as a culture, especially because of the internet, “we’re going through a slightly eighth grade moment.” Digital culture has produced new sources of doxa that compete with earlier sources and, more importantly, new methods of producing doxa through the algorithm-driven processes that we have sketched out in the context of creepy kids’ videos. The internet has become a training grounds for doxic infrastructure: scan this, tap that, view, favorite, retweet—essentially, check the infrastructure. That opinion is stable enough to push there, but a belief in democracy is crumbling too much to like that post or read this article. Our opinions and beliefs mature in digital contexts, increasingly exercised and challenged through our everyday habits encountering everyday texts. YouTube is no different: each clip scaffolds—slowly, irreparably, imperceptibly—our precious rhetorical infrastructure. Beings upon beings encompass numbers of views that are not, indeed, passive intake of media, but are active bursts of beliefs and opinions taking shape, in tandem, for future use.
This isn’t a new phenomenon. Throughout our lives, doxic infrastructure are gauged, strengthened, recalibrated, or demolished. Baptisms or retirement parties, for instance, call on a mutual, yet individual, assessment of the load-bearing beams of our rhetorical infrastructure. We can disagree with the values and opinions that are disclosed during funerals or lectures, but most of the time, the rule of these events does not need to be explicitly detailed because they are fairly routine, and culture prepares us for participating in them. Assembling the right and appropriate rhetorical offering to upkeep our doxic infrastructure rarely leads to anything that feels twisted.
In digital media ecologies, though, the scale of the endeavor is disproportionate to the doxa that are constantly forming, and responsible institutions are nowhere to be found. Information abundance runs wild, and the rules feel like they’ve been changed and are difficult to figure out. We can develop ways to content moderate YouTube kids’ videos: “THESE VIDEOS WILL NOT PASS!” (our standards for acceptable tests of the doxic infrastructure).
Or, alternatively, we can continue to stare askance at videos of humans acting out creepy scenes written by algorithms. Neither of these options is particularly satisfying, in part because they elide the centrality of education in the formation and critique of doxa. We need to be able to access doxa that will make our priorities and personalities legible to digital spheres, but we need the proper conditions for doxa to support us. Unfortunately, without the perspective, attunement, and defenses that most adults have, kids experience Elsagate videos as opportunities to hone doxic infrastructure (although we note, given the “fake news” phenomenon, this isn’t only a childish phenomenon). A child can go on YouTube, search for their favorite trope, and any videos linked to that trope will rise to the top. Click on one, and a playlist forms. The playlist cycles through without any prompting necessary. Without institutionalized quality control, toxic doxic infrastructure forms here, at this juncture of innocence, receptivity… and violence.
This is a significant way to approach the issue of “infrastructural violence” because it is our very opinions and beliefs and values that orient us as ethical beings in the world. During encounters with others, where ethical decisions must be enacted, doxa form paths under us. But what if we never had the chance to try out our opinions, to have them turned over with a non-profit, non-algorithmically-driven other? What if instead of getting to enrich our doxa with nuance while protecting them from power-mongers, demagoguery, and out-of-control grotesque entanglements, our childhood’s doxa were fused with the likes of alt-right trolls and desperate algorithms? Where, then, leaves our collaborative pursuit of an ethic of the good?
Bridle ends his piece wringing his hands: all of us see these videos, and yet “we’re struggling to find a way to even talk about” why they are upsetting—how we might “describe its mechanisms and its actions and its effects.” We will continue to struggle, especially as virtual reality environments become commercially viable. The immersive dimensions of virtual reality are hinted at by the colorful, wacky environments in bizarro kids’ videos—and so are the ethical quandaries. In order to pursue the ethical entailments of new technologies, we need a sustained conversation about how doxa is formed and how it should be formed. We may need to seek more radical solutions—in the etymological sense of “at the root”— than the “algorithmic solutionism” that prescribes better, fitter, and more productive algorithms to address current problems. Perhaps foregrounding doxa as a key element of our rhetorical infrastructure will provide a new angle of vision to see contemporary problems of ethics related to digital media.
Anderson, Dana. Identity’s Strategy: Rhetorical Selves in Conversion. University of South Carolina Press, 2007.
Peters, John Durham. The Marvelous Clouds: Toward a Philosophy of Elemental Media. University of Chicago Press, 2015.
- Caddie Alford is Assistant Professor of Rhetoric and Writing in the English Department at Virginia Commonwealth University. Her work has appeared in enculturation and Rhetoric Review. She recently published a book review of Kate Manne’s Down Girl: The Logic of Misogyny for Rhetoric Society Quarterly and is currently serving as the book review editor at enculturation. Her current book project, Everyday Chatter: The Matter of Opinion in the Age of Social Media, recuperates the ancient Greek concept of doxa for contemporary rhetorical theory, especially in the context of social media.
- Damien Smith Pfister is an Associate Professor of Communication at the University of Maryland, studying the confluence of rhetorical theory, networked media, digital technology, public deliberation, and visual culture. He is the author of Networked Media, Networked Rhetorics: Attention and Deliberation in the Early Blogosphere (Penn State, 2014) and a co-editor, with Michele Kennerly, of Ancient Rhetorics + Digital Networks (Alabama, 2018).