As a control, I transcribed the audio myself (using oTranscribe) and then listened through several times to check for total accuracy. We normalized the resulting transcripts using Microsoft Word, removing timestamps and making sure speaker names were congruent. We then uploaded the audio to each tool and kept track of how long each one took to transcribe. With a private YouTube Live, which is how Kristen joined us.With my iPhone 6S Plus, using the Recordly app to record, placed next to the Zoom.With a Zoom H4nPro handheld microphone, placed between us.We recorded our 14-minute test in Poynter’s webinar studio and were interrupted by the sound of at least one loud plane overhead (there’s an airport a few blocks away), an emergency vehicle and the clamoring of Kristen’s phone. To torture the algorithms even more, we also read passages at a much faster pace than we usually speak, Dulce and Alexios spoke a variety of foreign languages (Italian, Spanish, French and Greek), we uttered as many proper nouns as possible (Apalachicola, Michael Oreskes and various Greek islands, to name a few), got creative with Urban Dictionary (a portmanteau of Paul Manafort and a crude word describing the state of his legal situation) and talked over each other with some frequency. Audio from a phone or video chat seems to be universally difficult for them to handle. Kristen joined us via Google Hangouts/YouTube Live ( disclosure: a grant from Google News Lab partially funds my position), which most automatic transcription tools openly warn against. Me, and though Kristen said I have a “Buffalo accent,” I think my inclination to mumble, talk too fast and skip parts of words probably proves more challenging for transcriptions (Recording yourself in anticipation of being transcribed clearly leads to a little self-reflection.).Kristen Hare, a reporter at Poynter, who thinks she sounds “slightly valley girl-ish” when she listens to herself on recordings.Dulce Ramos, program manager for the International Fact-Checking Network, who came to Poynter from Mexico City in September.Alexios Mantzarlis, Poynter faculty and director of the International Fact-Checking Network, who hails from Rome and described himself as having a lisp and “some funny words that mix British, Italian and odd American accents”.That’s because we tried our darndest to confuse them.įirst, to reflect a wide range of people, voices and accents, we recorded our sample audio with four participants. Read on to see why.Īs you’ll see, the accuracy rates of these tools are low. Though it wasn’t the most accurate, most feature-rich or the cheapest tool we tried, its transcript editing tools and ability to fit a little more seamlessly into a journalist’s workflow help it to edge out its competitors. Though none of the tools were perfect, one edged out the others as the best in the category.Ī combination of accuracy, features and ease of use make Trint the best choice for automatic transcription for journalists. We ran each tool through a variety of real-world scenarios, experimenting with how each one fared against a journalist’s typical use. We tested (or tried to test - more on that later) eight of the most popular transcription tools aimed at journalists, including Dragon Dictation, Happy Scribe, oTranscribe, Recordly, Rev, Sonix, Trint and YouTube.
It now takes just a few minutes, and a few dollars, to upload audio or video to a site and receive a fairly comprehensive transcript.īut, like all tools, some are better than others. Automatic transcription tools have been on the market for a while now, and they’re finally getting good.
Well, it turns out we don’t really have to. If we can ask our phones for the weather in Albuquerque and compel a plastic cylinder in our living rooms to read the Washington Post out loud, why are we still transcribing interviews by hand?