Technology has impacted many aspects of journalism, but one thing that’s still stuck in the 1970s is how journalists get the content out of a recording.
For decades, any journalist who’s recorded an interview has had to hit play, stop, and then type, and then go back and hit play and stop and then type away. Journalists can’t get a lot done until they have found that quote or sound bite that’s just right and captures the essence of a story.
Jeff Kofman, an Emmy award-winning journalist and foreign correspondent, is solving this problem.
In late 2014, Jeff left his illustrious career as a broadcast journalist to begin working full-time on developing a high-tech solution to transcription. Today, Jeff is CEO and co-founder of Trint, a start-up which offers an audio and video-to-text transcription service that condenses a journalist’s workflow significantly.
Trint glues audio & video seamlessly to an automated transcript.
Launched in September, 2016, Trint can also be used by anyone who wants an audio and video-to-text transcription service, such as academics, lawyers, students, podcast producers, and more.
Trint’s key innovation is its machine learning-powered editor, which stitches together the text to the audio, making it incredibly simple to search, verify, and, if necessary, correct. It’s all untouched by human hands. And it’s a truly disruptive technology that’s already been adopted by NPR, Thompson Reuters, ESPN, and other organizations, including universities and publishers.
It transcribes in 13 languages, including English, and in three English accents: North American, Australian, and British.
Based in London, Jeff took time out of his incredibly busy schedule to have video session with Sheelagh Caygill. The session was then Trinted for publication here.
Next week, we’ll talk to Jeff about his transition from journalist to entrepreneur.
Audio and video transformed seamlessly to an automated transcript
Jeff, many readers outside the world of journalism (or transcription) may not understand the significance of Trint and the breakthrough it heralds. Can you put this into perspective for readers and explain how Trint resolves the challenge?
We live in a digital age and that means that most of the communication we exchange is in the form of recorded talk, whether it’s audio or video. At least 80 percent of the Internet’s content is recorded talk in some form. But the problem is that it’s not searchable.
So let’s take an example. Suppose you listen online to a speech about climate change in the Arctic. Then the speaker talks about how polar bears are losing their habitat. For someone interested in only the polar bear points, there is no way to find the spoken reference to polar bears. Instead, the listener has to go through it for the whole hour, or someone has to transcribe it.
That’s what is called dark data. Dark data means that there is a recording out there with content on it that nobody can access. It’s like it is off limits.
What Trint does is it shines a light on dark data. You can take that hour-long recording and Trint it, and suddenly it is searchable on Google.
I can then put the words “polar bear” in the search bar and I can find it. I can see three references, I can browse through them, and find the one I want. And because the text and the audio and the video are all glued together, I can hit play and listen to the speaker and I can hear how she or he has said it and instantly I’ve accessed it.
So from what you have described, Trint is clearly a service that will appeal not only to journalists, but many others, too.
Yes, that problematic scenario that I just described applies to so many people, not just journalists. It has become a part of the way we live today because the problem is that the technology’s gotten ahead of our ability to access the content it creates. Trint is our contribution to trying to help people catch up.
So Trint truly is revolutionary and a significant time saver.
In fact, so that people can understand how much transcription is a headache for journalists, part of Trint’s marketing swag is little candies that look like Tylenol or aspirin, and we call them Transcription relief.
A need to search audio and video recordings
Clearly the hours you had to spend transcribing as a broadcast journalist inspired you. Where else is your passion coming from?
You have to think big and have a passion and a vision to do this and I believe that we’re creating a new language.
The world needs a way to search audio and video recordings and find the content and validate that it that it’s correct and share it . . . get it out there through journalism, social media or through colleagues or corporate partners whatever your needs.
We need to be able to access the content of recorded talk. And Trint is creating an incredibly intuitive language. And what’s critical is to understand that automated speech recognition has made huge strides.
We’ve all seen how good Siri is. When it was first introduced it was kind of a joke. Now it’s really good. Even so, being really good is not the same as being correct and trustworthy. We know that all of these things like Alexa, Siri, and IBM Watson – they make mistakes. And for a journalist, lawyer, business person or academic, those mistakes make it flawed data.
So as good as automated speech recognition has become, its errors disqualified it. Because for any journalist, we can’t use that until we know we can validate it.
And there’s no easy way to do that. You’ve got to go back to the reporting, find the moment, listen to it. We might as well just do it all without automated speech because it takes a long time to find what we want.
Trint bridges that gap because we marry the text to the original audio or video. When you play the recording in your browser, you can follow the transcript and correct any errors directly in the browser copy.
How will Trint work with people whose first language is English but who have a heavy regional accent?
That’s actually a really smart question. It works really well and if you look at Trustpilot or our Twitter feed, you will see we get a ton of love.
But the truth is we also get people who think that Trint doesn’t work. They can be quite belligerent in their letters to us, saying you know this product isn’t what it claims to be on the box.
There are only two reasons why people would say that. One is they recorded bad audio, which could be from a Skype call, from a noisy coffee shop with music, or from the back of a lecture hall on their iPhone or Android from 100 feet from the speaker. That kind of audio won’t work with machine learning. And that’s an education. But the other is that it is heavy regional accents.
It will work with North American, British and Australian English. But with heavily accented English, you probably won’t be happy with the results. With Scottish and Welsh, we will eventually look at doing those. But they are small markets and we’re a small company, and you have to be quite harsh about where are you going to put your resources at the early stage. For us, it’s a matter of prioritizing our markets. We’ve got to pay our bills.
The Trint Player
How will Trint work with podcasts?
We’re building the Trint Player for launch in summer 2017. It will provide embeddable, searchable audio aligned text video and video. Users will be able to embed the Trint Player on websites, making audio-video content that will be searchable on Google. People will be able to find that content and instantly send it out on Twitter, Facebook, wherever. There will even be a captioning facility too.
With the work that’s been done so far, the feedback you’ve been getting must feel incredibly rewarding.
If through Trint we can contribute a way to make journalism more efficient so that people can focus on content production and creation rather than simply being stenographers, then that’s a pretty great contribution that we as a team and Trint can do.
As you know, the problem is that journalists and many others are all being asked to do more with less, and faster, too. Searching content is the bottleneck. It really hasn’t changed amidst all of the changes in journalism over the last 30 years that I’ve witnessed.
The one part that hasn’t changed is how we get the content out of our recordings. And that means that you can’t get anything done until you find your quote or sound bite or click whatever you want to call it and what Trint does is condense that workflow massively.
It’s really gratifying the kind of e-mails and tweets you see that we’re getting on our Twitter feed, It’s really really fun to see and really rewarding to see people get it.
Journalist, entrepreneur, and CEO Jeff Kofman
Jeff Kofman is an award-winning journalist, and foreign and war correspondent. He’s a veteran reporter with ABC, CBS and CBC News and has worked in Canada, the U.S., Latin America, the Caribbean, and the U.K. He has reported on many of the biggest stories of our time including the Iraq War, Hurricane Katrina and the Arab Spring.
In 2011 he won an Emmy for his coverage of the Libyan Revolution and the downfall of Col. Muammar Gadhafi. He has also been recognized with an Edward R. Murrow Award, an Alfred I. duPont-Columbia University Award for Excellence in Journalism, and a DuPont Award, and a special Emmy for coverage of the 9/11 World Trade Center attacks.
While with ABC, Jeff reported from around the globe on stories in the U.K. and Europe as well as the Middle East and Africa for World News with Diane Sawyer, Nightline, and Good Morning America. With CBC, he worked on the flagship news show The National with anchor Peter Mansbridge.
Support Communicate Influence!
Enjoy this interview, or find it informative and inspiring? Want to read more? Consider donating to Communicate Influence. In return, you receive more great interviews, a great feeling for contributing, and, not least, our heartfelt thanks.
© Communicate Influence. Please see Communicate Influence’s Terms and Conditions for information on sharing, adapting or attributing content.