Regardless of medium, when it comes to accessibility, the goal is to make information available to the widest possible audience by presenting it in a variety of formats or through a platform that allows for easy interoperability with external devices that can transform it into other formats.
Key accessibility questions include:
Can we find it? - Do I have sufficient, properly structured metadata to make my content discoverable by as many audiences as possible?
Can we use it? - Is my content viewable or actionable by different methods (e.g. not just mouse clicks, but keystrokes or menus that are compatible with speech commands?)
Can we read it? - Is my content available in multiple formats, or at least formats that are flexible and interoperable with other tools and software designed to enhance accessibility?
Can we participate or get help? - Does my content include ways for users to comment, interact with, or contact me?
Historically, federal accessibility regulations for audio have been far less rigorous than those for audiovisual media (e.g. closed captioning requirements for television). Web-based media platforms, however, provide much better opportunities for enhancing the accessibility of audio content than were available in the days when radio reigned supreme. Two of the most common options are transcripts and closed captioning.
Audio hosting platforms often lack robust features for transcripts and captioning, so making audio content more accessible often means embedding an audio player from your hosting site of choice on a different site and augmenting it with supporting text and metadata.
Audio production software can vary in terms of support for screen readers. Two recording and editing platforms that do have dedicated support for screen readers are the PC version of Audacity and both Mac and PC version of Reaper, which makes use of a third-party plugin for greater screen reader support. VoiceOver support for Mac software with GarageBand and Logic Pro varies with operating system and software version.
When it comes to planning for accessibility, earlier is always better. It is much more straightforward to design for accessibility in the first place than it is to go back and build it in later. With that in mind, make sure you have a plan for creating, hosting, and making available a transcript of any podcast episode or audio essay you create. Providing text along with audio ensures that what you have worked so hard to create can reach more types of audiences, and promotes equitable access to knowledge and information.
A second, more dynamic option is closed captioning, which provides smaller text snippets in real time along with the audio track as it plays.
Finally, there is a hybrid of the two called an interactive transcript which displays a full transcript alongside the audio content, but highlights the active snippet of text in real time as the audio plays.
As you will note, all of these features place text at the forefront. This is because text can be more easily manipulated and transformed than audio, allowing for devices like screen readers to slow it down, speed it up, search it for key terms, or otherwise transform it with various types of accessibility software.
Accessibility can also be improved by adding helpful metadata to your content that makes it more discoverable and easier to navigate through. Here is a guide to editing metadata for audio files.
The Web Content Accessibility Guidelines (WCAG) were developed by the W3C, the leading independent standards body for the world wide web. They have since been codified in U.S. legal settlements and federal communications regulations.
Key requirements under WCAG include:
Transcripts AND closed captioning are required for all web-based audio and video content. This means there is a sea of non-compliant content out there, including lots of material created by educational institutions.
Playback must be 'keyboard functional.' This means any audio or video player that has mouse controls must have corresponding alternate keyboard control options for playback (stop, play, forward, rewind, pause).
'Equivalent Information.' This requirement is troublesome and hard to parse because of the inherent differences that exist across media types as well as types of sensory perception (e.g. sight vs. sound). That said, the crux of it lies in making sure that if there is auditory or visual information that would not be readily evident in a text transcript (e.g. the arrival of a new speaker or the inclusion of background music), then additional textual cues or metadata should be included to describe or represent these elements.
For more detailed information on best pratices for accessibility, check out U-M Library's Digital Accessibility guide.
This portion of the LibGuide is intended as a resource to assist with audio transcriptions. It includes links, descriptions, and ratings of various third-party programs that use artificial intelligence (AI) to generate transcripts, and how you can use these transcripts either for the audio itself or in conjunction with the creation of video captions. Some of these services also offer human transcription, at a higher cost.
There are a variety of uses for transcribing audio in this way:
Our ratings are based on tests run with multiple audio interviews from the Michigan Time podcast, which was created by Maggie Cease, former Design Lab Resident.
Please note that there are other platforms available that were built after we ran these tests. We hope to do a similar test with these platforms in the future and will update the table accordingly. Other platforms to check out include Otter, Descript, and Sonix.
Please use the table to navigate between the different sections (organized by the specific transcription platform), which will give you greater detail about each platform If you have any questions please feel free to email the Shapiro Design Lab at firstname.lastname@example.org.
|Audio Transcription Service||Accuracy||Ease of Use||Cost||Response Time||Editing Time||Notes|
|Okay||Easy||Free||Fast||Medium Slow||Free with Canvas account, intuitive interface, not as accurate as other services|
|Bad||Intermediate||Free||Slow||Slow||Free, listens in real time (speaker output can be made into input), so speed of audio should be slow. Must be connected to the internet|
|Best||Easy (browser) Intermediate/Hard (command line)||Low Cost||Fast||Fast||First five hours free, can be done in browser or on command line|
|Good||Easy||Moderate Cost||Fast||Fast||Free one-time usage, API documentation available but also available in user-friendly interface|
|Best||Easy||Costly||Fast||Fast||Easily editible, user-friendly interface, harder to delete files|
|Okay||Intermediate||Free||Medium||Medium Slow||Free, takes audio from .mp4 or other video format file (great for captioning videos)|
Note: You may need to slow down the play-speed of your audio, otherwise dictation.io may miss a few words. In this case, we use audacity, a freely available software to do so.
Note: At the original time of publication, you could only use Rev.AI via the terminal/command line, and the instructions below reflect this workflow. However, recently they have added the ability to use this tool through the browser, meaning that you can upload audio directly to your account and then download either a JSON or TXT file of the final transcript. Use the link below to visit the site and follow their instructions for transcribing a piece of audio. The browser version also gives you the option of entering a custom vocabulary, which can increase accuracy and cut down editing time. Also, the platform can now provide transcriptions in five different languages: English, French, German, Portuguese, and Spanish.
Transcribe Audio File.
Once file is uploaded, click the yellow “Checkout” button either at the top right of the page or the bottom right of the page.
After checkout, Temi will begin to produce a transcription for you complete with timestamps and speaker identification.
Transcription time will be dependent on how large your file is, but the transcription is typically ready in 5-10 minutes.
Once Temi is done transcribing your file, it will appear under the “Dashboard” tab (can be accessed from the homepage → click on your username on the top right of the page, it will release a drop down menu where the first option is “Dashboard”) → Click on the “View Transcript” button for your file (this will be under “Status”).
Make any necessary edits.
Add speaker names by clicking on the “Add speaker” option on the left, which shows up again each time Temi identifies a change in speaker. You have the option to change all “Speaker 1” identifications to a specific name (similar to a “find and replace” feature).
Replace any missed words by clicking on them and typing in the correct word.
Once you have reviewed the transcript and edited it to your satisfaction, click on the “Download” button on the top right of Temi’s interface and save your file.
Temi can export a variety of file types (.docx, .pdf, .txt, .srt, .vtt) and gives you the option to include speaker names, timestamps, or export only highlighted sections.
Converting your mp3 file to mp4. via Adobe Premiere (or MPEGStreamClip If you do not have access to these software). You can find Adobe Premiere by signing into ‘apps anywhere’ link or through the Library computers across campus).
Uploading your file to YouTube.
If you would like to explore these tools more of need assistance, please feel free to email the Shapiro Design Lab at email@example.com.