Today I spoke with developers, researchers, and repository managers attending the Open Repositories 2012 conference in Edinburgh and obtained valuable feedback on the idea of a device for automated transcription and deposit of audio files.
The most direct way to implement the device is as a mobile phone application. Fortunately, at last year’s conference in Austin there was a similar application that allowed smartphone users to deposit mobile camera photos into repositories via the SWORD protocol. The deposit application from last year’s conference is described in more detail here:
It should be possible to extend this application to allow SWORD deposit of smartphone recorded audio files, as well as an associated, automatically generated transcription. Here’s an unrefined interface sketch, intended to serve as a discussion aid and to help people find design problems.
From the mobile app, the user can record an audio file, request automatic transcription of the audio file, and deposit files into the repository.
When transcribing audio files, the user can select from two cloud transcription services: Microsoft Research MAVIS (computerized speech recognition) or Amazon Mechanical Turk (crowdsourced) transcription.
When depositing into the repository, the user may choose to submit the original audio, the text transcript, or both files. Depending on repository capabilities, the files can be submitted into a private workspace to protect data that should not be available to the world – however, the application is intended to be used in oral history projects where the intent is for widespread public availability.
I believe that this approach would be doable as a (long) weekend hack by an experienced mobile app developer, considering that the SWORD deposit base has already been developed.
The app allows for easy transcription and deposit by historians, archivists, and others. Also, depositing the fulltext transcription of audio files into the repository enables full indexing for keyword search of audio files in the repository. Finally, the text transcript can provide access to users who prefer or require access to information via text instead of audio for reasons of social appropriateness, time saving, or auditory impairments.
I am stuck on the important parts of this idea and am seeking people who can refine/prototype/pitch this app for the Open Repositories 2012 Developer Challenge.