Researchers at Uppsala University have a great need of legal and efficient transcription services. Traditionally this has been a tedious manual work associated with large costs. In addition, if trancription of sensitive data is outsourced it becomes complicated from legal point of view, hence a wordaround has been to temporarily employ a person for the transcription. This requires a lot of administration and takes time and resources.
In recent times the develpment of AI has paved the way for automated machine transcription. Up until recently this has been a service mainly offered by companies outside of Sweden, again leaeding us into a legal gray zone regarding handling and storage of sensitive data.
An on-prem installation of the open source speech recognitions software OpenAI Whisper will solve a lot of above problems. It requires substantial compute power to run smoothly, a T4 GPU typically.
- No legal problems, on prem, data never leaves UU
- Compute capacity problems,
- No coslty internal billing administration
- Open soruce free software, no procurment issues
- Whisper is the best trained transcription software to date
The outcome of this project will be a step-by-step guide how to effectively use Whisper at Uppmax.