In the 2nd part of our post series, we describe the transcription part of the implementation. This is a project Analytiq Hub developed for Boston Medical Data.
Medical Screening Call Transcription
Transcription can be implemented with AWS Transcribe, or, more specifically, given the medical subject of the conversation, AWS Transcribe Medical. Conversation summary notes can be implemented with AWS HealthScribe.
AWS has a number of blog posts describing reference implementations for the call transcription workflow:
- Use generative AI to increase agent productivity through automated call summarization (Nov 2023). This describes transcription of support center calls – a slightly different problem, where AWS Transcribe is a good fit for transcription. In our case, for medical calls, AWS Transcribe Medical is a better fit – but the architecture can remain unchanged. Here is the reference architecture from the blog post:
Another AWS blog post describing call transcription is:
- Principal Financial Group uses AWS Post Call Analytics solution to extract omnichannel customer insights (Nov 2023). This design is also for customer contact center calls – and would have to be adapted replacing AWS Transcribe with AWS Transcribe Medical. Here is the architecture diagram for this solution:
A third blog post describing a reference design for call transcription is:
- Create summaries of recordings using generative AI with Amazon Bedrock and Amazon Transcribe (Dec 2023). Here, also, AWS Transcribe would have to be replaced with AWS Transcribe Medical.
This solution comes with full Github repo implementation from AWS, using CloudFormation, AWS tRanscribe, AWS Bedrock, and a Step Functions state machine. The architecture diagram in this implementation is:
The state machine in this design is described below:
What are the differences between AWS Transcribe and AWS Transcribe Medical?
Feature | AWS Transcribe | AWS Transcribe Medical |
---|---|---|
Purpose | General speech-to-text | Medical speech-to-text |
Supported languages | Wider range | US English only |
Accuracy | Good for general audio | Highly accurate for medical terminology |
Features | Speaker ID, custom vocabularies, real-time transcription | Medical speaker diarization, medical term identification, HIPAA compliance |
Use cases | Interviews, meetings, podcasts, etc. | Clinical documentation, pharmacovigilance, telehealth, etc. |
- Both AWS Transcribe and AWS Transcribe Medical support a real time feature, where transcription results can be read real time as audio is being streamed.
- Both also support custom vocabularies. These are pretty easy to set up, and improve transcription accuracy for domain-specific terms.
- AWS Transcribe supports custom language models.
- You can train a model specifically for your domain to improve accuracy for specialized terminology and speech patterns.
- A significant amount of domain-specific text data (up to 2 GB) is needed to train the model.
- This feature is not supported by AWS Transcribe Medical.