Unmatched Accuracy and Lightning Speed in Python for Speech Recognition



AI Nuggets

Assembly AI API Tutorial Outline

Getting Started with Assembly AI’s Universal One Model

  1. Import Assembly AI’s Python SDK
    • Command: import assemblyai as aai
    • If not installed: pip install assemblyai
  2. Set Assembly AI API Key
  3. Define Audio Data
    • Provide a URL to an audio or video file or upload a file from your system.
  4. Transcribe Audio Data
    • Define a transcriber object.
    • Call the transcribe function with the audio URL.
    • To get only the transcription text: transcript.text

Customizing Output with Assembly AI API

  1. Using Universal One Model
    • By default, the API uses the Universal One model when calling the transcribe function.
  2. Using Nano Tier for Bulk Transcription
    • Pass a config argument to the transcriber with speech_model set to ‘nano’.
    • Example: transcriber.transcribe(audio_url, config={'speech_model': 'nano'})
  3. Disabling Punctuation and Formatting
    • Set punctuate to false and format_text to false in the config.
    • Example: transcriber.transcribe(audio_url, config={'punctuate': False, 'format_text': False})
  4. Transcribing Non-English Audio Files
    • Set language_code to the desired language code.
    • For automatic language detection, set automatic_language_detection to true.
    • Example: transcriber.transcribe(audio_url, config={'language_code': 'es', 'automatic_language_detection': True})
  5. Getting Speaker Labels
    • Set speaker_labels to true in the config.
    • Print the results of the speaker diarization.
    • Example: transcriber.transcribe(audio_url, config={'speaker_labels': True})

Pricing and Usage Tracking

  • Pricing
    • Best tier: $0.37 per hour.
    • Nano tier: $0.12 per hour.
  • Usage Tracking
    • Visit the Assembly AI dashboard to track transcription hours and costs.

Additional Resources

  • Assembly AI Documentation
  • YouTube Tutorials
    • Description mentions links to tutorials in the video description.
  • Next Steps
    • To learn how to transcribe a stream of audio with Assembly AI, watch the suggested video.

Notes

  • The code examples and configurations can be found in the Assembly AI documentation under the speech recognition section.
  • The tutorial demonstrates how to use the Assembly AI Python SDK to transcribe audio data and customize the transcription output.
  • The tutorial also covers how to use different tiers of the Assembly AI service and how to handle audio data in different languages.
  • The pricing information and how to track usage are provided for users to manage their costs effectively.