ListenX
MediumSpeech Recognition Model
Speech-to-TextWhat ListenX Medium Does
ListenX Medium is an advanced speech recognition model designed for accurate transcription across multiple languages and accents. Built on state-of-the-art transformer architectures, it delivers exceptional accuracy while maintaining reasonable computational requirements.
This model strikes the perfect balance between accuracy and performance, making it ideal for real-time transcription, voice assistants, and automated captioning systems.
Key Features
- •High Accuracy – 95%+ word accuracy across multiple languages
- •Real-time Processing – Low latency for live transcription applications
- •Multilingual Support – Supports 20+ languages including Arabic and English
- •Noise Robustness – Performs well in noisy environments
System Requirements
GPU Memory
0
Model Size
~800MB
Latency
~100ms
How to Use
Load and Use the Model
"keyword">from transformers "keyword">import AutoModelForSpeechSeq2Seq, AutoProcessor
"keyword">import torch
# Load model and processor
model = AutoModelForSpeechSeq2Seq.from_pretrained("tokenaii/ListenX-Medium")
processor = AutoProcessor.from_pretrained("tokenaii/ListenX-Medium")
# Load audio file
audio_input = processor(audio_file, sampling_rate="number">16000, return_tensors="pt")
# Generate transcription
"keyword">with torch.no_grad():
predicted_ids = model.generate(**audio_input)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens="constant">True)
print("Transcription:", transcription["number">0])Download the Model File Only
"keyword">from huggingface_hub "keyword">import hf_hub_download
# Download the model file "keyword">from the repo
model_path = hf_hub_download(
repo_id="tokenaii/ListenX-Medium",
filename="pytorch_model.bin"
)
print("Model downloaded to:", model_path)Can I Run This Model?
Enter your system specifications to check if you can run this model (functionality coming soon):
