whisper2024-03-25

NISHIO Hirokazu [Translate]

Whisper2024-03-25
prev Whisperの活用振り返り
Whisper

3h44録音をWhisper APIに渡してみた
Speech to text - OpenAI API
 python whisper.py  0.37s user 0.18s system 0% cpu 10:00.79 total 
後半はノイズだけだったので実質2時間だった

1時間の勉強会音声で試す
 python whisper.py  0.34s user 0.14s system 0% cpu 3:27.50 total 
さっきの例と違ってこちらは1時間ほぼずっと話している
1時間の音声で、処理時間が3分半、費用が40セント
得られた文字起こしをClaudeにまとめさせた
勉強会の音声を文字起こししてAIまとめ

コードは何も難しくない
pythonaudio_file = open(audio_path, "rb")
transcription = client.audio.transcriptions.create(
    model="whisper-1", file=audio_file
)
print(transcription.text)
with open(f"whisper_out/{indir}_{audio}.txt", "w") as f:
    f.write(transcription.text)

一番時間がかかったのはffmpegのインストール、gccとか入れ始める
$ brew install ffmpeg
なお音声ファイルの分割のためなので音声ファイルが25MB以下なら必要ない
上記の1時間喋りまくりの音声が15MBだから、1時間で刻んでくれる録音アプリを使ってるなら必要ない

"Engineer's way of creating knowledge" the English version of my book is now available on [Engineer's way of creating knowledge]

(C)NISHIO Hirokazu / Converted from [Scrapbox] at 2/26/2026, 5:09:53 PM[Edit]

Related Pages