NISHIO Hirokazu[English][日本語]

J-Moshi

atsumoto_ohashi Japanese real-time spoken dialogue model J-Moshi is now available! Based on @kyutai_labs' Moshi, it "speaks" and "listens" at the same time like a human. This is the first model available in Japanese. Model size is 7B and lightweight, please try it! Will be presenting at #NLP2025. https://nu-dialogue.github.io/j-moshi/

takahiroanno The aisle, filler, and cut-ins are so natural and amazing,..! And this is done in a small 7B!

If we run this on a local machine, we might be able to organize tasks by voice without relying on Advanced Voice.

  • I'm not sure when this happened, but Advanced Voice now disconnects after about 5 minutes if I leave it on silent, so I can no longer leave it on when I'm working on my computer and use my voice to control it when I think of something.

nu-dialogue/j-moshi: J-Moshi: A Japanese Full-duplex Spoken Dialogue System

  • J-Moshi is in the prototype stage, and its responses may be unnatural. In addition, since most of J-Moshi's training data is chat dialogues, it cannot generate responses according to the user's instructions.

    • Uh, I wonder how much this will affect them.

https://x.com/akkikiki/status/1882913953749287288?s=46&t=gkSZtjGEtUZPO0JCzBxCBw A story about converting it to work on a Mac

2025-01-25 INSTALLING move as fast as one can Still in Japanese and free of charge at this time.

  • Looks like a kukancho.
  • Not yet able to do useful tasks in conversation
    • Cannot generate a response according to the user's instructions.". Which means.

This page is auto-translated from /nishio/J-Moshi using DeepL. If you looks something interesting but the auto-translated English is not good enough to understand it, feel free to let me know at @nishio_en. I'm very happy to spread my thought to non-Japanese readers.


(C)NISHIO Hirokazu / Converted from Markdown (en)
Source: [GitHub] / [Scrapbox]