world’s first open source LLM for speech detection
Jingzhunxue has launched “FlowMirror-s(V02),” the world’s first open-source large language model (LLM) for speech detection and interaction catering to the education industry.This is as an effort to promote industry knowledge-sharing and the adoption of artificial intelligence-powered learning services.
As an Alibaba-funded company, Jingzhunxue provides educational hardware and software products (AI-assisted learning devices). With AI technologies, the AI team creates proactive learning experiences comparable to or surpassing human education, while striving to reduce technical costs to make these solutions affordable for everyone.
FlowMirror-s(V02) is a self-supervised Chinese speech codec system with its weight initialised based on a text LLM, in a departure from traditional Automatic Speech Recognition and Text-to-Speech processes. In this way, a model can be trained end-to-end on speech and dialogue data, enabling end-to-end, low latency speech interactions.
Jingzhunxue has also developed the world’s first “Hyper-Realistic AI One-on-One Tutor” using its FlowMirror-s(V02) model. With its “AI Native” design, this AI tutor mimics real teachers closely, providing personalised, systemised instruction beyond traditional AI tools.
FlowMirror stands out as one of China’s most advanced educational models by leveraging Alibaba’s LLM Tongyi Qianwen (Qwen). With its trillion-parameter framework and multi-GPU Bailian platform from AliCloud, FlowMirror offers a multitude of specialised features.
Incorporating over 2 billion proprietary tokens through a sophisticated data pipeline for educational assistance, it achieves multimodal interaction through Alibaba’s visual model, which enables dynamic problem-solving and tutoring support through voice and visual cues. In addition, with 160,000 hours of educational speech training, the model is able to detect over 40 different emotional and physical states.
Additionally, FlowMirror offers personalised teaching styles inspired by renowned educators and uses virtual teacher technology to produce exclusive, high-definition, real-time interactions from just one hour of video data. To master various educational materials, the knowledge graph utilises millions of data sets gathered from an extensive question bank and learning data pool.
A total of 20,000 and 50,000 hours of speech data were used to train FlowMirror-s v0.1 and v0.2, respectively, demonstrating the model’s scalability and end-to-end speech capabilities. It is better suited to educational scenarios with natural, teacher-like dialogues because it facilitates seamless speech input-to-output interactions. Several practical applications of Jingzhunxue’s technology will be showcased in the near future.
Jingzhunxue’s AI Native e-learning tablets, such as the Bong series, are now available on e-commerce platforms like Tmall and JD.com .
Picture source: Jingzhunxue