In January 2025, an AI unicorn company named MiniMax made headlines by releasing its first open-source model, the MiniMax-01 series. This model pioneers the use of a linear Attention mechanism architecture in large models with over 400 billion parameters. This breakthrough not only significantly enhances the “memory” capabilities of AI models but also reduces costs by 10 times compared to GPT-4o, sparking heated discussions overseas, with some hailing it as “China’s AI revolution.”
Behind this innovation is Qin Zhen, an algorithm researcher at Xindong TapTap. Years prior, Qin had begun publishing papers on linear Attention. The new MiniMax model draws heavily from his research findings. Although Qin is not affiliated with an AI startup, his work provided vital theoretical support for MiniMax.
Below is the content formatted for wordpress blog, without further explanations:
Qin Zhen’s Journey and Challenges
Qin Zhen discussed his reasons for joining TapTap and his research journey in the field of linear Attention. He believed that although linear Attention was not widely recognized in its early stages, his persistence in this direction and the belief that its value would eventually be recognized have paid off.
He mentioned that despite the challenges and lack of early attention, he continued to believe in the potential of this research, and the application by MiniMax is a testament to its significance.
TapTap’s Approach to Innovation
Lai Hongchang, the head of TapTap’s AI team, noted that their team has always been attentive to cutting-edge technologies and encourages team members to explore according to their interests. He emphasized that significant breakthroughs often require patience, a more inclusive environment, and a long-term perspective.
Reaction to MiniMax’s Model
Qin expressed his delight in seeing his research applied to such a large-scale model, stating that it is a positive sign for the industry.
Future Plans
Lai Hongchang said they will continue to focus on multimodal large models and try to apply them in specific business operations. Qin Zhen, on the other hand, will continue his exploration in the direction of linear models.
In summary, this is a story of perseverance, patience, and long-termism, showcasing the spirit of exploration and innovative atmosphere in the AI field. It combines rigorous scientific research with a touch of humanism, highlighting the emotional value behind the scientific achievements.