The new family of models, which Mistral is calling “Les Ministraux,” can be used or tuned for a variety of applications, from basic text generation to working in conjunction with more capable models to complete tasks.
Two Les Ministraux models are available — Ministral 3B and Ministral 8B — both of which have a context window of 128,000 tokens, meaning they can ingest roughly the length of a 50-page book.
“Our most innovative customers and partners have increasingly been asking for local, privacy-first inference for critical applications such as on-device translation, internet-less smart assistants, local analytics, and autonomous robotics,” Mistral writes in a blog post. “Les Ministraux were built to provide a compute-efficient and low-latency solution for these scenarios.”
Ministral 8B is available for download as of today — albeit strictly for research purposes. Mistral is requiring devs and companies interested in Ministral 8B or Ministral 3B self-deployment setups to contact it for a commercial license.
Otherwise, devs can use Ministral 3B and Ministral 8B through Mistral’s cloud platform, La Platforme, and other clouds with which the startup has partnered in the coming weeks. Ministral 8B costs 10 cents per million output/input tokens (~750,000 words), while Ministral 3B costs 4 cents per million output/input tokens.
There’s been a trend toward small models, lately, which are cheaper and quicker to train, fine-tune, and run than their larger counterparts. Google continues to add models to its Gemma small model family, while Microsoft offers its Phi collection of models. In the most recent refresh of its Llama suite, Meta introduced several small models optimized for edge hardware.
Mistral claims that Ministral 3B and Ministral 8B outperform comparable Llama and Gemma models — as well as its own Mistral 7B — on several AI benchmarks designed to evaluate instruction-following and problem-solving capabilities.
Paris-based Mistral, which recently raised $640 million in venture capital, continues to gradually expand its AI product portfolio. Over the past few months, the company has launched a free service for developers to test its models, an SDK to let customers fine-tune those models, and new models, including a generative model for code called Codestral.
Co-founded by alumni from Meta and Google’s DeepMind, Mistral’s stated mission is to create flagship models that rival the best-performing models today, like OpenAI’s GPT-4o and Anthropic’s Claude — and ideally make money in the process. While the “making money” bit is proving to be challenging (as it is for most generative AI startups), Mistral reportedly began to generate revenue this summer.