ADASPEC: Making large language models faster and more efficient across multiple languages

Ishikawa, Japan --Large language models (LLMs), which are the artificial intelligence (AI) systems behind modern chatbots, translation tools, and virtual assistants, have become revolutionary tools worldwide. Companies, governments, schools, and developers now rely on them to serve users across dozens of languages. Unfortunately, as these systems grow more capable and incorporate support for more and more languages, they also become more computationally demanding. Generating responses from large multilingual models not only costs more but also take significantly more time.

One of the leading approaches for addressing this issue is called speculative decoding. This technique can speed up LLM output by using a small internal “drafter” model to predict several words ahead at once, which the main model then checks in parallel. While powerful, most existing speculative decoding methods were built and optimized for English; high-quality training data for drafters is widely available in English but scarce or absent for many other languages. As a result, these speed-boosting techniques lose much of their effectiveness when dealing with non-English languages.

To tackle this problem, a research team (Do Dinh Truong and Le Nguyen Khang) led by Professor Le-Minh Nguyen from Japan Advanced Institute of Science and Technology, Japan, developed ADASPEC, a multilingual speculative decoding framework designed to work across languages from the ground up. Their paper, which was presented at Proceedings of the AAAI Conference on Artificial Intelligence [The Fortieth AAAI Conference on Artificial Intelligence (AAAI-26)] on March 14, 2026, introduces not only this new framework, but also a new benchmark for evaluating multilingual inference speed in LLMs.

The core challenge the team faced was twofold. First, training effective drafter models requires language-specific instruction data, which is limited or unavailable for many languages. Second, the vocabulary set a drafter uses to predict tokens needs to reflect the language being generated, not a one-size-fits-all list.

ADASPEC addresses both problems simultaneously. Rather than relying on existing datasets, it uses the target LLM itself to automatically generate instruction data in any desired language, including low-resource ones. Moreover, it analyzes word frequency across language-specific text sources to build compact and language-tailored vocabulary sets. “ During inference, the system dynamically selects the optimal language, drafter model, and vocabulary size based on the recently generated context. By reducing unnecessary vocabulary computations, A DA S PEC achieves faster and more stable multilingual inference, ” explains Prof. Nguyen. In other words, because the framework can adaptively identify the most suitable language-specific configuration from the generated context and switch drafters and vocabularies on the fly, it is well suited for real-world situations where users may write in any language.

To validate their approach, the researchers introduced Multi-SpecBench, a novel multilingual benchmark for evaluating speculative decoding that supports more rigorous comparisons. Using this, they tested ADASPEC across seven languages, namely English, German, French, Spanish, Chinese, Japanese, and Vietnamese, and seven task types, including question answering, summarization, code generation, translation, and math reasoning. Notably, the proposed framework consistently outperformed other state-of-the-art techniques, achieving up to a 2.3× speedup over EAGLE-2, one of today’s strongest speculative decoding methods. The team found that some existing speculative decoding methods actually slowed down inference in non-English settings compared with using no acceleration at all, revealing how poorly adapted they are for multilingual use.

The researchers believe ADASPEC holds great potential in any setting where fast LLM responses in multiple languages matter, such as multilingual customer support systems, AI tutors, translation and summarization tools, and real-time conversational agents. Looking further ahead, this kind of research could help reduce the energy and infrastructure costs of running multilingual AI services, and meaningfully narrow the gap between the quality of AI assistance available in English and in other languages. “ We expect the proposed technology to reduce response times and computational costs for multilingual AI services in general ,” concludes Prof. Nguyen. On top of this, for smaller organizations or communities working with lower-resource languages, a system that can generate its own training data and adapt without manual intervention represents a meaningful step toward more accessible and equitable AI.

###

Reference

Title of original paper:

ADASPEC: Adaptive Multilingual Speculative Decoding with Self-Synthesized Language-Aware Training and Vocabulary Simplification

Authors:

Dinh-Truong Do, Nguyen-Khang Le, and Le-Minh Nguyen

Journal:

Proceedings of the AAAI Conference on Artificial Intelligence; The Fortieth AAAI Conference on Artificial Intelligence (AAAI-26)

DOI:

10.1609/aaai.v40i36.40307

About Japan Advanced Institute of Science and Technology, Japan

Founded in 1990 in Ishikawa prefecture, the Japan Advanced Institute of Science and Technology (JAIST) was the first independent national graduate university that has its own campus in Japan. Now, after 30 years of steady progress, JAIST has become one of Japan’s top-ranking universities. JAIST strives to foster capable leaders with a state-of-the-art education system where diversity is key; about 40% of its alumni are international students. The university has a unique style of graduate education based on a carefully designed coursework-oriented curriculum to ensure that its students have a solid foundation on which to carry out cutting-edge research. JAIST also works closely both with local and overseas communities by promoting industry–academia collaborative research.

About Professor Le-Minh Nguyen from Japan Advanced Institute of Science and Technology, Japan

Dr. Le-Minh Nguyen obtained a Master’s degree from Vietnam National University in 2002 and a Ph.D. from Japan Advanced Institute of Science and Technology (JAIST) in 2004, where he currently serves as a full Professor. He is also the Director of the Research Centre for Interpretable AI at JAIST. His research focuses on artificial intelligence and machine learning, particularly in the domains of natural language understanding, text summarization, deep learning, and knowledge representation. He received the JSAI Best Paper Award in 2025 and the SAC Highlight Award at ACL 2025.

Proceedings of the AAAI Conference on Artificial Intelligence

14-Mar-2026

ADASPEC: Making large language models faster and more efficient across multiple languages

Additional Media

Keywords

Article Information

Contact Information

Source

How to Cite This Article