Skip to main content Start main content

News

Updated 8 Jan 2026

Spotlight on Innovation: Prof. Yang Hongxia's Work on Democratising AI Featured by Croucher News

In a feature interview with Croucher News, Prof. Yang Hongxia, Executive Director of the PolyU Academy for Artificial Intelligence, Associate Dean (Global Engagement) of the Faculty of Computer and Mathematical Sciences, and Professor at the Department of Computing, discussed her Co-GenAI project. The initiative is designed to democratise generative AI by significantly reducing GenAI development barriers and enabling broader participation in the AI era.  Challenging the Centralised AI Paradigm Professor Yang describes the current race to build increasingly large AI models as a "rich people's game," exclusively pursued by a handful of well-funded companies. Her research at The Hong Kong Polytechnic University challenges this paradigm. She advocates for a future where multiple stakeholders collaborate to develop high-quality AI, comparing today's centralised labs to the era of mainframe computers before the rise of the personal device.   Key Technological Advances: Co-GenAI and Model Fusion Her proposed solution, Collaborative Generative AI (Co-GenAI), introduces practical innovations that significantly reduce development barriers: Advanced Model Fusion: Her team successfully fused four top-tier reasoning models using only around 160 GPU hours, a fraction of the 1-2 million GPU hours typically required to train a similar model from scratch. The resulting model achieved state-of-the-art performance, with average success rates in the mid-80% range across 11 challenging reasoning domains. Theoretical Breakthrough: Professor Yang's team is the first to theoretically derive the "Model Merging Scaling Law." This pivotal finding suggests that decentralised, collaborative approaches are not just practical but are also a feasible pathway toward more advanced AI systems, offering a viable alternative to pure centralisation. Real-World Impact: Empowering Healthcare and Beyond The research has immediate applications in specialised fields such as medicine. Co-GenAI enables hospitals to train AI models on private, high-quality data without ever sharing raw data externally. Multiple local models can then be fused to create a stronger, more knowledgeable foundation model. Enhanced Privacy, Accuracy & Efficiency: This method ensures patient data remains completely private while reducing inaccuracies common in general-purpose models. Running models locally provides millisecond-speed responses, a critical improvement over cloud-based systems for time-sensitive decisions in clinical settings. Professor Yang emphasizes that the goal is to support, not to replace, human experts: "The final decision-maker is still the doctor." A Call for Collaborative and Responsible AI Development Looking ahead, Professor Yang envisions building comprehensive "science foundation models" by integrating contributions from leading domain experts worldwide. She remains optimistic about AI's potential to revolutionize industries while advocating for smart regulation that promotes responsible use without stifling technology.   Read the full feature article on the Croucher Foundation website: Making powerful generative AI cheaper and more collaborative via https://croucher.org.hk/en/news/making-powerful-generative-ai-cheaper-and-more-collaborative 

8 Jan, 2026

PAAI Media Coverage

Updated 10 Dec 2025

PolyU establishes Academy for Artificial Intelligence to develop a world-class AI innovation hub

The Hong Kong Polytechnic University (PolyU) today held the PolyU Academy for Artificial Intelligence (PAAI) Inauguration, demonstrating its support to the Nation’s “Artificial Intelligence (AI) Plus” initiative under the 15th Five-Year Plan to support high-quality development across industries. Leveraging PolyU’s cross-disciplinary strengths in computer science, mathematics and data science, the PAAI strives to foster international collaboration and help position Hong Kong and the Greater Bay Area as a globally influential AI innoation hub. On the same day, PolyU hosted a series of forums, bringing together experts from around the world to advance cutting-edge AI technologies and their innovative applications in healthcare. The Inauguration took place at the PolyU Chiang Chen Studio Theatre, officiated by Prof. SUN Dong, Secretary for Innovation, Technology and Industry of the HKSAR Government of the People’s Republic of China, and Prof. Jin-Guang TENG, President of PolyU. Prof. Sun Dong remarked, “The country’s Recommendations for Formulating the 15th Five-Year Plan reaffirm Hong Kong’s strategic position as the international innovation and technology centre. Our vision to become a global hub for AI development was underscored in the 2025 Policy Address delivered by our Chief Executive, with promotion of AI being top of our agenda, taken forward through multi-pronged measures on key enablers, including talent, data and industry applications.” “The inauguration of the PAAI marks not just a milestone, but a new chapter in our city’s united efforts to expedite the AI development. This Academy will inspire ideas, foster collaboration and fuel Hong Kong's AI ecosystem,” he added. Prof. Jin-Guang Teng elaborated: “Leveraging PolyU’s strong foundation in AI, computer science, mathematics, data science and other globally recognised disciplines, the PAAI will foster interdisciplinary and international collaboration to drive AI development, positioning Hong Kong and the Greater Bay Area as a leading AI innovation hub. It will deliver sustainable, efficient and impactful solutions for key sectors – ranging from healthcare and finance to education and beyond. It will also cultivate a talent ecosystem that can drive future innovation by leveraging Hong Kong’s international research environment and its government-industry-academia-research network.” Prof. Qiang YANG, PAAI Director, and Prof. Hongxia YANG, PAAI Executive Director, delivered keynotes on “The AI Revolution: Challenges and Opportunities” and “Co-Generative AI (Co‑GenAI)” respectively, elaborating on the University’s key tasks in advancing AI and how related projects are being translated into real-world applications that benefit diverse industries. Addressing future AI challenges, Prof. Qiang Yang noted that the PAAI will continue to advance key technologies including Co‑GenAI, Federated Learning and Edge Foundation Models, while setting out robust technological roadmaps in the priority fields of healthcare, education, finance and robotics. He highlighted Hong Kong’s dual positioning as an international financial centre as well as an international innovation and technology hub. Together with the extensive clinical networks and strong industry demand in the Guangdong-Hong Kong-Macao Greater Bay Area, the PAAI will seek to expand decentralised AI infrastructure, enabling more institutions to use advanced AI technologies under safe and controllable conditions. Prof. Hongxia Yang added that traditional AI training faces hurdles such as high thresholds for computing capacity and data privacy protection. By aggregating the strengths of hundreds of industry-specific models, Co‑GenAI can reduce reliance on centralised computing resources and build high-quality foundation models that better reflect real-world application scenarios. The PAAI is working with various medical institutions to implement the “Cancer GenAI” project, while also exploring the potential of AI in infectious disease prevention and control, robotic systems and finance. In the “International Forum on AI 2025”, moderated by Prof. Qiang Yang and the “Intelligent Oncology Forum” moderated by Prof. Jing CAI, Head of the PolyU Department of Health Technology and Informatics, convened leading experts from academia and clinical medicine. Participants engaged in in-depth discussions on the deep integration of AI and healthcare, innovative applications and cross-disciplinary technological breakthroughs, contributing insights to further propel AI technologies. The PAAI will contribute to building Hong Kong into a global testing ground that drives AI innovation in healthcare and smart city development, fostering world-class technologies and talent. It will also strengthen collaboration with industry, medical institutions, schools and government departments to apply AI solutions in public health and education systems. In ShanghaiRanking’s Global Ranking of Academic Subjects announced last month, PolyU ranked first in Hong Kong and 16th worldwide in the newly introduced “Artificial Intelligence” subject area, underscoring the University’s forward-looking strategy and achievements in facilitating AI in education. Officiating the PAAI Inauguration, Prof. Sun Dong, Secretary for Innovation, Technology and Industry of the HKSAR Government of the People's Republic of China, said that the PAAI marked not just a milestone, but a new chapter in the city’s united efforts to expedite the AI development. PolyU President Prof. Jin-Guang Teng said that the PAAI would cultivate a talent ecosystem to drive future innovation by leveraging Hong Kong’s international research environment and its government-industry-academia-research network. During the media interview session, PolyU Senior Vice President (Research and Innovation) Prof. Christopher Chao (2nd from left); PAAI Director Prof. Qiang Yang (2nd from right); PAAI Executive Director Prof. Hongxia YANG (1st from left); and Prof. Jing Cai, Head of the Department of Health Technology and Informatics (1st from right), outlined the PAAI’s strategy and the development and applications of AI across industries. ***END***  

10 Dec, 2025

PAAI Publicities

Updated 23 Oct 2025

New 23 Oct 2025 Co-GenAI

The Hong Kong Polytechnic University (PolyU) Academy for Artificial Intelligence (PAAI) has announced achieving several milestones in Generative AI (GenAI) research. The PAAI team is pushing the boundaries of AI with a novel collaborative GenAI paradigm known as Co-GenAI, which has the potential to transform frontier model training from a centralised, monolithic approach into a decentralised one. Significantly lowering training resource requirements, protecting data privacy and removing resource barriers such as graphics processing unit (GPU) monopolies paves the way for a more inclusive and accessible environment for global institutions to participate in AI research. Advances in GenAI research are presently constrained by three major barriers: training foundation models being so computationally prohibitive that only a few organisations can afford it, effectively excluding global academia from frontier model development; domain knowledge and data remaining siloed due to privacy and copyright concerns, particularly for sensitive information in healthcare and finance; and foundation models being static and unable to evolve with emerging knowledge, while retraining each frontier model ab initio consumes an enormous amount of resources and makes rapid iteration impossible. To tackle these challenges, the PAAI team has developed a novel model training framework that enables ultra-low-resource training and decentralised model fusion. The framework is theoretically grounded and has been validated through extensive real-world applications. PolyU is the first academic institution to open-source an end-to-end FP8 low-bit training solution that covers both continual pre-training (CPT) and post-training stages. This approach will set a new standard for training models with FP8 ultra-low resources while maintaining BF16 precision, in turn revolutionising the practice of model training and positioning PolyU among the few institutions worldwide to master this advanced training technique. Compared with BF16, FP8 delivers over 20% faster training, reduces peak memory by over 10% and dramatically lowers training overheads while maintaining performance. The pipeline integrates CPT, supervised fine-tuning (SFT) and reinforcement learning (RL) to achieve BF16 quality while shortening training time and reducing memory footprint. The team has begun exploring even lower-cost FP4 precision training, with initial results reported in academic publications1. In medical applications, the models trained by these pipelines outperform all peer models on diagnosis and reasoning across all key areas2. In research agent application, the models also demonstrate exceptional performance in complex task handling, generalisation and report quality3. Until now, foundation model training has followed scaling laws: more parameters yield broader knowledge and stronger performance. However, centralised training typically requires millions of GPU hours—a resource available to only a few organisations. The PolyU InfiFusion model fusion achieves a key milestone in model fusion research: it uses only hundreds of GPU hours to fuse large models that would otherwise require 1–2 million GPU hours to train from scratch. The team has merged four state-of-the-art models in 160 GPU hours4-5, avoiding million-scale training budgets while delivering fused models that significantly outperform the originals across multiple key benchmarks. The team has published the first theoretical validation of model fusion—a concept championed by Thinking Machines Lab. Through rigorous mathematical derivation, they proposed the “Model Merging Scaling Law,” suggesting there is another viable pathway to artificial general intelligence (AGI)6. Prof. YANG Hongxia, Executive Director of PolyU PAAI, Associate Dean (Global Engagement) of the Faculty of Computer and Mathematical Sciences, and Professor of the Department of Computing, stated, “Ultra-low-resource foundation model training, combined with efficient model fusion, enables academic researchers worldwide to advance GenAI research through collaborative innovation.” The team has also demonstrated the potential of its training pipelines through applications across specific domains, including state-of-the-art medical foundation and cancer AI models that achieve best-in-class performance. With the integration of high-quality domain-specific data, these models can adapt to medical devices for different scenarios, including personalised treatment and AI-based radiotherapy for oncology. In this context, the team is now collaborating with Huashan Hospital affiliated to Fudan University, Sun Yat-sen University Cancer Center, Shandong Cancer Hospital and Queen Elizabeth Hospital in Hong Kong. PAAI has also introduced a leading agentic AI application in deep search and academic paper assistance—a graduate-level academic paper writer with agentic capability that supports a multimodal patent-search engine for end-to-end research and manuscript drafting. Prof. Christopher CHAO, Senior Vice President (Research and Innovation) of PolyU, stated, “AI is a key driver in accelerating the development of new quality productive forces. The newly established PAAI is dedicated to expediting AI integration across key sectors and developing domain-specific models for diverse industries. These initiatives will not only solidify the leading position of PolyU in related fields, but also help position Hong Kong as a global hub for GenAI.” The research project led by Prof. Yang Hongxia is supported and funded by the Theme-based Research Scheme 2025/26 under the Research Grants Council, the Research, Academic and Industry Sectors One-plus Scheme under the Innovation and Technology Commission of the HKSAR Government, and the Artificial Intelligence Subsidy Scheme under Cyberport. It marks a significant step forward for Hong Kong in global AI innovation and accelerating the democratisation and industrial implementation of AI technology.   1InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models,  https://arxiv.org/html/2509.22536v3 2InfiMed: Low-Resource Medical MLLMs with Advancing Understanding and Reasoning, https://arxiv.org/html/2505.23867 3InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios, https://arxiv.org/html/2509.22502 4InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion, https://arxiv.org/html/2505.13893 5InfiFPO: Implicit Model Fusion via Preference Optimization in Large Language Models, https://arxiv.org/abs/2505.13878 6Model Merging Scaling Laws in Large Language Models, https://arxiv.org/html/2509.24244   ***END***

23 Oct, 2025

PAAI Research Results

Updated 14 Jul 2025

Prof. Yang Qiang gave an in-depth presentation on federated learning at the ICML'25 Vision and Learning Workshop

The Vancouver Vision & Learning Workshop @ ICML 2025, jointly organized by Simon Fraser University (SFU), the University of British Columbia (UBC), and the Vector Institute, was successfully held on July 14, 2025. As a key affiliated event of the prestigious International Conference on Machine Learning (ICML 2025), the workshop brought together leading scholars and industry experts to explore cutting-edge advancements in computer vision and machine learning. Among the highlights of the event was a keynote presentation by Professor Qiang Yang, Director of the Hong Kong Polytechnic University’s PolyU Academy for Artificial Intelligence (PAAI), titled “Federated Learning meets Large Language Models.” Professor Yang's talk attracted significant attention for its in-depth exploration of federated learning—an emerging paradigm in distributed AI that enables collaborative model training across multiple devices or institutions without sharing raw data. This approach plays a pivotal role in building privacy-preserving and efficient cross-domain AI systems. Professor Yang further discussed several promising directions and applications of federated learning, including: Federated Foundation Models that integrate pre-trained large language models with domain-specific models; Agentic Federated Learning, which leverages large language models to develop intelligent edge agents; Industry Collaborations, especially in the financial sector, involving both domestic and international institutions to promote real-world applications; Scientific Research, enhancing cross-institutional collaboration through privacy-preserving AI techniques; And the development of open-source tools to support technology implementation and ecosystem growth. In addition to Professor Yang, the workshop featured insightful presentations by leading researchers including Kelsey Allen (UBC, Vector Institute), Jamie Shotton (Wayve), Masashi Sugiyama (RIKEN AIP, University of Tokyo), Alane Suhr (UC Berkeley), and Arash Vahdat (NVIDIA). Their talks covered a wide array of topics spanning vision, learning, and language, highlighting groundbreaking intersections between artificial intelligence, the physical world, and cognitive science. The workshop not only served as a high-level platform for academic exchange but also underscored the importance of cross-institutional collaboration in advancing the frontiers of AI research. These explorations continue to inject new momentum into the field and demonstrate the vast potential of interdisciplinary integration.

14 Jul, 2025

PAAI Scholarly Engagement

Updated 13 Jul 2025

Prof. YANG Qiang co-authored the paper “Federated Machine Learning: Concept and Applications”, which received the Frontiers of Science Award at ICBS 2025.

The 2025 International Congress of Basic Science (ICBS) officially opened in Beijing on July 13 and will run through July 25. Since its inception in 2023 under the leadership of Academician Shing-Tung Yau, ICBS has become a premier international platform in the field of basic science. The Congress focuses on three major areas—mathematics, physics, and information science and engineering—and gathers global elites to drive disciplinary breakthroughs. This year’s event features an esteemed lineup, including Nobel, Turing, and Fields Medal laureates such as Samuel C. C. Ting and Steven Chu. Two major awards—the Lifetime Achievement Award in Basic Science and the Frontier Science Award—were presented during the Congress.   One of the highlights of this year’s event was the recognition of the paper "Federated Machine Learning: Concept and Applications," co-authored by Professor Qiang Yang (Chair Professor and Director of the Academy of Artificial Intelligence, The Hong Kong Polytechnic University), Associate Professor Yang Liu (Department of Computing and Department of Data Science and Artificial Intelligence), Dr. Tianjian Chen, and Professor Yongxin Tong. The paper was honored with the 2025 Frontier Science Award.   Professor Yang Liu accepted the award on behalf of all the authors and delivered a speech on behalf of the awardees in the field of information science and engineering. The Congress recognized the paper for “addressing the framework of federated machine learning for privacy preservation, introducing a decentralized training paradigm without data sharing, and pioneering horizontal and vertical federated learning methods. This work has tackled critical issues such as data heterogeneity and security, with wide applications in healthcare, finance, and the Internet of Things, paving new paths for integrating AI with the real economy.”   This prestigious award highlights the strong capabilities of the research team and reflects the deep academic foundation of the PolyU Academy of Artificial Intelligence (PAAI) and its Research Institute for Federated Learning (RIFL).   Federated learning is an advanced distributed AI technology that enables multiple devices to collaboratively train models without sharing raw data. By protecting data privacy, it facilitates secure and efficient cross-domain intelligent collaboration. It has become a core paradigm of privacy-preserving artificial intelligence.   Established in 2025, the PolyU Academy of Artificial Intelligence (PAAI) is co-led by Professor Qiang Yang and Professor Hongxia Yang. It focuses on developing industry-specific large language models and exploring decentralized generative AI, contributing to Hong Kong's leadership in innovation.   Under PAAI, the Research Institute for Federated Learning (RIFL) is one of the world’s first research institutes dedicated to federated learning. Co-directed by Professor Yang Liu and Professor Qiang Yang, RIFL aims to advance fundamental research and real-world applications in this cutting-edge area.   Moving forward, RIFL will continue to work closely with PAAI under the leadership of Professor Yang Liu and Professor Qiang Yang, striving to advance the global development of federated learning to new heights.

13 Jul, 2025

PAAI Awards & Recognitions

Updated 10 Jul 2025 - 2

Prof. Yang Hongxia secures RGC Theme-based Research Scheme funding to develop cost-effective and sustainable Co-GenAI model

Prof. YANG Hongxia, Executive Director of the PolyU Academy for Artificial Intelligence, Associate Dean (Global Engagement) of the Faculty of Computer and Mathematical Sciences, and Professor of the Department of Computing, has received funding from the Theme-based Research Scheme 2025/26 under the Research Grants Councilfor her pioneering project, “Collaborative Generative AI (Co-GenAI)”.   Professor Wing-tak Wong, Deputy President and Provost of PolyU, remarked, “We are delighted that our scholars have received this significant recognition and support. This visionary project highlights PolyU’s strategic commitment to advancing cutting-edge AI research, with a strong emphasis on inclusivity and sustainability. The establishment of the PolyU Academy of Artificial Intelligence will further enhance interdisciplinary collaboration and open up new frontiers in AI applications.”  

10 Jul, 2025

PAAI Funding & Donations

Updated 10 Jul 2025 - 1

PolyU secures RGC Theme-based Research Scheme funding to develop cost-effective and sustainable Co-GenAI model

The Hong Kong Polytechnic University (PolyU) is committed to driving cutting-edge research that creates societal impact and technological advancement. Prof. YANG Hongxia, Executive Director of the PolyU Academy for Artificial Intelligence, Associate Dean (Global Engagement) of the Faculty of Computer and Mathematical Sciences, and Professor of the Department of Computing, has received funding from the Theme-based Research Scheme 2025/26 under the Research Grants Councilfor her pioneering project, “Collaborative Generative AI (Co-GenAI)”. The project has been awarded total funding of HK$62.6 million, with HK$41.79 million provided by the RGC and the remaining amount matched by PolyU and other participating universities. This initiative is aimed at reshaping the landscape of GenAI through a decentralised way. The research holds significant potential to strengthen Hong Kong’s position as a global leader in GenAI development,with real-world applicationsin healthcare and technology. Prof. Christopher CHAO, PolyU Vice President (Research and Innovation), said, “We are delighted that our scholar has received this significant support. This pioneering project exemplifies the University’s commitment to advancing cutting-edge AI research, alongside our emphasis on inclusive and sustainable technological development. PolyU will continue to leverage its world-class research capabilities to make a profound impact on the future development of Hong Kong and the global community. With the launch of the PolyU Academy for Artificial Intelligence, we are poised to foster interdisciplinary collaboration and unlock new frontiers in AI applications.” The project, led by Prof. Yang Hongxia, aims to develop a novel collaborative GenAI paradigm known as Co-GenAI. The system evolves through the integration of several hundred domain-specific models to create a foundation model designed to achieve artificial general intelligence (AGI) with significantly reduced centralised computational demand. By addressing the current constraints imposed by graphics processing unit (GPU) monopolies, this innovative approach is set to democratise AI development and enable broader participation in GenAI research and deployment. Co-GenAI is tailored to enhance different domains and collaborations, with the long-term goal of creating a versatile platform for the next generation of GenAI ecosystem. The project’s key tasks include the development of domain-adaptive continual pre-training infrastructure and the design of a robust, generalisable model ranking methodology. In addition, an advanced model fusion approach will be implemented to merge heterogeneous top-ranked domain-specific models. Prof. Yang expressed her gratitude for the RGC’s support and said, “Backed by a team of world-renowned researchers with extensive expertise, we believe Co-GenAI will play a transformative role in advancing the democratisation of AI, expanding its accessibility across disciplines and enhancing cost-effectiveness. We are confident that this novel paradigm will spark greater innovation and diversity in the field, ultimately paving the way for the development of a global foundation model that is both sustainable and inclusive.” To evaluate Co-GenAI, the research team will implement and deploy the system across a wide range of applications in collaboration with industry partners including Cyberport, Hong Kong Science and Technology Park, Alibaba, and leading hospitals such as Huashan Hospital affiliated to Fudan University, Shandong Cancer Hospital, and Sun Yat-sen University Cancer Center. The RGC’s Theme-based Research Scheme aims to pool the academic research efforts of UGC-funded universities to conduct researchon topics of strategic importance to Hong Kong’s long-term development. Evaluation criteria include qualification as world-leading by international standards and the potential impact on Hong Kong.

10 Jul, 2025

PAAI Funding & Donations

Updated 9 Jul 2025

PAAI co-hosted the the 19th International Congress on Logistics and SCM Systems (ICLS 2025) on AI-empowered logistics and supply chain.

PAAI co-hosted the 19th International Congress on Logistics and SCM Systems (ICLS 2025) on AI-empowered logistics and supply chain.    The conference featured a rich program, including panel discussions, workshops, and special sessions, with contributions from academia, industry, and management agencies. Attendees engaged in dynamic exchanges of ideas, fostering interdisciplinary collaboration to enhance the efficiency, resilience, and sustainability of global supply chains.   With its high-level discussions and networking opportunities, the conference successfully attracted worldwide attention from scholars and industry talents, reinforcing Hong Kong’s role as a hub for innovation and international exchange. By bridging academia and industry, the event facilitated meaningful dialogue on AI-driven solutions for resilient and sustainable supply chains.   Prof. Ming Li, Assistant Director of PAAI, remarked, "This conference exemplified the power of AI in revolutionizing logistics and SCM. Through our partnership, we have strengthened the synergy between research and real-world applications, paving the way for smarter global supply chains."   The organizers extended their gratitude to all speakers, attendees, sponsors, and partners for making the event a milestone in the field. With its impactful discussions and networking opportunities, the congress concluded on a high note, setting the stage for future innovations.

9 Jul, 2025

PAAI Scholarly Engagement

Updated 24 Jun 2025

Prof. Hongxia Yang's Team Unlocks SOTA Multimodal Math Reasoning with Small Language Models

In the field of artificial intelligence, large language models (LLMs) have made remarkable strides in reasoning capabilities. However, when these capabilities are extended to multimodal scenarios—where models must process both text and images—researchers face considerable challenges. These challenges are especially pronounced for small multimodal language models with limited parameter sizes. A research team led by Professor Hongxia Yang at The Hong Kong Polytechnic University has proposed a training framework called Infi-MMR, which leverages an innovative three-phase reinforcement learning strategy. This framework successfully unlocks the multimodal reasoning potential of small language models, achieving state-of-the-art (SOTA) performance across several mathematical reasoning benchmarks—even surpassing some larger models in the process. The team's findings are detailed in their recent preprint titled “Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models”, now available on arXiv. The paper lists Zeyu Liu, a research assistant at The Hong Kong Polytechnic University, and Yuhang Liu, a master's student at Zhejiang University, as co-first authors. Professor Hongxia Yang is the corresponding author. The team aims to extend rule-based reinforcement learning achievements from the text domain (such as those from DeepSeek-R1) to the multimodal domain, while addressing inherent challenges in multimodal reinforcement learning. Small language models (SLMs), due to their limited number of parameters, face three core challenges: Low-Quality Multimodal Reasoning Data Rule-based reinforcement learning requires verifiable answers. However, most multimodal tasks focus on image captioning, description, or visual question answering, which lack rigorous reasoning elements. Existing datasets rarely offer complex reasoning tasks paired with verifiable outputs. Degradation of Core Reasoning Abilities When multimodal LLMs integrate visual and textual data, they often compromise their core reasoning skills—a problem especially severe in smaller models. Moreover, the complexity of cross-modal fusion can disrupt structured reasoning, leading to reduced task performance. Complex but Unreliable Reasoning Paths When trained directly on multimodal data using reinforcement learning, models tend to generate overly complex and often inaccurate reasoning processes. The Infi-MMR framework addresses these issues through its three-stage curriculum learning approach: Stage 1: Foundational Reasoning Activation Instead of using multimodal inputs directly, this phase uses high-quality textual reasoning data to activate the model's reasoning capabilities through reinforcement learning. This approach builds a solid logical reasoning foundation and mitigates the degradation seen in standard multimodal models. Stage 2: Cross-Modal Reasoning Adaptation With the foundation in place, this phase gradually transitions the model to the multimodal domain using question-answer pairs supplemented with explanatory textual information. This helps the model adapt its reasoning skills to handle multimodal inputs. Stage 3: Multimodal Reasoning Enhancement To simulate real-world multimodal scenarios—where image descriptions may be missing—this stage removes textual hints and trains the model to perform reasoning directly from raw visual inputs. This reduces linguistic bias and promotes robust multimodal reasoning. Notably, the team introduced caption-augmented multimodal data, which aids the model in transferring its text-based reasoning skills to multimodal contexts and enables more reliable cross-modal reasoning. Using the Infi-MMR framework, the team fine-tuned Qwen2.5-VL-3B into Infi-MMR-3B, a small multimodal model focused on mathematical reasoning. The results are striking: On the MathVerse benchmark—which spans domains like algebra and geometry—Infi-MMR-3B achieved 43.68% accuracy, outperforming models of the same scale and even surpassing some 8-billion-parameter models. On the MathVista benchmark, which assesses comprehensive reasoning ability, it achieved 67.2% accuracy, a 3.8% improvement over the baseline. Impressively, its performance on MathVerse is approaching that of proprietary models such as GPT-4o (39.4%). These achievements validate the effectiveness of the Infi-MMR framework and demonstrate the successful transfer of reasoning capabilities to the multimodal domain. The team emphasizes that while Infi-MMR-3B is tailored for mathematical reasoning, its core reasoning abilities are generalizable to other fields that require complex decision-making, such as education, healthcare, and autonomous driving. Looking ahead, the team will continue exploring ways to enhance reasoning in multimodal models, aiming to empower small models with robust and transferable reasoning capabilities.

24 Jun, 2025

PAAI Research Results

Updated 20 Jun 2025

Prof. Yang Hongxia Wins the Prestigious Funding Support from RAISe+ Scheme

We are proud to announce that our professor, Prof. YANG Hongxia, Executive Director of PolyU Academy for Artificial Intelligence, Associate Dean (Global Engagement) of Faculty of Computer and Mathematical Sciences, has been selected for funding support under the second batch of the Research, Academic and Industry Sectors One-plus (RAISe+) Scheme. This recognition underscores Prof. Yang’s exceptional research achievements and her leadership in advancing innovation in computing technologies.   Prof. Yang’s funded project, titled “Reallm: World-leading Enterprise GenAI Infrastructure Solution”, aims to develop a comprehensive Generative Artificial Intelligence (GenAI) infrastructure tailored for enterprise applications. The project will: Establish a decentralised architecture for pretraining and post-training to support distributed model training frameworks; Develop a domain-adaptive continual training system that optimises large language models using domain-specific unlabelled data, enabling seamless adaptation to target domain distributions; Design a low-bit training framework that requires only half the computational and storage resources of traditional training, while still achieving high-quality, end-to-end training from pretraining to post-training—significantly lowering the entry barrier for enterprises.   Ultimately, the project will launch an enterprise-grade GenAI platform to facilitate cross-domain collaboration, offering services across Software-as-a-Service, Platform-as-a-Service, and Infrastructure-as-a-Service.   Inaugurated in 2023, the RAISe+ Scheme aims to provide funding, on a matching basis, for at least 100 research teams from universities funded by the University Grants Committee which demonstrate strong potential to evolve into successful startups. Each approved project will receive funding support ranging from HK$10 million to HK$100 million.

20 Jun, 2025

PAAI Funding & Donations

Your browser is not the latest version. If you continue to browse our website, Some pages may not function properly.

You are recommended to upgrade to a newer version or switch to a different browser. A list of the web browsers that we support can be found here