PolyU Develops Advanced Human-Robot Collaboration System to Empower High-End Manufacturing Tasks
Other Articles


With human-robot collaboration at the core of Industry 5.0, a research team at PolyU has made significant progress in this field, developing a new generation of “human-machine symbiotic” collaborative manufacturing systems. The system has been successfully applied to high-end manufacturing tasks such as autonomous drilling on large aircraft and the disassembly of electric vehicle batteries.
Led by Ir Prof. ZHENG Pai, Wong Tit Shing Young Scholar in Smart Robotics and Associate Professor of PolyU’s Department of Industrial and Systems Engineering, the research team has developed the “Mutual Cognitive Human-Robot Collaboration Manufacturing System” (MC-HRCMS). The project was in collaboration with Prof. WANG Lihui, Chair of Sustainable Manufacturing and Director of Centre of Excellence in Production Research at KTH Royal Institute of Technology, Sweden.
MC-HRCMS is centred upon holistic scene perception—by collecting and analysing multimodal sensing data including vision, haptics, language and physiological signals, it enables highly accurate and comprehensive environmental analysis, while carrying out autonomous decision-making and flexible task execution.
The system features advanced machine learning and 3D scene perception capabilities that deliver efficiency and safety, greatly enhancing fluid human-robot interaction in complex manufacturing scenarios. Through industry collaboration projects, the team has tailored human-robot collaboration systems for multiple leading enterprises and successfully deployed them across various scenarios that involve precision and/or complex work procedures.
Ir Prof. Zheng said, “The global manufacturing industry is shifting towards a human-machine/robot symbiotic paradigm that emphasises more flexible automation. Our research aims to develop a paradigm that offers multimodal natural perception, cross-scenario skill transfer and domain foundation model-based autonomous execution, so that robots are no longer just tools, but intelligent agents that can co-evolve with human operators. This provides smart factories with a new path beyond pre-programmed automation.”
The team introduced a novel “Vision-Language-Tactile-Guided” planning framework that combines Vision-Language Models (VLMs), Deep Reinforcement Learning (DRL), 6-DoF tactile perception, and Mixed-Reality Head-Mounted Displays (MR-HMD), enhancing the ability to execute personalised and other unpredictable production tasks.
A key innovation of the framework is the combination of a vision-language-tactile-guided target object segmentation model with language-command-driven task planning, allowing the system to integrate visual information with language-based instructions. This enables robots to comprehend complex task semantics, interpret dynamic scenes and collaborate efficiently with human operators. In particular, the head-mounted device enables real-time data acquisition and provides immediate, intuitive guidance to operators, redefining the human-machine interaction interface.


