Clinical applications, such as image-guided surgery and noninvasive diagnosis, rely heavily on multi-modal images. Medical image fusion plays a central role by integrating information from multiple sources into a single, more understandable output. We propose a real-time image fusion method using pre trained neural networks to generate a single image containing features from multi-modal sources. The images are merged using a novel strategy based on deep feature maps extracted from a convolutional neural network. These feature maps are compared to generate fusion weights that drive the multi-modal image fusion process. Our method is not limited to the fusion of two images, it can be applied to any number of input sources. We validate the effectiveness of our proposed method on multiple medical fusion categories. The experimental results demonstrate that our technique achieves state-of-the-art performance in both visual quality, objective assessment, and runtime efficiency.