In recent years, cloud computing has surged in popularity, offering vast computational resources in a scalable, cost-efficient manner. Despite its benefits, security concerns persist, prompting many companies to adopt cloud computing despite the associated risks. To address challenges in password management and the efficacy of authentication systems, biometric authentication has garnered significant attention. As the imperative for personal data security intensifies, multi-biometric fusion-based identification systems emerge as a promising solution to bolster performance accuracy. This paper introduces a novel computational multimodal biometric recognition technique aimed at autonomously authenticating facial, iris, and fingerprint images using advanced deep learning methodologies. By integrating features using Fusion-Based Feature Extraction (Weighted Sum Rule), and classification using Deep Cross-Modal Retrieval (DCMR), this approach produces robust representations of facial, iris, and fingerprint characteristics by generating OTP (One-Time Password) to enhance authentication in the cloud environment. The efficacy of the proposed approach is evaluated by comparing its performance against established classifiers such as Support Vector Machines (SVM), Random Forests, Decision Trees, and K-Nearest Neighbors (KNN), utilizing metrics including recognition rate, precision, recall, and F-measure. Results demonstrate a recognition rate of 99.2%, surpassing alternative models considered. These findings highlight the potential of advanced deep learning methodologies within cloud computing environments to enhance multimodal biometric authentication systems. This approach utilizes Biometric-as-a-Service (BaaS) to streamline complexity and computational overhead, facilitating broader implementation of robust biometric security measures in cloud-based ecosystems.