Audio-Visual Speech Recognition (AVSR) and lip reading have emerged as pivotal research areas that integrate auditory and visual modalities to enhance the robustness of speech recognition systems. By ...