👋🏼 Hello there, I’m Dibyanayan!
👨🏻💻 I’m a PhD candidate in Computer Science & Engineering at IIT Patna, specializing in Natural Language Processing and multimodal AI.
🔍 My research interests include causal interpretability in multimodal AI systems.
📜 I’ve contributed to several top-tier conferences and journals, including IJCAI, EMNLP, and IEEE TCSS, focusing on state-of-the-art methods in language and vision models and their interpretability.
📚 Selected Publications
SEMANTIFY: Unveiling Memes with Robust Interpretability beyond Input Attribution
Dibyanayan Bandyopadhyay, A. Ganguly, B. Gain, A. Ekbal
IJCAI, 2024 (Core A*)
- What it’s about: SEMANTIFY pioneers a fresh approach to understanding AI models by moving beyond surface-level input data to extract hidden, influential keywords from deep within the model. This four-step framework opens up a new dimension in meme interpretation, revealing the inner workings of multimodal models.
- Why it matters: Traditional input attribution often fails to capture complex interactions. SEMANTIFY provides a deeper, layered interpretability that helps us see what truly drives AI’s decisions, moving a step closer to trustworthy AI systems.
Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes
Dibyanayan Bandyopadhyay, M. Hasanuzzaman, A. Ekbal
EMNLP Findings, 2024 (Core A)
- What it’s about: This study dives into VisualBERT’s inner mechanics, applying causal analysis to identify critical features that drive meme offensiveness detection. By isolating causally important elements, this work provides a unique “causal lens” to view model behavior.
- Why it matters: Input attribution methods often miss the mark in explaining causal relationships. This research pioneers a framework that captures true causality in model predictions, a key step in making AI models more interpretable and reliable.
A Knowledge-Infusion Multitasking System for Sarcasm Detection in Memes
Dibyanayan Bandyopadhyay, G. Kumari, A. Ekbal, S. Pal, A. Chatterjee, V. BN
ECIR, 2023 (Core A)
- What it’s about: Combining sarcasm detection with emotion recognition, this multitasking system on CLIP leverages a novel dataset of Hindi memes, packed with subtle sarcasm cues. By infusing nuanced emotion categories, the model achieves a more refined grasp of sarcasm.
- Why it matters: Sarcasm in memes is notoriously challenging to detect. This research introduces a high-performing solution that can differentiate sarcasm through enriched emotion understanding, setting a new standard in multimodal humor analysis.
Unsupervised Text Style Transfer Through Differentiable Back Translation and Rewards
Dibyanayan Bandyopadhyay, A. Ekbal
PAKDD, 2023 (Core A)
- What it’s about: This work pushes the boundaries of style transfer with an unsupervised system that blends back-translation with reinforcement learning. It achieves high-quality, context-sensitive transformations across styles without needing labeled data.
- Why it matters: Unsupervised style transfer is a tough nut to crack. By creatively combining translation and rewards, this model achieves new state-of-the-art performance, making it a valuable tool for applications in content adaptation and personalized text generation.
A Deep Transfer Learning Method for Cross-Lingual Natural Language Inference
Dibyanayan Bandyopadhyay, A. De, B. Gain, T. Saikh, A. Ekbal
LREC, 2022
- What it’s about: This study taps into the potential of teacher-student learning for cross-lingual knowledge transfer, enhancing the ability of models to perform across languages with minimal loss in accuracy.
- Why it matters: With a 10% boost in performance, this approach shows how leveraging pretrained models in a cross-lingual context can improve understanding in multilingual applications, bridging linguistic divides and expanding AI accessibility.
Each publication tackles real-world challenges in interpretability, humor detection, language translation, and multilingual understanding, all geared toward building smarter, more reliable AI systems that can resonate across cultures and contexts.
Selected Experience
👨🏻🔬 Research Experience
I was a Junior Research Fellow at IIT Patna, where I’ve developed an advanced Sign Language Translation system and conducted impactful work on meme detection in multimodal contexts. I have also interned at IBM Research, where I enhanced large language models with function call graph (FCG) data, boosting cross-repository code completion.