👋🏼 Hello there, I’m Dibyanayan!

Illustration of research in computer science

👨🏻‍💻 I’m a PhD candidate in Computer Science & Engineering at IIT Patna, specializing in Natural Language Processing and multimodal AI.

🔍 My research interests include causal interpretability in multimodal AI systems.

📜 I’ve contributed to several top-tier conferences and journals, including IJCAI, EMNLP, and IEEE TCSS, focusing on state-of-the-art methods in language and vision models and their interpretability.

📚 Selected Publications

SEMANTIFY: Unveiling Memes with Robust Interpretability beyond Input Attribution

Dibyanayan Bandyopadhyay, A. Ganguly, B. Gain, A. Ekbal
IJCAI, 2024 (Core A*)

What it’s about: SEMANTIFY pioneers a fresh approach to understanding AI models by moving beyond surface-level input data to extract hidden, influential keywords from deep within the model. This four-step framework opens up a new dimension in meme interpretation, revealing the inner workings of multimodal models.
Why it matters: Traditional input attribution often fails to capture complex interactions. SEMANTIFY provides a deeper, layered interpretability that helps us see what truly drives AI’s decisions, moving a step closer to trustworthy AI systems.

Seeing Through VisualBERT: A Causal Adventure on Memetic Landscapes

Dibyanayan Bandyopadhyay, M. Hasanuzzaman, A. Ekbal
EMNLP Findings, 2024 (Core A)

What it’s about: This study dives into VisualBERT’s inner mechanics, applying causal analysis to identify critical features that drive meme offensiveness detection. By isolating causally important elements, this work provides a unique “causal lens” to view model behavior.
Why it matters: Input attribution methods often miss the mark in explaining causal relationships. This research pioneers a framework that captures true causality in model predictions, a key step in making AI models more interpretable and reliable.

A Knowledge-Infusion Multitasking System for Sarcasm Detection in Memes

Dibyanayan Bandyopadhyay, G. Kumari, A. Ekbal, S. Pal, A. Chatterjee, V. BN
ECIR, 2023 (Core A)

What it’s about: Combining sarcasm detection with emotion recognition, this multitasking system on CLIP leverages a novel dataset of Hindi memes, packed with subtle sarcasm cues. By infusing nuanced emotion categories, the model achieves a more refined grasp of sarcasm.
Why it matters: Sarcasm in memes is notoriously challenging to detect. This research introduces a high-performing solution that can differentiate sarcasm through enriched emotion understanding, setting a new standard in multimodal humor analysis.

Unsupervised Text Style Transfer Through Differentiable Back Translation and Rewards

Dibyanayan Bandyopadhyay, A. Ekbal
PAKDD, 2023 (Core A)

What it’s about: This work pushes the boundaries of style transfer with an unsupervised system that blends back-translation with reinforcement learning. It achieves high-quality, context-sensitive transformations across styles without needing labeled data.
Why it matters: Unsupervised style transfer is a tough nut to crack. By creatively combining translation and rewards, this model achieves new state-of-the-art performance, making it a valuable tool for applications in content adaptation and personalized text generation.

A Deep Transfer Learning Method for Cross-Lingual Natural Language Inference

Dibyanayan Bandyopadhyay, A. De, B. Gain, T. Saikh, A. Ekbal
LREC, 2022

What it’s about: This study taps into the potential of teacher-student learning for cross-lingual knowledge transfer, enhancing the ability of models to perform across languages with minimal loss in accuracy.
Why it matters: With a 10% boost in performance, this approach shows how leveraging pretrained models in a cross-lingual context can improve understanding in multilingual applications, bridging linguistic divides and expanding AI accessibility.

Each publication tackles real-world challenges in interpretability, humor detection, language translation, and multilingual understanding, all geared toward building smarter, more reliable AI systems that can resonate across cultures and contexts.

Selected Experience

👨🏻‍🔬 Research Experience

I was a Junior Research Fellow at IIT Patna, where I’ve developed an advanced Sign Language Translation system and conducted impactful work on meme detection in multimodal contexts. I have also interned at IBM Research, where I enhanced large language models with function call graph (FCG) data, boosting cross-repository code completion.

Dibyanayan