Aniruddha Mahapatra

I am a Research Engineer at Adobe Research. I completed my Masters in Computer Vision (MSCV) from Carnegie Mellon University, where I was advised by Prof. Jun-Yan Zhu. My research interest include image and video synthesis and editing using generative models.

I am fortunate to have worked with Aliaksandr Siarohin, Hsin-Ying Lee and Sergey Tulyakov at Snap Research, Kuldeep Kulkarni, Anandhavelu Natarajan and Subrata Mitra at Adobe Research, Jitendra Singh at IBM Research, and, Professor Biplab Banerjee and Ranita Biswas at Indian Institute of Technology, Roorkee.

Email  |  Resume  |  Google Scholar  |  LinkedIn

profile photo
sym

Adobe Research
Research Engineer
Feb. 24 - Present

sym

CMU
MS in Computer Vision
Aug. 22 - Dec. 23

sym

Snap Research
Research Intern
May. 23 - Aug. 23

sym

Adobe Research
Research Associate
Aug. 20 - Aug. 22

sym

Adobe Research
Research Intern
May. 19 - Jul. 19

sym

IIT Roorkee
B.Tech. Computer Science
Jul. 16 - Jul. 20

Software
Moving Elements
video | website

Adobe's Photoshop Elements 2023 new feature "Moving Elements" lets users generate aesthetic cinemagraphs from their photos.
This feature was based on our Controllable Animation of Fluid Elements in Still Images (CVPR 2022) work.

Research
Co-speech Gesture Video Generation with 3D Human Meshes
Aniruddha Mahapatra*, Richa Mishra*, Renda Li, Ziyi Chen, Boyang Ding, Shoulei Wang, Jun-Yan Zhu, Peng Chang, Mei Han, Jing Xiao
ECCV, 2024
webpage | paper | bibTeX

Existing hand gesture video generation methods are primarily limited by the widely adopted 2D skeleton-based gesture representation and still struggle to generate realistic hands. We introduce a co-speech video generation framework to synthesize human speech videos leveraging human mesh-based representations.

On the Content Bias in Fréchet Video Distance
Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar, Jun-Yan Zhu, Jia-Bin Huang
CVPR, 2024
webpage | paper | arXiv | code | bibTeX

We analyse the problems associated with the commonly used I3D features for computing FVD metric that is used to evaluate quality of generated videos. We develop code and provide pre-computed features for computing FVD with different feature extractors. The toolkit is available at Github repo.

pip install cd-fvd

Text-Guided Synthesis of Eulerian Cinemagraphs
Aniruddha Mahapatra, Aliaksandr Siarohin, Hsin-Ying Lee,
Sergey Tulyakov, Jun-Yan Zhu
SIGGRAPH Asia, 2023 (ACM Transactions on Graphics)
webpage | paper | arXiv | code | bibTeX

We introduce a fully automated method, Text2Cinemagraph, for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images. Our method also gives the user a coarse control over the direction of motion in the generated cinemagraphs using text-based direction.

Controllable Animation of Fluid Elements in Still Images
Aniruddha Mahapatra, Kuldeep Kulkarni
CVPR, 2022
webpage | paper | arXiv | video | bibTeX

Given a single input image, mask the region user wants to animate and any number of arrow directions and their associated speeds provided by the user to specify the direction of desired movement, we propose a method to interactively control the animation of fluid elements (like water, fire, clouds, etc.) to generate cinemagraphs from the single image.

GEMS: Scene Expansion using Generative Models of Graphs
Aniruddha Mahapatra*, Rishi Agarwal*, Tirupati Saketh Chandra*, Vaidehi Patil*, Kuldeep Kulkarni, Vishwa Vinay
WACV, 2023
paper | arXiv | bibTeX

We design an auto-regressive model, GEMS, for a novel task of conditional expansion of scene graphs from a given seed scene graph that adds nodes and edges hierarchically to the seed scene graph. We also propose novel metrics to evaluate the quality of expanded scene graphs to capture the coherence of predicted edges and nodes better than traditional MMD based metrics.

Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models
Aniruddha Mahapatra, Snarmila Nangi, Aparna Garimella, Anandhavelu Natarajan
EMNLP, 2022 (Oral)
paper | bibTeX

We introduce effective ways of dataset selection for pretraining large language models in an unsupervised way to facilitate domian adaptation to very limited data domains.

SemIE: Semantically-aware Image Extrapolation
Bholeshwar Khurana, Soumya Ranjan Dash, Abhishek Bhatia, Aniruddha Mahapatra, Hrituraj Singh, Kuldeep Kulkarni
ICCV, 2021
webpage | paper | arXiv | bibTeX | video (infinite zooming-out)

We propose a semantically-aware novel paradigm to perform image extrapolation that enables the addition of new object instances.

Unsupervised Domain Adaptation for Remote Sensing Images Using Metric Learning and Correlation Alignment
Aniruddha Mahapatra, Biplab Banerjee
NCVPRIPG 2019, 2021 (Oral)
paper | bibTeX

We prose an end-to-end trainable neural network-based unsupervised DA module for RS image classification that learns a shared embedding space for both the domains which are also deemed to be discriminative by jointly optimizing the contrastive loss and minimizing the difference of the two domain higher-order statistics.

Assessment of Sentinel-1 and Sentinel-2 Satellite Imagery for Crop Classification in Indian Region During Kharif and Rabi Crop Cycles
Jitendra Singh, Aniruddha Mahapatra, Saurav Basu, Biplab Banerjee
IEEE IGARSS, 2019
paper | bibTeX

We evaluate the potential of Sentinel-1 Synthetic Aperture Radar (SAR) and Sentinel-2 optical imagery in crop classification for an Indian region using multi-class classification algorithm based on the support vector machine (SVM) by applying it to the temporal features extracted from the two imagery data.

News
Awards and Scholarships

Source code from Jon Barron