Aniruddha Mahapatra

I am a sophomore year Masters in Computer Vision (MSCV) student at Carnegie Mellon University. I am advised by Prof. Jun-Yan Zhu. My research interest include computer vision and deep learning, specifically, image and video synthesis and editing using generative models. My goal is to create automated AI algorithms for generating (or editing) photorealistic images and videos which are very time-consuming, or otherwise impossible to do manually by existing tools.

I am fortunate to have worked with Aliaksandr Siarohin, Hsin-Ying Lee and Sergey Tulyakov at Snap Research, Kuldeep Kulkarni, Anandhavelu Natarajan and Subrata Mitra at Adobe Research, Jitendra Singh at IBM Research, and, Professor Biplab Banerjee and Ranita Biswas at Indian Institute of Technology, Roorkee.

Email  |  Resume  |  Google Scholar  |  LinkedIn

profile photo
sym

CMU
MS in Computer Vision
Aug. 22 - Dec. 23

sym

Snap
Research Intern
May. 23 - Aug. 23

sym

Adobe
Research Associate
Aug. 20 - Present

sym

Adobe
Research Intern
May. 19 - Jul. 19

sym

IBM
Remote Research Intern
Jun. 18 - Dec. 18

sym

IIT Roorkee
B.Tech. Computer Science
Jul. 16 - Jul. 20

Software
Moving Elements
video | website

Adobe's Photoshop Elements 2023 new feature "Moving Elements" lets users generate aesthetic cinemagraphs from their photos.
This feature was based on our Controllable Animation of Fluid Elements in Still Images (CVPR 2022) work.

Research
Text-Guided Synthesis of Eulerian Cinemagraphs
Aniruddha Mahapatra, Aliaksandr Siarohin, Hsin-Ying Lee,
Sergey Tulyakov, Jun-Yan Zhu
SIGGRAPH Asia, 2023 (ACM Transactions on Graphics)
webpage | paper | arXiv | code | bibTeX

We introduce a fully automated method, Text2Cinemagraph, for creating cinemagraphs from text descriptions - an especially challenging task when prompts feature imaginary elements and artistic styles, given the complexity of interpreting the semantics and motions of these images. Our method also gives the user a coarse control over the direction of motion in the generated cinemagraphs using text-based direction.

Controllable Animation of Fluid Elements in Still Images
Aniruddha Mahapatra, Kuldeep Kulkarni
CVPR, 2022
webpage | paper | arXiv | video | bibTeX

Given a single input image, mask the region user wants to animate and any number of arrow directions and their associated speeds provided by the user to specify the direction of desired movement, we propose a method to interactively control the animation of fluid elements (like water, fire, clouds, etc.) to generate cinemagraphs from the single image.

GEMS: Scene Expansion using Generative Models of Graphs
Aniruddha Mahapatra*, Rishi Agarwal*, Tirupati Saketh Chandra*, Vaidehi Patil*, Kuldeep Kulkarni, Vishwa Vinay
WACV, 2023
paper | arXiv | bibTeX

We design an auto-regressive model, GEMS, for a novel task of conditional expansion of scene graphs from a given seed scene graph that adds nodes and edges hierarchically to the seed scene graph. We also propose novel metrics to evaluate the quality of expanded scene graphs to capture the coherence of predicted edges and nodes better than traditional MMD based metrics.

Entity Extraction in Low Resource Domains with Selective Pre-training of Large Language Models
Aniruddha Mahapatra, Snarmila Nangi, Aparna Garimella, Anandhavelu Natarajan
EMNLP, 2022 (Oral)
paper | bibTeX

We introduce effective ways of dataset selection for pretraining large language models in an unsupervised way to facilitate domian adaptation to very limited data domains.

SemIE: Semantically-aware Image Extrapolation
Bholeshwar Khurana, Soumya Ranjan Dash, Abhishek Bhatia, Aniruddha Mahapatra, Hrituraj Singh, Kuldeep Kulkarni
ICCV, 2021
webpage | paper | arXiv | bibTeX | video (infinite zooming-out)

We propose a semantically-aware novel paradigm to perform image extrapolation that enables the addition of new object instances.

Unsupervised Domain Adaptation for Remote Sensing Images Using Metric Learning and Correlation Alignment
Aniruddha Mahapatra, Biplab Banerjee
NCVPRIPG 2019, 2021 (Oral)
paper | bibTeX

We prose an end-to-end trainable neural network-based unsupervised DA module for RS image classification that learns a shared embedding space for both the domains which are also deemed to be discriminative by jointly optimizing the contrastive loss and minimizing the difference of the two domain higher-order statistics.

Assessment of Sentinel-1 and Sentinel-2 Satellite Imagery for Crop Classification in Indian Region During Kharif and Rabi Crop Cycles
Jitendra Singh, Aniruddha Mahapatra, Saurav Basu, Biplab Banerjee
IEEE IGARSS, 2019
paper | bibTeX

We evaluate the potential of Sentinel-1 Synthetic Aperture Radar (SAR) and Sentinel-2 optical imagery in crop classification for an Indian region using multi-class classification algorithm based on the support vector machine (SVM) by applying it to the temporal features extracted from the two imagery data.

News
Awards and Scholarships

Source code from Jon Barron