Sept 2023 - Present
Tech: Large Language Model (GPT3.5), Prompt Engineering, RAG, LLM Finetuning, Chatbot, Text-to-image models, Image-to-video models, LangChain, Qdrant, AWS
  • Developed PoCs using image-to-image text style transfer models to generate images with specific font-styled textual content and text-to-image-to-video models to generate visual storyboards based on a textual context.
  • Worked on a prompt-engineered LLM-based Twitter bot for ICC Men’s Cricket World Cup’23 to reply optimistically and with <>b>personality-specific tonality. Curated knowledge base to make replies factful.
  • Developed an LLM-finetuned and RAG-based chatbot to answer legal queries. Also worked on the feature of automated legal document generation. Deployed it as a microservice on the AWS Platform.
Dubverse.ai, Gurgaon, India | DL Engineering Intern
January 2023 - July 2023
Tech: Text-to-speech, Voice cloning, Python, Pytorch, Google Cloud Platform (GCP)
  • Optimized the deployed TTS Cross-Lingual Voice Cloning model by plotting graphs between transcript and spectrogram-energy-frequency of generated speech for locating misaligned portions in generated speech
  • Added language-specific and RegEx rules to lower Phoneme Error Rate by 9%.
  • Scaled the optimized model to 4 new languages by training on 200+ hours of data over Google Cloud Platform.
June 2022 - September 2022 | Project Page
Pre-trained Language Model (BioBERT), Named Entity Recognition, Python, TensorFlow, FastAPI, Google Cloud Platform (GCP)
  • Reduced data curation time by 40% by developing OligoFinder, a BioBERT-based module fine-tuned for the task of Name Entity Recognition to extract Oligonucleotides from research papers automatically.
  • Developed active-learning pipeline using RegEx, NLTK and BOWs to curate dataset of 7000 pairs of sentence
  • The resultant fine-tuned model achieved a Precision of 0.92, Recall of 0.93 and F1 of 0.92.
January 2022 - July 2022
GAN-based lip-synching models, YOLOV5, RoBERTa, Information Extraction, Tensorboard, BeautifulSoup, Python, TensorFlow, Flask, Google Cloud Platform (GCP)
  • Implemented a GAN-based Lip-synching system that creates lip movements based on dubbed audio. The model is tuned on a custom-curated Indian speakers dataset to achieve a Lip Sync Error Distance of 6.512.
  • Designed a Flask-based Two-staged Resume Parser that uses YOLOv5 and spaCyV3's RoBERTa to extract 10 information with overall test accuracy of 53.97%. Scraped ~4000 resumes for training and val using BeautifulSoup.
DESIDOC-DRDO, New Delhi, India | Data Science Intern
June 2022 - July 2022
Tech: Deep Learning, Face detection, Face recognition, Python, Tensorflow, TensorBoard, Streamlit
  • Developed a Person Auto-tagging system that labels renowned personalities across unlabeled photographs using MTCNN and FaceNet. Achieved 94% and 98% accuracy in face detection and recognition tasks, respectively.
January 2021 - February 2021
Tech: Deep Learning, Face detection, Face recognition, Python, Tensorflow, TensorBoard, FastAPI, Google Cloud Platform (GCP)

  • Integrated face recognition-based Automated Connection Invite Sender feature to Gullu, a social mobile application focused on traveller communities.
  • Reduced the average time spent earlier by a user in sending friend requests by 53% by replacing it with an automatic search for a person based on their selfie and automatic sending of a connection request over the platform.