Professional Journey

From NLP research labs at IIT Bombay to healthcare AI in Berlin — here's the path.

May 2025 – Present

Full Stack Developer

May 2025 – Present

Bryo

Berlin, Germany

Working as a full-stack engineer at Bryo — an early-stage Berlin startup building AI-powered sales automation software for industrial suppliers. Bryo helps technical sales teams automate the entire quoting process: from reading incoming customer requests and configuring complex products, to generating accurate quotes and syncing with ERPs like SAP, Salesforce, and Microsoft Dynamics 365. As one of the core engineers, I am leading the development across the full stack — frontend, backend, infrastructure, and architecture — playing a central role in taking the product from early-stage to production-ready.

Led a critical architecture redesign to consolidate per-client codebases into a single unified codebase, with feature-level customisation toggles per client — reducing duplication, accelerating feature delivery, and simplifying long-term maintenance
Designed and deployed the complete cloud infrastructure from scratch — secure, scalable, and production-hardened — implementing a single-tenant architecture that ensures strict data isolation per client
Built and integrated multiple backend features bridging the frontend with internal AI services and third-party ERP systems including SAP, Salesforce, and Microsoft Dynamics 365
Optimised the frontend interface for complex industrial sales workflows, improving usability and reducing the time sales reps spend navigating from an incoming RFQ to a validated quote

Mar 2024 – Apr 2025

Senior Data Analyst

Mar 2024 – Apr 2025

Tiger Analytics

Bangalore, India

Worked at a leading analytics consultancy, building intelligent data systems for healthcare clients using the latest AI technologies. My most significant contribution was designing a RAG (Retrieval-Augmented Generation) pipeline — think of it as an AI-powered research assistant that not only searches through massive databases to find the most relevant records, but also reads and summarizes them in plain language, ranked by relevance. I also built the full production infrastructure to deploy this system reliably at scale. Beyond AI, I automated data collection by building web crawlers — programs that automatically browse websites and extract useful information — cutting manual effort by 85%. I also developed tools to read unstructured documents (such as reports and articles) and automatically fill in missing fields in master datasets.

Built a RAG pipeline using Large Language Models (LLMs) to generate intelligent query filters and summarize healthcare records, with accuracy improved through similarity-based ranking
Engineered an end-to-end production inference pipeline optimized with parallel processing, significantly reducing execution time at scale
Designed automated web crawlers using Selenium and async programming (asyncio, aiohttp), improving data pipeline efficiency by 85%
Extracted key entities from unstructured documents using LLM prompt tuning for classification, enriching master datasets and enabling downstream analytics

May 2022 – Feb 2024

Data Scientist

May 2022 – Feb 2024

Docketry.ai

Bangalore, India

Led AI and automation development for Docketry — an intelligent document processing platform that helps businesses automate paperwork workflows. My core contribution was fine-tuning a state-of-the-art AI model called LayoutLMV2, which reads documents the same way humans do: by understanding both the text and the visual layout of the page. I trained this model to classify documents into 10 different categories with 94% accuracy. Beyond the AI model, I designed and built the entire server infrastructure — including the API, authentication system, and cloud deployment — and published a Python package so clients could integrate Docketry into their own systems with minimal effort.

Fine-tuned the LayoutLMV2 document AI model, achieving 94% F1-score across 10 document categories — enabling automated, highly accurate document classification
Built and led development of a gRPC-based Django server with full authentication and authorization, deployed on Azure with Nginx for SSL-secured production-ready hosting
Published the Docketry PyPI package to enable seamless, plug-and-play client-side integrations across diverse business systems
Explored prompt-engineering techniques to extract structured fields from raw text documents, improving processing speed and accuracy for automation workflows
Designed user analytics dashboards to visualize usage logs and provide meaningful summaries in the Docketry dashboard

Dec 2020 – Jun 2022

NLP Researcher

Dec 2020 – Jun 2022

Indian Institute of Technology - Bombay

Mumbai, India

Conducted advanced research in Natural Language Processing (NLP) — the branch of AI that enables computers to understand and process human language — under Prof. Pushpak Bhattacharya, one of India's foremost NLP experts. My research focused on a uniquely Indian challenge: understanding code-mixed text, where people blend two languages in a single sentence (e.g., mixing Hindi and English). I built a Named Entity Recognition (NER) system — an AI that automatically identifies names of people, places, movies, and other entities in text — for the top 5 Indian languages, covering 17 entity types and trained on 100,000+ data entries. I also developed data augmentation techniques to create training data for low-resource Indian languages, leading to a 20% performance improvement.

Built a multilingual NER model for code-mixed Indian language queries covering 17 entity tags and 5 Indian languages; fine-tuned an ensemble BERT model achieving 94.79% F1-score
Developed a data augmentation pipeline using open-source tools and a custom gazetteer list, achieving a 20% improvement in NER performance for targeted low-resource language tags
Explored a wide range of deep learning architectures — RNNs, LSTMs, Transformers, T5, and GPTs — across NER, transliteration, machine translation, and summarization tasks
Published research paper: 'Aspect-Sentiment-based Opinion Summarization using Multiple Information Sources' at ACM CODS-COMAD 2023