Hello, I am

Rohit Saroj

Experienced Data Scientist skilled in Python, SQL, Generative AI, AWS, and advanced data analysis. I specialize in automating processes, designing AI/ML solutions, and deploying production-ready models to solve complex business challenges.

Professional Experience

Aug 2025 – Present

Sr. Data Scientist

Teamlease (Client: Deloitte USI) | Gurgaon, IN

  • Developing a Generative AI–powered chatbot (RWE Agent) using LangGraph for automated clinical cohort building.
  • Implemented prompt optimization using DSPy and integrated Claude 3.5 via VOX API.
  • Designed a human-in-the-loop workflow in LangGraph for user review of extracted criteria.
  • Built an AWS Knowledge Base mapping system to link criteria with Snowflake schema.
  • Engineered a study recommendation module leveraging AWS KB and K-Means clustering.
  • Designed automated ETL pipelines to fetch and preprocess study metadata for S3/AWS KB ingestion.
LangGraph DSPy AWS Bedrock Snowflake FastAPI
Jun 2023 – April 2025

Data Scientist

Stem Inc | Gurgaon, IN

  • Designed and deployed a RAG-based application using OpenAI embeddings & Pinecone, reducing resolution time by 40%.
  • Enhanced predictive accuracy of site load forecasting models, increasing reliability.
  • Automated analytics workflows in Jupyter Notebook, boosting triaging efficiency by 25%.
  • Leveraged AWS, Jenkins, and CI/CD pipelines to deploy scalable data solutions.
Python LangChain OpenAI API Pinecone Jenkins
Sep 2022 – Jun 2023

Analyst

Manikaran Analytics Limited | New Delhi, IN

  • Refined wind energy forecasting models, reducing Mean Absolute Error (MAE) by 12%.
  • Automated recurring analysis tasks using Python scripts, saving 25% of team resources.
  • Built wind speed prediction models with Random Forest and XGBoost, enhancing precision by 15%.
XGBoost Random Forest Pandas SQL
Mar 2020 – Aug 2022

Performance Analyst

Emergya Wind Turbines Pvt Ltd | Chennai, IN

  • Designed a CNN-based AI model to classify wind turbine power curves, improving efficiency by 50%.
  • Predicted component failures using ML algorithms, reducing maintenance costs by 20%.
  • Automated data reporting via Python-based GUIs.
TensorFlow CNN PySpark Power BI
Mar 2017 – Mar 2020

Engineer

WIND WORLD INDIA LTD | Mumbai, IN

  • Boosted energy yield by 5% through in-depth turbine performance analysis.
  • Conducted root cause analyses to identify and rectify underperformance issues.
Excel SQL Python

Technical Skills

Programming & Tools

Python SQL Git Docker Jenkins Jira Bitbucket

AI & Machine Learning

Generative AI LLMs LangChain NLP Deep Learning TensorFlow PyTorch

Data & Analytics

Pandas NumPy Scikit-learn EDA Feature Engineering Power BI Plotly

Frameworks & Cloud

FastAPI Flask Streamlit PySpark AWS (S3, EC2, Bedrock)

Education & Certifications

Education

  • PGDBM Operation Management

    NMIMS, Mumbai | Jul 2022

  • B.E. Instrumentation Engineering

    Rajiv Gandhi Institute of Technology, Mumbai | Aug 2016

Certifications

  • Prompt Design in Vertex AI - Google
  • SQL for Data Science – Great Learning
  • The Data Science Course 2022 – Udemy
  • Generative AI with Langchain – Udemy

Get In Touch

I'm currently open to new opportunities. Whether you have a question or just want to say hi, I'll try my best to get back to you!

Rohit's Assistant

Online
Hi! I'm Rohit's virtual assistant. Ask me about his experience, skills, or how to get in touch!