PV Abhiram — Data Analyst and Machine Learning Engineer
From research to production — I build ML systems that work in the real world.
Data analyst at Novartis Healthcare and ML researcher based in Hyderabad, India. I work across the full stack — from CUDA kernels and transformer architectures to CRM platforms and business analytics. BTech CSE (AI & ML), Vellore Institute of Technology Chennai, CGPA 8.78. Open to roles.
Projects
CLIP-to-EdgeNeXt Distillation
Distilled CLIP ViT-B/16 into EdgeNeXt-XX-Small (approximately 66x compression) for on-device crop disease detection across 38 disease classes. 3-stage curriculum with a text-prototype-anchored distillation loss. Targets real-world inference on resource-constrained agricultural hardware.
Stack: CLIP, PyTorch, Edge AI, Knowledge Distillation.
Neurosymbolic AI: Beyond LLM-to-Solver Pipelines
Co-authored position paper arguing translation — not symbolic inference — is the bottleneck in LLM+solver systems. Evidence from an 85-paper review, multi-scale probing of GPT-2 and DistilBERT, and Phi-2 and Qwen2.5-3B benchmarks on FOLIO with the Z3 SMT solver.
Stack: Position Paper, GPT-2, Phi-2, FOLIO, Z3 SMT.
Adaptive Depth Transformer
Decoder-only transformer that allocates compute per-token via representational convergence gating — simple tokens early-exit, complex ones run full depth. 29% FLOPs reduction with matched perplexity on WikiText-103 across 13 architectural patches.
Stack: PyTorch, Transformers, NLP.
Multilingual Bixby Capsule (Samsung R&D)
Led design and integration of Meta's NLLB model into a Bixby capsule for auto-detection and translation across 200+ languages at 98% accuracy. Hosted via Docker and Hugging Face during Samsung R&D internship. First author on findings published on arXiv (arXiv:2403.05982).
Stack: NLLB, Docker, Hugging Face, Bixby.
LLM BI Assistant
Natural-language analytics agent over business databases. Ask a question in plain English, get SQL, a chart, and a narrative summary in one shot. RAG retrieval over schema docs keeps hallucinations near zero.
Stack: LangChain, RAG, FastAPI, GPT-4o, Plotly.
Clinical Doc Processor
Multi-agent LLM pipeline for pharma document intelligence — adverse-event extraction, compliance checks, and structured CRUD summaries from unstructured visit notes and Vault documents. Built around real pain points from the Novartis CRM workflow.
Stack: LLM Agents, Python, Pharma, NLP.
Time-Series Forecast API
Production forecasting service using PatchTST for demand and KPI prediction. Exposes a REST endpoint: upload a CSV, get back forecasts with confidence bands and anomaly flags. Packaged with MLflow tracking and Docker-compose deployment.
Stack: PatchTST, MLflow, FastAPI, Docker.
Experience
Associate Data Analyst — Novartis Healthcare (2024 — present)
- Driving analytics for the Pelacarsen launch — defining the target HCO/HCP universe, optimising field force structure and cost, and analysing patient and market dynamics to inform go-to-market and commercial planning.
- Designed a multi-source data validation framework (CRM, external APIs, NLP classification) that classified 90% of previously-unknown pediatric HCP addresses, improved targeting coverage by 75%, mitigated $10M in regulatory risk and saved $100K in vendor costs.
- Partnered on the ZAIDYN Field Deployment rollout with ZS Associates — owned KPI design, BRD translation, system integration, testing and validation; delivered $300K in annual savings via an in-house TechOps transition.
Data Science Intern — Vivriti Capital (Jan 2024 — Jul 2024)
- Built an end-to-end pipeline to extract and analyse 10,000+ credit reports from CRISIL, ICRA, CARE Ratings and India Ratings — NLP to lift P&L items, credit ratings and key financial metrics into standardised summaries for investment-lead screening at previously-infeasible scale.
- Developed an ML-powered KYC solution combining biometric matching between live customer photos and government IDs with OCR-based cross-validation of portal-entered data against PAN card information.
R&D Intern — Samsung R&D Institute India (Dec 2022 — Jul 2023)
- Designed API systems for a Bixby capsule with auto-detection and translation across 200+ languages at 98% accuracy.
- Integrated the NLLB model via Docker and Hugging Face. First author on findings published on arXiv (arXiv:2403.05982).
BTech CSE with AI & ML — Vellore Institute of Technology, Chennai (2020 — 2024)
- CGPA 8.78.
- Specialisation in Artificial Intelligence and Machine Learning.
Skills
ML and Research
PyTorch, TensorFlow, Keras, Scikit-learn, Hugging Face Transformers, Deep Learning, NLP, Computer Vision.
Systems
Python, C++, Docker, PySpark, CUDA.
Data and Analytics
SQL, MySQL, PostgreSQL, Snowflake, FAISS, DataIKU DSS, Alteryx, Power BI.
Engineering
JavaScript, Data Structures and Algorithms, Object-Oriented Programming, Statistical Modeling.
Business and Strategy
Stakeholder Management, Business Requirement Structuring, Solution Design, KPI-driven Decision Making, Salesforce.
Contact
Building something at the intersection of ML and production? Let's talk.