PV Abhiram — Data Analyst and Machine Learning Engineer

From research to production — I build ML systems that work in the real world.

Data analyst at Novartis Healthcare and ML researcher based in Hyderabad, India. I work across the full stack — from CUDA kernels and transformer architectures to CRM platforms and business analytics. BTech CSE (AI & ML), Vellore Institute of Technology Chennai, CGPA 8.78. Open to roles.

Projects

CLIP-to-EdgeNeXt Distillation

Distilled CLIP ViT-B/16 into EdgeNeXt-XX-Small (approximately 66x compression) for on-device crop disease detection across 38 disease classes. 3-stage curriculum with a text-prototype-anchored distillation loss. Targets real-world inference on resource-constrained agricultural hardware.

Stack: CLIP, PyTorch, Edge AI, Knowledge Distillation.

Neurosymbolic AI: Beyond LLM-to-Solver Pipelines

Co-authored position paper arguing translation — not symbolic inference — is the bottleneck in LLM+solver systems. Evidence from an 85-paper review, multi-scale probing of GPT-2 and DistilBERT, and Phi-2 and Qwen2.5-3B benchmarks on FOLIO with the Z3 SMT solver.

Stack: Position Paper, GPT-2, Phi-2, FOLIO, Z3 SMT.

Adaptive Depth Transformer

Decoder-only transformer that allocates compute per-token via representational convergence gating — simple tokens early-exit, complex ones run full depth. 29% FLOPs reduction with matched perplexity on WikiText-103 across 13 architectural patches.

Stack: PyTorch, Transformers, NLP.

Multilingual Bixby Capsule (Samsung R&D)

Led design and integration of Meta's NLLB model into a Bixby capsule for auto-detection and translation across 200+ languages at 98% accuracy. Hosted via Docker and Hugging Face during Samsung R&D internship. First author on findings published on arXiv (arXiv:2403.05982).

Stack: NLLB, Docker, Hugging Face, Bixby.

LLM BI Assistant

Natural-language analytics agent over business databases. Ask a question in plain English, get SQL, a chart, and a narrative summary in one shot. RAG retrieval over schema docs keeps hallucinations near zero.

Stack: LangChain, RAG, FastAPI, GPT-4o, Plotly.

Clinical Doc Processor

Multi-agent LLM pipeline for pharma document intelligence — adverse-event extraction, compliance checks, and structured CRUD summaries from unstructured visit notes and Vault documents. Built around real pain points from the Novartis CRM workflow.

Stack: LLM Agents, Python, Pharma, NLP.

Time-Series Forecast API

Production forecasting service using PatchTST for demand and KPI prediction. Exposes a REST endpoint: upload a CSV, get back forecasts with confidence bands and anomaly flags. Packaged with MLflow tracking and Docker-compose deployment.

Stack: PatchTST, MLflow, FastAPI, Docker.

Experience

Associate Data Analyst — Novartis Healthcare (2024 — present)

  • Driving analytics for the Pelacarsen launch — defining the target HCO/HCP universe, optimising field force structure and cost, and analysing patient and market dynamics to inform go-to-market and commercial planning.
  • Designed a multi-source data validation framework (CRM, external APIs, NLP classification) that classified 90% of previously-unknown pediatric HCP addresses, improved targeting coverage by 75%, mitigated $10M in regulatory risk and saved $100K in vendor costs.
  • Partnered on the ZAIDYN Field Deployment rollout with ZS Associates — owned KPI design, BRD translation, system integration, testing and validation; delivered $300K in annual savings via an in-house TechOps transition.

Data Science Intern — Vivriti Capital (Jan 2024 — Jul 2024)

  • Built an end-to-end pipeline to extract and analyse 10,000+ credit reports from CRISIL, ICRA, CARE Ratings and India Ratings — NLP to lift P&L items, credit ratings and key financial metrics into standardised summaries for investment-lead screening at previously-infeasible scale.
  • Developed an ML-powered KYC solution combining biometric matching between live customer photos and government IDs with OCR-based cross-validation of portal-entered data against PAN card information.

R&D Intern — Samsung R&D Institute India (Dec 2022 — Jul 2023)

  • Designed API systems for a Bixby capsule with auto-detection and translation across 200+ languages at 98% accuracy.
  • Integrated the NLLB model via Docker and Hugging Face. First author on findings published on arXiv (arXiv:2403.05982).

BTech CSE with AI & ML — Vellore Institute of Technology, Chennai (2020 — 2024)

  • CGPA 8.78.
  • Specialisation in Artificial Intelligence and Machine Learning.

Skills

ML and Research

PyTorch, TensorFlow, Keras, Scikit-learn, Hugging Face Transformers, Deep Learning, NLP, Computer Vision.

Systems

Python, C++, Docker, PySpark, CUDA.

Data and Analytics

SQL, MySQL, PostgreSQL, Snowflake, FAISS, DataIKU DSS, Alteryx, Power BI.

Engineering

JavaScript, Data Structures and Algorithms, Object-Oriented Programming, Statistical Modeling.

Business and Strategy

Stakeholder Management, Business Requirement Structuring, Solution Design, KPI-driven Decision Making, Salesforce.

Contact

Building something at the intersection of ML and production? Let's talk.

  • Email: abhiramp428@gmail.com
  • LinkedIn: linkedin.com/in/pvabhiram
  • GitHub: github.com/Abhiram970
  • CV / Resume: Download PDF
  • Publications: arXiv:2403.05982 — multilingual translation, Samsung R&D