Portfolio
Analyze
000

Kiran Paritala

Data Engineer & AI/ML Engineer by craft.

Building petabyte-scale pipelines, real-time Kafka streaming systems, and ML-powered data products across AWS, GCP, and Azure.

60–70%
Query Cost Reduction
TB+
Scale Pipelines
4
Certifications
2+
Years Production
Get in touch ↗
Scroll
About Me

Engineer by craft

I'm Kiran Paritala — a Data Engineer and AI/ML Engineer with an MS in Computer Science from Cleveland State University (May 2026). I build systems that move data reliably at scale and layer intelligence on top of it. At Airtel — India's largest telecom — I spent nearly three years engineering petabyte-scale ETL pipelines, Medallion Architecture across Azure and GCP, and real-time Kafka streaming systems. During my MS, I built four production-grade AI/ML projects: a Water Quality ML Pipeline, a zero-cost RAG chatbot, an autonomous LangGraph AI agent, and a Retail BI pipeline serving Power BI dashboards.

I don't wait for problems to escalate — I look for them before they do.

At Airtel, query costs were quietly climbing across reporting systems. Nobody flagged it. I dug into execution plans, identified missing partitioning and clustering strategies, and rebuilt the table architecture — cutting scan costs by 60–70% without touching a single downstream report.

When I built the Ask Athreya AI agent, the model kept hallucinating column names. The obvious fix was prompt engineering. The real fix was injecting the actual dataset schema into the system prompt at startup — eliminating the entire class of errors in one structural change.

Data problems are rarely about the data. They're about the assumptions built into the system that nobody questioned.
Projects · Portfolio
Personal Project · AWS · GCP

Water Quality ML Pipeline

1 hr
Predictive Alert
60–70%
BQ Cost Saved
TB+
Pipeline Scale

Medallion Architecture IoT pipeline with Isolation Forest anomaly detection (Δ=0.78) and a global LSTM model predicting threshold breaches 1 hour early. 60–70% BigQuery cost reduction on TB-scale data with idempotent MERGE writes and a 5-check quality gate.

PySparkTensorFlowGCP BigQueryAirflowLSTM
Personal Project · Local · Zero API Cost

RAG-Reader

$0
API Cost
Local
Embeddings
Multi‑Q
Retrieval

RAG-powered chatbot that reads any PDF or Word document and answers questions using Retrieval Augmented Generation — fully locally at zero API cost. HuggingFace all-MiniLM-L6-v2 runs on-device with no disk writes. Multi-query retrieval expands vague queries into semantically related variants for better recall on ambiguous inputs. A custom prompt template extracts exact names, dates, and skills from retrieved context rather than returning empty responses.

LangChainHuggingFaceGroqRAG
Personal Project · LangGraph · AI Agent

Ask Athreya — AI Data Analyst

34
Pytest Tests
4
Pandas Tools
Multi
Turn Memory

AI agent built with LangChain and LangGraph that answers plain-English questions about CSV/Excel files using 4 custom pandas tools selected autonomously. Eliminated column-name hallucination by injecting the actual dataset schema into the system prompt at startup. Multi-turn memory via LangGraph checkpointing so follow-up questions like "list them" work correctly. 34-test pytest suite plus a separate eval-harness that caught a false-positive bug unit tests missed entirely.

LangChainLangGraphGroq Llama 3.3pandas
Personal Project · ETL · Business Intelligence

Retail Product Performance Dashboard

Daily 6AM
Airflow DAG
5 KPIs
Revenue Tracked
4 Sources
Ingested

Daily ETL pipeline ingesting retail order data from PostgreSQL, REST APIs, and S3 CSV files into Databricks. PySpark Medallion Architecture (Bronze→Silver→Gold) with dbt staging and mart models calculating monthly revenue, MoM growth, revenue rank, and top 10 products. Data quality framework covers null checks, duplicate detection, and referential integrity validation. Orchestrated with Apache Airflow at 6AM daily with retry logic and failure alerting. Analytics-ready data loaded to Snowflake and BigQuery — served via Power BI semantic models.

PySparkdbtAirflowDatabricksSnowflake
Expertise

Technical stack

2+ years across multi-cloud platforms, big-data tooling, and AI/ML frameworks.

📥
Ingest
Apache Kafka Kafka Connect REST APIs Kafka CDC MySQL PostgreSQL
Process
PySpark Python Apache Spark pandas Databricks Apache Airflow AWS Glue
🏔
Store
Delta Lake Apache Iceberg Snowflake BigQuery AWS Redshift Azure Synapse Avro
🛡
Govern
dbt Great Expectations MLflow DataHub Terraform Grafana Git / GitHub
🧠
AI / ML
LangChain LangGraph TensorFlow HuggingFace scikit-learn Groq RAG
☁ Cloud Platforms
AWS — S3 · Glue · Redshift GCP — BigQuery · Dataflow Azure — Synapse · Data Lake Gen2 Databricks Power BI · Tableau
Background

Work experience

Graduate Teaching Assistant
Cleveland State University · Cleveland, OH
Jan 2026 – May 2026
  • Led weekly SQL and Python lab sessions for 50+ students covering query optimization, complex joins, and real-world data engineering workflows.
  • Held office hours diagnosing SQL and Python logic errors; collaborated with professor to improve overall class performance.
Software Engineer — Data Engineering
Airtel · India
Oct 2021 – Feb 2024
  • Engineered TB-scale ETL/ELT pipelines in Python and PySpark ingesting from MySQL, PostgreSQL, Kafka CDC, REST APIs into BigQuery, Redshift, and Snowflake.
  • Architected Medallion Architecture (Bronze/Silver/Gold) achieving 60–70% query scan cost reduction via partitioning, clustering, and Z-ordering.
  • Built real-time Kafka streaming pipelines with Spark Structured Streaming for CDC replication into Delta Lake and Apache Iceberg.
  • Provisioned cloud infrastructure using Terraform across AWS, GCP, and Databricks — eliminating manual provisioning and reducing environment drift.
  • Built Grafana dashboards and Airflow SLA alerts routed to Slack and PagerDuty; participated in on-call incident response.
MS in Computer Science
Cleveland State University · Cleveland, OH
May 2026
0+
Years Experience
0
Projects Built
0
Certifications

Get in touch

Let's build
something great.

Open to full-time Data Engineer & AI/ML Engineer roles. Let's connect and build something that scales.

Kiran Paritala © 2026 kiranparitala.dev