Data Engineer · Denver, CO
Sai Praneeth
Vella
ETL & ELT Pipelines | Python & SQL | Spark · Kafka · dbt | AWS · GCP · Azure
Data Engineer with 2+ years of software engineering experience building scalable data pipelines, cloud-based data infrastructure, and real-time streaming systems. M.S. in Information & Communications Technology — University of Denver (Aug 2025).
2+
Years Experience
7–10
OTT Platforms Integrated
3
Cloud Platforms
50K+
Test Records Processed
Expertise
Technical Skills
Programming & Query
Data Engineering
Cloud Platforms
Databases & Warehouses
Orchestration & DevOps
Front-End & APIs
Work History
Experience
YuppTV India Pvt. Ltd.
Software Engineer — Videograph.ai
Videograph.ai — AI-powered video intelligence SaaS platform processing large-scale media metadata and streaming analytics for major Indian OTT broadcasters. · Hyderabad, India
- Architected and deployed real-time data ingestion pipelines using Google Firebase and PubNub event streaming — supporting data-driven decisions for 1K–10K active viewers.
- Drove successful data pipeline integrations with 7–10 major Indian OTT platforms, expanding Videograph.ai's client footprint and contributing to enterprise revenue growth.
- Built end-to-end analytics dashboards in React.js / Next.js surfacing critical KPIs — buffering rates, viewer engagement, drop-off metrics — enabling business teams to identify monetization opportunities and reduce churn.
- Optimized API response handling and implemented front-end caching strategies, improving platform responsiveness and directly enhancing user retention for client-facing analytics tools.
- Standardized data integration documentation and API contracts across all client integrations, reducing partner onboarding time and cutting engineering support overhead.
- Mentored 3–5 interns and junior engineers on data integration patterns and pipeline best practices, increasing sprint delivery capacity without additional headcount cost.
Portfolio
Projects
PROJECT_01
Real-Time E-Commerce Data Pipeline & Analytics Platform
End-to-end streaming pipeline built locally with Docker — ingests simulated e-commerce events through Kafka, transforms with PySpark, and serves analytics via Snowflake + dbt.
PythonApache KafkaPySpark
SnowflakedbtAWS S3
AWS EMRAirflow
- Kafka ingestion pipeline consuming simulated e-commerce events — validated end-to-end in local Docker environment
- PySpark jobs benchmarked against Pandas/SQL — measurably faster on 50K–100K record datasets
- Snowflake star schema with dbt — sub-second query performance on loaded test datasets
- Airflow DAGs with retry logic and alerting — full pipeline runs successfully end-to-end locally
- dbt schema + singular tests catching nulls, duplicate keys, and referential integrity issues upstream
PROJECT_02
Multi-Cloud Patient Health Data Warehouse & ETL Framework
ETL framework built with synthetic patient data (Faker) — consolidates multi-source records into a Redshift data warehouse with a fully documented multi-cloud orchestration architecture.
PythonPostgreSQLMongoDB
Amazon RedshiftGCP Dataflow
Azure Data Factorydbt
- Python ETL consolidating 4 source schemas (PostgreSQL, MongoDB, REST APIs) into a single canonical model
- Redshift star schema — validated query performance across fact & dimension tables on synthetic datasets
- Multi-cloud blueprint: GCP Dataflow + Azure Data Factory — fully documented with pipeline diagrams
- Field-level PII masking (name, DOB, SSN) using AWS KMS-aligned encryption patterns
- GCP Pub/Sub alerting module — tested with simulated failure injection scenarios
Academic Background
Education
Master of Science — Information & Communications Technology
University of Denver (DU) · Denver, CO
Coursework: Database Systems, Cloud Computing, Big Data Analytics, Algorithms & Data Structures, Software Architecture, Data Modeling
Bachelor of Science — Electronics & Communication Technology
Vidya Jyothi Institute of Technology (VJIT) · Hyderabad, India
Capstone: Smart Healthcare System — IoT-based patient vitals monitoring with real-time web dashboard. 🏆 2nd Place, Project Expo.
Get In Touch
Contact
Let's build something great.
Open to mid-level Data Engineer roles. I bring hands-on project experience with Spark, Kafka, dbt, Snowflake, Redshift, and multi-cloud pipeline architecture across AWS, GCP, and Azure — backed by Google, AWS, and dbt certifications.