ANSHUL DHAWAN

[email protected] • +91-9873919352 • Gurugram, India

LinkedIn: https://linkedin.com/in/anshul-dhawan089244130/

PROFESSIONAL SUMMARY

Results-driven Data Engineer with over 4 years of experience designing and optimizing scalable ETL pipelines using Spark, Scala, and Python. Proven track record of processing billions of financial and clickstream records daily to enable data-driven decision-making and automate reporting. Skilled in cloud platforms including AWS (S3, Redshift) and GCP (BigQuery, Dataproc), with hands-on expertise in orchestration tools like Apache Airflow. Adept at collaborating with cross-functional teams to deploy machine learning models and deliver high-quality master data, driving product enhancements and operational efficiency.

WORK EXPERIENCE

Data Engineer 2

08/2023 - Present

Expedia (via TEKsystems) , Gurugram, India

• Designed and developed scalable ETL pipelines using Spark and Scala for ingesting data from multiple sources (Teradata, S3) into distributed systems, ensuring efficient data retrieval and availability for stakeholders

• Built and optimized data transformation services to produce high-quality, clean master data for downstream applications and data models, enabling data-driven decision-making across the organization

• Collaborated with Data Science teams to deploy machine learning models at scale, integrating them seamlessly with data pipelines and user services, supporting product enhancements and data-driven decision-making

• Utilized orchestration tools like Apache Airflow to schedule and monitor complex workflows, improving the efficiency and reliability of ETL pipelines, with a focus on reducing downtime and ensuring timely data availability

• Hands-on experience in RDBMS, NoSQL, and analytics databases, including Redis and Amazon Redshift, enabling efficient data storage and retrieval for large-scale data processing workloads

• Experienced with cloud-based data processing services on AWS (S3, Redshift) and GCP (BigQuery, GCS, Dataproc), optimizing data storage and processing solutions for large-scale distributed systems

Data Engineer

03/2022 - 07/2023

Paytm , Noida, India

• Development of data pipelines using Spark, Scala and Python

• Processing huge volumes of financial data (in billions) every day across various products to automate reporting that support business teams

• Automation of Data Science Models to support DS team and provide the data to downstream models and pipelines on time

• Created an automated recon process to identify the gaps among financial data sources and deliver data gaps on mail to stakeholders

• Wrote a common utility to write data as spark dataframe to an internal tool named Datalake that helped to avoid dependency of external framework and write data directly

• Working with AWS S3 as storage layer

• Orchestrating the jobs using Airflow DAGS – Python Operator, Livy Operator, etc.

• Data ingestion and processing of data using spark, Scala and python

Software Engineer

07/2019 - 03/2022

Airtel x-Labs , Gurugram, India

• Development of a generic scalable spark framework to handle billions of records using pySpark

• Data ingestion and processing of data using spark

• Optimization of spark jobs to avoid resource overutilization

• Build and unit test against specifications

• Have Exposure with Airflow (Dags, operators,etc.)

• Ensure quality by performing thorough testing and leveraging peer reviews for your work and the work of others

• Developed aggregates on clickstream source from complex SQL queries as per business requirements

• Created a pipeline to ingest continuous data into hive and exadata through Solace queue

• Interaction with business users, analysts, data scientists and product customers for gathering requirements and functional specification

• Data extraction/loading from/to Exadata, HDFS and hive as per business specifications

• Have exposure developing ETL pipelines using AbInitio

Analysis and Prediction of Suicide Attempt

• The main objective of project is to analyse the suicide dataset and identify significant attributes contributing towards suicide attempt and predict future such attempts with significant precision

• Compared accuracies calculated using three models based on logistic regression, naïve bayes and random forest

Emoji Interpretor

• A web app which interprets emojis and display their meaning. Built using ReactJS

• App link : //App

• Source : Github://Emoji-app

Cricket Fan Contest

• A CLI quiz app for cricket fans built using JavaScript

• App link : //App

• Source code : Github://fan-contest

EDUCATION

B.Tech in Computer Science

01/2015 - 01/2019

Bharati Vidyapeeth’s College of Engineering, New Delhi , New Delhi, India GPA: 7.73

12th Non-Medical (PCM)

01/2015

Hansraj Model School, Punjabi Bagh, New Delhi , New Delhi, India GPA: 83.60%

10th

01/2013

Hansraj Model School, Punjabi Bagh, New Delhi , New Delhi, India GPA: 9.2/10.00

SKILLS

PROJECTS

Analysis and Prediction of Suicide Attempt

• Compared accuracies calculated using three models based on logistic regression, naïve bayes and random forest

Emoji Interpretor

Technologies: ReactJS

• A web app which interprets emojis and display their meaning. Built using ReactJS

• App link : //App

• Source : Github://Emoji-app

Cricket Fan Contest

Technologies: JavaScript

• A CLI quiz app for cricket fans built using JavaScript

• App link : //App

• Source code : Github://fan-contest

ATS-Optimized

Creative & Modern

Minimalist & Clean

Microsoft Word Ready

Executive & Corporate

Cloud-Based Editing

Cover Letter Templates

Cover Letter Examples

AI Cover Letter Generator

Cover Letter Format

How to Write a Cover Letter

Cover Letter vs Resume

ANSHUL DHAWAN

PROFESSIONAL SUMMARY

WORK EXPERIENCE

EDUCATION

SKILLS

PROJECTS

Similar Resumes