Ketan Hadkar

Aspiring Data Engineer | Databricks & AWS Certified

LinkedIn | GitHub | Email | Phone

[email protected]

9326496156

Mumbai, IN

About

Highly motivated early-career Data Engineer with a strong foundation in building and optimizing scalable ETL pipelines and data warehouses on AWS. Proven ability to leverage technologies like Apache Spark, Python, and SQL to process multi-format data, automate workflows, and deliver actionable insights. Eager to apply certified cloud and big data expertise to drive data-driven solutions in a dynamic environment.

Work Experience

Data Engineer Intern

Mactores

Mar 2025 - Jul 2024

Mumbai, Maharashtra, India

• Gaining exposure to Apache Spark, AWS, and Databricks through internship training. • Earned Databricks Certified Data Engineer Associate and AWS Certified Cloud Practitioner, demonstrating skills in data ingestion, transformation, and cloud workflows. • Built and tested ETL pipelines in personal projects using Spark, Airflow, and key AWS services. • Explored and implemented pipeline design strategies such as incremental loading, partitioning, and orchestration to improve efficiency and scalability

Education

Electronics & Telecommunication Engineering (EXTC)

Vasantdada Patil Pratishthan's College of Engineering & Visual Arts, Mumbai University

7.85 CGPA

Aug 2020 - May 2024

Certificates

AWS Certified Cloud Practitioner

Amazon Web Services (AWS)

Aug 2025

Databricks Certified Data Engineer Associate

Databricks

Jul 2025

Projects

ETL Data Pipeline and Warehouse Implementation using AWS

Designed and implemented a robust, scalable ETL pipeline and data warehouse on AWS, integrating multi-format data for advanced analytics and reporting.

Built Automated Incremental ETL Pipeline

Designed and automated an ETL pipeline using Apache Spark, MySQL, S3, and Airflow, incorporating full-load and incremental strategies with upsert functionality. Optimized performance through advanced partitioning and mirroring, reducing processing time by 40% for large-scale datasets and enabling faster data retrieval and insights.

Skills

Programming Languages

  • Python
  • SQL
  • PySpark

Databases

  • MySQL
  • MongoDB

Big Data & ETL Frameworks

  • Apache Spark
  • Hadoop
  • Apache Airflow
  • Databricks

Developer Tools & Concepts

  • Git
  • Docker
  • VS Code
  • PyCharm
  • IntelliJ
  • Eclipse
  • SFTP Server
  • DBeaver
  • YAML
  • Linux (Ubuntu)

Cloud Platforms & Services

  • AWS
  • Databricks