Job Overview

Location
Karachi, Sindh
Job Type
Full Time
Date Posted
13 hours ago

Additional Details

Job ID
1454
Job Views
10
Work Mode *
On-site
Degree Requirement
Bachelors
Years of Experience
2

Job Description

We are looking for a hands-on Data Engineer to join our team and lead the development of robust data pipelines powering the IllumiFi Analytics platform. This role involves working with cutting-edge, cloud-native tools to design, implement, and optimize serverless data solutions on AWS. Candidates with experience in Python, PySpark, and modern data lakehouse patterns who thrive in fast-paced environments will find this an excellent opportunity to build impactful solutions from the ground up.

Key Responsibilities:

Design and implement scalable, serverless data pipelines using AWS services such as Glue, Lambda, Step Functions, Athena, and S3. Build and optimize ETL processes leveraging Python and PySpark to ensure efficient data transformations. Manage ingestion and processing of structured and semi-structured data formats including JSON, Parquet, and CSV. Integrate with external APIs using REST protocols and OAuth2 authentication. Model analytical datasets using star and snowflake schemas, implement Slowly Changing Dimensions (SCD) Type 2, and define partitioning strategies to improve query performance. Develop and maintain a well-organized S3-based data lake structure to support analytics workloads. Apply best practices for version control, data quality, and automation through GitHub workflows. Lead technical design discussions and contribute to continuous improvements in infrastructure, tooling, and data processes.

Required Qualifications:

A minimum of 2 years’ experience in Data Engineering is essential. Strong hands-on expertise with AWS services, particularly Glue, Lambda, S3, Step Functions, and Athena, is required. Proficiency in Python and PySpark for building and managing ETL pipelines is necessary. Candidates must have a proven track record working with large-scale datasets and optimizing data pipeline performance. A deep understanding of data modeling, data transformations, and SQL for analytical use cases is expected. Familiarity with REST APIs, OAuth2 authentication, and data integration patterns is important. Experience with Git, CI/CD pipelines, and workflow automation is required. Strong problem-solving skills and a passion for scalable, efficient data engineering are essential. The ability to work independently while collaborating effectively across teams is also important.

Preferred Qualifications and Additional Skills:

Experience with Apache Iceberg or Hudi for data lakehouse architectures is highly desirable. Knowledge of data quality frameworks such as Great Expectations is a plus. Familiarity with Terraform or other Infrastructure as Code (IaC) tools for infrastructure automation is advantageous. A background in e-commerce or analytics platforms will be considered an asset. Exposure to performance tuning for large data workloads is beneficial. Hands-on experience with Step Functions HTTP Task is also preferred.

This is a full-time, in-person position based in Karachi. A Bachelor’s degree is required. If you are motivated to contribute to a dynamic team and build scalable data solutions, we encourage you to apply.

Location

Similar Jobs

Dice Tech Recruitment Services

Data Analytics Expert

Full Time

Dice Tech Recruitment Services

Senior Data Warehouse Engineer I

Full Time

Dice Tech Recruitment Services

Data Science Engineer

Full Time

Dice Tech Recruitment Services

Data Engineer Internship

Full Time