Job Overview

Location
Islamabad, Islamabad
Job Type
Full Time
Date Posted
21 hours ago

Additional Details

Job ID
2047
Job Views
27
Work Mode *
On-site

Job Description

At Sobah Systems, we are looking for a Data Engineer with strong programming skills in Python to help us optimize and manage our growing data ecosystem. In this role, your primary focus will be on building, improving, and streamlining data pipelines to ensure data is efficiently processed, readily available, and reliable for our AI applications.

You’ll work closely with AI/ML engineers to ensure that the underlying data infrastructure seamlessly supports LLM-based and Generative AI systems particularly within the US healthcare domain. This role is ideal for someone who enjoys optimizing complex data workflows and writing clean, scalable code to automate data operations.


Responsibilities:

  • Design, build, and optimize data pipelines and ETL workflows for structured and unstructured data.
  • Enhance and refactor existing pipelines to improve performance, reliability, and scalability.
  • Develop and maintain data transformation and integration logic using Python and SQL.
  • Automate recurring data processing and validation tasks through robust, well-documented scripts.
  • Monitor and troubleshoot data jobs to ensure timely and accurate data delivery for AI systems.
  • Collaborate with AI/ML teams to ensure pipelines efficiently serve data for training, fine-tuning, and RAG-based applications.
  • Implement data quality checks, validation layers, and observability tools for production pipelines.
  • Manage and tune databases (preferably MS SQL Server) for optimal query and storage performance.
  • Work with Azure cloud and related services for deployment, orchestration, and storage management.
  • Stay updated on modern data engineering practices, optimization techniques, and AI data infrastructure trends.


Qualifications & Skills:

  • Bachelor’s/master’s degree in computer science, Data Science, or a related field.
  • 3+ years of experience in data engineering, with proven success in pipeline development and optimization.
  • Strong command of Python for data engineering, scripting, and workflow automation.
  • Advanced SQL skills, including performance tuning and query optimization.
  • Experience designing and maintaining ETL/ELT pipelines at scale.
  • Familiarity with Azure (preferred) or other major cloud platforms for data workflows.
  • Exposure to AI/ML data pipelines or LLM workflows is a plus.
  • Solid grasp of data modeling, schema design, and data warehouse architecture.
  • Excellent problem-solving and debugging abilities with a focus on performance optimization.
  • Effective communication and collaboration skills to work across technical teams.


Nice to Have:

  • Experience with workflow orchestration tools (Airflow, Prefect, Azure Data Factory).
  • Familiarity with vector databases (e.g., Pinecone, FAISS, Weaviate).
  • Understanding of healthcare data standards (HIPAA, PHI).


Why Join Us?

  • Work on high-impact data systems that directly support advanced AI solutions.
  • Collaborate with a forward-thinking AI team driving innovation in healthcare technology.
  • Grow your career at the intersection of data engineering and AI infrastructure.
  • Contribute to real-world AI readiness through efficient, optimized data delivery.


Apply at [email protected]


Location

Similar Jobs

Dice Tech Recruitment Services

Data Analyst

Full Time

Dice Tech Recruitment Services

Senior Analyst

Full Time

Dice Tech Recruitment Services

Data Analyst

Full Time

Dice Tech Recruitment Services

Data Specialist

Full Time