Job Title: Infrastructure Engineer (Azure / Kubernetes / Docker / Terraform)
Location: Remote / 9am-5pm (EST Time / Toronto Time)
Department: Product and Engineering
Overview
We are seeking an experienced Infrastructure Engineer to design, implement, and manage scalable, secure, and automated infrastructure for our AI-driven platforms.
As part of our engineering team, you will help build and maintain cloud-native environments on Microsoft Azure using Terraform, Kubernetes, and Docker, ensuring the performance, reliability, and scalability of the systems that power our AI products and services.
This role offers the opportunity to work with cutting-edge infrastructure technologies that support advanced AI workloads and data-driven applications.
Key Responsibilities
- Design, deploy, and manage Azure Cloud infrastructure using Terraform for Infrastructure as Code (IaC).
- Build and maintain Kubernetes clusters (AKS) for scalable, containerized AI and data workloads.
- Create, manage, and optimize Docker images, registries, and container deployments.
- Develop and enhance CI/CD pipelines (GitHub Actions, Azure DevOps, or Jenkins) for infrastructure and application delivery.
- Implement monitoring, logging, and alerting solutions using tools like Prometheus, Grafana, Azure Monitor, or ELK stack.
- Collaborate with software, data, and ML engineering teams to ensure infrastructure reliability and seamless deployment of AI applications.
- Apply security, compliance, and cost optimization best practices across environments.
- Troubleshoot complex infrastructure and networking issues across production and staging systems.
- Contribute to disaster recovery, scalability, and performance optimization strategies.
Required Skills & Experience
- 3+ years of experience as an Infrastructure Engineer, DevOps Engineer, or Cloud Engineer.
- Proven experience managing environments on Microsoft Azure (AKS, Networking, Load Balancing, Storage, IAM).
- Advanced proficiency with Terraform (modules, workspaces, state management, CI/CD integration).
- Strong experience with Kubernetes (Helm, ingress controllers, autoscaling, secrets, namespaces).
- Proficiency with Docker (container lifecycle management, image optimization, private registries).
- Experience designing and maintaining CI/CD pipelines for cloud-based infrastructure.
- Scripting or automation experience with Bash or PowerShell.
- Solid understanding of networking, observability, and security best practices in cloud-native environments.
Nice-to-Have
- Experience with Docker Compose and Docker Swarm.
- Familiarity with Ansible, Chef, or Puppet for configuration management.
- Exposure to JavaScript or TypeScript for integration with application or frontend systems.
- Experience managing or deploying MongoDB in cloud or containerized environments.
- Familiarity with vector databases (e.g., Pinecone, Weaviate, Milvus, Qdrant) or AI infrastructure tools (e.g., MLflow, Weights & Biases, Ray, LangChain, OpenAI API).
- Knowledge of service mesh technologies (e.g., Istio, Linkerd).
- Exposure to DevSecOps and SRE principles.
- Relevant certifications such as Microsoft Certified: Azure Administrator or Azure Solutions Architect Expert.
What We Offer
- Competitive salary.
- Flexible remote work environment.
- Opportunity to work on infrastructure supporting cutting-edge AI and machine learning workloads.
- Collaborative, innovation-driven engineering culture.