Neodustria is building the first full PaaS for industrial innovators (automotive, railway, naval, aviation, aerospace, and beyond). Our platform streamlines the full lifecycle from 3D design to advanced AI-driven simulations and market intelligence. We are at the intersection of high-performance computing, cloud-native technologies, and cutting-edge AI.
We are looking for a DevOps/Cloud Engineer who is passionate about building and scaling the infrastructure that powers a new generation of industrial engineering. You will be the backbone of our platform, ensuring our microservices and AI models run seamlessly, securely, and at scale on a global, GPU-accelerated infrastructure.
Your Role
As a DevOps/Cloud Engineer at Neodustria, you will:
- Infrastructure as Code (IaC): Design, build, and maintain our core cloud infrastructure using Terraform to ensure it is scalable, resilient, and fully automated.
- Containerization & Orchestration: Manage the entire lifecycle of our services using Docker and Kubernetes, from deployment and scaling to monitoring and maintenance.
- CI/CD Automation: Architect and manage robust CI/CD pipelines to enable rapid, reliable, and automated releases of our Golang and Python microservices.
- GPU Cluster Management: Provision, manage, and optimize our high-performance GPU clusters (NVIDIA H100s) to support demanding AI/ML training and simulation workloads.
- API Infrastructure: Deploy and manage API gateways and service meshes to ensure secure, reliable, and observable communication between our microservices.
- Observability & Reliability: Implement and manage comprehensive monitoring, logging, and alerting solutions to guarantee platform health and performance.
- Security: Champion DevSecOps principles by integrating security best practices and automated checks throughout the infrastructure and CI/CD pipelines.
- Collaboration: Work closely with backend and AI/ML engineers to optimize application performance, streamline deployment workflows, and troubleshoot complex issues across the stack.
What We’re Looking For
- Must-Have:
- Strong, hands-on proficiency with Terraform for infrastructure provisioning.
- Deep expertise in containerization and orchestration with Docker and Kubernetes.
- Proven experience managing production workloads on a major cloud provider (AWS, GCP, Azure).
- Solid experience building and managing automated CI/CD pipelines.
- Highly Valued:
- Experience with managing GPU resources in a Kubernetes environment.
- A strong understanding of microservices architecture and API design principles.
- Proficiency in a scripting language like Python, Golang, or Bash.
- Familiarity with monitoring and observability tools (e.g., Prometheus, Grafana, ELK Stack).
- Bonus Points:
- Experience with service mesh technologies (e.g., Istio, Linkerd).
- Knowledge of AI/ML infrastructure and tools (e.g., Kubeflow, PyTorch, TensorFlow).
Why Join Neodustria?
- Impact: Build the foundational infrastructure for a platform poised to revolutionize the industrial engineering landscape.
- Innovation: Work at the frontier of AI, cloud-native technologies, and high-performance computing.
- Collaboration: Join a world-class team of AI researchers, simulation experts, and industrial engineers.
- Growth: Gain unparalleled experience in managing large-scale, GPU-accelerated AI systems from the ground up