As a Machine Learning Operations (MLOps) Engineer, you will be responsible for designing, deploying, and maintaining scalable platforms for machine learning models, utilizing open-source tools such as Kubernetes, Kubeflow, and MLflow. Your role will span the entire ML lifecycle, ensuring smooth deployment, efficiency, and security from development to production.
Key Responsibilities:
- Build and scale machine learning platforms and automate pipelines for training, deployment, and monitoring.
- Deploy models across cloud and on-premises environments, manage model versions, and handle model retraining.
- Optimize infrastructure, implement containerization (Docker, Kubernetes), and ensure high-performance systems.
- Set up monitoring solutions to track model performance and troubleshoot issues.
- Collaborate with data scientists, engineers, and other teams to ensure smooth integration and deployment.
Required Skills and Experience:
- Proficiency in Python and experience with cloud platforms (AWS, GCP, Azure).
- Expertise in containerization, orchestration, and machine learning frameworks (TensorFlow, PyTorch).
- Familiarity with DevOps principles, CI/CD, and automation processes.
- Strong problem-solving, communication, and teamwork skills.
Qualifications:
- Bachelor's or Master's degree in Computer Science, Data Science, or related field.
- Proven experience in machine learning deployment or a similar area.
- Certifications in cloud platforms or DevOps tools are a plus.
Why Join Us?
- Competitive salary and benefits.
- Opportunities for professional development and career growth.
- Collaborative and innovative work environment.
- Exposure to cutting-edge machine learning technologies.