Job description
I am looking for an experienced AI/LLM Operations Engineer to oversee the deployment and optimization of large language models (LLMs) across my client's platform. This role focuses on managing all aspects of LLM infrastructure, from training and fine-tuning to deploying models in production, ensuring high performance and reliability. The engineer will work closely with research and engineering teams to bring advanced AI capabilities to life.
Key Responsibilities:
- Develop and optimize deployment pipelines for LLM inference.
- Fine-tune and adapt language models for specific use cases.
- Collaborate with researchers to train and implement models on large datasets.
- Design and maintain end-to-end systems for LLM operations.
- Monitor model performance and troubleshoot scaling issues in production.
- Apply current LLM techniques to improve model performance and efficiency.
Qualifications:
- Bachelor’s or Master’s in Computer Science, Engineering, or related field.
- 4+ years of experience with machine learning model deployment, ideally with LLMs.
- Strong background in model fine-tuning, training, and deployment best practices.
- Proficiency with TensorFlow, PyTorch, Hugging Face, and cloud platforms (AWS, Google Cloud, or Azure).
- Experience with containerized environments, including Kubernetes and Docker.