Jan 28, 2026
AI/LLM Architect | Machine Learning Architect
architect
enterprise
solutions

Job Title: AI/LLM Architect | Machine Learning Architect
Location: ( Remote )
Job Type: Full-time
Salary: $180,000 - $250,000 per year
Job Description
We’re seeking an experienced AI/LLM Architect to lead the design and implementation of our large language model and machine learning infrastructure. In this senior technical role, you’ll architect scalable AI systems, guide our ML strategy, and work closely with engineering, product, and data science teams to deploy cutting-edge LLM solutions that drive business value. This is a high-impact position for someone passionate about pushing the boundaries of what’s possible with AI.
Key Responsibilities
Design and architect end-to-end LLM and ML systems at scale, including model selection, fine-tuning strategies, and deployment pipelines
Lead the development of AI/ML infrastructure that supports training, evaluation, and production deployment of large language models
Define best practices for prompt engineering, RAG (Retrieval-Augmented Generation) implementations, and LLM orchestration
Evaluate and select appropriate foundation models (GPT, Claude, Llama, Mistral, etc.) based on use case requirements
Architect solutions for model fine-tuning, including LoRA, QLoRA, and full fine-tuning approaches
Design and implement robust evaluation frameworks to measure model performance, accuracy, and safety
Collaborate with data engineering teams to build scalable data pipelines for model training and inference
Establish MLOps practices including model versioning, monitoring, A/B testing, and continuous improvement
Provide technical leadership and mentorship to ML engineers and data scientists
Stay current with the rapidly evolving LLM landscape and emerging AI technologies
Ensure AI systems are built with security, privacy, and ethical considerations in mind
Partner with stakeholders to translate business requirements into technical AI/ML solutions
Optimize model performance, latency, and cost-efficiency for production environments
Required Skills and Qualifications
7+ years of experience in machine learning, with at least 3 years focused on LLMs or NLP
Deep expertise in working with large language models (OpenAI, Anthropic, open-source models)
Strong programming skills in Python and ML frameworks (PyTorch, TensorFlow, Hugging Face Transformers)
Proven experience architecting and deploying ML systems at scale in production environments
Expertise in LLM techniques including prompt engineering, few-shot learning, fine-tuning, and RLHF
Experience with vector databases and embedding models for RAG implementations (Pinecone, Weaviate, ChromaDB, etc.)
Strong understanding of transformer architectures and attention mechanisms
Experience with model optimization techniques (quantization, pruning, distillation)
Proficiency with cloud platforms (AWS, GCP, Azure) and their ML services (SageMaker, Vertex AI, Azure ML)
Knowledge of MLOps tools and practices (MLflow, Kubeflow, Weights & Biases, etc.)
Experience with containerization (Docker, Kubernetes) for ML workloads
Strong understanding of distributed training and inference strategies
Excellent communication skills with ability to explain complex technical concepts to non-technical stakeholders
Master’s or PhD in Computer Science, Machine Learning, or related field (or equivalent experience)
Preferred Qualifications
Experience with LLM fine-tuning at scale using techniques like LoRA, QLoRA, or full parameter fine-tuning
Hands-on experience with LLM frameworks like LangChain, LlamaIndex, or Semantic Kernel
Knowledge of advanced techniques like Constitutional AI, chain-of-thought prompting, or agent-based systems
Experience building multimodal AI systems (vision-language models, audio processing)
Background in deploying models on edge devices or optimizing for low-latency inference
Contributions to open-source ML/AI projects or published research in top-tier conferences
Experience with AI safety, alignment, and responsible AI practices
Knowledge of GPU optimization and parallel computing frameworks (CUDA, Ray, DeepSpeed)
Experience with data labeling platforms and synthetic data generation
Familiarity with reinforcement learning from human feedback (RLHF) pipelines
Technical Stack(s)
LLM Platforms: OpenAI API, Anthropic Claude, Hugging Face, Azure OpenAI
ML Frameworks: PyTorch, TensorFlow, JAX, Hugging Face Transformers
Vector DBs: Pinecone, Weaviate, Qdrant, Milvus, ChromaDB
Orchestration: LangChain, LlamaIndex, Haystack
MLOps: MLflow, Weights & Biases, Neptune.ai, Kubeflow
Infrastructure: AWS/GCP/Azure, Kubernetes, Docker, Terraform
Monitoring: Prometheus, Grafana, DataDog
What We Offer
Competitive salary ($180,000 - $250,000 based on experience)
Significant equity/stock options package
Comprehensive health, dental, and vision insurance
401(k) with generous company match
Unlimited PTO policy
$5,000+ annual professional development budget for conferences, courses, and certifications
Access to cutting-edge AI tools and compute resources
Remote-first culture with flexible work arrangements
Opportunity to work on challenging problems with real-world impact
Collaborative environment with top-tier AI/ML talent
Conference speaking opportunities and support for publishing research
Email resume to careers@oliego.com