We are looking for a Senior Solutions Architect to design, develop, and scale innovative AI/ML-driven solutions. You will be responsible for architecting highly scalable, low-latency distributed systems optimized for AI/ML workloads. As a key technical leader, you will solve complex challenges, influence next-generation AI/ML infrastructures, and guide cross-functional teams to deliver state-of-the-art solutions for fast-growing startups and enterprise companies.

Be at the forefront of shaping next-generation AI/ML infrastructures, driving solutions for high-impact products across diverse industries. You'll have the opportunity to influence key architectural decisions and enable real-world applications that scale globally, ensuring innovation and efficiency at every step.

Requirements

You'll be responsible for —

Driving end-to-end GenAI architecture and implementation:

  • Design and deploy multi-agent systems using modern frameworks (LangGraph, CrewAI, AutoGen)
  • Architect RAG solutions with advanced vector store integration
  • Implement efficient fine-tuning strategies for foundation models
  • Develop synthetic data generation pipelines for training and testing

Leading ML infrastructure and deployment:

  • Design high-performance model serving architectures
  • Implement distributed training and inference systems
  • Establish MLOps practices and pipelines
  • Optimize cloud resource utilization and costs
  • Set up monitoring and observability solutions

Driving technical excellence and innovation:

  • Define architectural standards and best practices
  • Lead technical decision-making for AI/ML initiatives
  • Ensure scalability and reliability of AI systems
  • Implement AI governance and security measures
  • Guide teams on advanced AI concepts and implementations

Overseeing production AI systems:

  • Manage model deployment and versioning
  • Implement A/B testing frameworks
  • Monitor system performance and model drift
  • Optimize inference latency and throughput
  • Ensure high availability and fault tolerance

Fostering collaboration and growth:

  • Mentor engineering teams on AI architecture
  • Collaborate with stakeholders on technical strategy
  • Drive innovation in AI/ML solutions
  • Share knowledge through documentation and training
  • Lead technical reviews and architecture discussions

You need —

8+ years experience in software engineering or architecture, including:

  • 4+ years leading cross-functional GenAI/ML teams
  • Production experience with distributed AI systems
  • Enterprise-scale AI architecture implementation

To lead and architect enterprise-scale GenAI/ML solutions, focusing on:

  • Multi-agent orchestration using LangGraph, CrewAI, and AutoGen
  • Workflow automation with LlamaIndex, LangChain, and LangFlow
  • Agent coordination using LETTA framework
  • Integration of specialized agents for reasoning, planning, and execution

To design and implement sophisticated AI architectures incorporating:

Advanced RAG systems using:

  • Vector databases (Chroma, Weaviate, Pinecone, Milvus)
  • Hybrid search with BM25 and semantic embeddings
  • Self-querying and recursive retrieval patterns

Fine-tuning strategies for foundation models:

  • PEFT methods (LoRA, QLoRA, Adapter-tuning)
  • Parameter-efficient training approaches
  • Instruction fine-tuning and RLHF

Multi-agent frameworks integrating:

  • Tool-use and reasoning chains
  • Memory systems (short-term and long-term)
  • Meta-prompting and reflection mechanisms
  • Agent communication protocols

Expertise advanced data generation and synthesis:

  • Synthetic data generation using Arigilla and PersonaHub
  • Privacy-preserving data synthesis
  • Domain-specific data augmentation
  • Quality assessment of synthetic data
  • Data balancing and bias mitigation

To architect high-performance ML serving infrastructure focusing on:

  • Model serving platforms (BentoML, Ray Serve, Triton)
  • Real-time processing with Ray, Kafka, and Spark Streaming
  • Distributed training using Horovod, DeepSpeed, and FSDP
  • vLLM and TGI for efficient inference
  • Integration patterns for hybrid cloud-edge deployments

To drive cloud architecture decisions across:

  • Kubernetes orchestration with Kubeflow and KServe
  • Serverless ML with AWS Lambda, Azure Functions, Cloud Run
  • Auto-scaling using HPA, KEDA, and custom metrics
  • Resource optimization with Nvidia Triton and TensorRT
  • MLOps platforms (MLflow, Weights & Biases, DVC)

Benefits

Bonus points for —

  • Research publications in AI/ML
  • Open-source project maintenance
  • Technical blog posts on AI architecture
  • Conference presentations
  • AI community leadership

What you get —

  • Best in class salary: We hire only the best, and we pay accordingly.
  • Proximity Talks: Meet other designers, engineers, and product geeks — and learn from experts in the field.
  • Keep on learning with a world-class team: Work with the best in the field, challenge yourself constantly, and learn something new every day.

About us —

We are Proximity — a global team of coders, designers, product managers, geeks, and experts. We solve complex problems and build cutting-edge tech at scale. Here's a quick guide to getting to know us better:

  • Watch our CEO, Hardik Jagda, tell you all about Proximity.
  • Read about Proximity's values and meet some of our Proxonauts here.
  • Explore our website, blog, and the design wing — Studio Proximity.
  • Get behind the scenes with us on Instagram! Follow @ProxWrks and @H.Jagda

Location

Santa Clara, California, United States

Job Overview
Job Posted:
1 week ago
Job Expires:
Job Type
Full Time

Share This Job: