This role is for one of the Weekday's clients

Salary range: Rs 3000000 - Rs 4000000 (ie INR 30-40 LPA)

Min Experience: 2 years

Location: Bangalore

JobType: full-time

We are looking for a skilled and driven NLP Engineer to help scale, optimize, and deploy large language model (LLM)-based solutions within the healthcare domain. Your primary focus will be on building and maintaining production-ready, end-to-end NLP systems—covering backend architecture, inference optimization, and efficient model deployment pipelines. While opportunities exist for fine-tuning LLMs for specific use cases, the core responsibility is ensuring these models run efficiently, reliably, and at scale in production environments.

Additionally, you will develop NLP pipelines leveraging pre-trained LLMs and embedding models, including retrieval-augmented generation (RAG) systems and agentic NLP solutions that integrate multiple models and data sources for real-time, context-aware processing.

Requirements

Key Responsibilities

Production-Grade NLP Systems

  • Design and implement scalable, efficient NLP pipelines using LLMs and embedding models.
  • Integrate RAG and agentic components to enhance NLP capabilities and adaptability.

Inference Optimization & Deployment

  • Optimize model inference performance, reduce latency, and improve throughput using frameworks like vLLM, TensorRT, Ray, etc.
  • Implement best practices for containerization, CI/CD, monitoring, and observability to ensure stable, production-ready deployments.

Occasional Model Adaptation

  • Assist with fine-tuning or adapting LLMs for specific healthcare applications, ensuring scalability and efficiency.

Collaboration & Continuous Improvement

  • Work closely with NLP researchers, backend engineers, product managers, and frontend developers to build high-quality NLP solutions.
  • Participate in code reviews, architectural discussions, and stay updated on emerging NLP and LLM optimization techniques.

Requirements (Must-Haves!)

  • Bachelor's or Master’s degree in Computer Science or a related field.
  • 2+ years of experience (or 1+ year with an advanced degree) in building and deploying ML/NLP systems using Python.
  • Hands-on experience with NLP frameworks (e.g., spaCy, Hugging Face Transformers, LangChain) and deep learning libraries (e.g., PyTorch).
  • Strong background in designing, implementing, and maintaining scalable backend architectures for NLP/LLM-based applications.
  • Experience working with large datasets, including data cleaning, preprocessing, and structuring.
  • Proficiency in containerization, CI/CD, and version control for production-grade deployments.
  • Expertise in LLM inference optimization using vLLM, TensorRT, Ray, etc.
  • Practical knowledge of deploying NLP models in production, including load balancing and latency reduction.

Preferred (Nice-to-Have!)

  • Experience in building RAG pipelines and integrating embedding models into NLP workflows.
  • Familiarity with agentic systems that leverage multiple models for dynamic, context-aware NLP solutions.
  • Knowledge of prompt engineering, model fine-tuning, and large-scale inference optimization for LLMs.

Location

Bengaluru, Karnataka, India

Job Overview
Job Posted:
1 week ago
Job Expires:
Job Type
Full Time

Share This Job: