The data scientist is an expert in applied NLP capabilities viz. text classification, NER, building multi-lingual products, sentiment analysis, topic classification, information retrieval, conversational AI.
A hands-on experience in working with LLMs viz. ChatGPT, Vicuna, Llama 2, FLANT5 etc., in prompt engineering and managing LLM inputs and output can be a big plus.
Knowledge and exposure/experience in building LLM driven solutions using approaches viz. Prompt engineering, RAG with some understanding on the LLM fine-tuning.
Brings outside-in perspective for innovative technology adoption, re-engineering existing capabilities, implementing best-practices for NLP product development.
Has an ability to evaluate all options for a given situation and recommend the best one based on facts. Acts as a go-to person for few applied NLP area for the team.
Business understanding & Solution implementation.
The business need is understood and formalized in a descriptive datasheet or specifications.
The methods are identified through external research and selected by their theoretical bases, advantages and drawbacks.
The results of the analysis or the models are presented to the customers with data visualization tools indicating the performances and the limits. The source code is delivered and explained if necessary.
Quality/Capitalization
All stages of Machine learning project are implemented (pipeline, components, tests), respecting quality standards.
All valuable code is re-usable by community. Data Analysis models are monitored (technological watch) and partnerships are developed.
Health of the Métier
The métier is well supported by the data scientist network with proposals for improvements to develop people, technology, processes.
Qualifications
Required:
9+ years of overall machine learning/advanced analytics experience with considerable time spent in building enterprise grade NLP solutions.
6+ years of NLP solution building and applied research experience including 4+ years in Python
Strong experience developing and implementing NLP application in a commercial setting.
Hands-on experience in building POCs and solutions using GenerativeAI and LLMs.
Expert in NLP methodologies: document embedding, text mining, topic classification, sentiment analysis, information extraction, conversational AI, generative AI etc.
Self-starter who can define precise problem and also have the zeal to explore, identify and implement optimal solutions.
Expert in deep learning models for sequence-to-sequence problems like LSTMs, GRUs, RNNs
Experience with NLTK, Spacy, Gensim, HuggingFace Transformers
MTech/BTech/B.E. degree in Computer Science or related IT / technology field or equivalent experience
Excellent English communication skills, both oral and written.
Team player who can work well in a highly collaborative, multi-cultural and matrixed organization.
Preferred:
Azure knowledge and conceptual familiarity with ML Engineering and ML Ops