Job Description
Join our dynamic team as a Data Scientist II and work at the forefront of data-driven innovation. You'll dive deep into exploratory data analysis, modeling, and MLOps to craft impactful solutions across multidisciplinary teams.
Duties and Responsibilities:
Collaborate and work across functional and multidisciplinary teams in a dynamic environment to develop an understanding of evolving/agile business needs
The Data Scientist role has work across the following four areas:
Exploratory data analysis:
Conduct in-depth exploratory data analysis (EDA) to uncover hidden patterns, anomalies, and trends.
Utilize statistical techniques and data visualization tools to communicate insights effectively to both technical and non-technical stakeholders.
Perform hypothesis testing and A/B testing to validate assumptions and measure the impact of data-driven decisions.
Feature engineering and enrichment:
Create and refine relevant features from raw data to enhance model performance.
Leverage large language models (LLMs) to generate novel features and augment existing datasets.
Modeling:
Design, develop, and deploy advanced machine learning models (e.g., regression, classification, clustering) to address complex business problems.
Continuously improve model performance by fine-tuning hyperparameters, exploring new algorithms, and incorporating feature engineering techniques.
Collaborate with domain experts to understand business objectives and translate them into actionable data science solutions.
Data Operations and MLOps:
Develop and maintain efficient data pipelines to extract, transform, and load (ETL) data from various sources.
Deploy and manage machine learning models in production environments using MLOps practices.
Monitor model performance, retrain as needed, and implement automated retraining pipelines.
Collaborate with software teams to ensure seamless integration of data science solutions into the broader technology stack.
Requirements and Qualifications:
BS/MS in Science (Statistics, Computer Science, Econometrics, Data Science, Artificial Intelligence ).
1-5 years of experience with data science or computer science fields
Experience with common data science toolkits, programming languages, visualization tools and SQL/NoSQL databases.
Good applied statistical knowledge with emphasis in business and finance related statistical distributions, statistical testing, modeling, regression analysis, etc.
Good foundation of computer science knowledge such as data structure, operating system
Familiar or prone to adopt design thinking methods.
Able to work under pressure and change, and balance among speed, reliability, interpretability.
Experience with code versioning, code review and documentation.
Effective communication skills
Experience in building part-time projects which focuses on:
Additional skill based requirements:
Machine Learning
Understanding of machine learning algorithms such as k-NN, Naive Bayes, SVM, Decision trees.
Experience using ML frameworks such as TensorFlow, PyTorch, or scikit-learn.
Experience with Google Cloud Platform products and services such as Vision API, Recommendations API, Cloud Natural Language.
GenAI
Leverage the power of Generative AI to develop innovative solutions that push the boundaries of creativity and automation such as target labeling, building chatbot system, data enrichment, semantic search
Experience in designing and implementing RAG systems, including knowledge base construction, information retrieval, and model fine-tuning.
Algorithm Engineering
Strong ability to implement, improve, and deploy ML and Math models in Golang or Python.
Conduct systems tests for security, performance, and availability.
Develop and maintain the design and troubleshooting/error documentation.
Create cost effective scalable systems and develop innovative algorithm solutions.
Operations Research
Familiar with modeling problems as mathematical programming, constraint satisfaction, particle swarm optimization and other appropriate OR methodologies.
Familiar with tools such as Cplex, Gurobi, Google OR-Tools.