About GridwareGridware is a San Francisco-based technology company dedicated to protecting and enhancing the electrical grid. We pioneered a groundbreaking new class of grid management called active grid response (AGR), focused on monitoring the electrical, physical, and environmental aspects of the grid that affect reliability and safety. Gridware’s advanced Active Grid Response platform uses high-precision sensors to detect potential issues early, enabling proactive maintenance and fault mitigation. This comprehensive approach helps improve safety, reduce outages, and ensure the grid operates efficiently. The company is backed by climate-tech and Silicon Valley investors. For more information, please visit www.Gridware.io.
Role OverviewGridware is creating cutting edge technology to increase hazard awareness on the electric distribution system. We are building the observability layer of a safer and more efficient grid.We are seeking an Machine Learning Engineer to lead the development of robust models and data pipelines for detecting and interpreting events from distributed sensors installed on electrical distribution infrastructure. You will support the full ML product lifecycle from model development and prototyping to building stable, fully supported, production systems for real-time event detection.You will collaborate with a diverse team of scientist and engineers to build the hardware, software, and the operational systems to deliver actionable information to utility operators.

Responsibilities

  • Design, train, and deploy ML models for real-time and batch detection of events, anomalies, or faults from distributed sensor networks.
  • Build end-to-end data and ML pipelines for sensor ingestion, preprocessing, feature extraction, and model inference.
  • Collaborate with hardware teams, data engineers, and product managers to define ML system requirements.
  • Work with distributed streaming systems (e.g., Apache Kafka, Spark Structured Streaming) for real-time data processing and inference.
  • Develop tools and processes for continuous model evaluation, retraining, and performance monitoring (MLOps best practices).
  • Lead the adoption of scalable frameworks for spatial-temporal and graph-based modeling of sensor systems.
  • Mentor junior engineers and participate in architecture and design reviews.

Required Skills

  • 5+ years of experience designing and deploying ML systems in production environments.
  • Proficiency in Python and ML libraries (e.g., PyTorch, TensorFlow, scikit-learn, XGBoost).
  • Strong background in time-series analysis, anomaly detection, or sensor fusion.
  • Experience with real-time or distributed data systems: Spark, Kafka, Flink, or similar.
  • Solid understanding of data engineering fundamentals, including ETL and batch/streaming processing.
  • Experience deploying models via REST APIs or frameworks like MLflow, TorchServe, or FastAPI.
  • Familiarity with cloud-native architectures (AWS, Azure, GCP) and containerization (Docker, Kubernetes).

Bonus Skills

  • Experience with Graph Neural Networks (GNNs), spatial-temporal modeling, or edge ML.
  • Exposure to sensor networks in domains like energy, industrial IoT, transportation, or environmental monitoring.
  • Contributions to open-source ML or data systems.
This describes the ideal candidate; many of us have picked up this expertise along the way. Even if you meet only part of this list, we encourage you to apply!
BenefitsHealth, Dental & Vision (Gold and Platinum with some providers plans fully covered) Paid parental leave Alternating day off (every other Monday)“Off the Grid”, a two week per year paid break for all employees. Commuter allowance Company-paid training

Location

San Francisco, CA

Job Overview
Job Posted:
2 days ago
Job Expires:
Job Type
Full Time

Share This Job: