Open Position – Staff/Principal AI/ML Deployment Software Engineer
Horizon Surgical Systems Inc.
Horizon Surgical Systems Inc. is revolutionizing the world of surgical ophthalmology by developing a novel, AI driven, and imaging-guided surgical robotic system. Horizon Surgical Systems Inc. aims to expand access to care, provide superior capabilities to the human surgeon, and enhance patient outcomes. Microsurgery in general and Ophthalmology are subfields of surgery for which the surgical outcomes can be significantly improved by robotic systems to allow superior dexterity, precision, accuracy, and visualization beyond the human surgeon’s own capabilities.
We are seeking highly motivated, and intellectually inquisitive individuals looking to make a positive impact on healthcare via the development of robotic technology. The core values of Horizon Surgical Systems Inc. are:
- Commitment to Excellence: We aim to deliver superior patient outcomes and surgeon experiences
- Passion for Creativity and Innovation: We are driven by new ideas and aim to push the boundaries of what's possible
- Teamwork and Camaraderie: We achieve our best when we collaborate and work together
- Welcoming of Critical Opinion: We are enriched by constructive criticism and support the best ideas
- Personal Accountability: We honor our commitments and take responsibility for our actions
Horizon Surgical Systems Inc. offers:
- An opportunity to build autonomous surgical robotic systems driven by image guidance and AI technology for the future of affordable, high-quality healthcare.
- The opportunity to work alongside clinicians, engineers, and global leaders in cutting-edge AI, imaging, and robotics technology.
- Competitive compensation and an excellent company-paid benefits package.
We are looking for an experienced AI/ML and vision algorithm C++ deployment software engineer to develop a medical imaging and robotics system. This role requires deep understanding of AI/ML and machine vision-based algorithms development, as well as a large amount of experience translating models from research prototypes into production-grade, low-latency inference services in a resource constrained environment.
Responsibilities
As a member of AI/ML algorithm development team, you will be in charge of the following responsibilites:
- Deploy AI models (segmentation and detection) in TensorRT format using Triton Inference Server on embedded GPU hardware.
- Design and maintain multi-threaded pipelines for model inference within C++ applications, coordinating with real-time robotic control and perception systems.
- Collaborate with AI scientists to convert, optimize, quantize, and benchmark models for real-time deployment, ensuring numerical stability
- Collaborate with the product software engineers to integrate the algorithms to be a part of the product software, make sure the team provide quality of software algorithms with the optimized performance.
- Implement scheduling and resource management strategies to balance GPU/CPU workloads and meet latency/throughput targets.
- Develop tooling for model introspection, logging, and diagnostics to ensure traceability, debuggability, and regulatory readiness.
- Stay current with developments in inference optimization (TensorRT, ONNX, CUDA, etc.) and help inform the platform roadmap.
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Robotics, Electrical Engineering, or related field.
- 3+ years of experience deploying deep learning models to production, preferably in real-time or embedded systems.
- 3+ years of experience in C++ with strong understanding of multithreading, memory management, and system-level performance profiling.
- Experience with Triton Inference Server, TensorRT, ONNX, and CUDA.
- Deep understanding of model quantization, layer fusion, and precision trade-offs (FP32, FP16, INT8).
- Familiarity with numerical stability and robustness in floating-point computation.
- Experience implementing telemetry, logging, and runtime performance monitoring in production systems.
- Hands-on experience in real-time or latency-sensitive systems, especially on resource-constrained platforms.
Preferred Qualifications
- Experience working in medical devices, robotics, or safety-critical software systems.
- Familiarity with AI frameworks (PyTorch, TensorFlow) and model export pipelines (→ ONNX → TensorRT).
- Knowledge of ROS2, shared memory IPC, or similar communication mechanisms in robotics or distributed systems.
- Comfort with profiling tools such as Nsight Systems, perf, nvprof, or similar.
This is an exciting opportunity to join a high-tech startup that is poised to revolutionize surgical robotics in ophthalmology.