Senior DevOps (System Reliability) Engineer
Company: Dyna Robotics
Location: Redwood City
Posted on: April 7, 2025
Job Description:
Senior DevOps (System Reliability) EngineerCompany Overview:Dyna
Robotics is at the forefront of revolutionizing robotic
manipulation with cutting-edge foundation models. Our mission is to
empower businesses by automating repetitive, stationary tasks with
affordable, intelligent robotic arms. Leveraging the latest
advancements in foundation models, we're driving the future of
general-purpose robotics-one manipulation skill at a time.Dyna
Robotics was founded by industry leaders who previously achieved a
$350 million exit in grocery deep tech as well as top robotics
researchers from DeepMind and Nvidia. Our team blends world-class
research, engineering, and product innovation to drive the future
of robotic manipulation. With $20mil+ in funding, we're positioned
to redefine the landscape of robotic automation. Join us to shape
the next frontier of AI-driven robotics.Role Overview:We are
seeking a talented and driven DevOps / System Reliability Engineer
(SRE) to join our growing team onsite in Redwood City, California.
At Dyna, our software systems run on cloud infrastructure as well
as IoT robot devices deployed to diverse customer locations. You
will be responsible for building, maintaining, and ensuring the
reliability and scalability of our infrastructure and systems in
the cloud and on our robots. This role involves a blend of DevOps
practices and SRE principles, where you will not only automate
deployments and infrastructure but also actively monitor system
health, manage releases, and address production issues. As an early
member of our team, you will have a significant impact on our
development workflow and the overall success of the
company.Responsibilities:
- CI/CD Pipeline Management: Design, implement, and maintain
robust build tools and CI/CD pipelines for test automation and code
quality. Optimize existing pipelines for speed and efficiency.
- Deployment Automation: Implement and manage deployment tools
like Spinnaker, Helm, or similar for reliable automated deployments
to the cloud and to our robots.
- Security and Access Control: Set up and manage authentication
for internal tools and customer deployments to enhance security and
streamline access.
- Containerization and Orchestration: Automate image builds
within our CI/CD pipeline. Work with containerization technologies
like Docker and orchestration tools like Kubernetes.
- Environment Management: Set up and maintain staging and
production environments for our services to facilitate testing,
validation, and high availability.
- IoT Device Monitoring and Alerting: Implement and manage
comprehensive monitoring, logging, and alerting solutions for Linux
IoT devices to ensure system health, performance, and
observability.
- System Reliability and Availability: Monitor system performance
and availability, proactively address potential issues, and ensure
high uptime. Develop and implement incident response procedures.
Participate in on-call rotations to address production issues.
- Release Management: Manage and coordinate releases, ensuring
smooth deployments and minimal downtime. Troubleshoot and resolve
release-related issues in a timely manner.
- Collaboration and Documentation: Work closely with development
teams to understand their needs and provide DevOps/SRE support.
Document processes, configurations, and incident reports
thoroughly.Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a
related field, or equivalent practical experience.
- 10+ years of experience in a DevOps, SRE, or similar role.
- Strong understanding of CI/CD and test automation.
- Proficiency in scripting languages such as Bash or Python.
- Experience with build tools like Bazel.
- Familiarity with deployment tools like Spinnaker, Helm, or
similar.
- Knowledge of containerization technologies like Docker and
orchestration tools like Kubernetes.
- Experience with cloud platforms (e.g., AWS, GCP, Azure).
- Experience deploying software to IoT or Linux devices.
- Strong problem-solving and troubleshooting skills, particularly
in production environments.
- Excellent communication and collaboration skills.
- Ability to work independently and in a fast-paced startup
environment.
- Experience with incident management and on-call
responsibilities.Bonus Points:
- Experience with Airflow.
- Experience setting up cloud infrastructure using
Terraform.
- Experience with database administration and optimization.
- Experience with cross-platform builds and deployments.
- Knowledge of network protocols and security.Benefits:
- Competitive salary and equity in a seed-stage venture-backed
startup.
- Comprehensive health, dental, and vision insurance.
- Daily catered lunches and dinners with a fully stocked
kitchen.
- Professional growth and development through training,
mentorship, and challenging projects.
#J-18808-Ljbffr
Keywords: Dyna Robotics, Santa Rosa , Senior DevOps (System Reliability) Engineer, Other , Redwood City, California
Didn't find what you're looking for? Search again!
Loading more jobs...