Site Reliability Engineer

Posted on Jun 27, 2020


San Diego, CA

The Video Computer Vision organization is working on exciting technologies for future Apple products.

Our focus is on ML based solution around real time image and video.

We have contributed to the FaceID and FaceKit project in the past and more recently the new LIDAR iPad sensor.

We are looking for the right Site Reliability Engineer to help us take our efforts to the next level.In this role, you will be part of the core data infrastructure team for the Video Computer Vision organization.

You will be a core contributor in our SRE team to develop and maintain a modern deployment system for cloud services and applications.

You will be responsible for system bringup, deployment, reliability, security and service scalability.

This role is highly multi-functional and you will work very closely with various highly skilled software development / ML teams developing cutting edge algorithms.Your core responsibility is to provide operational support of multiple cloud based applications with an emphasis on deployment, security, scalability and reliability running on AWS and Apple infrastructure.

Operations tech stack: Ansible, Terraform, Go, Python, Prometheus, with some bash scripting.

Common technologies include: Django, Docker, Kubernetes, Postgres, Redis, and Cassandra.

We make have a hybrid infrastructure and make use Amazon Web Service extensively along with home-grown compute clouds.What qualities will make you successful? We are looking for a driven and dedicated Site Reliability Engineer possessing hands-on experience with:- Core Operations experience with Linux, Ansible (or similar), Docker, Kubernetes, Postgres.- Engage various software development teams to collaborate and build services from the ground up- Expertise in networking with an emphasis on security- Experience building systems both on-premise (data center) and on public cloud (AWS, GCP or Azure welcome)- Working knowledge of deploying microservices (Django, Go, JVMs)- Have worked with schedulers such as Kubernetes, AWS ECS or EKS.- Ability to write code in one of many high level languages (Python preferred)- Vast experience using Linux with knowledge of kernel/system tuning- Last but not least, you are battle-tested and have a few interesting production tales

How to Apply

Follow the application procedure at for more info.

Related positions:

Site Reliability Engineer (SRE)

Covr, Beverly Hills, CA

VP, Site Reliability Engineer

JP Morgan Chase, Wilmington, VA

VP, Site Reliability Engineer

JP Morgan Chase, Wilmington, VA

Lab Site Reliability Engineer

Apple, Cupertino, CA

Site Reliability Engineer II

BigCommerce, Austin, TX

Download free Serefind app to explore more!