Site Reliability Engineer (GKE + GCP)
Reponsibilities:• Work on a team of extremely talented platform engineers to help maintain and scale the current and future state services platform. • Help architect and develop the future state compute platform by leveraging industry best practices as well as embracing new technologies to support the future growth as a business• Help influence the product roadmaps of GCP (our primary cloud provider) to better suit our future state architecture• Work collaboratively with business and technical stakeholders to develop and architect enhancements to the compute platform capabilities that enable them to develop and iterate applications to power the business• Identify opportunities to introduce automation, improvements to avoid repetitive operational tasks (DRY)• Participate in the on-call rotation to ensure operational excellence and overall platform healthRequirements:• 5+ years of experience in platform engineering/SRE roles using an object oriented language (Python, Golang, etc)• Bachelor’s degree in Computer Science, Computer Engineering or equivalent combination of education and experience• Extensive experience working with Kubernetes in a public cloud (GKE, EKS, AKS, etc)• Experience working with Istio/Service Mesh• Experience working with IaC (Terraform, Pulumi, etc)• Experience working within a Public Cloud environment (GCP, AWS, Azure, etc)• Experience working with bolthires/CD tools such as Argo, Buildkite, TravisCI, Jenkins, Spinnaker, etc• Experience working with platform observability tools (Prometheus, Thanos, Grafana, Fluentbit, Cloud Monitoring, bolthires Cloud Logging, Datadog, Pagerduty, Cloudwatch, Kibana, Elastic Search, Splunk, VictorOps, etc)• Experience with Networking• Experience and desire to work in an agile environment• Analytical mindset and passion for solving business problems with technologyNice To Haves:• Experience working with Dev Testing tools and patterns such as Garden, Flagger, Canary Deployments, Blue/Green Testing, A/B Testing• Experience setting up and working with Kubernetes Admission Control (Kyverno, OPA, etc)• Experience working with workload scaling (HPA, VPA, Capacity Planning/Reservations, etc) Apply tot his job