Sr. Site Reliability Engineer job in New York

Sr. Site Reliability Engineer- Cloud GRC Automation Series A Start-Up- Remote

We are looking for a Sr. Site Reliability Engineer who will be responsible for ensuring system integrity, availability, and Non-Functional Requirements including a performance for designing and implementing the defined tactics for CI and CD platform operations.

Hyper Growth Series A Firm with strong recent funding has an enterprise cloud security solution that automates cloud governance, protecting enterprise data, controlling risk, and accelerating success in the cloud.

The business is backed by a VC group which have had a prolific track record of success in the cyber security market.

Overview:

Ensuring holistic system health across the web frontend, API services, and backend services
Lead processes to promote reliability on Kubernetes and AWS infrastructure
Create Monitoring and Observability platforms to identify SLA / SLO metrics for platform health and customer usage
Design and collaborate with development to aggregate user and session logs, and infrastructure health logs
Design and implement application and infrastructure logging for monitoring, operating, and debugging
Collaborate with engineering to identify necessary logging workflows, log levels, and end-to-end user interaction monitoring (RUM)
Disaster recovery and replication strategy to provide zero downtime failovers
Establish Authentication and Authorization using IAM principles and OIDC
Generate the RBAC Security model using the principle of least privilege
Multi-AZ deployment, disaster recovery, replication strategy
Security posture; container scanning, vulnerability remediation, WAF and SIEM configuration

This is an exciting opportunity to join a growth start-up in a hyper-growth market.