logo inner

Senior Site Reliability Engineer (SRE) / Team Lead


Dark Wolf Solutions is seeking a Senior Site Reliability Engineer (SRE) / Team Lead to support the Unified Platform Cyber Operations & Security Center (COSC) in San Antonio, TX. The Senior SRE / Team Lead will be responsible for leading a multi-disciplinary team focused on platform reliability, operational resilience, performance optimization, and cloud infrastructure automation across multi-tenant, classified, and hybrid mission environments.This role blends deep technical expertise in cloud-native operations, Infrastructure as Code (IaC), observability, and automation with leadership responsibilities for mentoring, guiding, and growing a team of SREs.

Key Responsibilities


  • Lead the Site Reliability Engineering team supporting platform monitoring, incident response automation, service resilience, and performance optimization for COSC environments.
  • Architect and oversee deployment of observability solutions including logging, monitoring, alerting, telemetry ingestion, and performance dashboards.
  • Design and maintain Infrastructure as Code (IaC) pipelines to automate provisioning, scaling, and configuration of critical platform components.
  • Implement and enforce Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets to drive operational excellence.
  • Develop and optimize incident response workflows and playbooks; lead root cause analysis (RCA) and post-incident reviews (PIRs).
  • Manage cloud-native infrastructure in AWS GovCloud, Azure Government, and/or classified cloud environments.
  • Collaborate with Platform Engineers, Cloud Security Engineers, and DevSecOps teams to continuously improve system availability and performance.
  • Integrate monitoring and telemetry into the COSC SIEM and observability frameworks.
  • Support compliance efforts by ensuring observability and operational artifacts align with RMF, STIGs, and NIST standards.
  • Mentor junior SREs and cloud engineers, providing technical leadership and professional development support.

Basic Qualifications


  • Bachelor’s degree in Computer Science, Cybersecurity, Engineering, Information Technology, or a related technical field, or equivalent industry experience.
  • Minimum of 8–10 years of experience in system engineering, cloud engineering, or platform operations.
  • Minimum of 3 years experience in a leadership or technical team lead role.
  • Strong experience operating cloud-native environments with AWS, Azure, and/or Kubernetes orchestration.
  • Deep understanding of Infrastructure as Code (IaC) tooling (e.g., Terraform, CloudFormation, Ansible) and GitOps practices (e.g., ArgoCD).
  • Expertise in monitoring and observability frameworks (Elastic Stack, Prometheus, Grafana, Fluentd, Loki, or equivalent).
  • Strong knowledge of SRE principles, service reliability engineering, and resilience patterns.
  • Experience with security compliance alignment including NIST 800-53, DoD STIGs, and Continuous ATO practices.
  • Experience leading incident management activities and conducting root cause analysis (RCA).
  • US Citizenship required with an active Secret clearance and eligibility for Top Secret/SCI.

Desired Qualifications


  • Certifications such as AWS Certified Solutions Architect, Certified Kubernetes Administrator (CKA), or Certified DevOps Engineer.
  • Familiarity with performance testing, chaos engineering, and fault injection frameworks.
  • Hands-on experience building or maintaining self-service internal developer platforms (IDPs).
  • Experience supporting DoD or Intelligence Community environments.
  • Knowledge of Zero Trust Architecture (ZTA) implementation in cloud-native environments.

The estimated salary range is $155,000.00 - $190,000.00, commensurate on experience, technical expertise, certifications, and clearance level.Primary work location is San Antonio, TX. Hybrid model with a mix of remote and on-site support; on-site presence required for classified system activities.We are proud to be an EEO/AA employer Minorities/Women/Veterans/Disabled and other protected categories. In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.

Life at Dark Wolf Solutions

Dark Wolf Solutions provides DevSecOps agile software development, information operations, penetration testing and incident response, applied research and rapid prototyping, machine learning, and mission support and engineering services to the Intelligence Community, national security, and Fortune 500 customers. By combining the most innovative emerging technologies with deep federal domain expertise, Dark Wolf operates at the nexus of technical innovation and mission needs.
Thrive Here & What We Value1. EEO/AA Employer2. Minorities/Women/Veterans/Disabled and other protected categories3. Continuous Learning and Improvement Mindset4. Hybrid Work Environment Supported5. Strong Technical Skills and Analytic Ability Valued6. Excellent Communication and Collaboration Skills Emphasized7. Attention to Detail and Organizational Abilities Expected8. Continuous Monitoring Practices Familiarity Preferred
Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2025