logo inner

SME – Observability, ELK Stack & Monitoring Engineer

LocationFairfax, Virginia, United States
TypeRemote, Onsite

Senior Observability Engineer (SME)


Long-term contract - 2+ years100% remote in the continental US


Job Description:


Our client's Enterprise Observability team is looking for a senior-level ELK Stack Subject Matter Expert (SME). The team is responsible for enterprise infrastructure, application, and network observability, with a primary focus on log management and metrics. The selected candidate will be joining a team of skilled engineers with a broad background in enterprise observability.

Your Impact:


As an ELK Stack Engineer, this role is focused on maintaining the reliability, scalability, and availability of our enterprise Elastic Stack solution. This platform is used for log management, metrics, and observability. The role heavily utilizes automation with tools like Terraform and Ansible and requires the candidate to maintain performance KPIs and define SLOs for the platform.

Responsibilities:


  • Maintain and deploy monitoring and alerting systems within the ELK Stack.
  • Design, configure, and maintain our large-scale log aggregation solution using Elasticsearch and Logstash.
  • Set up and manage data ingestion pipelines and transformations using tools like Filebeat, Logstash, and/or Fluentd/Fluentbit.
  • Embrace the mindset of "automate any task" to improve efficiency.
  • Build and maintain robust monitoring systems using Elasticsearch, Kibana, and Beats to proactively detect potential issues and trigger timely alerts.
  • Maintain associated documentation as it applies to our audit and certification requirements.
  • Participate in troubleshooting, capacity planning, and performance analysis activities related to the ELK Stack.
  • Research new observability requirements and, in many cases, write code to implement them.
  • Possess strong expertise in setting up monitoring policies, rules, and templates, and writing scripts to accomplish observability requirements.

What you need to succeed:


  • BS/MS in CS/Engineering or equivalent, OR 5+ years of experience.
  • 4+ years of experience working directly with theELK Stack as either an Admin, SME, or Architect.
  • Hands-on experience with designing data pipelines using Logstash, and/or Fluentd/Fluentbit.
  • Expert-level knowledge of the ELK Stack (on-prem and cloud), including best practices related to performance, security, and component setup (Elasticsearch, Logstash, Kibana, Beats).
  • Fluent in writing scripts in languages like Python and (Bash or PowerShell) to automate tasks.
  • Experience in Terraform and Ansible, including syntax, best practices, and managing complex configurations to build and manage infrastructure and applications.
  • Very good working knowledge of Linux OS.
  • Highly self-motivated and directed.
  • Good analytical and problem-solving/troubleshooting abilities.

Your tracker settings

We use cookies and similar methods to recognize visitors and remember their preferences. We also use them to measure ad campaign effectiveness, target ads and analyze site traffic. To learn more about these methods, including how to disable them, view our Cookie Policy or Privacy Policy.

By tapping `Accept`, you consent to the use of these methods by us and third parties. You can always change your tracker preferences by visiting our Cookie Policy.

logo innerThatStartupJob
Discover the best startup and their job positions, all in one place.
Copyright © 2025