Senior Site Reliability Engineer Job at MetaRouter, Denver, CO

VERSN1M0ZHJJWEhUR0NGbllaSksxQ29HUXc9PQ==
  • MetaRouter
  • Denver, CO

Job Description

Job Description

Job Description

Salary: $130,000 - $170,000

Senior Site Reliability Engineer

About Us

MetaRouter provides highly reliable and robust Customer Data Infrastructure via Software-as-a-Service and Self-Hosted deployment options. Our platform allows organizations to tailor their digital data collection and processing pipelines to their unique needs. MetaRouter is designed to improve how organizations unify real-time data collection and processing while maintaining control over data privacy and security. As a result, our customers gain deeper insights into their consumers, optimize their marketing and advertising operations, mitigate data compliance and security risks, and make data-driven decisions with confidence.

We believe organizations who harness first party data build trust with their audiences by meeting their consumers where they are, with the specific products, services, and experiences they want, at the moment they need it the most. Our purpose is to empower customers to take control of their data, unlocking differentiation, driving growth, and creating value for all stakeholders while meeting compliance regulations and respecting individual privacy rights.

About The Role

We are looking for a Senior Site Reliability Engineer (DevOps) with a minimum 7- 10 years of experience automating systems and advancing the ability to monitor and resolve critical issues proactively while being comfortable with developing and creating processes for a maturing SRE organization.

Core Responsibilities

  • Architect the creation, maintenance, and removal of cloud infrastructure that supports our applications and internal operations.
  • Manage deployment of our applications on cloud infrastructure.
  • Manage upgrades of infrastructure and a wide variety of intermediate software that supports our applications.
  • Set up and maintain dashboards, logs, metrics, and alerting mechanisms, with a focus on creating alerts that provide high signal and low noise.
  • Continuously improve observability by enhancing logging, metrics, and tracing systems to provide deeper insights into system performance, reduce time to resolution, and support proactive incident detection.
  • Lead the investigation and resolution of complex infrastructure and application issues, identifying root causes, driving systemic fixes, and mentoring others in effective troubleshooting practices.
  • Ensure that cloud infrastructure and our applications meet or exceed compliance requirements.
  • Establish and drive standards for infrastructure and process documentation, ensuring clarity, consistency, and long-term maintainability across teams and systems.
  • Drive best practices through code reviews by setting high standards for infrastructure, application, and service reliability, while mentoring engineers and influencing architecture and deployment patterns across teams.
  • Work with customers to determine and implement custom infrastructure requirements in a way that balances flexibility with repeatable, scalable patterns.
  • Lead design and architectural decisions for infrastructure and applications, driving improvements in automation, performance, reliability, and security at scale.
  • Provide technical leadership and mentorship to SRE team members, fostering a culture of growth, ownership, and continuous learning.
  • Partner cross-functionally with platform engineering and other stakeholders to define and deliver scalable infrastructure solutions for internal and customer-facing systems.
  • Apply business and technical acumen to prioritize and guide engineering efforts that maximize impact in resource-constrained environments.
  • Champion a culture of continuous improvement by identifying and implementing strategic process, system, and collaboration enhancements across teams.
  • Proactively identify and address technical and procedural risks before they escalate, exercising sound judgment and autonomy to drive long-term resilience and operational excellence."
  • Lead by example in the on-call rotation, setting standards for incident response, postmortems, and systemic resiliency improvements.
  • Design and implement scalable playbooks and alerting systems that reduce Mean Time To Repair (MTTR) by enabling rapid, consistent, and effective incident response.

Qualifications and Experience

  • 8+ years of experience in SRE or DevOps roles, with a strong track record of owning and scaling infrastructure on at least one major cloud provider (preferably GCP).
  • Deep expertise in configuring, maintaining, and troubleshooting Kubernetes clusters in production environments, including cluster architecture, security, and performance tuning.
  • Advanced proficiency with infrastructure and automation tools such as Bash, CI/CD pipelines, Docker, Git, Helm, Prometheus, Terraform, and YAML, with the ability to evaluate and implement tooling at scale.
  • Demonstrated experience architecting and managing identity and access management (IAM) and single sign-on (SSO) across complex, multi-platform environments.
  • Operational expertise with observability platforms such as New Relic (including NRQL), using telemetry to guide performance optimization, reliability improvements, and incident response strategies.
  • Familiarity with the operational aspects of modern application stacks, including Go and React/Node.js, with the ability to collaborate effectively across application and infrastructure domains.
  • Strong understanding of agile methodologies, with experience leading infrastructure initiatives within iterative development cycles.
  • Proven ability to prioritize and execute across a diverse set of responsibilities in a fast-paced, evolving environment, balancing tactical needs with long-term technical strategy.

Employment Details

Job Type: Full Time

Location: Fully Remote

Benefits

  • Health/Dental/Vision/Insurance
  • 401(k)
  • Unlimited Vacation Policy
  • Fully Remote (US)

Job Tags

Full time, Remote work,

Similar Jobs

Aerotek

Mold Maker Job at Aerotek

**Job Title:** **Mold Maker & Repairer****Location:** Elgin, IL**Job Summary**The Mold Maker is responsible for designing, fabricating, repairing, and maintaining precision molds used in manufacturing processes such as injection molding, die-casting, and metal forming... 

Millennia Housing Management

Porter/Custodian Job at Millennia Housing Management

Job Description Job Description This role responsible for the overall upkeep of the propertys public areas, common areas, exteriors, floor care, and trash removal. They also assists with litter pick-up, sidewalk cleaning, snow removal, and maintenance and cleaning...

Tarlton Corporation

Marketing Assistant Job at Tarlton Corporation

 .... Assist with coordinating events, award submissions, PR, and advertising efforts. ESSENTIAL ACTIVITIES: 1. Maintain and develop...  ...status. ~ We respectfully request that external recruiting agencies and search firms refrain from submitting resumes or candidate... 

In-n-Out

Waiter/Waitress (San Francisco) Job at In-n-Out

 ...Waiter/Waitress Our restaurant is looking for a waiter/waitress with remarkable hard skills and engaging people skills. The right person for this job should be able to multitask food and beverage orders and deliveries, but not only that. Our restaurant prides itself... 

Axguard LLC

Sales and Marketing Internship Job at Axguard LLC

 ...expand your talents, this is the job for you! This is not a hard job but pushes you to grow and learn new skills. There is no experience required! We will train you to succeed. Schedule: We are open to candidates seeking Full-time employment, Mondays through Fridays...