Quality Assurance Automation Lead

Teammates.ai

Date: 8 hours ago

Contract type: Full time

Remote

About Teammates.ai

Teammates.ai builds Autonomous AI Teammates that take full ownership of business functions—customer support, sales, hiring, and more. Our Teammates (Raya for CX, Rashed for Sales, Sara for Interviewing) resolve thousands of real-world interactions every day across phone, email, WhatsApp, and chat in 50 + languages—delivering superhuman results at 1/10 th the cost and 45 × the speed of traditional teams. Venture-backed and partnered with public-sector innovators such as Dubai Centre for AI, we’re scaling fast and need bulletproof quality to match.

Role Overview

You will own product quality end-to-end—from LLM prompts and voice latency to dashboard UI regressions—so our AI teammates never drop a call, hallucinate, or crash in front of an enterprise user. You’ll design the strategy, build the automation, and champion a culture of quality as we roll out to high-stakes government and Fortune 500 clients.

Key Responsibilities

Quality Strategy

Define and evangelise test plans covering APIs, dashboards, and multilingual LLM outputs.

Automated Test Frameworks

Build/maintain Python test suites (Pytest, Playwright); integrate synthetic voice simulators in collaboration with the VoIP team.

LLM & Agent Evaluation

Implement prompt-drift and regression checks with RAGas / DeepEval / TruLens; maintain golden-set benchmarks.

CI/CD & Release Gates

Wire tests into GitHub Actions; block builds that exceed latency, accuracy, or error-rate thresholds.

Cross-Functional Partnering

Work closely with Data Science on eval metrics and with VoIP Engineering to run pre-release call-flow

smoke tests.

Requirements

Technical Expertise

5+ yrs building QA automation for SaaS or real-time systems.
Deep Python proficiency; comfortable writing fixtures, custom assertions, and CLI tools.
Hands-on with Playwright or Cypress for UI; Pytest or Robot for API (or similar).
Familiarity with LLM evaluation tooling (RAGas, DeepEval, TruLens) or demonstrable ability to pick it up fast.
CI/CD in GitHub Actions, CircleCI, or similar.
Clear written communication; you can draft a concise bug report that gets fixed the first time.

Nice to Have

Knowledge of prompt-engineering pitfalls, multilingual TTS/ASR quirks, or agentic evaluation methods.
Experience testing audio / VoIP flows (RTP, SIP) or real-time streaming APIs.
Security/compliance test exposure (SOC 2, ISO 27001).

Success Looks Like

< 0.5 % production escape rate on critical bugs.
10 × faster release confidence, reducing manual QA cycles from days to minutes.
A self-service, repeatable test harness that any engineer (or even an AI teammate) can run locally or in CI.

Why Join Us

Cutting-edge playground: work at the intersection of LLMs, real-time voice synthesis, multi-agent orchestration, and AI SaaS.
High-trust team: small, senior crew backed by top-tier VCs; your decisions shape the quality culture for years.
Learn and execute at very high speeds.

Apply Now

If you obsess over catching the bug before the user does—and you’re excited to test AI that actually does real work—send your CV and a note on your proudest quality win.

You must interview with Sara, our AI Interviewer in the first round.

Post a CV

See more Remote jobs