AI Prompt Engineer

RecruitSeq • Full-time • San Francisco, CA, US • $90k - $145k / year • 3w ago

AI Prompt Engineer

San Francisco, CA (On-Site M-F)

Our client is an early-stage, AI-native technology company building AI-powered call center and scheduling agents.

About the Role

As an AI Prompt Engineer, you will own critical behavioral slices of production voice agents used, shaping both shared and customer-specific behaviors across thousands of calls. You will design prompts, sub-agent architectures, and evaluation harnesses to iteratively improve automation, booking, and resolution rates using real call data.

Responsibilities

Write, maintain, and version-control production prompts for intent classification, information extraction, scheduling and availability negotiation, verification flows, objection handling, and edge-case recovery across all customer deployments.
Review failed or low-performing calls daily, identify root causes, and ship targeted prompt or configuration updates multiple times per week to measurably improve automation and booking metrics.
Design and manage sub-agent architectures (e.g., routing, specialist agents, fallback handlers) that support complex multi-turn healthcare workflows while maintaining latency and reliability requirements.
Build and maintain offline evaluation harnesses, including curated eval sets, automated prompt optimization workflows (e.g., GEPA-style approaches), and regression test suites for safe shipping of changes.
Collaborate on human-in-the-loop onboarding flows that translate practice-specific intake forms, scheduling rules, and quirks into robust agent configurations, and define customer-specific evaluation metrics.
Simulate real-world caller scenarios, monitor live production performance dashboards, detect drift or degradation early, and coordinate fixes with engineering and operations.
Partner closely with software engineers to integrate prompts and agents into the broader AI stack, ensuring clean interfaces, observability, and reliable deployments in a high-volume environment.

Qualifications

2+ years of experience with AI/ML, NLP, or prompt engineering in production, including hands-on work shipping prompts or agents that real users relied on.
Demonstrated experience writing, testing, and iterating prompts for tasks such as classification, information extraction, scheduling, or conversational flows in high-stakes or operational contexts.
Strong analytical, data-driven mindset with comfort designing experiments, reading dashboards, and justifying changes with metrics (e.g., conversion or booking rate improvements).
Excellent writing skills, including sensitivity to tone, register, and phrasing in spoken or TTS-delivered interactions.
Comfort reading in Python and working familiarity with TypeScript, with the ability to collaborate effectively using modern AI coding tools.
On-site availability five days per week in the San Francisco Bay Area.

Preferred Skills

Prior experience with voice AI, TTS, ASR, or telephony platforms and real-time conversational systems.
Automated prompt optimization experience using frameworks or approaches such as DSPy, GEPA, or similar techniques.
Experience building and maintaining evaluation suites, test harnesses, or CI pipelines for LLM-based agents.
Academic or practical training in linguistics, philosophy, cognitive science, or related fields that inform language and conversation design.