Do you want to love what you do at work? Do you want to make a difference, an impact, and transform peoples lives? Do you want to work with a team that believes in disrupting the normal, boring, and average?
If yes, then this is the job you are looking for ,
webook.com is Saudi's #1 event ticketing and experience booking platform in terms of technology, features, agility, revenue serving some of the largest mega events in the Kingdom surpassing over 2 billion in sales.
Role Overview
Design high-quality prompts, system instructions, and tooling that make our LLM features accurate, safe, and cost-effective. You'll own evaluation, prompt versioning, and continuous improvement.
Key Responsibilities:
- Author, refactor, and chain prompts (system/tool/policy) for varied tasks
- Create offline/online evaluation harnesses (rubrics, golden sets, metrics)
- Build prompt libraries with versioning, A/B testing, and telemetry
- Reduce hallucinations via verification, constrained decoding, and tool use
- Implement safety: jailbreak/prompt-injection tests, content policy checks, PII handling
- Partner with engineers to integrate prompts into production features
Requirements
- Demonstrated prompt design across multiple task types and models
- Experience building eval datasets and automated scoring (e.g., accuracy, faithfulness, utility, cost/latency)
- Familiarity with retrieval-augmented generation concepts and tool/function calling
- Strong scripting (Python/TypeScript) for data prep, evals, and analysis
- Clear writing; ability to translate business goals into measurable prompt specs
Nice-to-Haves
- Experience with LangChain/LLM orchestration, vector stores, and rerankers
- Knowledge of safety tooling and red-teaming techniques
- Experiment platforms (feature flags, A/B tests), analytics