Job Description: Prompt Engineer
Employment Type: Full-time
Work Type: Hybrid
Duration: 12 months (Annual Renewal)
Location: Gurgaon or Hyderabad
ABOUT THE ENGAGEMENT:
Aceolution is hiring for the work that supports the development, evaluation, and safety calibration of frontier generative AI systems used by hundreds of millions of users worldwide.
KEY RESPONSIBILITIES
As per the engagement scope: tuning and testing client APIs, and running projects within the prompt-engineering pod. The work is likely to include agentic and tool-calling components. The day-to-day workflow runs from morning triage of overnight evaluation outputs, through hypothesis-driven prompt iteration, into regression testing and production deployment.
- Pull and review overnight QA logs to identify discrepancies between the client tool orchestration layer’s outputs and the human-validated baseline.
- Categorize errors: hallucinations (model invented a rule), context misses (failed local language or cultural nuance), boundary failures (gray-area cases the model decided too rigidly).
- Form hypotheses on why the tool failed — cluttered context, missing few-shot examples, outdated RAG content — and refactor prompts to test the hypotheses.
- Add negative constraints, update few-shot examples, request RAG layer updates as needed.
- Run regression tests against the Golden Dataset before deploying any prompt change. Confirm fixes do not silently regress other workflows.
- Deploy approved prompt changes to the live orchestration layer and monitor output for stability.
- Author hypothesis docs, contribute to weekly RCA reports, and participate in client calibration sessions.
MUST-HAVE SKILLS AND EXPERIENCE
- Hands-on production experience with at least one major LLM API — Gemini, OpenAI, or Anthropic. Specific Gemini API experience is strong-to-have but not mandatory; the patterns are highly transferable and a strong candidate from another LLM platform.
- Must be able to discuss prompt strategies fluently — context window management, few-shot construction, system prompt design, structured output. Not just “I have used ChatGPT”.
- Strong analytical reasoning and structured problem-solving. The work is hypothesis → experiment → regression test → deploy → monitor.
- Python fluency — this is non-negotiable. You read, write, modify, and debug Python daily. These are the engineering workhorses of the pod, not prompt hobbyists; “I can run a notebook” is not enough.
- Comfortable with object-oriented Python. The agent and tool-calling frameworks in this space are class-based, so you need to read and extend OOP code, not just write top-to-bottom scripts. You don’t need to be a systems architect, but a scripts-only background will not clear the bar.
- Strong written communication. Daily output includes hypothesis docs, RCA contributions, and inputs to the client-facing weekly review.
- High tolerance for ambiguity. The work involves gray-area judgements where guidelines are 50% indicative; you must be comfortable making subjective decisions and defending them.
- Linguistic precision and cultural awareness.
STRONG-TO-HAVE SKILLS
- Prior experience with prompt-engineering patterns: chain-of-thought, few-shot, ReAct, structured output.
- Familiarity with eval datasets, labelled-data workflows, or human-in-the-loop systems.
- Background in content moderation, ad evaluation, or trust-and-safety operations.
- Familiarity with API tools (Postman) and basic understanding of REST APIs and JSON.
- Multilingual fluency, especially in any major non-English market language.
Important notice:
Aceolution Inc. will never request a monetary deposit for any role or project with the company, and our recruitment and sourcing teams only use @aceolution.com address when emailing candidates. Ignore aceolutions.com which is a spammer email ID doing rounds over the past few months.