Interview Preparation
Great interviews combine structured technical reasoning with concise communication, whether you target specialist AI roles or broader engineering positions.
Before the interview
- Understand the role scope, stack, and product constraints.
- Prepare 3–4 project stories with architecture and measurable outcomes.
- Review likely system design and model or agent evaluation trade-offs.
Technical rounds
- Explain assumptions before jumping into implementation details.
- Discuss reliability, safety, and cost implications early.
- For agentic systems, explain tool selection, planning, and fallback behavior.
- Use simple diagrams or verbal structures to keep answers clear.
Behavioral rounds
- Use STAR format and focus on your contribution and learning.
- Show ownership, conflict resolution, and delivery under constraints.
- Keep stories concise and aligned with the role's priorities.
Post-interview follow-up
Send a short thank-you note, reiterate role fit, and include one concrete point from the discussion to demonstrate attention and professionalism.
Sample GenAI Interview Q&As
These questions come up in virtually every AI Engineer technical round. Practise answering each out loud before your interview.
Q: Explain the difference between RAG and fine-tuning.
A: RAG retrieves context at inference time from an external knowledge base without changing model weights — ideal for frequently updated knowledge. Fine-tuning modifies weights by training on domain data — better for consistent style/vocabulary. Production systems often combine both.
Q: How do you prevent prompt injection in an agent?
A: Separate instructions from data (never embed raw user input in the system prompt), use structured output schemas, validate all tool call arguments before execution, and add a secondary moderation layer for high-stakes actions.
Q: How do you handle hallucination in a production RAG system?
A: Ground generation with retrieved context, instruct the model to cite sources, add a faithfulness check (RAGAS), validate structured outputs, and flag low-confidence answers for human review.
Q: Walk me through your agentic system architecture.
A: Describe planning layer (ReAct or function-calling), tool integrations (web, code execution, APIs), memory management (short-term prompt context + long-term vector store), step limit for loop prevention, and observability via LangSmith or Arize.
Q: How do you choose a vector database?
A: pgvector for teams already on Postgres (operational simplicity). Pinecone for fully managed scale. Weaviate/Qdrant for self-hosted with schema richness. At 100M+ vectors, purpose-built systems outperform pgvector.
Q: What metrics do you track in a production LLM system?
A: Latency (P50/P95/P99), token cost per request, hallucination rate (LLM-as-judge or RAGAS faithfulness), user satisfaction (thumbs up/down), error rate, and context window utilisation.