Which role best describes your day-to-day work?
- ML engineer
- Backend engineer
- Data scientist
- MLOps/Platform
- Product engineer
- Researcher
- Architect/Tech lead
- Other
Have you built or maintained a RAG system in the last 6 months?
Which primary content source feeds your retriever today?
- Proprietary documents
- Code repositories
- Product knowledge base
- Web crawl
- Vendor API docs
- Slack/Chat logs
- Support tickets
- Wiki/Confluence
- Database/warehouse
- Not applicable
- Other
In the last 30 days, how well did retrieved context meet task requirements?
Range: 1 – 10
Min: Far below needsMid: AdequateMax: Far above needs
How do you set or tune top-k and related retrieval parameters?
- Manual experimentation
- Grid/Random search
- Bayesian optimization
- Vendor auto-tuning
- Learned retrieval policy
- Not tuned
- Other
In the last 30 days, how often did you encounter these retrieval issues?
Rows | Never | Rarely | Sometimes | Often | Very often |
---|
Irrelevant passages retrieved | • | • | • | • | • |
Relevant items ranked too low | • | • | • | • | • |
Outdated content surfaced | • | • | • | • | • |
Duplicate or near-duplicate results | • | • | • | • | • |
Contexts too long for the task | • | • | • | • | • |
Sparse or long-tail queries underperformed | • | • | • | • | • |
Attention check: To confirm attention, please select “Often.”
- Never
- Rarely
- Sometimes
- Often
- Very often
How are model answers grounded or cited?
- Inline citations with URLs
- Inline citations with document IDs
- Evidence block after the answer
- Tool outputs included verbatim
- Structured JSON evidence list
- No grounding/citations
- Other
Rate your trust in the correctness of cited evidence (last 30 days).
Scale: 10 (star)
Min: Low trustMax: High trust
Rank your preferred citation/grounding display styles.
Drag to order (top = most important)
- Inline per sentence
- Numbered endnotes
- Collapsible evidence panel
- Top-k sources with scores
- Link to full passages
- Show only on demand
How frequently did you observe hallucinations despite grounding (last 30 days)?
Range: 1 – 10
Min: NeverMid: OccasionalMax: Very frequent
Briefly describe a recent grounding failure and its impact.
Max 600 chars
Which evaluation tools or libraries do you use for RAG?
- Ragas
- TruLens
- DeepEval
- Promptfoo
- Custom harness
- LlamaIndex evals
- None
- Other
Which metrics best reflect your RAG quality today?
- Precision@k
- Recall@k
- MRR
- nDCG
- Answer faithfulness
- Context precision/recall
- Groundedness score
- Human ratings
- Production usage signals
- Custom internal metrics
Name your key custom metric or how you compute it.
Max 100 chars
How automated is your evaluation workflow?
- None (manual only)
- Some scripts
- CI-integrated checks
- Continuous eval in production
How often do you run RAG benchmarks?
- Before each release
- Weekly
- Biweekly
- Monthly
- Quarterly
- Ad hoc only
Primary vector store or retriever backend in use?
- Pinecone
- Weaviate
- Milvus
- FAISS
- Elasticsearch/OpenSearch
- pgvector
- Chroma
- Vespa
- Not applicable
- Other
Which embedding model is your primary choice?
- OpenAI text-embedding-3
- OpenAI small embedding
- Cohere Embed
- VoyageAI
- Jina embeddings
- E5 family
- Instructor
- BGE family
- Local model
- Other
Do you use a reranker after initial retrieval?
Which reranker do you use most often?
- Cohere Rerank
- Voyage Rerank
- Jina Reranker
- Cross-encoder (e.g., MS MARCO)
- Self-hosted reranker
- Not applicable
- Other
What is your end-to-end RAG latency target per query (ms)?
Accepts a numeric value
Whole numbers only
Years of professional experience (software/data/ML)?
- 0–1
- 2–4
- 5–9
- 10–14
- 15+
- Prefer not to say
Region you primarily work in?
- Africa
- Asia
- Europe
- North America
- Oceania
- South America
- Prefer not to say
Organization size (employees)?
- 1–10
- 11–50
- 51–200
- 201–1000
- 1001–5000
- 5001+
- Prefer not to say
Primary industry/domain for your RAG work?
- Technology
- Finance
- Healthcare/Life sciences
- Retail/CPG
- Education
- Government/Public sector
- Manufacturing
- Media/Entertainment
- Other
- Prefer not to say
Primary programming language you use for RAG?
- Python
- JavaScript/TypeScript
- Java
- Go
- C
- Rust
- Other
How critical is retrieval quality to your RAG outcomes?
Range: 1 – 10
Min: Not importantMid: Moderately importantMax: Critical
Overall satisfaction with your RAG system today.
Scale: 10 (star)
Min: Very dissatisfiedMax: Very satisfied
Rank your top priorities for the next 3 months.
Drag to order (top = most important)
- Improve retrieval precision/recall
- Better grounding/citations
- Reduce latency
- Lower cost per query
- Scale to more data sources
- Harden evaluation pipeline
- Security/compliance
- Developer ergonomics
Anything else we should know about your RAG retrieval or grounding?
Max 600 chars
Welcome! This short survey takes about 6–8 minutes. Your responses are anonymous and will be reported in aggregate only.
AI Interview: 2 Follow-up Questions on RAG Retrieval and Grounding
AI InterviewLength: 2Personality: Expert InterviewerMode: Fast Thank you for your time—your input helps improve RAG systems!