In the last 14 days, approximately how did you allocate your working time? Assign a total of 100 points across the activities.
Total must equal 100
- Project/feature work
- Incident response/on-call
- Maintenance/operations changes
- CI/CD and deployments
- Troubleshooting/bug fixing
- Meetings/coordination
- Documentation/runbooks
- Repetitive manual tasks
Min per option: 0Whole numbers only
In the last 14 days, about how many hours per week did you spend on repetitive manual tasks?
Accepts a numeric value
Whole numbers only
In the last 30 days, which were your main sources of toil? Select up to 5.
- Noisy or flaky alerts
- Manual deployments
- Brittle CI/CD pipelines
- Environment drift or config mismatch
- Access or permissions requests
- Manual change approvals
- Capacity management chores
- Ticket handoffs or coordination
- Limited observability or telemetry gaps
- Flaky tests
- Rollback or roll-forward complexity
- Data migrations or backfills
- Tooling integrations or gaps
Rank the following by how disruptive they are to your focused engineering time (1 = most disruptive).
Drag to order (top = most important)
- Noisy alerts/pages
- Manual deployments
- Access/permissions requests
- Environment setup/configuration
- Manual change approvals
- Capacity/infra changes
Which tooling do you actively use to manage reliability and reduce toil? Select all that apply.
- Alerting/Monitoring (e.g., Prometheus, Datadog)
- Incident management (e.g., PagerDuty, Opsgenie)
- Infrastructure as Code (e.g., Terraform, Pulumi)
- Configuration management (e.g., Ansible, Chef)
- CI/CD orchestration (e.g., Jenkins, GitHub Actions)
- Feature flags/progressive delivery
- SLO/Error budget tooling
- Runbooks/ChatOps automation
- Change management (e.g., ServiceNow)
- Internal developer portal (e.g., Backstage)
- Chaos/Resilience testing
Overall, how automated are your common operations tasks today?
Range: 1 – 10
Min: Not at all automatedMid: About half automatedMax: Fully automated
How effective are your current tools for each area?
Rows | Very ineffective | Ineffective | Neutral | Effective | Very effective |
---|
Deployments and rollbacks | • | • | • | • | • |
Incident mitigation | • | • | • | • | • |
On-call scheduling and handovers | • | • | • | • | • |
Permissions and access management | • | • | • | • | • |
Infrastructure provisioning and changes | • | • | • | • | • |
Observability and debugging | • | • | • | • | • |
Approximately how many manual steps did you automate or remove from runbooks in the last 30 days?
Accepts a numeric value
Whole numbers only
Attention check: To confirm you are paying attention, please select “I am paying attention.”
- I am paying attention
- I did not read the instructions
- I prefer to skip this question
Roughly how many incidents with user impact occurred in the last 30 days?
Accepts a numeric value
Whole numbers only
Compared to 3 months ago, how has your median time to resolve incidents changed?
- Improved (decreased)
- About the same
- Worsened (increased)
- Not sure/Don’t track
During your most significant incident in the last 30 days, what added the most toil?
- Paging noise or alert confusion
- Manual runbook steps
- Access or permissions delays
- Coordination or hand-off overhead
- Rollback or roll-forward complexity
- Limited data or observability gaps
- Change approvals or governance delays
- No significant incidents in the last 30 days
What single tooling change would most reduce toil for your team?
Max 100 chars
What are the biggest blockers to automating more of your operations work next quarter?
Max 600 chars
What is your primary role?
- SRE/Production Engineer
- Platform/Infrastructure Engineer
- Software Engineer
- DevOps Engineer
- Engineering Manager
- Other
How many years have you worked in this type of role?
Approximately how large is your organization?
- 1–49 employees
- 50–249
- 250–999
- 1,000–4,999
- 5,000–19,999
- 20,000+
Approximately how large is your SRE/Platform team?
How often do you take on-call rotations?
- Never
- Ad hoc/occasionally
- Weekly
- Every 2 weeks
- Monthly
- Less often than monthly
Which region best describes your primary working time zone?
- Americas
- EMEA
- APAC
- Other/Multiple
What is your work location model?
Any other comments about toil, reliability, or tooling that we didn’t cover?
Max 600 chars
AI Interview: 2 Follow-up Questions on Your Responses
AI InterviewLength: 2Personality: Expert InterviewerMode: Fast Thanks for your time—your input helps us track toil and prioritize the right reliability tooling.