At Equifax, behind every reliable digital experience lies a team of dedicated engineers relentlessly focused on system reliability, performance, and innovation. Among them is Sumi Sulai, a Site Reliability Engineer (SRE) whose day-to-day work keeps our critical systems running smoothly for millions of users around the world. We caught up with Sumi to understand what her typical day looks like and what drives her passion for technology and teamwork.
Walk us through a typical workday. What are some of the first things you do when you arrive, and how does your day usually progress?
My day typically begins with a review of our monitoring dashboards and alerting systems to check for any anomalies or critical updates from the previous night. This initial assessment helps me set my priorities for the day. I usually split my time between strategic planning, architecture reviews, and operational tasks — all of which are essential to ensuring the long-term reliability of our systems. A significant part of my role also involves collaborating with other SREs and mentoring newer team members at Equifax India. When incidents arise, I take the lead on incident management, focusing on root cause analysis, effective resolution, and thorough post-mortems.
What tools or technologies are essential to your work, and why?
Our team relies heavily on the Google Cloud Platform (GCP) suite for all deployments. For observability, we use Prometheus, and for alerting, PagerDuty is our platform of choice. Our CI/CD pipeline is powered by Jenkins, GitOps, and ArgoCD, enabling seamless and reliable delivery. Security is a top priority, so we integrate industry-leading SCA tools and code scanners. Together, these tools enable us to operate efficiently, maintain high availability, and uphold our security standards.
What aspects of your work do you find most rewarding or motivating?
I find it most rewarding to contribute to the stability and availability of critical systems that impact millions of users. Solving complex technical problems through collaboration and innovation is intellectually stimulating and deeply satisfying. I also take pride in mentoring my team. Supporting their growth and development, both technically and professionally, is incredibly fulfilling. Beyond that, I value the collaborative spirit within the SRE community — it’s energizing to work in an environment where knowledge is freely shared and solutions are built collectively.
How would you describe the work environment and team dynamics within the PEC SRE community?
The SRE community is rooted in transparency, collaboration, and shared ownership. We prioritize open communication and collective problem-solving, which creates a supportive and empowering environment. It’s a team that embraces continuous learning and constantly seeks to improve both skills and systems. We believe in leveraging proven solutions to move faster — not by reinventing the wheel, but by accelerating progress.

What is your approach to tackling a complex technical problem that has no immediate solution?
I take an organized and collaborative approach. First, I work to fully understand the problem by gathering all the relevant data and insights. Then, I form hypotheses and design controlled experiments to test them. Teamwork is key — we often brainstorm and validate ideas together, which helps us find the best path forward. Once we land on a solution, we focus on implementation and continuous improvement.
What are some of the biggest challenges you face in your role, and how do you overcome them?
One of the primary challenges is balancing rapid innovation with the need for extreme reliability. I address this by involving SREs early in the development cycle, establishing strong SLOs/SLIs, and implementing automation and guardrails from the start. Another challenge is the complexity of managing distributed systems. We mitigate this through robust observability, well-documented runbooks, and consistent knowledge sharing across the team.
How does your team typically communicate and share information? What tools or methods do you use?
We use Google Chat for real-time communication and alerts. For documentation and work tracking, Confluence and Jira are our primary tools — especially for runbooks and architecture references. Daily stand-ups keep everyone aligned, while email is used for broader updates and meeting invites. We also have scheduled deep-dive discussions to ensure clarity, and structured on-call handoffs to support smooth transitions.
How do you handle disagreements or conflicts within a team setting?
I approach disagreements with open dialogue, active listening, and a strong foundation in data-backed SRE principles. My focus is on fostering mutual understanding, encouraging compromise, and tracking outcomes. This approach, deeply rooted in our One Equifax culture, helps ensure that even tough conversations lead to stronger, more effective solutions.
Learn more about available opportunities at Equifax.