Software Site Reliability Engineer II (Remote/Hybrid PST)

Remote, USA Full-time Posted 2025-02-21

Job Description

iTalent Digital is a leading, woman- and minority-owned global technology consulting company. We are seeking two(2) Software Site Reliability Engineer II to join our diverse and dynamic global team. This is a long term, ongoing opportunity to assist our Fortune 500 tech client in the Silicon Valley. Client is in the global financial technology platform that powers prosperity for... the people and communities we serve. With approximately 100 million customers worldwide using a broad product portfolio.

The roles may be remote or hybrid in the PST zone. The individual selected will be instrumental in helping us continue to deliver excellence to our base of leading global accounts.

You will also interact closely with iTalent's Communities of Practice, expand your network, and grow your career. This is a unique chance to meet others who think differently and are passionate about challenging the status quo!

Job Title: Software Site Reliability Engineer II

Job Overview

Join the FinTech Payments Platform as a Software Engineer II. You will be a part of a trusted financial expert empowering financial prosperity for businesses and consumers through a convenient, powerful, AI-native fintech platform providing fast and easy access to funds at the time of need. Assisting in the process of millions of transactions every day across various payment methods.

Responsibilities
? Be the first level of support and handle and investigate incidents, production issues, and alerts
? Identify, design and build tools that are focused on tooling work and observability work, ensuring high availability, scalability, and performance of our production systems
? Ensure the highest standards for engineering design, implementation, and testing
? Accurately scope effort, identify risks and clearly communicate trade-offs with team members and other stakeholders
? Investigate production issues and provide valuable insights to the core teams
? Pursue and resolve complex technical problems and share key learnings
? Stay aware of industry trends and make technology choices and strategic decisions
? Collaborate closely with peers, cross-functional teams and business units to define, prioritize, sequence and scope business and functional requirements and drive results forward

Required qualifications and skills
? 2+ years of related experience with SRE/NOC team
? Six Sigma experience (Green or Black preferred)
? Expert in one of the following: Automation, Monitoring tools, Cloud Operations
? Solid AWS experience
? Solid and comfortable with backend or full stack coding and scripting: strong experience with Java/J2EE, Go, Python, REST, SOAP, JSON
? Skilled in software development lifecycle processes. Experience with SCRUM and Agile Development
? Knowledge of current trends and best practices in the modern SaaS technology landscape
? Experience in leveraging Amazon Web Services for building scalable applications
? High adaptability and flexibility
? Work well under pressure
? Have a passion for working on systems that are highly reliable, maintainable, scalable, and secure
? High energy, self-starter with a positive mindset

Skills:

Operational Excellence:
? Proactively identifies and resolves product stability issues, thereby improving quality and availability
? Expertise in designing and implementing advanced CI/CD and automation/resiliency concepts such as Progressive Rollouts and Failure Modes and Effects Analysis (FMEA)
? Identifies and drives resiliency, cost optimization, and process improvements
? Manages and performs on-call duties to ensure operational excellence and quick resolution of production incidents

Software Fundamentals:
? Writes and reviews code to eliminate complexity while ensuring security, scalability, performance, testability, resiliency, and maintainability
? Expert at diagnosing and resolving cross capability issues, with a focus on tooling and observability
? Enhances test coverage including unit tests, end-to-end tests, and integration tests to maintain production system robustness
? Experience with metrics, monitoring and alerting tools such as Splunk, Wavefront, AppDynamics, Prometheus, and Pagerduty

Design and Architecture:
? Promotes standard practices for tooling, monitoring, and observability
? Develops tools that focus on improving system observability, including metrics, logging, and tracing

Communications:
? Ability to convince people of their design, especially for tooling and observability solutions that ensure system reliability and performance
? Is receptive to feedback from peers and acts accordingly, particularly in high-pressure incident resolution scenarios
? Collaborates with other team members to solve problems more effectively, emphasizing cross-functional collaboration during production incidents
? Demonstrated ability to explain complex technical issues to both technical and non-technical audiences

Preferred qualifications and skills

Preferred Experience

Experience with large-scale payment systems

Education

Bachelor's Degree

Company description

About iTalent Digital:

A woman- and minority-owned digital consulting company, we celebrate individuals and diversity, cultivating a culture where our people can excel and lead balanced lives. Recruitment at iTalent is guided by an unwavering principle: Only hire the best. Because we have the best people, we have the privilege of working with the best clients, doing the best work, and effecting transformative change at work and in our communities.

What you get:

You get the chance to work with some of the best brands and high-performance teams out there! iTalent offers our W2 consultants? excellent benefits such as medical, dental, vision, life insurance, paid holidays and PTO, and 401K + matching. We are growing and we want to see you grow!

Log onto iTalentdigital.com to learn more about what working at iTalent can mean for you

Apply Job!

Similar Remote Jobs