Professional System Engineer

Remote, USA Full-time Posted 2025-02-21

About the position

As a Professional System Engineer at AT&T, you will be at the forefront of delivering innovative solutions that enhance the customer experience across various channels, including online, retail, and care. Your role will involve supporting systems that are critical to AT&T's customer interactions, particularly through multiple internet-facing eCommerce applications, databases, and technology stacks. You will analyze production alerts and dashboards to identify outages that impact customers, ensuring that system health is continuously monitored to prevent potential issues that could lead to service disruptions. In this position, you will be responsible for evaluating both upcoming projects and existing production monitoring systems to ensure that applications are equipped with the necessary monitoring tools to capture any customer-impacting outages. You will quantify the effects of production outages on customers, ensuring that appropriate support is provided for critical issues. Your duties will include providing 24x7 production support, troubleshooting incidents, and responding to outages effectively. You will also be tasked with application performance monitoring, dashboard creation, and implementing corrective actions to optimize existing alerts, thereby reducing false alarms and improving alert accuracy. Your role will require you to engage with various stakeholders across operations, product delivery, infrastructure, and business units to build consensus during high-severity production outages. You will work on incident management and response as part of a Tier 1 production operations team, deploying cloud and other networks that deliver digital products to both employees and customers. Additionally, you will develop and implement customer journey dashboards to proactively monitor the availability of customer experiences, applying your knowledge of operational practices to enhance the organization's capability maturity. Networking and troubleshooting will be key components of your role, utilizing technologies such as Docker, Kubernetes, Microsoft Azure Cloud, and Unix. You will also create dashboards using tools like Dynatrace, ELK, and Grafana, and utilize customer experience analytics tools including Quantum Metric and Tealeaf, along with visualization tools like Kibana and Grafana. Debugging logs from Java and microservices, as well as utilizing the EFK stack, Dynatrace, Catchpoint, and Nagios, will also be part of your responsibilities.

Responsibilities
? Deliver groundbreaking solutions that provide intuitive and integrated experiences for customers across online, retail, and care channels.
,
? Support systems that deliver AT&T's customer experience across multiple internet-facing eCommerce applications, databases, platforms, and technology stacks.
,
? Analyze production alerts and dashboards to identify customer impacting outages.
,
? Examine system health status to identify issues that may lead to customer impacting production outages.
,
? Evaluate upcoming projects and existing production monitoring to ensure applications have proper monitoring to capture outages.
,
? Quantify customer impacts of production outages to ensure proper support is provided on critical issues.
,
? Provide 24x7 production support and first-level troubleshooting of incidents and outage responses.
,
? Provide application performance monitoring, dashboarding, troubleshooting, and corrective actions.
,
? Optimize existing alerts to reduce false alarms and improve accuracy.
,
? Support incident management and problem management processes.
,
? Engage resources in high severity production outages.
,
? Perform production support for mission-critical and high-performance applications including telecom and eCommerce.
,
? Build cross-organizational consensus across operations, product delivery, infrastructure, and business stakeholders.
,
? Work on incident management and incident response on a Tier 1 production operations team.
,
? Deploy cloud and other networks that deliver digital products to employees and customers.
,
? Develop and implement customer journey dashboards for proactive monitoring of customer experience availability.
,
? Apply knowledge in operations practices to increase operational capability maturity within the organization.
,
? Network and troubleshoot utilizing Docker, Kubernetes, Microsoft Azure Cloud, and Unix.
,
? Create dashboards on Dynatrace, ELK, and Grafana.
,
? Utilize customer experience analytics tools including Quantum Metric and Tealeaf.
,
? Utilize visualization tools like Kibana and Grafana.
,
? Debug Java log and microservices log.
,
? Utilize EFK stack, Dynatrace, Catchpoint, and Nagios.

Requirements
? Bachelor's degree in Computer Engineering, Computer Science, or Electrical and Electronic Engineering.
,
? Three years of experience in a related occupation building cross-organizational consensus across operations, product delivery, infrastructure, and business stakeholders.
,
? Experience working on incident management and incident response on a Tier 1 production operations team.
,
? Experience deploying cloud and other networks that deliver digital products to employees and customers.
,
? Experience developing and implementing customer journey dashboards for proactive monitoring of customer experience availability.
,
? Knowledge in operations practices and increasing operational capability maturity within an organization.
,
? Experience networking and troubleshooting utilizing Docker, Kubernetes, Microsoft Azure Cloud, and Unix.
,
? Experience utilizing customer experience analytics tools including Quantum Metric and Tealeaf.
,
? Experience creating dashboards on Dynatrace, ELK, and Grafana.
,
? Experience utilizing visualization tools like Kibana and Grafana.
,
? Experience debugging Java log and microservices log.
,
? Experience utilizing EFK stack, Dynatrace, Catchpoint, and Nagios.

Nice-to-haves

Benefits
? Medical/Dental/Vision coverage
,
? 401(k) plan
,
? Tuition reimbursement program
,
? Paid Time Off and Holidays (at least 23 days of vacation each year and 9 company-designated holidays)
,
? Paid Parental Leave
,
? Paid Caregiver Leave
,
? Additional sick leave beyond state and local law requirements
,
? Adoption Reimbursement
,
? Disability Benefits (short term and long term)
,
? Life and Accidental Death Insurance
,
? Supplemental benefit programs: critical illness/accident hospital indemnity/group legal
,
? Employee Assistance Programs (EAP)
,
? Extensive employee wellness programs
,
? Employee discounts up to 50% off on eligible AT&T mobility plans and accessories, AT&T internet, and AT&T phone.

Similar Remote Jobs

Professional System Engineer

Posted on: 14-02-2025 07:18

REMOTE SALES GROWTH POSITION

Posted on: 16-11-2024 19:13

Data Entry Clerk- Remote - Entry Level

Posted on: 28-10-2024 09:49

Entry Level Digital Marketing Role

Posted on: 10-10-2024 00:00

Entry Level Insurance Specialist

Posted on: 25-09-2024 00:00

Home Health Licensed Nurse

Posted on: 31-01-2025 10:03

Customer Support Associate - US Remote

Posted on: 07-09-2024 00:00