Network Operations - Incident Manager

$100,000 - $120,000 yearly
  • The Resume Review - Recruiting Department
  • Richmond, VA, USA
  • Nov 03, 2022
Full time Customer Service Information Technology

Job Description

Job description
This role provides support to end users responding to issues related to Incidents and Problem Management, for multiple applications, with the primary focus on triage leadership of all business impacting incidents. Responsibilities include understanding and ensuring compliance with the Incident Management and Problem Management policies and procedures. This role is the key focal point for the customer/client/associate experience and owns restoring any impacts to those experiences regardless of where the root cause of the impact lies.
 
Position Summary
 
The areas of focus for the Network Technical Recovery Specialist includes management of network stability across multiple complex platforms. Acts as a focal point for incident troubleshooting in medium/high impacting issues. Network Technical Recovery Specialists are expected to be well versed in numerous networking protocols, technologies and troubleshooting methodology, including the use of proactive and reactive tools. The work is always in alignment to the current and approved Network Services Standards, Incident and Problem Management Policies & Procedures, governance and management policies set forth by the firm.
This position will interface directly with internal stakeholders and external suppliers/providers, architecture, product engineering, product management, and business management. At times, they may interface with various levels of senior management. Strong communication and problem-solving skills are a must. The candidate must be able to work on their own and successfully in team settings in various sizes and locations. Adherence and use of standards, product sets, templates, systems, and artifacts are important to the success of the individual, the department, and the firm at large. The Network Technical Recovery Specialist will be considered a subject matter expert in their field and is expected to stay current with various technologies, organizational goals, and industry trends to drive end to end value.
 
Key Responsibilities:
  • Operational Support of Global Network Technologies
  • Lead production support triage efforts for network infrastructure incidents, manage bridge line troubleshooting and appropriate team engagement, engage in technical research and troubleshooting, and escalate to next level of leadership as needed
  • Identify service impact, interpret monitoring tools, dashboards, and logs; Work with senior team members to validate impacts and communicate all impacts to internal stakeholders and leadership
  • Provide status updates and technical detail for awareness communications, ensure accuracy of all communications sent, and ensure any necessary follow-ups are scheduled
  • Identify possible production failure scenarios, vulnerabilities, and opportunities for improvement, and take ownership of the escalation and resolution
  • Participate in the documentation of application flows, upstream/downstream impacts during outages, the customer experience in failure scenarios, contacts for various support needs and ensures appropriate documents and wikis are up to date and available for use during triage
  • Accountable for data quality and completion of incident tickets, including ensuring all impacts are accurately recorded and documented in the system of record
 
Required Skills
  • Experience with Network technologies
  • WAN, Wired/Wireless LAN, Optical, Public/Private Internet, Routing, Switching, Firewalls, Load Balancing, Proxy, DDI, and AAA
  • Experience with troubleshooting complex networking problems
  • Knowledgeable on JIRA, Confluence, and experience with Agile framework & SCRUM ceremonies
  • Self-starter/self-directed, organized and detail oriented
  • Strong technical acumen and analytical skills
  • Excellent client interfacing skills
  • Strong verbal and written communication skills and ability to work with all levels of management
  • Experience aligning actions to business impact and service restoral
  • Understand configuration management with tools such as Forward Networks and HPNA
  • Knowledge and experience using (both proactive and reactive) advanced tooling. Inclusive of but not limited to Sniffer/wire Shark, Splunk, SevOne, HPNA, NNMI, OBM, IBM Watson, etc.
  • The candidate must have excellent client interfacing skills, strong verbal and written communication skills and ability to work with all levels of management.
  • Candidate should have experience operating with colleagues across different time zones with a flexible approach to working hours (ability to work varied hours and weekends a must) to successfully interact and communicate on a global level) if needed
 
Desired Skills
  • Experience in Networking-related disciplines within a design, implementation, or operations role
  • Relevant Industry certifications in Network Technologies
  • Cloud or SDN knowledge and experience
  • Experience with SDN; Cisco ACI, VMware NSX, Arista CloudVision o Experience with SDWAN, preferred if on Cloudgenix
  • Experience with automation tools such as Python, Ansible, YAML or Django, API calls (to ticketing systems and network devices), and frontend web development
  • Experience working in an Agile environment
  • Experience of working within Financial Services (Insurance, Banking, Investment banking)
  • Experience with other network technologies
  • WAN, MAN, LAN, Optical, Routing, Switching, Proxy/Threat Prevention, DDI, Load Balancing, and AAA