Associate III - Cloud Infrastructure Services

UST


Date: 5 days ago
City: Thiruvananthapuram, Kerala
Contract type: Full time
Role Description

Role Proficiency:

Resolve L1 Incident and service requests within agreed SLA

Outcomes

  • Monitor customer infrastructure using tools or defined SOPs to identify failures and mitigate the same by raising tickets with defined priority and severity2) Update SOP with updated troubleshooting instructions and process changes3) Mentor new team members in understanding customer infrastructure and processes4) Perform analysis for driving incident reduction5) Resolve L1 incidents and service requests

Measures Of Outcomes

  • SLA Adherence2) Compliance towards runbook based troubleshooting process3) Time bound elevations and routing of tickets – OLA Adherence4) Schedule Adherence in managing ticket backlogs5) # of NCs in internal/external audits6) Number of KB changes suggested7) Production readiness of new joiners within agreed timeline by one-on-one mentorship8) % Completion of all mandatory training requirements9) Number of tickets reduced by analysis 10) Number of installation SR handled for endpoints / change tasks completed for infrastructure 11) Number of L1 tickets closed

Monitoring

Outputs Expected:

  • Understand Priority and Severity based on ITIL practice. Understand agreed SLA with customer and adhere.
  • Repetitive analysis for finding high ticket generating Cis. Adhere to ITIL best practices

Runbook Reference/Change

  • Follow runbook for troubleshooting record troubleshooting steps and provide inputs for runbook changes.

Escalation/Elevation/Routing Of Tickets

  • Escalate within organization/customer peer in case of resolution delay.
  • Understand OLA between delivery layers (L1 L2 L3 etc) adhere to OLA route the tickets to relevant queue initiate intimation respective teams/customer based on defiled process.

Tickets Backlog/Resolution

  • Follow up on tickets based on agreed timelines manage ticket backlogs/last activity as per defined process.
  • Resolve incidents and SRs within agreed timelines. Execute change tasks for infrastructure.

Collaboration

  • Collaborate with different towers of delivery for ticket resolution (within SLA) document learnings for self-reference.
  • Close/resole L1 tickets with help from respective tower.
  • Actively participate in team/organization-wide initiatives.

Installation

  • Install software software/tools and patches

Stakeholder Management

  • Lead the customer and vendor calls.
  • Organize meetings with different stake holders. Participate in RCA meetings.

Process Adherence

  • Thorough understanding of organization and customer defined process.
  • Consult with mentor when in doubt.
  • Adherence to defined processes.
  • Adhere to organization’ s policies and business conduct.

Training

  • On time completion of all mandatory training requirements of organization and customer.
  • Provide On floor training and one-on-one mentorship for new joiners.

Performance Management

  • Update FAST Goals in NorthStar track report and seek continues feedback from peers and manager.
  • Set goals and provide feedback for mentees.
  • Assist new team members to understand the customer environment.

Skill Examples

  • Good communication skills (Written verbal and email etiquette) to interact with different teams and customers2) Networking:a. Good in Monitoring tools and Device back up schedulingb. Basic DHCP and DNS configuration in routers and switchesc. Basic troubleshooting skills in ‘show ip route’ ‘sh mac address-table’ etcd. Static and dynamic IP routing protocols basics3) Server:a. Basic to intermediate powershell / BASH/Python scripting skillsb. Manual patch of QA serverc. Analyse space s from a server and engage Capacity Mgmt. team for disc expansion4) Storage and Back upa. Ability to handle Storage and Backup issues independentlyb. Ability to handle Vendor management Device management Storage array managementc. Perform Hardware upgrades firmware upgrades Vulnerability remediationd. Ticket analysis Storage and backup Performance management various trouble shootings5) Database:a. Patching and upgrading the DB server and application toolsb. Tweak queries making them run as fast as possiblec. Logical and Physical Schema design (indexing constraints partitioning etc.)d. Ability to visualize debug the end-to-end flow of business transaction model and applicationse. DB migration export/import

Knowledge Examples

  • Fair understanding of customer infrastructure ability to co-relate failures
  • Monitoring knowledge in infrastructure tools3) Networkinga. IP addressing and Subnetting knowledgeb. Preferably certified in Cisco's basic certification trackc. IOS upgradation knowledge and IOS patching knowledge4) Servera. Intermediate level knowledge in active directory DNS DHCP DFS IIS patch managementb. Strong knowledge in backup tools such as Veritas/Commvault/Windows backup storage concepts etcc. Strong Virtualization and basic cloud knowledged. AD group policy management group policy tools and troubleshooting GPO se. Basic AD object creation DNS concepts DHCP DFSf. Knowledge with tools like SCCM SCOM administration5) Storage and Backupa. In depth knowledge in Storage & Backup technology Storage allocation and reclamation Backup policy creation and managementb. Strong knowledge in server Network and virtualization technologies6) Toola. Knowledge in Infrastructure and application technologiesb. Understanding of monitoring concepts and processc. Understanding of key network monitoring protocols including SNMP NetFlow WMI syslog etcd. Knowledge in administration of tools like SCOM Solarwinds CA UIM Nagios ServiceNow etc7) Monitoringa. Good understanding of networking concepts and protocolsb. Knowledge in Server backup storage technologiesc. Desirable to have knowledge in SQL scriptingd. Knowledge in ITIL process8) Database:a. Knowledge of Database security9) Quality Analysisa. Exposure to FMEA audit practicesb. Exposure to technology/processes as per audit requirements.10) Working knowledge of MS Excel Word PPT Outlook etc.

Additional Comments

Role: Critical Incident Manager The resource must have Incident, Major Management experience and should possess or have experience in the following:

  • Ability to assess and troubleshoot high severity incidents.
  • Initiate and driving the bridge calls and sending communications.
  • Prepare notifications to be sent to senior and executive stakeholders across the company
  • Ensure normal services are restored as quickly as possible and adverse impact on operations due to incidents under MIM management in IT environment is minimized.
  • Oversee and collaborate with problem management and drive identification of root causes as well as sufficient prevention of incident recurrences
  • Execute MIM accordantly to its processes and tools to provide efficient, high-quality services
  • During MIM calls, relentless focus to reduce unplanned down time to Critical IT Infrastructure as well as Operationally Critical Applications (OCA)
  • After MIM Calls, perform analytics work and develop actions plan to minimize or eliminate future downtimes and drive the implementation of the developed action plan.
  • Drive/Manage service quality, performance, and improvement of service delivery processes as per established governance and reporting
  • Prepare Major IM analysis reports on a weekly, monthly, and quarterly basis to be provided to the Leadership team.
  • Demonstrate clear and definitive understanding of business needs and requirements.
  • Address any potential technical and non-technical issues and escalations within and outside of team which might impact the team’s performance adversely.
  • Maintain overview of daily records, incident logs & shift planners.
  • Undertake ad-hoc projects as agreed with the Management team.
  • Provide technical expertise for the planning and definition of new requirements, perform feasibility and performance studies, including benchmark, capacity, planning, sizing, etc.
  • Responsible for first level Service Continuity of the operations during regional/location outages. Keep team leads and Management appraised on operational activities, any associated risk/issues, workload distribution and accomplishments; participate in achieving resolutions to identified issues.
  • Seek and provide feedback, mentoring, support, and career development to and from team.
  • Routinely provides informal coaching, technical trainings, through suggestions, feedback, and encouragement to less experienced team members.
  • Be the 1st point of contact in case of any untoward Incidence with respect to work/team dynamics/attitude & approach by team members.
  • Ensure positive team satisfaction and strong relationship is maintained for service delivery with cross functional teams
  • Achieve agreed targets in terms of quality and time.
  • Ensure business is not always affected under any circumstance.
  • Achievement of service levels corresponding to industry best practice.
  • Resource must have solid understanding on Data center Infrastructure Technical components, basic functioning specially Networking domain, Azure Cloud knowledge Requirements:
  • 1-3 years of experience
  • Qualification BE, B Tec, BCA, BSc IT
  • ITIL Certification

Skills

MIM Management,Incident Management,Troubleshooting

How to apply

To apply for this job you need to authorize on our website. If you don't have an account yet, please register.

Post a resume

Similar jobs

Senior Process Associate

Lifelancer, Thiruvananthapuram, Kerala
1 week ago
Job FamilyGIC Process (India)Travel RequiredNoneClearance RequiredNoneWhat You Will DoAbility to understand the EDD requirements for individual customers when completing documentation inclusive Periodic Review, Client Onboarding and Trigger Review.Conduct clear and concise request for information (RFI) to customers to obtain specific information/documentation required to fulfill EDDAbility to understand and interpret customer conversations. It involves assessing customer responses, translating them to actionable...

Site Reliability Developer 3

Oracle, Thiruvananthapuram, Kerala
1 week ago
Job DescriptionThe ideal candidate will be a detail-oriented professional with a robust technical background, a proven track record in Site Reliability Engineering, and a passion for improving service reliability and performance. You should thrive in a fast-paced environment, be adept at collaborating with diverse teams, and have a proactive approach to problem-solving and continuous improvement.My Oracle Support (MOS) is Oracle's...

Lead - Architect (Terminals)

Adani Airport Holdings Ltd, Thiruvananthapuram, Kerala
3 weeks ago
ResponsibilitiesKey Responsibilities:Responsible for design leadership, design delivery & quality of assigned projectsDesign management of projects from concept design till as-built project stagesEnsure all works are carried out as per the currently prevalent Codes and StandardsProvide design/technical leadership to resolve all outstanding design issuesLead role on assigned projects in obtaining internal stakeholder approvals, as necessaryLead role on assigned projects in obtaining...