Site Reliability Developer 3
Oracle
Date: 1 day ago
City: Thiruvananthapuram, Kerala
Contract type: Full time

Job Description
Ideal Candidate Profile
We are seeking a detail-oriented professional with a strong technical background, a proven track record in Site Reliability Engineering, and a passion for improving service reliability and performance. The ideal candidate should thrive in a fast-paced environment, excel in cross-functional team collaboration, and take a proactive approach to problem-solving and continuous improvement.
My Oracle Support (MOS)
My Oracle Support (MOS) is Oracle's Enterprise Support solution, and the MOS Development (Dev) team is responsible for creating and maintaining the My Oracle Support application. This includes both the customer-facing portal for external users and the employee-facing portal for internal support engineers. Additionally, the MOS Dev team manages the entire ecosystem that supports customer interactions and support processes.
We work closely with Global Customer Support to understand business needs and ensure these requirements are integrated into MOS releases.
Who are we looking for?
As a Site Reliability Engineer (SRE) for My Oracle Support, you will play a key role in the development, implementation, and maintenance of our support solution. We're looking for someone who excels in:
Key Responsibilities and Skills:
Your role will be crucial in maintaining the high-quality support that My Oracle Support delivers to both internal and external users. We look forward to discussing our role with you.
Career Level - IC3
Responsibilities
Key Responsibilities:
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.
We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] or by calling +1 888 404 2494 in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
Ideal Candidate Profile
We are seeking a detail-oriented professional with a strong technical background, a proven track record in Site Reliability Engineering, and a passion for improving service reliability and performance. The ideal candidate should thrive in a fast-paced environment, excel in cross-functional team collaboration, and take a proactive approach to problem-solving and continuous improvement.
My Oracle Support (MOS)
My Oracle Support (MOS) is Oracle's Enterprise Support solution, and the MOS Development (Dev) team is responsible for creating and maintaining the My Oracle Support application. This includes both the customer-facing portal for external users and the employee-facing portal for internal support engineers. Additionally, the MOS Dev team manages the entire ecosystem that supports customer interactions and support processes.
We work closely with Global Customer Support to understand business needs and ensure these requirements are integrated into MOS releases.
Who are we looking for?
As a Site Reliability Engineer (SRE) for My Oracle Support, you will play a key role in the development, implementation, and maintenance of our support solution. We're looking for someone who excels in:
Key Responsibilities and Skills:
- Cloud Experience: Knowledge of cloud environments (Oracle Cloud Infrastructure (OCI), AWS, Azure, GCP) including design, implementation, and troubleshooting
- Cloud Native Technologies: Understanding of cloud-native technologies such as Prometheus, Kubernetes, Helm, and container runtimes
- Kubernets knowledge and experience is critical
- Linux Proficiency
- Hands-on experience with Linux environments, troubleshooting, and testing changes via systemd and sysctl
- Demonstrate expertise in Linux-based systems and the ability to troubleshoot and resolve complex issues.
- Reliability and Performance: Ensure the reliability, performance, and security of My Oracle Support by proactively managing and optimizing the solution.
- Collaborative Problem-Solving: Work effectively with other professionals to maintain high standards and solve technical challenges.
- Communication: Strong verbal and written communication skills, capable of communicating effectively with technical and management staff during critical events.
- Interpersonal Skills: Ability to present ideas clearly in both business-friendly and user-friendly language. Strong team-oriented mindset.
- Self-Motivation: Highly self-motivated with a keen attention to detail.
- Troubleshooting: Ability to troubleshoot issues affecting large-scale service architectures and application stacks.
- Scripting/Programming: Proficiency in one or more scripting/programming languages (Python, Bash, Java, Ruby, Go).
- Development and Deployment: Collaborate with a skilled team to design, deploy, and enhance the My Oracle Support application.
- Technical Expertise: Demonstrate expertise in Linux-based systems and the ability to troubleshoot and resolve complex issues.
- Database Knowledge: Working knowledge of databases such as Oracle Database or MySQL, including the ability to write and design SQL queries.
- Configuration Management: Experience with tools like Puppet, Chef, or Ansible for configuration management and orchestration.
- Shift Flexibility: Ability to work as part of a global 24x7x365 DevOps team, including non-standard shifts, holidays, and weekends, on a rotational basis. Primary standard shift is US daytime.
- Agile Development: Experience in agile development using tools like Jira and Git.
- Analytical Skills: Strong analytical and problem-solving abilities with a customer service-focused approach.
- Task Management: Ability to prioritize and manage tasks effectively in a high-pressure environment.
Your role will be crucial in maintaining the high-quality support that My Oracle Support delivers to both internal and external users. We look forward to discussing our role with you.
Career Level - IC3
Responsibilities
Key Responsibilities:
- Understand and Manage Support Solutions: Gain a comprehensive understanding of the end-to-end configuration, technical dependencies, and behavior of Oracle's Enterprise support services.
- Maximize Service Availability: Strive to maximize service availability by enhancing the service during non-crisis periods and minimizing impact during crises. Focus on hardening the service to extend the time between service-impacting events.
- Identify Hardening Opportunities: Identify and address opportunities to improve service reliability, including enhancing monitoring coverage and recognizing actionable events that require intervention.
- Enhance SOPs: Develop and refine Standard Operating Procedures (SOPs) by creating documented responses to alerts. Automate these responses and integrate them with actionable events for streamlined incident management.
- Drive Major Incident Response: Actively participate in Major Incident bridges during critical service-impacting events to lead and coordinate effective service mitigation efforts
- Post-Mortems and Critical Repairs: Engage in Post Mortems and Critical Repair Items following service-impacting events to prevent recurrence and ensure continuous improvement
- Monitor and Improve: Understand and communicate the scale, capacity, security, and performance attributes and requirements of the service stack. Continuously work on improving telemetry, automation, and overall service reliability.
- Troubleshooting and Issue Resolution: Act as the ultimate escalation point for complex or critical issues, utilizing deep knowledge of service topology and dependencies to troubleshoot and define mitigations.
- Automation and Orchestration: Demonstrate a strong understanding of automation and orchestration principles to improve service availability, reduce time to mitigate issues, and enhance development velocity.
- Drive Continuous Improvement: Develop tools, drive down incident counts, reduce event severity, and minimize time to mitigate. Foster a “Site Up” culture and continuously review and enhance systems and methods to improve custo
- Technological Analysis: Contribute to the analysis and enhancement of MOS applications and internal tools, identifying and implementing durable solutions to complex challenges
- Collaborate with Development Teams: Partner with development teams to define and implement improvements in the support service architecture, ensuring that enhancements are aligned with overall goals.
- Articulate Technical Characteristics: Clearly communicate the technical characteristics of services and technology areas, guiding development teams in engineering and integrating advanced capabilities.
- Communication and Problem Solving: Employ excellent communication, technical analysis, and problem-solving skills to methodically address and resolve issues. Communicate clearly and professionally with internal stakeholders during high-priority situations, both in written and spoken forms.
- Team Development: Support the training and development of junior team members, sharing knowledge and best practices to foster growth within the team.
- Educational Background: Bachelor’s degree in Computer Science, Information Technology, or a related field. Relevant work experience may be considered in place of a degree
- Experience:
- 6 to 10 years of releveant industry experience experience
- Proven experience as a System Engineer, Software Engineer, or in a similar role, preferably with a focus on complex enterprise software solutions. Understanding of the Enterprise Cloud solutions and the ability to delve into complex services.
- Communication Skills: Excellent communication skills, analytical thinking, problem-solving capabilities, and attention to detail.
- Technical Skills: Proficiency in Linux-based systems, including administration, scripting, and troubleshooting.
- Judgment and Independence: Ability to handle varied and complex tasks independently, demonstrating sound judgment in decision-making.
- Monitoring and Performance: Knowledge of system monitoring tools, performance tuning, and capacity planning.
- Problem-Solving: Strong problem-solving abilities with a proven track record of analyzing and resolving complex technical issues.
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity.
We know that true innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing an inclusive workforce that promotes opportunities for all.
Oracle careers open the door to global opportunities where work-life balance flourishes. We offer competitive benefits based on parity and consistency and support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing [email protected] or by calling +1 888 404 2494 in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Senior AI Engineer
Armada,
Thiruvananthapuram, Kerala
1 week ago
About The CompanyArmada is an edge computing startup that provides computing infrastructure to remote areas where connectivity and cloud infrastructure is limited, as well as areas where data needs to be processed locally for real-time analytics and AI at the edge. We’re looking to bring on the most brilliant minds to help further our mission of bridging the digital divide...

Senior UI/UX Designer
Armada,
Thiruvananthapuram, Kerala
1 week ago
About The CompanyArmada is an edge computing startup that provides computing infrastructure to remote areas where connectivity and cloud infrastructure is limited, as well as areas where data needs to be processed locally for real-time analytics and AI at the edge. We’re looking to bring on the most brilliant minds to help further our mission of bridging the digital divide...

Human Resources and Operations Manager
Armada,
Thiruvananthapuram, Kerala
3 weeks ago
About the CompanyArmada is an edge computing startup that provides computing infrastructure to remote areas where connectivity and cloud infrastructure is limited, as well as areas where data needs to be processed locally for real-time analytics and AI at the edge. We’re looking to bring on the most brilliant minds to help further our mission of bridging the digital divide...
