Site Reliability Engineer
Ather Energy
Date: 10 hours ago
City: Pune, Maharashtra
Contract type: Full time

You’ll be our: Site Reliability Engineer
You’ll be based at: Pune Zonal Office
You’ll be aligned with: Cloud and Data Platform Lead / Cloud Architect
You’ll be a member of: Cloud and Data Platform Team
Ather's fleet of smart scooters is growing rapidly, and so is the volume of data they generate. Our Vehicle Data Platform (VDP) is the core of this ecosystem, and its stability and scalability are critical to our success. We are looking for a foundational Site Reliability Engineer to join our VDP team, taking full ownership of our data infrastructure and building a robust reliability practice to support our rapid growth.
What You’ll do at ather:
You’ll be based at: Pune Zonal Office
You’ll be aligned with: Cloud and Data Platform Lead / Cloud Architect
You’ll be a member of: Cloud and Data Platform Team
Ather's fleet of smart scooters is growing rapidly, and so is the volume of data they generate. Our Vehicle Data Platform (VDP) is the core of this ecosystem, and its stability and scalability are critical to our success. We are looking for a foundational Site Reliability Engineer to join our VDP team, taking full ownership of our data infrastructure and building a robust reliability practice to support our rapid growth.
What You’ll do at ather:
- Run and own the production environment by managing alerts, leading incident response, conducting root cause analysis (RCA), and implementing permanent fixes.
- Take full ownership of our ClickHouse database clusters as we move from a managed service, managing their performance, reliability, and scaling internally.
- Build and maintain our core infrastructure using Infrastructure-as-Code principles (Terraform).
- Perform critical, periodic maintenance and upgrades for our infrastructure, with a strong focus on Kubernetes, Cloud SQL, and data workloads like Kafka.
- Partner with the Data Engineering team to support the underlying infrastructure for our new Databricks platform, ensuring robust and efficient data ingestion pipelines.
- Enhance observability by building and refining our monitoring, logging, and tracing systems to proactively identify performance bottlenecks.
- Lead capacity planning and forecasting for all cloud workloads, ensuring our platform can scale effectively for the next 6-12 months.
- Drive cloud cost optimization by monitoring spending, identifying and implementing savings opportunities, and ensuring resource governance.
- Our ideal candidate is a strong software engineer at heart with deep expertise in cloud-native infrastructure.
- The main focus areas for this role are:
- Significant Coding Experience: You must have a strong software engineering background with significant coding experience in a language like Python, Go, or Java, focusing on writing clean, scalable, and automated solutions.
- Deep Cloud Proficiency: You need deep, hands-on experience with at least one major cloud provider (GCP, AWS, or Azure). A strong background in GCP is highly preferred.
- Production Kubernetes Expertise: You must have proven, hands-on experience designing, running, and troubleshooting applications on Kubernetes in a production environment.
- Other key qualifications include:
- Hands-on experience with infrastructure automation tools like Terraform or Ansible.
- Strong expertise in building and managing CI/CD pipelines.
- Experience administering, monitoring, and scaling ClickHouse clusters is highly desirable.
- Familiarity with data platforms like Databricks and their infrastructure requirements.
- Experience with messaging queues like Kafka.
- Strong Linux administration, system internals, and network troubleshooting skills.
- A Bachelor’s or Master’s degree in Computer Science or a related engineering field.
- 3 to 6 years of relevant experience as a Site Reliability Engineer, DevOps Engineer, or Software Engineer with a focus on infrastructure.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Senior Software Engineer (Site Reliability Engineer - SRE)
Autodesk,
Pune, Maharashtra
2 days ago
Job Requisition ID #25WD89345Position OverviewWe are looking for a passionate Sr. Software Reliability Engineer to join our platform team in Pune, India. Our organisational ecosystem comprises Cloud services. Autodesk Platform Services (APS) is a cloud service platform that powers custom and pre-built applications, integrations, and innovative solutions. It offers APIs and web services to unlock the values of our customers'...

Lead Associate - Accounting and Finance Controllership
Nexdigm,
Pune, Maharashtra
2 days ago
<p class="MsoNormal" style="text-align: justify;"><span style="font-weight: bolder;"><span lang="EN-US" style="font-size: 10pt; font-family: Arial, sans-serif;"><span style="font-family: Arial; font-size: 14px;">About Us:<span style="font-weight: bolder; text-align: justify;"><span lang="EN-US" style="font-family: Arial;">Click here to know - 'Who we are?'<span style="font-weight: bolder;"><span lang="EN-US" style="font-family: Arial;">JOB DESCRIPTION:<span style="font-weight: bolder;"><span style="line-height: 15.3333px; font-family: Arial;">DESIRED SKILL:<span style="font-weight: bolder;"><span lang="EN-US" style="line-height: 15.3333px; font-family: Arial;">Accounting<span lang="EN-US" style="line-height: 15.3333px; font-family: Arial;">Review of Accounting for Sales,...

Senior Rating Analyst
Crisil,
Pune, Maharashtra
2 days ago
RoleClient Management / Stakeholder Management;Discuss with the key management personnel of the corporates being rated to understand their strategy and business modelCommunicate the final Rating to the clientAnalyse Company’s financial performance which include;Past financial and business performance & Benchmark these against peers performance. Project future performance of the companyUndertake rigorous credit risk analysis encompassing industry / business research and financial...
