Job Description
We are seeking a skilled Site Reliability Engineer (SRE) to join our team. As a Site Reliability Engineer (SRE), you will be responsible for designing, implementing, and maintaining our cloud infrastructure and CI/CD pipelines. You will work closely with our development and operations teams to automate processes, improve system reliability, and accelerate software delivery.
Responsibilities:
- Design, build, and maintain scalable and highly available infrastructure using automation and configuration management tools (e.g., Terraform, Ansible, Kubernetes).
- Monitor system performance, availability, and reliability using monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack).
- Implement and maintain continuous integration and continuous deployment (CI/CD) pipelines to automate software delivery and deployment processes.
- Conduct capacity planning and performance tuning to ensure systems can handle expected loads and scale dynamically as needed.
- Implement disaster recovery and failover mechanisms to minimize downtime and ensure business continuity.
- Troubleshoot and resolve issues related to system reliability, performance, and security.
- Participate in on-call rotations and respond to incidents in a timely manner, following incident response procedures.
- Collaborate with development teams to improve system architecture, reliability, and scalability.
- Document system designs, configurations, and procedures for knowledge sharing and future reference.
Requirements:
- Bachelor’s degree in Computer Science, Engineering, or related field.
- 2+ years of experience in Site Reliability Engineering, DevOps, or related roles.
- Proficient in Linux/Unix systems administration and shell scripting.
- Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and cloud services (e.g., EC2, S3, RDS).
- Strong understanding of networking concepts, protocols, and security best practices.
- Experience with infrastructure as code (IaC) tools such as Terraform or CloudFormation.
- Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes).
- Familiarity with monitoring and observability tools such as Prometheus, Grafana, and ELK stack.
- Strong problem-solving and troubleshooting skills.
- Excellent communication and collaboration abilities.
- Ability to work effectively in a fast-paced environment and prioritize tasks.
About Us:
Tekriders, Inc. is a leading provider of IT consulting services dedicated to helping businesses harness the power of technology to achieve their goals. With a focus on innovation, expertise, and customer satisfaction, we empower organizations to navigate the complexities of the digital landscape and drive sustainable growth. Our office located in South Plainfield, NJ.We started our journey in the year 2012. In the past 12+ years, we have overcome business challenges with custom software engineering and consulting services that add tangible value.Tekriders, Inc. is committed to a policy of Equal Employment Opportunity and will not discriminate against an applicant or employee based on age, sex, sexual orientation, race, color, creed, national origin, ancestry, disability, marital status, or any other legally protected basis under federal, state, or local law.
How to Apply:
Please submit your resume and portfolio showcasing your projects to HR Dept., Tekriders, Inc., 50 Cragwood Road, Suite 222, South Plainfield, NJ 07080.
Contact Us:
HR Manager
Job ID: TKI/NJ/SRE/02/24
E-Mail: jobs@tekriders.com