We are seeking a passionate and skilled Cloud Performance QA Engineer to evaluate the scalability, responsiveness, and resilience of the Tarana Cloud Suite, a large-scale distributed system. This includes validating cloud microservices, databases, and real-time communication with intelligent radio devices. You will be a key member of the QA team, responsible for performance, load, stress, and soak testing, as well as conducting chaos testing and fault injection to ensure system robustness. You will simulate production-like environments, analyze bottlenecks, and collaborate closely with development, DevOps, and SRE teams to proactively address performance issues. This role requires a strong understanding of system internals, cloud infrastructure (AWS), and modern observability tools. Your work will directly impact the quality, reliability, and scalability of our next-gen wireless platform.
Responsibilities:
- Understand the Tarana Cloud Suite architecture, including microservices, UI, data/control flows, databases, and AWS-hosted runtime.
- Design and implement robust load, performance, scalability, and soak tests using tools like Locust and JMeter.
- Set up and manage scalable test environments on AWS to mimic production loads.
- Build and maintain performance dashboards using Grafana, Prometheus, or similar observability tools.
- Analyze performance test results and infrastructure metrics to identify bottlenecks and optimization opportunities.
- Integrate performance testing into CI/CD pipelines for automated baselining and regression detection.
- Collaborate with cross-functional teams to define SLAs, set performance benchmarks, and resolve performance-related issues.
- Conduct resilience and chaos testing using fault injection tools to validate system behavior under stress and failures.
- Debug and root-cause performance degradations using logs, APM tools, and resource profiling.
- Tune infrastructure parameters (e.g., autoscaling policies, thread pools, database connections) for improved efficiency.
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- 6-10 years of experience in Performance Testing/Engineering.
- Hands-on expertise with Locust, JMeter, or equivalent load testing tools.
- Strong experience with AWS services such as EC2, ALB/NLB, CloudWatch, EKS/ECS, S3, etc.
- Familiarity with Grafana, Prometheus, and APM tools like Datadog, New Relic, or similar.
- Strong understanding of system metrics: CPU, memory, disk I/O, network throughput, etc.
- Proficiency in scripting and automation (Python preferred) for custom test scenarios and analysis.
- Experience with testing and profiling REST APIs, web services, and microservices-based architectures.
- Exposure to chaos engineering tools (e.g., Gremlin, Chaos Mesh, Litmus) or fault injection practices.
- Experience with CI/CD tools (e.g., Jenkins, GitLab CI) and integrating performance tests into build pipelines.
- Experience with Kubernetes-based environments and container orchestration.
- Knowledge of infrastructure-as-code tools (Terraform, CloudFormation).
- Background in network performance testing and traffic simulation.
- Experience in capacity planning and infrastructure cost optimization.