Cognite
Cognite10d ago
InstaHyre

Senior Software Engineer

Bangalore
Full Time
Senior Level

Auto Apply to 50+ AI Matched Senior Software Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at Cognite

Qualifications & Requirements

Experience Level: Senior Level

Full Job Description

Senior Data Platform Engineer - Bangalore

Cognite is at the forefront of revolutionizing industrial data management with Cognite Data Fusion, our advanced SaaS platform. We are seeking a talented Senior Data Platform Engineer to join our dynamic team in Bangalore. This role is ideal for an individual who excels at building high-performance distributed systems and thrives in a fast-paced startup environment. You will tackle complex data infrastructure challenges, directly influencing how Fortune 500 industrial companies manage their critical operational data.

Responsibilities

High-Performance Data Systems

  • Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka for terabyte-scale industrial datasets.
  • Build efficient APIs and services supporting thousands of concurrent users with sub-second response times.
  • Optimize data storage and retrieval for time-series, sensor, and operational data.
  • Implement advanced caching strategies leveraging Redis and in-memory data structures.

Distributed Processing Excellence

  • Engineer Spark applications, focusing on Catalyst optimizer, partitioning strategies, and performance tuning.
  • Develop real-time streaming solutions processing millions of events per second with Kafka and Flink.
  • Design efficient data lake architectures on S3/GCS with optimized partitioning and file formats (Parquet, ORC).
  • Implement query optimization techniques for OLAP datastores such as ClickHouse, Pinot, or Druid.

Scalability and Performance

  • Scale systems to handle 10K+ QPS while ensuring high availability and data consistency.
  • Optimize JVM performance through advanced garbage collection tuning and memory management.
  • Implement comprehensive monitoring using Prometheus, Grafana, and distributed tracing.
  • Design fault-tolerant architectures incorporating circuit breakers and retry mechanisms.

Technical Innovation

  • Contribute to open-source projects within the big data ecosystem (e.g., Spark, Kafka, Airflow).
  • Research and prototype new technologies to address industrial data challenges.
  • Collaborate with product teams to translate complex requirements into scalable technical solutions.
  • Participate actively in architectural reviews and technical design discussions.

Requirements

Distributed Systems Experience (4-6 years)

  • Production Spark expertise: Proven experience building and optimizing large-scale Spark applications with a deep understanding of internals.
  • Streaming systems proficiency: Experience implementing real-time data processing using Kafka, Flink, or Spark Streaming.
  • JVM Language expertise: Strong programming skills in Java, Scala, or Kotlin, with a focus on performance optimization.

Data Platform Foundations (3+ years)

  • Big data storage systems: Hands-on experience with data lakes, columnar formats, and table formats (e.g., Iceberg, Delta Lake).
  • OLAP query engines: Experience with Presto/Trino, ClickHouse, Pinot, or similar high-performance analytical databases.
  • ETL/ELT pipeline development: Experience building robust data transformation pipelines using tools like DBT, Airflow, or custom frameworks.

Infrastructure and Operations

  • Kubernetes production experience: Experience deploying and operating containerized applications in production environments.
  • Cloud platform proficiency: Hands-on experience with AWS, Azure, or GCP data services.
  • Monitoring and observability: Experience implementing comprehensive logging, metrics, and alerting for data systems.

Technical Depth Indicators

  • Performance Engineering: Proven system optimization experience, delivering measurable performance improvements (e.g., 2x+ throughput gains).
  • Resource efficiency: Experience optimizing systems for cost while meeting performance requirements.
  • Concurrency expertise: Experience designing thread-safe, high-concurrency data processing systems.

Data Engineering Best Practices

  • Data quality frameworks: Experience implementing validation, testing, and monitoring for data pipelines.
  • Schema evolution: Experience managing backward-compatible schema changes in production systems.
  • Data modeling expertise: Experience designing efficient schemas for analytical workloads.

Collaboration and Growth

  • Technical Collaboration: Ability to partner effectively with product managers, ML engineers, and data scientists.
  • Codereview excellence: Commitment to providing thoughtful technical feedback and maintaining high code quality.
  • Documentation and knowledge sharing: Experience creating technical documentation and facilitating knowledge transfer.
  • Continuous Learning: Aptitude for quickly learning and applying new technologies.
  • Industry awareness: Staying current with big data ecosystem developments and best practices.
  • Problem-solving approach: Demonstrating a systematic approach to debugging complex distributed system issues.

Startup Mindset

  • Execution Excellence: Proven ability for rapid delivery of high-quality features.
  • Technical pragmatism: Skill in making informed trade-offs between technical debt, velocity, and reliability.
  • End-to-end ownership: Taking responsibility for features from design through production and monitoring.
  • Ambiguity comfort: Thriving in environments with evolving requirements.
  • Technology flexibility: Adaptability to new tools and frameworks.
  • Customer focus: Understanding the impact of technical decisions on user experience and business metrics.

Bonus Points

  • Open-source contributions to major Apache projects in the data space (e.g., Apache Spark or Kafka).
  • Conference speaking or technical blog writing experience.
  • Industrial domain knowledge: Prior experience with IoT, manufacturing, or operational technology systems.

Primary Technologies (Technical Stack)

  • Languages: Kotlin, Scala, Python, Java.
  • Big Data: Apache Spark, Apache Flink, Apache Kafka.
  • Storage: PostgreSQL, ClickHouse, Elasticsearch, S3-compatible systems.
  • Infrastructure: Kubernetes, Docker, Terraform.

Technologies You May Work With

  • Table Formats: Apache Iceberg, Delta Lake, Apache Hudi.
  • Query Engines: Trino/Presto, Apache Pinot, DuckDB.
  • Orchestration: Apache Airflow, Dagster.
  • Monitoring: Prometheus, Grafana, Jaeger, ELK Stack.

Company

Cognite

Cognite

Cognite: Digitalizing the Industrial WorldCognite is a leading global industrial Software-as-a-Service (SaaS) provider dedicated to the digital transformation of asset-intensive industries. We develop...

Bangalore
Posted on InstaHyre