What is the salary for this Software Engineer position?

Salary information for this Software Engineer position is available upon application.

What experience is required for this Software Engineer role?

This Software Engineer position requires senior_level of experience.

Where is this Software Engineer job located?

This Software Engineer position is located in Bangalore.

How do I apply for this Software Engineer position at Cognite?

You can apply for this Software Engineer position by clicking the 'Apply Now' button on this page, which will direct you to the official application portal.

Job Opportunity: Senior Data Platform Engineer - Bangalore

Cognite is at the forefront of revolutionizing industrial data management with Cognite Data Fusion, our advanced SaaS platform that transforms how industrial companies utilize their data. We are actively seeking a Senior Data Platform Engineer to join our dynamic team in Bangalore. This role is ideal for individuals who excel in building high-performance distributed systems and thrive in a fast-paced startup environment. You will tackle complex data infrastructure challenges that directly influence how Fortune 500 industrial companies manage their critical operational data.

Core Responsibilities:

High-Performance Data Systems:

Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka to handle terabyte-scale industrial datasets.
Develop efficient APIs and services supporting thousands of concurrent users with sub-second response times.
Optimize data storage and retrieval strategies for time-series, sensor, and operational data.
Implement advanced caching mechanisms using Redis and in-memory data structures.

Distributed Processing Excellence:

Engineer Spark applications with a deep understanding of the Catalyst optimizer, partitioning strategies, and performance tuning.
Develop real-time streaming solutions capable of processing millions of events per second with Kafka and Flink.
Design efficient data lake architectures on S3/GCS, utilizing optimized partitioning and file formats like Parquet and ORC.
Implement query optimization techniques for OLAP datastores such as ClickHouse, Pinot, or Druid.

Scalability and Performance:

Scale systems to handle 10K+ queries per second (QPS) while ensuring high availability and data consistency.
Optimize JVM performance through advanced garbage collection tuning and memory management.
Implement comprehensive monitoring solutions using Prometheus, Grafana, and distributed tracing.
Design fault-tolerant architectures incorporating robust circuit breakers and retry mechanisms.

Technical Innovation:

Contribute to open-source projects within the big data ecosystem, including Spark, Kafka, and Airflow.
Research and prototype new technologies to address unique industrial data challenges.
Collaborate closely with product teams to translate complex requirements into scalable technical solutions.
Actively participate in architectural reviews and technical design discussions.

Requirements:

Distributed Systems Experience (2-6 years):

Proven production experience with Spark, including building and optimizing large-scale applications with an understanding of internal workings.
Proficiency in streaming systems and experience implementing real-time data processing using Kafka, Flink, or Spark Streaming.
Expertise in JVM languages such as Java, Scala, or Kotlin, with a strong focus on performance optimization.

Data Platform Foundations (3+ years):

Hands-on experience with big data storage systems, data lakes, columnar formats, and table formats (e.g., Iceberg, Delta Lake).
Experience with OLAP query engines and high-performance analytical databases like Presto/Trino, ClickHouse, Pinot, or similar.
Experience in ETL/ELT pipeline development and building robust data transformation pipelines using tools such as DBT, Airflow, or custom frameworks.

Infrastructure and Operations:

Production experience with Kubernetes, including deploying and operating containerized applications.
Proficiency with cloud platforms (AWS, Azure, or GCP) and their data services.

Monitoring and Observability:

Experience implementing comprehensive logging, metrics, and alerting for data systems.

Bonus Points:

Contributions to major Apache open-source projects in the data space (e.g., Apache Spark or Kafka).
Experience with conference speaking, technical blogging, industrial domain knowledge, or previous work with IoT, manufacturing, or operational technology systems.

Primary Technologies (Technical Stack):

Languages: Kotlin, Scala, Python, and Java.
Big Data: Apache Spark, Apache Flink, Apache Kafka.
Storage: PostgreSQL, ClickHouse, Elasticsearch, S3-compatible systems.
Infrastructure: Kubernetes, Docker, Terraform.

Technologies You May Work With:

Table Formats: Apache Iceberg, Delta Lake, Apache Hudi.
Query Engines: Trino/Presto, Apache Pinot, DuckDB.
Orchestration: Apache Airflow, Dagster.
Monitoring: Prometheus, Grafana, Jaeger, and ELK Stack.

Software Engineer

Auto Apply to 50+ AI Matched Software Engineer Jobs

Qualifications & Requirements

Full Job Description