What is the salary for this Senior Software Engineer position?

Salary information for this Senior Software Engineer position is available upon application.

What experience is required for this Senior Software Engineer role?

This Senior Software Engineer position requires senior_level of experience.

Where is this Senior Software Engineer job located?

This Senior Software Engineer position is located in Bangalore.

How do I apply for this Senior Software Engineer position at Cognite?

You can apply for this Senior Software Engineer position by clicking the 'Apply Now' button on this page, which will direct you to the official application portal.

Senior Data Platform Engineer - Bangalore

Cognite is at the forefront of revolutionizing industrial data management with Cognite Data Fusion, our advanced SaaS platform. We are seeking a talented Senior Data Platform Engineer to join our dynamic team in Bangalore. This role is ideal for an individual who excels at building high-performance distributed systems and thrives in a fast-paced startup environment. You will tackle complex data infrastructure challenges, directly influencing how Fortune 500 industrial companies manage their critical operational data.

Responsibilities

High-Performance Data Systems

Design and implement robust data processing pipelines using Apache Spark, Flink, and Kafka for terabyte-scale industrial datasets.
Build efficient APIs and services supporting thousands of concurrent users with sub-second response times.
Optimize data storage and retrieval for time-series, sensor, and operational data.
Implement advanced caching strategies leveraging Redis and in-memory data structures.

Distributed Processing Excellence

Engineer Spark applications, focusing on Catalyst optimizer, partitioning strategies, and performance tuning.
Develop real-time streaming solutions processing millions of events per second with Kafka and Flink.
Design efficient data lake architectures on S3/GCS with optimized partitioning and file formats (Parquet, ORC).
Implement query optimization techniques for OLAP datastores such as ClickHouse, Pinot, or Druid.

Scalability and Performance

Scale systems to handle 10K+ QPS while ensuring high availability and data consistency.
Optimize JVM performance through advanced garbage collection tuning and memory management.
Implement comprehensive monitoring using Prometheus, Grafana, and distributed tracing.
Design fault-tolerant architectures incorporating circuit breakers and retry mechanisms.

Technical Innovation

Contribute to open-source projects within the big data ecosystem (e.g., Spark, Kafka, Airflow).
Research and prototype new technologies to address industrial data challenges.
Collaborate with product teams to translate complex requirements into scalable technical solutions.
Participate actively in architectural reviews and technical design discussions.

Requirements

Distributed Systems Experience (4-6 years)

Production Spark expertise: Proven experience building and optimizing large-scale Spark applications with a deep understanding of internals.
Streaming systems proficiency: Experience implementing real-time data processing using Kafka, Flink, or Spark Streaming.
JVM Language expertise: Strong programming skills in Java, Scala, or Kotlin, with a focus on performance optimization.

Data Platform Foundations (3+ years)

Big data storage systems: Hands-on experience with data lakes, columnar formats, and table formats (e.g., Iceberg, Delta Lake).
OLAP query engines: Experience with Presto/Trino, ClickHouse, Pinot, or similar high-performance analytical databases.
ETL/ELT pipeline development: Experience building robust data transformation pipelines using tools like DBT, Airflow, or custom frameworks.

Infrastructure and Operations

Kubernetes production experience: Experience deploying and operating containerized applications in production environments.
Cloud platform proficiency: Hands-on experience with AWS, Azure, or GCP data services.
Monitoring and observability: Experience implementing comprehensive logging, metrics, and alerting for data systems.

Technical Depth Indicators

Performance Engineering: Proven system optimization experience, delivering measurable performance improvements (e.g., 2x+ throughput gains).
Resource efficiency: Experience optimizing systems for cost while meeting performance requirements.
Concurrency expertise: Experience designing thread-safe, high-concurrency data processing systems.

Data Engineering Best Practices

Data quality frameworks: Experience implementing validation, testing, and monitoring for data pipelines.
Schema evolution: Experience managing backward-compatible schema changes in production systems.
Data modeling expertise: Experience designing efficient schemas for analytical workloads.

Collaboration and Growth

Technical Collaboration: Ability to partner effectively with product managers, ML engineers, and data scientists.
Codereview excellence: Commitment to providing thoughtful technical feedback and maintaining high code quality.
Documentation and knowledge sharing: Experience creating technical documentation and facilitating knowledge transfer.
Continuous Learning: Aptitude for quickly learning and applying new technologies.
Industry awareness: Staying current with big data ecosystem developments and best practices.
Problem-solving approach: Demonstrating a systematic approach to debugging complex distributed system issues.

Startup Mindset

Execution Excellence: Proven ability for rapid delivery of high-quality features.
Technical pragmatism: Skill in making informed trade-offs between technical debt, velocity, and reliability.
End-to-end ownership: Taking responsibility for features from design through production and monitoring.
Ambiguity comfort: Thriving in environments with evolving requirements.
Technology flexibility: Adaptability to new tools and frameworks.
Customer focus: Understanding the impact of technical decisions on user experience and business metrics.

Bonus Points

Open-source contributions to major Apache projects in the data space (e.g., Apache Spark or Kafka).
Conference speaking or technical blog writing experience.
Industrial domain knowledge: Prior experience with IoT, manufacturing, or operational technology systems.

Primary Technologies (Technical Stack)

Languages: Kotlin, Scala, Python, Java.
Big Data: Apache Spark, Apache Flink, Apache Kafka.
Storage: PostgreSQL, ClickHouse, Elasticsearch, S3-compatible systems.
Infrastructure: Kubernetes, Docker, Terraform.

Technologies You May Work With

Table Formats: Apache Iceberg, Delta Lake, Apache Hudi.
Query Engines: Trino/Presto, Apache Pinot, DuckDB.
Orchestration: Apache Airflow, Dagster.
Monitoring: Prometheus, Grafana, Jaeger, ELK Stack.

Senior Software Engineer

Auto Apply to 50+ AI Matched Senior Software Engineer Jobs

Qualifications & Requirements

Full Job Description