Software DATA Engineer

Location: Remote

Job Type: Full-Time

Salary: 77K

About the Role

We are seeking an experienced and highly motivated Data Engineer to join our growing team. In this role, you will be responsible for designing, developing, and maintaining scalable data platforms and pipelines that support business intelligence, analytics, machine learning, and operational reporting initiatives.

You will work closely with data analysts, software engineers, architects, and business stakeholders to deliver robust, high-performance data solutions in a cloud-native AWS environment. The ideal candidate has strong expertise in PySpark, Python, Apache Airflow, AWS services, Terraform, and modern DevOps practices.

Good to have SC eligibility or SC clearance

Key Responsibilities

Data Engineering & Pipeline Development

Design, develop, and maintain scalable, reliable, and efficient data pipelines using PySpark and Python.
Build high-volume batch and real-time data processing solutions capable of handling large-scale datasets.
Develop, optimize, and monitor ETL/ELT workflows to ensure data quality, consistency, and availability.
Implement data transformation, cleansing, enrichment, and validation processes.
Troubleshoot and resolve data pipeline failures, bottlenecks, and performance issues.

Workflow Orchestration

Design and manage complex workflows using Apache Airflow.
Create and maintain DAGs with robust scheduling, dependency management, alerting, and recovery mechanisms.
Monitor workflow execution and proactively address failures or performance concerns.
Implement workflow best practices to ensure reliability and maintainability.

Cloud Data Architecture (AWS)

Architect and implement cloud-native data solutions on AWS.
Develop scalable and secure data platforms leveraging:
Amazon S3
Amazon Redshift
AWS Glue
AWS Lambda
Amazon EMR
API Gateway
Amazon CloudWatch
AWS IAM
Ensure adherence to security, governance, and compliance standards.
Optimize cloud resources for performance and cost efficiency.

Infrastructure as Code

Provision and manage AWS infrastructure using Terraform.
Develop reusable Terraform modules and templates.
Implement infrastructure automation to support development, testing, and production environments.
Maintain version-controlled infrastructure and deployment processes.

DevOps & CI/CD

Design and maintain CI/CD pipelines using GitHub Actions.
Automate testing, deployment, monitoring, and infrastructure updates.
Support continuous integration and continuous delivery best practices.
Collaborate with engineering teams to improve deployment reliability and efficiency.

Required Skills & Experience

Strong experience with Python and PySpark.
Hands-on expertise with Apache Airflow.
Extensive experience working with AWS cloud services.
Strong knowledge of Amazon Redshift, AWS Glue, S3, Lambda, EMR, API Gateway, CloudWatch, and IAM.
Experience with Terraform and Infrastructure as Code (IaC).
Proficiency with Git, GitHub Actions, and CI/CD pipelines.
Solid understanding of distributed data processing and Spark optimization.
Experience designing scalable data architectures and data models.
Strong SQL skills and understanding of data warehousing concepts.
Excellent troubleshooting, analytical, and problem-solving abilities.
Strong communication and collaboration skills.

If you are passionate about building scalable data platforms and solving complex data challenges, we would love to hear from you.

Apply Now

Software DATA Engineer

Job Details