Some Front End ability (Vue, React or Angular good but not necessary) Agile The following is DESIRABLE, not essential: AWS or GCP Buy-side Data tools such as Glue, Athena, Airflow, Ignite, DBT, Arrow, Iceberg, Dremio Fixed Income performance, risk or attribution TypeScript and Node Role: Python Developer (Software Engineer Programmer Developer Python Fixed Income JavaScript Node Fixed Income … requires the team to be in the office 1-2 times a week. The tech environment is very new and will soon likely include exposure to the following: Glue, Athena, Airflow, Ignite, DBT, Arrow, Iceberg, Dremio This is an environment that has been described as the only corporate environment with a start-up/fintech attitude towards technology. Hours More ❯
Reston, Virginia, United States Hybrid / WFH Options
ICF
/parallel processing frameworks to prepare big data for the use of data analysts and data scientists. If you have experience with Apache Parquet, Apache Spark, AWS Glue, AWS Athena, Databricks and want your work to contribute to systems that collect healthcare data used by hundreds of thousands of daily users, we want to (virtually) meet you! You will … want you to use your knowledge of Spark to teach others, inform design decisions, and debug runtime problems. Tools & Technology Python PySpark Spark Databricks PostgreSQL Jenkins AWS Glue AWS Athena Java JavaScript Git and GitHub Confluence Key Responsibilities and Job Duties Design and build data processing pipelines using tools and frameworks in the AWS ecosystem. Analyze requirements and architecture More ❯
of orchestration tooling (e.g., Airflow, Dagster, Azure Data Factory, Fivetran) Desirable: - Experience deploying AI/ML models in production environments - Familiarity with AWS data services (e.g., S3, Glue, Kinesis, Athena) - Exposure to real-time data streaming and analytics paradigms #LI-RJ1 Together, as owners, let's turn meaningful insights into action. Life at CGI is rooted in ownership, teamwork More ❯
Fairfax, Virginia, United States Hybrid / WFH Options
Metronome LLC
and services that expose AI capabilities to internal and external consumers Data Tool Development Design and implement data pipelines, dashboards, and analytics tools using AWS services such as Glue, Athena, Redshift, and QuickSight. Automate data ingestion, transformation, and visualization workflows Cloud Engineering (AWS) Deploy and manage applications using AWS services, including Lambda, ECS, S3, CloudFormation, and CDK. Implement CI More ❯
Columbia, South Carolina, United States Hybrid / WFH Options
Systemtec Inc
in big data technologies and cloud-based technologies AWS Services, State Machines, CDK, Glue, TypeScript, CloudWatch, Lambda, CloudFormation, S3, Glacier Archival Storage, DataSync, Lake Formation, AppFlow, RDS PostgreSQL, Aurora, Athena, Amazon MSK, Apache Iceberg, Spark, Python ONSITE: Partially onsite 3 days per week (Tue, Wed, Thurs) and as needed. Standard work hours: 8:30 AM - 5:00 PM … related degree 6 years-of application development, systems testing Nice to have: AWS Redshift, Databricks- delta Lake, Unity catalog, Data Engineering and processing using Databricks, AI and Machine Learning Amazon Bedrock, AWS Sagemaker, Unified Studio, R Studio/Posit Workbench, R Shiny/Posit Connect, Posit Package Manager, AWS Data Firehose, Kafka, Hive, Hue, Oozie, Sqoop, Git/Git More ❯
Use Terraform to automate infrastructure provisioning, deployment, and configuration, ensuring efficiency and repeatability in cloud environments. Database Design & Optimisation : Design and optimise complex SQL queries, and relational databases (e.g., Amazon Redshift, PostgreSQL, MySQL) to enable fast, efficient data retrieval and analytics. Data Transformation : Apply ETL/ELT processes to transform raw financial data into usable insights for business intelligence … understanding of data engineering concepts, including data modelling, ETL/ELT processes, and data warehousing. Proven experience with AWS services (e.g., S3, Redshift, Lambda, ECS, ECR, SNS, Eventbridge, CloudWatch, Athena etc.) for building and maintaining scalable data solutions in the cloud. Technical Skills (must have): Python: Proficient in Python for developing custom ETL solutions, data processing, and integration with More ❯
Platform Engineer to join our team and help develop and maintain a high scale, fully serverless cloud data platform. Our product ingests and processes large volumes of data into Amazon S3 using a modern architecture based on AWS serverless services and enables powerful querying and analytics through Amazon Athena. In this role, you'll work on a system More ❯
and building agent AI systems Our technology stack Python and associated ML/DS libraries (Scikit-learn, Numpy, LightlGBM, Pandas, TensorFlow, etc ) PySpark AWS cloud infrastructure: EMR, ECS, S3, Athena, etc. MLOps: Terraform, Docker, Airflow, MLFlow, Jenkins On call statement: Please be aware that our Machine Learning Engineers are required to be a part of the technology on-call More ❯
and building agent AI systems Our technology stack Python and associated ML/DS libraries (Scikit-learn, Numpy, LightlGBM, Pandas, TensorFlow, etc...) PySpark AWS cloud infrastructure: EMR, ECS, S3, Athena, etc. MLOps: Terraform, Docker, Airflow, MLFlow, Jenkins On call statement: Please be aware that our Machine Learning Engineers are required to be a part of the technology on-call More ❯
data architecture principles and how these can be practically applied. Experience with Python or other scripting languages Good working knowledge of the AWS data management stack (RDS, S3, Redshift, Athena, Glue, QuickSight) or Google data management stack (Cloud Storage, Airflow, Big Query, DataPlex, Looker) About Our Process We can be flexible with the structure of our interview process if More ❯
and data ingestion tools such as Airflow and Stitch, along with Python scripting for integrating diverse data sources. Large-scale data processing: Proficient with distributed query engines like AWS Athena or SparkSQL for working with datasets at the scale of billions of rows. Event streaming data: Experienced in working with live streamed event data, including transforming and modeling real More ❯
on OCR use-cases and LLM applications within AWS environments. Key Responsibilities: - AWS Data Science Tools: Hands-on with SageMaker, Lambda, Step Functions, S3, Athena. - OCR Development: Experience with Amazon Textract, Tesseract, and LLM-based OCR. - Python Expertise: Skilled in Pandas, NumPy, scikit-learn, PyTorch, Hugging Face Transformers; modular, testable code. - ML Models: Proficient in regression, classification, clustering, and … needs into data-driven solutions and actionable insights. - Stakeholder Engagement: Communicate effectively across technical and non-technical teams. - Data Engineering: Basic skills in SQL and big data tools (e.g., Athena). - Experimentation: A/B testing, statistical analysis, performance metrics. - Compliance: Knowledge of data privacy (GDPR), PII handling. - Agile Working: Experience in Agile/Scrum teams (Jira, Azure DevOps … . Essential Skills & Experience: - 5-7 years in a Data Science role - Strong experience with Amazon Bedrock and SageMaker - Python integration with APIs (e.g., ChatGPT) - Demonstrable experience with LLMs in AWS - Proven delivery of OCR and document parsing pipelines More ❯
flows using Apache Kafka, Apache Nifi and MySQL/PostGreSQL Develop within the components in the AWS cloud platform using services such as RedShift, SageMaker, API Gateway, QuickSight, and Athena Communicate with data owners to set up and ensure configuration parameters Document SOP related to streaming configuration, batch configuration or API management depending on role requirement Document details of … and problem-solving skills Experience in instituting data observability solutions using tools such as Grafana, Splunk, AWS CloudWatch, Kibana, etc. Experience in container technologies such as Docker, Kubernetes, and Amazon EKS Qualifications: Ability to obtain an Active Secret clearance or higher Bachelors Degree in Computer Science, Engineering, or other technical discipline required, OR a minimum of 8 years equivalent More ❯
and customise them for different use cases. Develop data models and Data Lake designs around stated use cases to capture KPIs and data transformations. Identify relevant AWS services - on Amazon EMR, Redshift, Athena, Glue, Lambda, to design an architecture that can support client workloads/use-cases; evaluate pros/cons among the identified options to arrive at More ❯
Collaborate with development teams to design and implement automated tests for microservices, emphasizing Spring Boot and Java-based architectures. Implement testing strategies for AWS data lakes (e.g., S3, Glue, Athena) with a focus on schema evolution, data quality rules, and performance benchmarks, prioritizing data lake testing over traditional SQL approaches. Automate data tests within CI/CD workflows to … maintain scalable test automation frameworks, with a focus on backend, API, and data systems using tools like Pytest and Postman. Expertise in Pandas, SQL, and AWS analytics services (Glue, Athena, Redshift) for data profiling, transformation, and validation within data lakes. Solid experience with AWS (S3, Lambda, EMR, ECS/EKS, CloudFormation/Terraform) and understanding of cloud-native architectures More ❯
Collaborate with development teams to design and implement automated tests for microservices, emphasizing Spring Boot and Java-based architectures. Implement testing strategies for AWS data lakes (e.g., S3, Glue, Athena) with a focus on schema evolution, data quality rules, and performance benchmarks, prioritizing data lake testing over traditional SQL approaches. Automate data tests within CI/CD workflows to … maintain scalable test automation frameworks, with a focus on backend, API, and data systems using tools like Pytest and Postman. Expertise in Pandas, SQL, and AWS analytics services (Glue, Athena, Redshift) for data profiling, transformation, and validation within data lakes. Solid experience with AWS (S3, Lambda, EMR, ECS/EKS, CloudFormation/Terraform) and understanding of cloud-native architectures More ❯
of data daily using AWS, Kubernetes, and Airflow. With solid software engineering fundamentals, fluent in Java and Python (Rust is a plus). Knowledgeable about data lake systems like Athena, and big data storage formats such as Parquet, HDF5, ORC, focusing on data ingestion. Driven by working in an intellectually engaging environment with top industry minds, where constructive debates More ❯
with Python coupled with strong SQL skills. In addition, you will also have a strong desire to work with Docker, Kubernetes, Airflow and the AWS data technologies such as Athena, Redshift, EMR and various other tools in the AWS ecosystem. You would be joining a team of 25+ engineers across mobile, web, data and platform. We look for engineers More ❯
interested in building data and science solutions to drive strategic direction? Based in Tokyo, the Science and Data Technologies team designs, builds, operates, and scales the data infrastructure powering Amazon's retail business in Japan. Working with a diverse, global team serving customers and partners worldwide, you can make a significant impact while continuously learning and experimenting with cutting … working with large-scale data, excels in highly complex technical environments, and above all, has a passion for data. You will lead the development of data solutions to optimize Amazon's retail operations in Japan, turning business needs into robust data pipelines and architecture. Leveraging your deep experience in data infrastructure and passion for enabling data-driven business impact … and optimizing SQL. - Ability to write code in Python for data processing. - Business level English (written and verbal) PREFERRED QUALIFICATIONS - Experience working with AWS technologies (e.g., Lambda, CloudWatch, QuickSight, Athena, RDS, etc.). - Experience in MLOps, generative AI, large language models (LLMs), and collaborating with data science teams. - Experience providing technical leadership and mentoring other engineers on best practices More ❯
with data privacy regulations. Technical Competencies The role is a hands-on technical leadership role with advanced experience in at least most of the following technologies Cloud Platforms: AWS (Amazon Web Services): Knowledge of services like S3, EC2, Lambda, RDS, Redshift, EMR, SageMaker, Glue, and Kinesis. Azure: Proficiency in services like Azure Blob Storage, Azure Data Lake, VMs, Azure … Lake Formation, Azure Purview. Data Security Tools: AWS Key Management Service (KMS), Azure Key Vault. Data Analytics & BI: Visualization Tools: Tableau, Power BI, Looker, and Grafana. Analytics Services: AWS Athena, Amazon QuickSight, Azure Stream Analytics. Development & Collaboration Tools: Version Control: Git (and platforms like GitHub, GitLab). CI/CD Tools: Jenkins, Travis CI, AWS CodePipeline, Azure DevOps. More ❯
ll work across engineering, analytics, and AI functions to ensure data is accessible, reliable, and actionable. ️ Key Responsibilities Build, maintain, and optimise batch and streaming pipelines using AWS Glue, Athena, Redshift, and S3. Use prompt engineering techniques (training provided) to support LLM-based automation and testing. Collaborate with platform teams to embed GenAI tools (e.g., Cursor, Gemini, Claude) into … of experience in data or analytics engineering. Proficiency in Python and SQL, with strong debugging and performance tuning skills. Experience building pipelines with AWS services such as Glue, S3, Athena, Redshift, and Lambda. Familiarity with orchestration tools (e.g., Airflow, Step Functions) and DevOps practices (e.g., CI/CD, Infrastructure as Code). Interest in Generative AI and a willingness More ❯
Cucumber) Linux environments AWS environment experience MariaDB, Oracle, MySQL, AWS Aurora (2 or more) 2+ years in MySQL scripting within cloud Data Migration projects AWS data migration testing, including Athena, Data Migration Service, or similar tools Additional skills include: Self-management and proactive decision-making with risk assessment Ability to translate technical concepts for non-technical stakeholders Knowledge sharing More ❯