excellent collaboration skills. Grit in the face of technical obstacles. Nice to have: Building SDKs or client libraries to support API consumption. Knowledge of distributed data processing frameworks (Spark, Dask). Understanding of GPU orchestration and optimization in Kubernetes. Familiarity with MLOps and ML model lifecycle pipelines. Experience with AI model training and fine-tuning. Familiarity with event-driven architecture More ❯
field (or equivalent experience) 3-5 years of experience in data engineering (healthcare/medical devices preferred but not required) Strong Python programming and data engineering skills (Pandas, PySpark, Dask) Proficiency with databases (SQL/NoSQL), ETL processes, and modern data frameworks (Apache Spark, Airflow, Kafka) Solid experience with cloud platforms (AWS, GCP, or Azure) and CI/CD for More ❯
/monitoring and ML inference services - Proficiency in creating and optimizing high-throughput ETL/ELT pipelines using a Big Data processing engine such as DataBricks Workflows, Spark, Flink, Dask, dbt or similar - Experience building software and/or data pipelines in the AWS cloud (SageMaker Endpoints, ECS/EKS, EMR, Glue) Why Proofpoint Protecting people is at the heart More ❯