DeveloperJobs.io
← Back to all jobs

Job Description

Senior Python Developer with Spark to design, build, and optimize large-scale data processing systems in a cloud-native AWS environment, based in Reston, VA with hybrid work options.

Responsibilities

  • Design, implement, and optimize expansive data processing pipelines within a cloud-native AWS framework.
  • Build high-performance data pipelines using Spark and Python while handling complex relational datasets.
  • Identify bottlenecks in data workflows, optimize Spark jobs, and author advanced SQL to support hierarchical and analytical use cases.
  • Collaborate with cross-functional teams to deliver scalable, reliable data solutions that support critical business functions in a financial services setting.

Requirements

  • 8+ years of hands-on Python for data engineering, including building and maintaining data pipelines.
  • Deep expertise in Apache Spark, including partitioning, caching, broadcast joins, shuffle optimization, understanding of DAGs/stages/tasks, and memory/resource management.
  • Extensive experience with big data ecosystems such as Hadoop, Hive, and EMR.
  • Advanced proficiency in SQL, including recursive CTEs for hierarchical data, query optimization, indexing strategies, and execution plan analysis.
  • Strong experience with AWS services including EMR, Lambda, Step Functions, EventBridge, Redshift, S3, and Glue.
  • Experience building and consuming APIs, along with data transformation and ingestion workflows.
  • Proven ability to work with large-scale datasets, performing data analysis and extracting actionable insights.
  • Familiarity with data modeling concepts (normalized/denormalized structures, handling hierarchical data).
  • Hands-on experience with CI/CD pipelines and tools such as GitLab and Terraform.
  • Strong understanding of performance troubleshooting in distributed systems and identifying bottlenecks.
  • Ability to clearly explain technical decisions, especially around Spark optimization and SQL logic.
  • Strong analytical thinking, problem-solving skills, and attention to detail.
  • Effective collaboration skills in cross-functional, matrixed environments.
  • Education: Bachelor's degree in Computer Science, Information Systems, or a related field.
  • Desired: Experience in financial services or regulated environments.
  • Desired: Exposure to data visualization tools.
  • Desired: Familiarity with event-driven architectures on AWS.

Technologies

  • Python
  • Apache Spark
  • Hadoop
  • Hive
  • EMR
  • AWS
  • Lambda
  • Step Functions
  • EventBridge
  • Redshift
  • S3
  • Glue
  • SQL
  • GitLab
  • Terraform
  • APIs

Benefits

  • Competitive compensation
  • Comprehensive insurance options
  • Matching contributions through the 401(k) plan and the share purchase plan
  • Paid time off for vacation, holidays, and sick time
  • Paid parental leave
  • Learning opportunities and tuition assistance
  • Wellness and Well-being program

Skills

  • Amazon Web Services Cloud
  • Detail-oriented
  • GitHub
  • Problem Solving
  • Python
  • SQL
  • Terraform

Similar Jobs

Get Job Alerts

New jobs delivered to your inbox.