Senior Python Developer with Spark
Job Description
Senior Python Developer with Spark to design, build, and optimize large-scale data processing systems in a cloud-native AWS environment, based in Reston, VA with hybrid work options.
Responsibilities
- Design, implement, and optimize expansive data processing pipelines within a cloud-native AWS framework.
- Build high-performance data pipelines using Spark and Python while handling complex relational datasets.
- Identify bottlenecks in data workflows, optimize Spark jobs, and author advanced SQL to support hierarchical and analytical use cases.
- Collaborate with cross-functional teams to deliver scalable, reliable data solutions that support critical business functions in a financial services setting.
Requirements
- 8+ years of hands-on Python for data engineering, including building and maintaining data pipelines.
- Deep expertise in Apache Spark, including partitioning, caching, broadcast joins, shuffle optimization, understanding of DAGs/stages/tasks, and memory/resource management.
- Extensive experience with big data ecosystems such as Hadoop, Hive, and EMR.
- Advanced proficiency in SQL, including recursive CTEs for hierarchical data, query optimization, indexing strategies, and execution plan analysis.
- Strong experience with AWS services including EMR, Lambda, Step Functions, EventBridge, Redshift, S3, and Glue.
- Experience building and consuming APIs, along with data transformation and ingestion workflows.
- Proven ability to work with large-scale datasets, performing data analysis and extracting actionable insights.
- Familiarity with data modeling concepts (normalized/denormalized structures, handling hierarchical data).
- Hands-on experience with CI/CD pipelines and tools such as GitLab and Terraform.
- Strong understanding of performance troubleshooting in distributed systems and identifying bottlenecks.
- Ability to clearly explain technical decisions, especially around Spark optimization and SQL logic.
- Strong analytical thinking, problem-solving skills, and attention to detail.
- Effective collaboration skills in cross-functional, matrixed environments.
- Education: Bachelor's degree in Computer Science, Information Systems, or a related field.
- Desired: Experience in financial services or regulated environments.
- Desired: Exposure to data visualization tools.
- Desired: Familiarity with event-driven architectures on AWS.
Technologies
- Python
- Apache Spark
- Hadoop
- Hive
- EMR
- AWS
- Lambda
- Step Functions
- EventBridge
- Redshift
- S3
- Glue
- SQL
- GitLab
- Terraform
- APIs
Benefits
- Competitive compensation
- Comprehensive insurance options
- Matching contributions through the 401(k) plan and the share purchase plan
- Paid time off for vacation, holidays, and sick time
- Paid parental leave
- Learning opportunities and tuition assistance
- Wellness and Well-being program
Skills
- Amazon Web Services Cloud
- Detail-oriented
- GitHub
- Problem Solving
- Python
- SQL
- Terraform
Similar Jobs
J
J