Senior Data Engineer
Location: 100% Remote
Years? Experience: 10 years...
Education: Bachelor?s in IT related field
Work Authorization: Must show that applicant is legally permitted to work in the United States.
Clearance: Applicants must be able to meet the requirements to obtain an Public Trust security clearance. NOTE: United States Citizenship is required to be eligible to obtain this security clearance.
Key Skills:
? 10 years of IT experience focusing on enterprise data architecture and management
? Experience with Databricks required
? 8 years experience in Conceptual/Logical/Physical Data Modeling & expertise in Relational and Dimensional Data Modeling
? Experience with Great Expectations or other data quality validation frameworks
? Experience with ETL and ELT tools such as SSIS, Pentaho, and/or Data Migration Services
? Advanced level SQL experience (Joins, Aggregation, Windowing functions, Common Table Expressions, RDBMS schema design, Postgres performance optimization)
? Experience with AWS environment, CI/CD pipelines, and Python (Python 3) a bonus
Responsibilities
? Plan, create, and maintain data architectures, ensuring alignment with business requirements
? Obtain data, formulate dataset processes, and store optimized data
? Identify problems and inefficiencies and apply solutions
? Determine tasks where manual participation can be eliminated with automation.
? Identify and optimize data bottlenecks, leveraging automation where possible
? Create and manage data lifecycle policies (retention, backups/restore, etc)
? In-depth knowledge for creating, maintaining, and managing ETL/ELT pipelines
? Create, maintain, and manage data transformations
? Maintain/update documentation
? Create, maintain, and manage data pipeline schedules
? Monitor data pipelines
? Create, maintain, and manage data quality gates (Great Expectations) to ensure high data quality
? Support AI/ML teams with optimizing feature engineering code
? Expertise in Spark/Python/Databricks, Data Lake and SQL
? Create, maintain, and manage Spark Structured Steaming jobs, including using the newer Delta Live Tables and/or DBT
? Research existing data in the data lake to determine best sources for data
? Create, manage, and maintain ksqlDB and Kafka Streams queries/code
? Data driven testing for data quality
? Maintain and update Python-based data processing scripts executed on AWS Lambdas
? Unit tests for all the Spark, Python data processing and Lambda codes
? Maintain PCIS Reporting Database data lake with optimizations and maintenance (performance tuning, etc)
? Streamlining data processing experience including formalizing concepts of how to handle lake data, defining windows, and how window definitions impact data freshness.
Qualifications
? 10 years of IT experience focusing on enterprise data architecture and management
? Experience in Conceptual/Logical/Physical Data Modeling & expertise in Relational and Dimensional Data Modeling
? Experience with Databricks, Structured Streaming, Delta Lake concepts, and Delta Live Tables required
? Additional experience with Spark, Spark SQL, Spark DataFrames and DataSets, and PySpark
? Data Lake concepts such as time travel and schema evolution and optimization
? Structured Streaming and Delta Live Tables with Databricks a bonus
? Experience leading and architecting enterprise-wide initiatives specifically system integration, data migration, transformation, data warehouse build, data mart build, and data lakes implementation / support
? Advanced level understanding of streaming data pipelines and how they differ from batch systems
? Formalize concepts of how to handle late data, defining windows, and data freshness
? Advanced understanding of ETL and ELT and ETL/ELT tools such as SSIS, Pentaho, Data Migration Service etc
? Understanding of concepts and implementation strategies for different incremental data loads such as tumbling window, sliding window, high watermark, etc.
? Familiarity and/or expertise with Great Expectations or other data quality/data validation frameworks a bonus
? Understanding of streaming data pipelines and batch systems
? Familiarity with concepts such as late data, defining windows, and how window definitions impact data freshness
? Advanced level SQL experience (Joins, Aggregation, Windowing functions, Common Table Expressions, RDBMS schema design, Postgres performance optimization)
? Indexing and partitioning strategy experience
? Debug, troubleshoot, design and implement solutions to complex technical issues
? Experience with large-scale, high-performance enterprise big data application deployment and solution
? Understanding how to create DAGs to define workflows
? Familiarity with CI/CD pipelines, containerization, and pipeline orchestration tools such as Airflow, Prefect, etc a bonus but not required
? Architecture experience in AWS environment a bonus
? Familiarity working with Kinesis and/or Lambda specifically with how to push and pull data, how to use AWS tools to view data in Kinesis streams, and for processing massive data at scale a bonus
? Experience with Docker, Jenkins, and CloudWatch
? Ability to write and maintain Jenkinsfiles for supporting CI/CD pipelines
? Experience working with AWS Lambdas for configuration and optimization
? Experience working with DynamoDB to query and write data
? Experience with S3
? Knowledge of Python (Python 3 desired) for CI/CD pipelines a bonus
? Familiarity with Pytest and Unittest a bonus
? Experience working with JSON and defining JSON Schemas a bonus
? Experience setting up and management Confluent/Kafka topics and ensuring performance using Kafka a bonus
? Familiarity with Schema Registry, message formats such as Avro, ORC, etc.
? Understanding how to manage ksqlDB SQL files and migrations and Kafka Streams
? Ability to thrive in a team-based environment
? Experience briefing the benefits and constraints of technology solutions to technology partners, stakeholders, team members, and senior level of management
Similar Remote Jobs
Senior Data Engineer
Posted on: 16-07-2024 18:46
Senior Data Engineer, Airbnb for Real Estate
Posted on: 27-08-2024 00:00
Senior Data Engineer, Airbnb for Real Estate
Posted on: 13-09-2024 00:00
Senior Data Engineer - AWS
Posted on: 15-09-2024 00:00
Senior Data Engineer - BPR Insight
Posted on: 27-08-2024 00:00
Senior Data Engineer - Business Analytics - Self Defense Industry - Remote Optional
Posted on: 22-09-2024 00:00
Senior Data Engineer, Community Support
Posted on: 03-08-2024 00:00
Senior Data Engineer, Community Support
Posted on: 11-08-2024 00:00
Senior Data Engineer, Community Support
Posted on: 29-08-2024 00:00
Senior Data Engineer, Content Data Solutions
Posted on: 13-12-2024 17:47
$16/Hr. *WORK FROM HOME* Healthcare Call Center Representative (MUST LIVE IN TEXAS)
Posted on: 27-12-2024 04:28
Outbound Call Center Rep - Remote
Posted on: 31-07-2024 19:12
Outbound Call Center Representative - Remote Work - Must be Resident of Ohio
Posted on: 04-02-2025 19:22
Senior Amazon Connect Consultant
Posted on: 24-01-2025 04:50
Work from Home Customer Service Representative (Tax, Healthcare...
Posted on: 16-07-2024 18:41
Care Manager - RN (Remote)
Posted on: 16-07-2024 18:58
Senior Software Engineer Washington, DC - Remote or Hybrid
Posted on: 11-11-2024 03:52
Amazon Virtual Customer Service Representative (Work From Home)
Posted on: 16-07-2024 18:44
Executive Director, Clinical Transformation and Delivery - Remote in TN
Posted on: 20-01-2025 00:00
Entry Level Opportunity - Launch your career in a stable growing industry | Gunnison, CO, USA
Posted on: 10-10-2024 00:00