Summary

This data engineer role at the EPA's independent office involves creating and managing systems to process and organize large amounts of environmental and operational data, ensuring it's accurate and secure for audits, investigations, and efficiency improvements.

The work supports the agency's mission to protect the environment by preventing waste and fraud through reliable data handling.

A good fit would be someone with strong technical skills in data processing and a detail-oriented mindset, ideally with experience in government or compliance-focused environments.

Key Requirements

One year of specialized experience at GS-12 level or equivalent, including building and maintaining ETL pipelines for large, disparate data sources
Proficiency in orchestrating pipelines with tools like Airflow or Luigi and accelerating processing with Spark or PySpark
Experience implementing data architecture, governance, and compliance in on-premises and cloud environments (e.g., AWS)
Skills in version control, collaborative workflows, code reviews, and documenting data pipelines for traceability and lineage
Ability to apply advanced data modeling and analytics techniques to meet oversight and customer needs
Testing and monitoring data pipelines to ensure accuracy, optimization, and functionality
Demonstrated competencies in coding/scripting, database management, critical thinking, and technical documentation

Full Job Description

We are an independent office within the EPA that helps the agency protect the environment in a more efficient & cost effective manner.

We perform audits, evaluations, & investigations of the EPA to promote economy, efficiency, and to prevent and detect fraud, waste and abuse.

We strive to provide solutions to problems that ultimately result in making America a cleaner and healthier place.

NOTE: You must meet qualification requirements by the closing date of this announcement.

Specialized experience is experience that has equipped the candidate with the particular knowledge, skills, and abilities to perform successfully the duties of the position.

To qualify, you must have one of the following: 1 year or 52 weeks of full-time Specialized Experience equivalent to at least the GS-12 level defined as (1) Building, documenting, and maintaining production‐grade advanced extraction, transformation, and loading (ETL) pipelines for ingesting and transforming large, disparate data from databases, APIs, and web‐scraped feeds orchestrated with Airflow/Luigi or similar technologies, and accelerates processing with Spark/PySpark or similar tools that feed into different IT systems including, but not limited to, on-premises SQL and Oracle databases, Amazon Web Services (AWS) environments, BI reporting platforms, and Microsoft Office 365; (2) Implementing data architecture, governance, and compliance solutions for on-premises and cloud environments; (3) Utilizing version control, collaborative workflows, code reviews, and clear documentation of data pipelines and documents data pipeline structures for traceability of the entire lifecycle of the data, tracking its flow, transformations, and usage from origin to destination to maintain data lineage from the various disparate systems collected and to provide transparency to the data scientists, data analysts, auditors, evaluators, investigators, and quality assurance teams; (4) Applying advanced techniques for data modeling and analytics to meet customer needs; and (5) Testing and monitoring of data pipelines to ensure underlying data being used for analysis is accurate, functioning as intended, and optimized.

Your resume will be assessed for the following Competencies: Coding and Scripting Data Systems Self Management Database Management Systems Information Technology Architecture Critical Thinking Data Literacy and Analysis Data Systems Statistical Techniques Information Technology Architecture Information Technology Research and Development Technology Management Software Engineering Technical Documentation Written Communication Attention to Detail Quality Assurance Evidence of the above Competencies and Specialized Experience must be supported by detailed documentation of duties performed in positions held.

Applicants must meet the qualifications for this position within thirty (30) days of the closing date of this announcement. Major Duties:

Responsibilities The ideal candidate will perform the following duties: Build, document, and maintain production‐grade advanced extraction, transformation, and loading (ETL) pipelines for ingesting and transforming large, disparate data from databases, APIs, and web‐scraped feeds orchestrated with Airflow/Luigi or similar technologies, and accelerates processing with Spark/PySpark or similar tools that feed into different IT systems including, but not limited to, on-premises SQL and Oracle databases, Amazon Web Services (AWS) environments, BI reporting platforms, and Microsoft Office 365.

Implements data architecture, governance, and compliance solutions for on-premises and cloud environments Applies advanced techniques for data modeling and analytics to meet oversight needs.

Provides technical leadership in the development or modernization of IT data systems.

Utilizes version control, collaborative workflows, code reviews, and clear documentation of data pipelines and documents data pipeline structures for traceability of the entire lifecycle of the data, tracking its flow, transformations, and usage from origin to destination to maintain data lineage from the various disparate systems collected and to provide transparency to the data scientists, data analysts, auditors, evaluators, investigators, and quality assurance teams.

Consistently and routinely tests and monitors data pipelines to ensure underlying data being used for analysis is accurate, functioning as intended, and optimized to providing provide the most up-to-date information

✓ Fresh Listing

Data Engineer

GS-12 Pay Grade

Job Description

Summary

Key Requirements

Full Job Description

Browse Similar Jobs

Related Jobs

DATA SCIENTIST

INFORMATION TECHNOLOGY SPECIALIST (APPSW/DATAMGT)

SUPERVISORY INTERDISCIPLINARY OPERATIONS RESEARCH ANALYST/DATA SCIENTIST

INFORMATION TECHNOLOGY SPECIALIST (DATAMGNT)

INFORMATION TECHNOLOGY SPECIALIST (DATAMGT)

DATA SCIENTIST