Systems Engineer
Leidos
- Washington Grove, MD
- Permanent
- Full-time
- Evaluate, format and maintain data from multiple sources; develop and manage data processing workflows in customer pipeline; and evaluate, extract and process System and Event Logs.
- Develop and maintain processing pipelines and documentation; assist with identifying, classifying, and verifying incoming dataset types; and identify and implement COTS and Open Source Data Enhancement capabilities.
- Code scripts and software modules as needed basis to triage and convert datasets that arrive in non-standard formats to fit a de-normalized row and column data model.
- Store, standardize, extract, reference, and validate data; develop and deploy processing pipelines to cloud analytics platforms; and develop requirements to support data analytics customers and integration partners.
- Develop software standards and practices documentation and support Agile development.
- Active TS/SCI with Polygraph security clearance.
- Typically requires BS degree and 12+ years of prior relevant experience or Masters with 10+ years of prior relevant experience. May possess a Doctorate in technical domain. Additional years of prior relevant experience may be substituted for a degree.
- Demonstrated experience utilizing big data processing tech such as Spark, Pyspark, and Python.
- Demonstrated experience data mapping, extraction, transformation and loading.
- Demonstrated experience building analytic reports in tools such as CloudWatch and Kibana.
- Demonstrated experience using AWS Step Functions to coordinate ETL pipelines.
- Demonstrated experience processing and converting of OS and Data Logs into reports and metrics Dashboards.
- Demonstrated experience with Regular Expressions (RegEx).
- Demonstrated experience with SQL, MySQL, PostgresSQL.
- Demonstrated experience with Data file type processing XML, JSON.
- Demonstrated experience with IDEs and Data Modeling through Notebooks and Visual Studio.
- Demonstrated experience ETL’ing data from disparate structured & unstructured data formats into enriched, query-friendly structured data in indexed files.
- Demonstrated experience performing extensive data review and data quality analysis.
- Demonstrated experience developing ETL design documentation including source and target mapping and data dictionary information.
- Demonstrated experience interfacing with customers and integration partners for gaining and clarifying detailed objectives.
- Demonstrated experience supporting Agile development by contributing to tasking definition, scope and review.
- Demonstrated experience deploying capabilities on the Databricks unified analytics platform.
- Demonstrated experience in optimizing Databricks Delta Tables for query, merge and stream operations.
- Demonstrated experience with CI/CD using Jenkins.
- Demonstrated experience in joining multiple complex data sets using Spark.
- Demonstrated experience in tuning Spark streaming and batch jobs for cluster utilization and speed.
- Demonstrated experience in deploying complex, notebook-based pipelines.
- Demonstrated experience in Python data analysis libraries such as Pandas.
- Demonstrated experience utilizing Cloud services such as Lambda, SNS/SQS, or EC2.
- Demonstrated experience with DevOps tools to include Cloudwatch, Lambda, SQS, Dynamo and RDS.
- Demonstrated experience working with Elastic, ElasticSearch/Logstash/Kibana (ELK stack).