The role of the Sr Data Architect is to collaborate with data scientists, researchers, users, model and sim engineers, user experience designers, software engineers, digitizers, cataloguers, and system administrators to design a system to collect, manage, and convert raw data from the DTRIAC archive into usable information for U.S. deterrent missions. JOB DESCRIPTION:
REQUIRED SKILLS AND EXPERIENCE:
- Design tools and methodologies to process the digital collection in a production mode.
- Develop and implement methods for, and configurations of, the Data Lake to support navigation, search, insertion, or extraction of information or files by the government or other performers without requiring proprietary software, tools, or data other than widely available commercial-off-the-shelf (COTS) tools, and software that can be authorized for use on government IT systems.
- Develop, maintain, and improve capabilities, such as scripting, to efficiently perform maintenance, synchronization, and production processing of data in the Data Lake on Windows- and Linux-based IT systems, including HPCMP clusters.
- Implement, configure, perform functional testing, and operate the data and applications of the Advanced Search and Discovery (ASD) environment as a hosted capability on government IT systems.
- Leverage the collection, capabilities, and team to perform targeted analyses and studies and to provide dedicated support to missions and end users.
- Create documentation or training materials for Project Products.
- Support integration or hosting of capabilities or products on government IT systems.
- Hold and participate in Gate Reviews.
- Other duties as assigned.
CITIZENSHIP/SECURITY CLEARANCE REQUIREMENTS:
- 5+ years relevant experience.
- Experience building and maintaining secure, end to end systems and services.
- Experience building and working with data pipelines and large data sets.
- Experience with schema design and data modeling.
- Deep understanding of algorithms and efficient data structures.
- Current Security+ Certification or equivalent required.
- Proficiency with Python programming language, C++, SQL, and C# required.
- Experience with OCR and Machine Learning technologies and methodologies required.
- Experience and demonstrable proficiency with OpenCV and PostgresDB is desirable.
- Experience/Proficiency with utilizing Tesseract OCR with Python is desirable.
- Experience in developing and implementing Recurrent Neural Networks (RNN) algorithms and integrating Long Short Term Memory (LSTM) highly desirable.
- Experience with Academy of Color Encoding System (ACES) Developer Tools for integrating data specifications into software and hardware a plus.
- Experience with the following types of tools a plus: SAS, Apache Hadoop, Tableau, TensorFlow, BigML, Knime, RapidMiner, Apache Flink, DataRobot, Apache Spark, MongoDB, Trifacta, Minitab, Apache Kafka, QlikView, Julia, SPSS, Keras, Matplotlib, Pytorch, scikit-learn, Weka, Domino Data Science Platform, IBM Watson Studio, and Google Cloud AI Platform
- Must be a U.S. Citizen
- Secret Clearance required