This position is located in the Division of Insurance and Research, Large Data Management Section leading the performance of complex analytics, including the use of machine learning and artificial intelligence techniques, on large datasets to support the mission of the division and the Corporation.
According to the posting, duties for the Data Scientist include:
- Provides technical leadership and serves as team lead on complex data analysis projects, including projects that use complex analytic approaches common to the field of data science including AI/ML techniques, natural language processing (NLP), statistical analysis, geographic analysis, data visualizations, and application/model development; to identify and monitor risks to the financial system and the deposit insurance fund, and to inform the development of banking policy and the conduct of supervision and resolution- related activities.
- Leads the development of auditable, flexible, repeatable, and scalable extract transform and load (ETL) or extract load transform (ELT) capabilities on large structured and unstructured data sources in a variety of formats from multiple sources, including internal and external application programming interfaces (APIs).
- Leads in the development of applications and visualizations of geographic data for analyses of climate and other risks; demonstrates proficiency in evolving geographical information system (GIS) mapping capabilities, including internal and external geographic datasets and mapping software (such as Google Maps API or ArcGIS).
- Utilizes software and/or programming languages to automate complex and intersecting data orchestration workflows. Leads the development of utilities or scripts to automate task execution and the instantiation of data pipelines; and to programmatically deploy and update data visualizations, data sources, and other analytic tools. Develops processes and algorithms for data engineering, including data manipulation, cleaning, analytics, and visualization, using appropriate software and programming language(s) (such as Python, GIS software, SQL, R and Microsoft Azure solutions).
- Develops and maintains expertise in the efficient use of a variety of cloud technologies (including Microsoft Azure Databricks and Apache Airflow), statistical programming languages and software (including Python and SQL), and distributed computing software and databases (such as Apache Spark and PySpark), and GIS software (such as Google Maps API and ArcGIS.