Cost analysis for Maine’s residential retail electricity suppliers
Portland, Maine • Jan 2015 — Present
Developed a data pipeline and framework for comparing cost of Maine’s retail electricity supplier prices to the state’s standard offer. This framework was validated and adopted by the Maine Public Utilities Commission in their own study. The analysis revealed Maine customers could have saved $180.5 million from 2012-2024. Initial analysis at the Bangor Daily News was done manually and visualized in Tableau. The updated project uses dlt for data ingestion, dbt for transformations and the Observable Framework to display interactive visualizations.
Semantic similarity analysis in Maine legislative testimony
Portland, Maine • Sep 2024 — Present
Built end-to-end pipeline for 20,000+ bills with semantic clustering (0.535 silhouette score) and vector embedding system for 3.6M sentences using HuggingFace transformers. Created vector search in DuckDB for real-time similarity queries, enabling analysis of testimony influence patterns for Sierra Club advocacy.
Improving data quality and data model for healthcare client deliverables: Crossover and Disproportionate Share Hospital reporting
Portland, Maine • Jan 2024 — Dec 2025
Refactored brittle Perl/VBA pipeline with Python ETL for Medicare EDI 835 files and PDF remittance advice, implementing pattern-based extraction, DuckDB storage, and SQL view transformations. Cut annual development costs by 90% while improving modularity, speed and data quality.
Bias detection in Paycheck Protection Program funding
Portland, Maine • Jan 2022 — Jun 2022
Identified statistically significant biases in PPP loan distribution during COVID-19 by integrating multiple data sources and applying regression modeling that controlled for confounding variables including rural-urban differences and industry concentration.