Great Data Products
By: Source Cooperative
Language: en
Categories: Technology, Science
A podcast about the ergonomics and craft of data. Brought to you by Source Cooperative.
Episodes
Turning Federal Data Into Action
Jan 10, 2026Jed talks with Denice Ross, Senior Fellow at the Federation of American Scientists and former U.S. Chief Data Scientist, about federal data's role in American life and what happens when government data tools sunset. Denice led efforts to use disaggregated data to drive better outcomes for all Americans during her time as Deputy U.S. Chief Technology Officer, and now works on building a Federal Data Use Case Repository documenting how federal datasets affect everyday decisions.
The conversation explores why open data initiatives have evolved over the years and how administrative priorities shape public...
Duration: 01:10:01How Standards Emerge: Lessons from STAC
Dec 27, 2025[Jed's audio in this sounds terrible because of a hardware setting that Marshall Moutenot very kindly helped us identify. Will sound better in future episodes!]
Jed talks with Matt Hanson from Element 84 about the SpatioTemporal Asset Catalog (STAC) specification and its role in making geospatial data findable and usable. Matt describes STAC as "a simple, developer-friendly way to describe geospatial data so that people can actually find it and use it." The conversation covers how STAC emerged from a 2017 sprint in Boulder with 20 people and grew into a specification now adopted by NASA, USGS, and commercial satellite...
Inside Harvard's data.gov Archive
Nov 21, 2025Jed talks with Jack Cushman from the Harvard Law School Library Innovation Lab about their project to archive and preserve more than 311,000 datasets from Data.gov. We explore how they use BagIt for long-term preservation, built a serverless search interface that makes 17.9 TB of data discoverable in the browser, and what this means for the future of online archives.
Duration: 01:19:21Protomaps and PMTiles
Nov 01, 2025Jed talks with Brandon Liu about building maps for the web with Protomaps and PMTiles. We cover why new formats won't work without a compelling application, how a single-file base map functions as a reusable data product, designing simple specs for long-term usability, and how object storage-based approaches can replace server-based stacks while staying fast and easy to integrate. Many thanks to our listeners from Norway and Egypt who stayed up very late for the live stream!
Links and Resources
- Protomaps – a free, customizable base map you can self-host
- PM...
Duration: 01:17:14Why LLM Progress is Getting Harder
Oct 02, 2025Jed Sundwall and Drew Breunig explore why LLM progress is getting harder by examining the foundational data products that powered AI breakthroughs. They discuss how we've consumed the "low-hanging fruit" of internet data and graphics innovations, and what this means for the future of AI development.
The conversation traces three datasets that shaped AI: MNIST (1994), the handwritten digits dataset that became machine learning's "Hello World"; ImageNet (2008), Fei-Fei Li's image dataset that launched deep learning through AlexNet's 2012 breakthrough; and Common Crawl (2007), Gil Elbaz's web crawling project that fueled 60% of GPT-3's training data. Drew argues that...
Duration: 01:51:38