Beyond Open vs. Closed: Enabling Public-Private Collaboration with Semi-Synthetic Datasets
Data too sensitive to be "open" typically remains "closed" as proprietary information. This dichotomy undermines efforts to make algorithmic decision systems more fair, transparent, and accountable. Access to proprietary data is needed by government agencies to enforce policy, researchers to evaluate methods, and the public to hold agencies accountable; all of these needs must be met while preserving individual privacy and affording oversight by data owners on how the data is used. In this talk, I’ll describe the algorithms we’re developing to generate privacy-preserving and bias-corrected synthetic datasets, and touch on the legal protections that govern their use. These datasets are intended to be shared with academic and private collaborators to experiment with advanced analytics without incurring significant legal risk, and to focus attention on pressing problems in housing, education, and mobility.
For more information click here