Ecosystems benefit from growth and fire -- the same is true for data ecosystems.
Measuring the impact of data levers is a scientific process.
We can pose questions like: Given that some set of people take some action to add, withhold or alter some data records, how will affected AI systems change in terms of their capabilities?
For instance, how will AI art capabilities change if a firm releases a model with no externally scraped data? Or, how will code-focused language model capabilities change after offering GitHub users an opt-out opportunity?
For now, this page just includes a list of systems to watch. In the future, we'll provide summaries of how the capabilities of lever-impacted systems differ. For connections between measuring data lever impacts and scaling laws, see this notebook.
This page is also under construction. We'll have more discussion of scaling laws and specific capability benchmarks soon!