Seeking Indigenous Data Sovereignty in the Age of Data Scraping: The Digital Archive of Indigenous Language Persistence (DAILP)
Written Communication: An International Quarterly of Research, Theory, and Application
Published online on May 12, 2026
Abstract
Written Communication, Ahead of Print.
The problem of data extraction to feed LLMs impacts all digital archives and open-sourced initiatives. Though the practice of data-scraping bots used to create the LLMs that feed algorithms is recent, extractivist models of language documentation are ...
The problem of data extraction to feed LLMs impacts all digital archives and open-sourced initiatives. Though the practice of data-scraping bots used to create the LLMs that feed algorithms is recent, extractivist models of language documentation are ...