04:00
2026-06-30
arxiv.org
natural-language-processing
Open but Incompatible: A License Compatibility Analysis of Corpora for Low-Resource African Languages
A new audit of over twenty African NLP corpus families reveals widespread license incompatibilities, including CC-BY-SA and CC-BY-NC datasets that cannot be legally combined and NoDerivs clauses that โฆ