Topic: collaboration academic institutions
-
EleutherAI Launches Huge Open-Source AI Training Dataset
EleutherAI released Common Pile v0.1, an 8TB open-source dataset of licensed and public-domain text, to train AI models like Comma v0.1-1T/2T without copyright issues, matching proprietary model performance. The dataset addresses legal concerns in AI training by using vetted sources like ...
Read More »