These datasets exist. But they are not on GitHub for free—they are traded on darknet forums or used by specialized blockchain forensic firms (Chainalysis, CipherTrace).
If you want to build a scanner for (e.g., checking if a specific key you lost is valid), the steps are: bitcoin private key scanner github extra quality
High-quality scanners use CUDA or OpenCL to harness graphics cards. A CPU checks maybe 50,000 keys/sec. A mid-range GPU can check 200-500 million keys/sec. But even at 500M keys/sec, the probability of finding a funded key in a human lifetime is still essentially zero. "Extra quality" here means the developer understands this and has optimized memory bandwidth. These datasets exist