Tim Bray's wide finder project is a wonderful experiment that has been drawing interest from many different people. My favorite solution so far is this one from Fredrik Lundh, in python. Intelligent single-threaded version, considers how threads would look, improves by switching to multiple processes (even works on Windows!) and finally brings memory mapped I/O to the table. I remember a few tricks from my days trying to rip files off a filesystem as fast as possible (working on backup applications), and the effbot does a great job here. I doubt you can get much faster without dropping down to OS-level primitives, and I doubt that speedup would be worth the loss of portability and maintainability.