It is designed to scale and do mapreduce kind of parallel processing. I would strongly recommend you to take a look before writing your own.
http://lucene.apache.org/nutch/
It is designed to scale and do mapreduce kind of parallel processing. I would strongly recommend you to take a look before writing your own.
http://lucene.apache.org/nutch/