Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can anyone tell me how, as a sole developer, it's possible to gain real-world experience with distributed Hadoop and Spark given the massive computing resources required? It just seems like a closed shop to me.


Define "massive". You can learn all the most important aspects of the APIs and programming model on a single machine, either by running in "local" (non-distributed) mode or by running a few VMs to simulate a real cluster. And you can spin up a cluster of 100 machines on either AWS or GCE for about a dollar per hour.


GCP offers a $300 credit that expires after 1 year. It's a good way to get your feet wet with Hadoop and Spark via Cloud Dataproc. (Google Cloud employee speaking.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: