Can anyone tell me how, as a sole developer, it's possible to gain real-world experience with distributed Hadoop and Spark given the massive computing resources required? It just seems like a closed shop to me.
Define "massive". You can learn all the most important aspects of the APIs and programming model on a single machine, either by running in "local" (non-distributed) mode or by running a few VMs to simulate a real cluster. And you can spin up a cluster of 100 machines on either AWS or GCE for about a dollar per hour.
GCP offers a $300 credit that expires after 1 year. It's a good way to get your feet wet with Hadoop and Spark via Cloud Dataproc. (Google Cloud employee speaking.)