Home built Hadoop analytics cluster: Part 1
Home built Hadoop analytics cluster: Part 1 [ Background ] [ Project Summary ] [ Building the cluster ] Background I am pursuing the UW Master of Science in Data Science degree from UW-Wisconsin - University of Wisconsin System / UW Extended Campus. My home school is at UW La Crosse . For the Fall 2020 semester, I am taking DS730 - Big Data: High-Performance Computing . I have been learning how to implement algorithms that allow for the distributed processing of large data sets across computing clusters; creating parallel algorithms that can process large data sets; and using tools and software such as Hadoop , Pig , Hive , and Python to compare large data-processing tasks using cloud-computing services e.g. AWS . One of the things that struck me so far in this course is how everything was "setup" for me in this course. While I understand the intent of the course is to focus on the programming skills to handle big data within data science, part of me wants to fo...