[Esip-cloud] Jupyter Hub and RStudio - connecting to a Spark cluster on AWS
Huang, Thomas (398G)
Thomas.Huang at jpl.nasa.gov
Mon Jun 26 13:44:01 EDT 2017
We are looking at EMR on AWS to compare with our current Spark cluster on AWS. I should have more to share in a month or two.
For building Spark cluster on AWS, our NEXUS distribution has an example on how to deploy Sparker cluster using Docker.
Jet Propulsion Laboratory
4800 Oak Grove Drive, Mail Stop 158-242, Pasadena, CA 91109
Phone: 818.354.2747, Email: thomas.huang at jpl.nasa.gov<mailto:thomas.huang at jpl.nasa.gov>
DISCLAIMER: All personal and professional opinions presented herein are my own and do not, in any way, represent the opinion or policy of JPL, NASA or Caltech.
On Jun 26, 2017, at 10:20 AM, Smith, David G. via Esip-cloud <esip-cloud at lists.esipfed.org<mailto:esip-cloud at lists.esipfed.org>> wrote:
Looking to get thoughts and insights on options for connecting a multi-user Jupyter Hub and RStudio environment to a Spark cluster on AWS. Some of the options we've been looking at are to either build our own Spark cluster on DC/OS and Mesos in EC2 instances, or to just consume Spark as a managed service via EMR on AWS. We've also been looking at using Livy as a middleman, the other option being to run Jupyter Hub right on the master node of the cluster.
Are there other, better options we should be looking at? Has anyone already done some comparative analysis to look at pros and cons, performance, limitations, et cetera? Also, thoughts/opinions/insights on running one cluster vs. running multiple clusters?
David G. Smith PE PLS
USEPA Office of Environmental Information
202.566.0797| http://epa.gov/enviro |
Esip-cloud mailing list
Esip-cloud at lists.esipfed.org<mailto:Esip-cloud at lists.esipfed.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Esip-cloud