[Esip-cloud] Jupyter Hub and RStudio - connecting to a Spark cluster on AWS

Smith, David G. Smith.DavidG at epa.gov
Mon Jun 26 13:20:15 EDT 2017

Looking to get thoughts and insights on options for connecting a multi-user Jupyter Hub and RStudio environment to a Spark cluster on AWS.  Some of the options we've been looking at are to either build our own Spark cluster on DC/OS and Mesos in EC2 instances, or to just consume Spark as a managed service via EMR on AWS.  We've also been looking at using Livy as a middleman, the other option being to run Jupyter Hub right on the master node of the cluster.

Are there other, better options we should be looking at?  Has anyone already done some comparative analysis to look at pros and cons, performance, limitations, et cetera?  Also, thoughts/opinions/insights  on running one cluster vs. running multiple clusters?


David G. Smith PE PLS
USEPA Office of Environmental Information
202.566.0797| http://epa.gov/enviro |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.deltaforce.net/pipermail/esip-cloud/attachments/20170626/616e8927/attachment.html>

More information about the Esip-cloud mailing list