[Esip-cloud] ESIP Cloud Computing Cluster Oct 30 - Arraylake: A Cloud-Native Data Lake Platform for Earth System Science

Aimee Barciauskas aimee at developmentseed.org
Mon Oct 16 17:17:17 EDT 2023


Please join us Monday Oct 30 in welcoming Joe Hammand and Ryan Abernathey
to present Arraylake.

*Topic:* Arraylake: A Cloud-Native Data Lake Platform for Earth System
Science

*When:* Monday October 30th, 10:30-11:30 am PT / 1:30-2:30 pm ET /
7:30-8:30pm CEST

*Who:* Ryan Abernathey and Joe Hamman, Founders of Earthmover.io

*Where:* Find joining information on ESIP Community Calendar
<https://www.esipfed.org/get-involved/community-calendar>

NOTE: THIS DAY AND TIME IS DIFFERENT FROM THE ORIGINAL ESIP COMMUNITY
CALENDAR. The ESIP community calendar should be updated with this new day
and time soon.

*Abstract:*
The vast amount of earth system data available today is an incredible
resource for understanding our planet and confronting the challenge of
climate change. Traditionally, a few large organizations have provided most
of the data, and users have downloaded data to local computers. This way of
working is becoming increasingly infeasible as data volumes grow and as
AI-based methods demand direct access to full-scale data archives. With
essentially infinite compute and storage capacity, cloud computing has the
potential to revolutionize our interaction with weather and climate data,
allowing everyone to bring their own compute workloads to bear against a
single shared copy of the data. Over the past years, via our work in the
Pangeo project, we have prototyped a cloud-native approach to weather and
climate data in the cloud, combining scalable computing technologies such
as Xarray and Dask with analysis-ready, cloud-optimized data in formats
like Zarr. While these tools show great potential, they remain difficult to
deploy and use in an operational context for many scientists and
institutions.

Motivated by this challenge, we founded Earthmover, a company aimed at
democratizing access to state-of-the-art cloud-native data analytics, and
built Arraylake, a data platform which enables teams of any size to manage
and analyze weather and climate data in the cloud. Arraylake users can
access high-quality public datasets alongside their own private data, all
via the high-performance Zarr data standard. This talk describes
Arraylake’s architecture, novel version control system for data, and
approach to supporting all common climate data formats (NetCDF, HDF5, Grib,
Tiff, Zarr) via a single, user-friendly interface. Via a short demo, we
illustrate how Arraylake helps overcome common data management challenges
that have henceforth limited widespread adoption of cloud computing in
earth system science.

   - 5 minutes - Welcome and Announcements
   - 30 minutes - Presentation
   - 25 minutes - Q + A

Thanks,
Aimee on behalf of the Cloud Computing Cluster organizing team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.esipfed.org/pipermail/esip-cloud/attachments/20231016/e0329b99/attachment.htm>


More information about the Esip-cloud mailing list