[Esip-datareadiness] ESIP Data Readiness Cluster February Meeting [02/20 1 pm ET]

Douglas Rao - NOAA Affiliate douglas.rao at noaa.gov
Wed Feb 14 13:37:30 EST 2024


Dear all,

Happy Valentine's Day! This is a reminder for the ESIP Data Readiness
Cluster meeting on 02/20 (Tuesday) at 1pm ET/UTC-5.

In addition to chocolate during this special day, you can add some tasty
croissants to the mix! (Thanks to Allison for making the flyer for our next
meeting!)

During this meeting, we will have an invited presentation from MLCommons
Croissant working group (Omar Benjelloun) on their recent development on
Croissant, a high-level metadata vocabulary for machine learning datasets.
Please see the information below.

*Abstract*: Croissant is an open community-built standardized metadata
vocabulary for ML datasets, including key attributes and properties of
datasets, as well as information required to load these datasets in ML
tools. Croissant enables data interoperability between ML frameworks and
beyond, which makes ML work easier to reproduce and replicate.
This talk will provide an overview of the Croissant format, and demonstrate
its benefits 1) for dataset consumers, who can search for Croissant
datasets, access their metadata on ML repositories like Kaggle and
HuggingFace, and load them into popular ML frameworks like TensorFlow, Jax
and Pytorch, and 2) for dataset creators, who can use the Croissant editor
to easily create, modify, and validate datasets in the Croissant format.

*Spear*: Omar Benjelloun is a software engineer at Google, where he has
developed data-focused products (Google Public Data Explorer, Google
Dataset Search) and Search features (media reviews, public statistics
answers, related entities, …) for over a decade and a half. Prior to
joining Google, Omar received a PhD in Databases from INRIA / University of
Paris Orsay, and spent two years as a postdoc in the Database group at
Stanford University.

Please join us on 02/20 via the ESIP Community Calendar (
https://www.esipfed.org/get-involved/community-calendar).

-Douglas
-- 
Douglas Rao, Ph.D. (he/him/his)
Research Scientist
North Carolina State University <http://www.ncsu.edu>
Cooperative Institute for Satellite Earth System Studies (CISESS)
<http://www.ncics.org>
NOAA National Centers for Environmental Information
<http://www.ncei.noaa.gov>
151 Patton Ave. Asheville, NC 28801
+1.828.271.4903

I work on a flexible work schedule and across a number of time zones.
Apologies for any out of hours email.

"Every individual matters. Every individual has a role to play. Every
individual makes a difference.” – *Dr. Jane Goodall*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.esipfed.org/pipermail/esip-datareadiness/attachments/20240214/96abe09f/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DataReadiness_Feb2024.png
Type: image/png
Size: 714031 bytes
Desc: not available
URL: <https://lists.esipfed.org/pipermail/esip-datareadiness/attachments/20240214/96abe09f/attachment-0001.png>


More information about the Esip-datareadiness mailing list