<div dir="ltr"><div class="gmail_default" style="font-size:small">Dear all ESIP Machine Learning cluster member,</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Hope you are doing well during the social distancing era to fight the pandemic. Bill Teng has shared this <a href="https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge">open research dataset challenge on Kaggle </a>(see details below) with Anne and me. This looks like a very interesting challenge and maybe some of us want to apply our AI skills to this challenge to "Make Data Matter"!</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Thanks Bill for sharing! Stay well!</div><div class="gmail_default" style="font-size:small">Kind regards,</div><div class="gmail_default" style="font-size:small"><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><table border="0" cellpadding="2" cellspacing="2" style="border-collapse:collapse;border-spacing:0px;max-width:100%;color:rgb(51,51,51);font-size:16px;border:3px solid rgb(170,170,170);font-family:Times;line-height:14.4px"><tbody><tr><td align="center" height="71" width="71" style="padding:0px"><span style="font-size:11px"><span style="font-family:arial,helvetica,sans-serif"><a href="http://ncics.org/" style="background:0px 0px;color:rgb(204,0,0)" target="_blank"><img height="71" width="71" alt="NCICS" src="https://ncics.org/ncics/img/NCICS_logo-131x131.png" style="max-width: 100%; height: auto; border: 0px; vertical-align: middle; width: 69px;"></a></span></span></td><td valign="top" style="padding:0px"><span style="font-size:11px;font-family:arial,helvetica,sans-serif"><span style="font-weight:700">Yuhan (Douglas) Rao</span><br><b>Postdoctoral Research Scholar</b><br><a href="http://ncsu.edu/" style="background:0px 0px;color:rgb(204,0,0)" target="_blank">North Carolina State University</a><br><a href="https://ncics.org/" style="background:0px 0px;color:rgb(204,0,0)" target="_blank">North Carolina Institute for Climate Studies (NCICS)</a><br>151 Patton Ave, Asheville, NC 28801<br>e: <a href="mailto:yrao5@ncsu.edu" target="_blank">yrao5@ncsu.edu</a><br>o: +1 828 271 4903</span></td></tr></tbody></table></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Below is the details of the challenge (<a href="https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge">https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge</a>).</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small"><h3 style="margin:4px 0px 16px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-weight:500;font-stretch:inherit;font-size:18.18px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgb(0,0,0)">Dataset Description</h3><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 44,000 scholarly articles, including over 29,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up.</p><h3 style="margin:24px 0px 16px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-weight:500;font-stretch:inherit;font-size:18.18px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgb(0,0,0)">Call to Action</h3><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">We are issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions. The CORD-19 dataset represents the most extensive machine-readable coronavirus literature collection available for data mining to date. This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing COVID-19 response efforts worldwide. There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up.</p><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">A list of our initial key questions can be found under the <span style="margin:0px;padding:0px;border:0px;font-style:inherit;font-variant:inherit;font-stretch:inherit;font-size:inherit;line-height:inherit;font-family:inherit;vertical-align:baseline"><a href="https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/tasks" target="_blank" style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:rgb(0,138,188);text-decoration-line:none">Tasks</a></span> section of this dataset. These key scientific questions are drawn from the NASEM’s SCIED (National Academies of Sciences, Engineering, and Medicine’s Standing Committee on Emerging Infectious Diseases and 21st Century Health Threats) <a href="https://www.nationalacademies.org/event/03-11-2020/standing-committee-on-emerging-infectious-diseases-and-21st-century-health-threats-virtual-meeting-1" target="_blank" style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:rgb(0,138,188);text-decoration-line:none">research topics</a> and the World Health Organization’s <a href="https://www.who.int/blueprint/priority-diseases/key-action/Global_Research_Forum_FINAL_VERSION_for_web_14_feb_2020.pdf?ua=1" target="_blank" style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:rgb(0,138,188);text-decoration-line:none">R&D Blueprint</a> for COVID-19.</p><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">Many of these questions are suitable for text mining, and we encourage researchers to develop text mining tools to provide insights on these questions.</p><h3 style="margin:24px 0px 16px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-weight:500;font-stretch:inherit;font-size:18.18px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgb(0,0,0)">Prizes</h3><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">Kaggle is sponsoring a <em style="margin:0px;padding:0px;border:0px;font-variant:inherit;font-weight:inherit;font-stretch:inherit;font-size:inherit;line-height:inherit;font-family:inherit;vertical-align:baseline">$1,000 per task</em> award to the winner whose submission is identified as best meeting the evaluation criteria. The winner may elect to receive this award as a charitable donation to COVID-19 relief/research efforts or as a monetary payment. More details on the prizes and timeline can be found on the <a href="https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/discussion/135826" target="_blank" style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:rgb(0,138,188);text-decoration-line:none">discussion post</a>.</p><h3 style="margin:24px 0px 16px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-weight:500;font-stretch:inherit;font-size:18.18px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgb(0,0,0)">Accessing the Dataset</h3><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">We have made this dataset available on Kaggle, and are periodically updating it from its source. To learn more and access the latest copy of the dataset, you can also go here: <a href="https://pages.semanticscholar.org/coronavirus-research" target="_blank" style="margin:0px;padding:0px;border:0px;font:inherit;vertical-align:baseline;color:rgb(0,138,188);text-decoration-line:none">https://pages.semanticscholar.org/coronavirus-research</a>.</p><p style="margin:10.5px 0px;padding:0px;border:0px;font-variant-numeric:inherit;font-variant-east-asian:inherit;font-stretch:inherit;font-size:14px;line-height:inherit;font-family:Inter,sans-serif;vertical-align:baseline;color:rgba(0,0,0,0.7)">The licenses for each dataset can be found in the all _ sources _ metadata csv file.</p></div></div>