Hi, this webinar<http://www.nsf.gov/events/event_summ.jsp?cntn_id=137718&WT.mc_id=USNSF_13&WT.mc_ev=click> on 2016-03-03, 1230hrs EST, will probably be of interest to the ESIP community.  From the bio, the speaker "is the creator of widely used web standards such as RSS, RDF and Schema.org. He is also responsible for products such as Google Custom Search."

Empirical Modeling of Complex Systems

Data Science Webinar Series - Ramanathan Guha - March 3, 2016
Engineering is about building models of phenomena. Traditionally, these models are built  using 'foundational' equations such as those of motion, continuum mechanics and electromagnetism, that capture the core causal relationships of the domain. Unfortunately, we do not have such equations for heterogeneous complex systems that we find in biological, environmental and behavioral sciences. Recently, exploiting large amounts of data and compute resources, we have started using machine learning to build empirical models of such systems. This technique is behind the success of many widely used products such as Google search and advertising. However, a number of obstacles need to be overcome before empirical modeling  becomes more widespread. In this talk, we discuss two of these problems along with their possible solutions.

Data drives empirical modeling and in order to get an adequate data set, we often need to merge data from different sources. Aligning schemas and resolving references to entities that appear in different sets with ambiguous names is expensive and error prone. In this talk, we look at how human communication deals with similar issues to show how these techniques may be adapted to allow very large scale data sharing.

High infrastructure setup costs have severely restricted the number of researchers experimenting with large datasets. We present the concept of Data Commons, a cloud offering that aggregates multiple datasets and makes them available to users of the cloud. In this model, data is part of the cloud infrastructure, like storage or networking. We discuss the potential impact of Data Commons and report on first steps.

Guha is the creator of widely used web standards such as RSS, RDF and Schema.org. He is also responsible for products such as Google Custom Search. He was a co-founder of Epinions.com and Alpiri.  Until recently, he was a Google Fellow and a vice president in research at Google. He has a Ph.D. in computer science from Stanford University and B.Tech in mechanical engineering from IIT Chennai.
