Please join NITRD FASTER and Bigdata for an informative presentation and discussion with Dr. Kenton McHenry, Senior Research Scientist at the National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign. Dr. McHenry will discuss Brown Dog: A search engine for the other 99 percent (of data). Brown Dog seeks to develop a service that will make uncurated data accessible to scientists.
Please join us on Tuesday, February 17, 2015 from 1:00 to 2:30pm (EST); in NSF Room I-1235.
Brown Dog: A search engine for the other 99 percent
Date: Tuesday, February 17, 2015
Time: 1:00 to 2:30 pm (EST)
Location: National Science Foundation
4201 Wilson Boulevard, Arlington, Virginia 22230
(Directions to NSF are at: http://www.nsf.gov/about/visit/.)
If you plan to attend in person, please RSVP to firstname.lastname@example.org to facilitate NSF visitors’ badging process. An NSF visitor’s badge is required for this presentation. The entrance to NSF is at the corner of Ninth Street and North Stuart Street (one block on your right as you exit the Ballston metro escalator).
Remote participation is available. Instructions to join our meeting remotely using WebEx are at the end of this message.
Abstract: The persistent growth of “Big Data” presents challenges that include not only the large quantities of digital data produced, but also the large variety of the data. This variety presents a significant challenge to ensuring future scientific reproducibility, especially in what is often referred to as “long-tail” data. Inadequate curation of data, for example, can result in undocumented representations of data or many different representations for similar data types. For future users, without the understanding provided by the original data creator or software, this data becomes difficult or impossible to explore. Another reoccurring problem for use of big data is difficulty in searching, finding, filtering, and comparing data. Without a consistent and uniform index over all the data, or at least associated metadata, such actions become prohibitively difficult.
Brown Dog addresses these and similar considerations through the development of a set of services for data format conversion, metadata extraction from existing data, and the generation of indexes over uncurated collections of data. Brown Dog provides the framework for an extensible suite of new and existing tools working together to accomplish these tasks. Domain researchers using the scalable and usable Brown Dog system will be able to apply the whole suite of tools to data collections in order to find and “unlock” previously inaccessible data. During the talk we will go over the Brown Dog services and their constituent components, as well as discuss the projects’ three scientific use cases and a general public use case that will also contribute to the sustainability of Brown Dog.
About The Speaker|
Kenton McHenry, Senior Research Scientist
National Center for Supercomputing ApplicationsUniversity of Illinois at Urbana-Champaign