The Challenges in eScience Workshop is an initiative of the National Laboratory of Scientific Computing with support from the Brazilian Society for the Progress of Science. The event will gather scientists and practitioners to discuss the last achievements in supporting eScience applications. The availability of more powerful instruments and the deployment of high-speed networks are enpowering researches with unprecedented volume of data that must be cleaned, analyzed and interpreted. Achieving this is, nevertheless, far from trivial. A complex computing arsenal composed of grids and clouds equipped with multi-core machines is driving the development of software to support scientists in the difficult task of understanding the data deluge. In this context, the Challenges in eScience Workshop has invited world leaders in eScience to discuss solutions and point to directions the community should follow to fulfill the potential of in-silico research. The one day workshop will include talks and poster sessions with a final closing panel. The event is scheduled to take place at the Vale Real Hotel - Itaipava - Petrópolis RJ, on the 29the of October, in the context of the 22nd Int'l Symposium on Computer Architecture(http://sbac-pad-2010.lncc.br/index.php). Participation is free of charge but registration is required by sending a message using the space below, informing your name, email address, institution, and one paragraph stating your motivation to attend the workshop.

Invited Speakers

Speaker
Eric Neumann
Title:"Semantic Linked Data in Biomedicine: Shifting the Focus fromApplication-Centricity to Data-Centricity"


Abstract:

Biomedical data generation is continuously growing both in terms of size and complexity. Clinical Study Data is complicated by the fact that new forms of associated data are continuously created as technologies emerge, including biomarkers, pathway (mechanistic) knowledge, assay platforms, and model systems. W3C semantic standards such as RDF and OWL have been around for several years, but most informatics specialists are unsure where they can be applied effectively. Semantically Linked Data (SLD) can significantly change the organization and re-use of data without requiring a concomitant investment in data systems. SLD is especially useful in dealing with changing data descriptions and the relationships they may have to other data elements, even if they exist externally in other data systems. Applications in the area of clinical data management and analyses will also be presented in the context of SLD.


Bio:
Eric K. Neumann is Executive Director of the Clinical Semantics Group, a consulting firm that advices on and develops intelligent clinical systems for the pharmaceutical industry based on applications of semantic standards. He has also been the founder and chair of the W3C Semantic Web Healthcare and Life Science Interest Group (HCLSIG), which brings together industry leaders to advance semantic web applications. Dr. Neumann was Global Head of Knowledge Management for Scientific and Medical Affairs within Sanofi-Aventis, which covered all functions of global R&D.
In 2005, Dr. Neumann developed BioDash, the first Drug Discovery Semantic Web dashboard, and is currently working on technologies for the rapid mapping of legacy databases onto Enterprise Semantic Infrastructures. Dr. Neumann is an expert in knowledge-based methods for the pharmaceutical industry, going back two decades while at Bolt, Beranek, and Newman. Dr. Neumann holds an S.B. degree from the Massachusetts Institute of Technology,and a PhD in neurobiology and developmental genetics from Case Western Reserve University.

Speaker
Philippe Cudre Mauroux
Title:"SciDB: a Science-Oriented Database Management System"


Abstract:

In this talk I will introduce SciDB, a new open-source and massively parallel platform for array data storage, processing, and analysis. I will review a number of scientific use-cases and describe how these use-cases have determined the features and functionality of the system. I will introduce SciDB's data model, and will describe some of the key architectural features of the system including columnar storage, parallel user-defined functions, and overlapping array partitioning. Finally, I will introduce SSDB, a new benchmark for scientific data management systems, and will explain why SciDB is up to two orders of magnitude faster than traditional database systems on common large-scale array processing tasks.


Bio:
Philippe Cudre-Mauroux is a postdoctoral associate working in the Database Systems group at MIT. He received his Ph.D. from the Swiss Federal Institute of Technology EPFL, where he won the Doctorate Award and the EPFL Press Mention in 2007. Before joining MIT, he worked on distributed information management systems for HP, IBM T.J. Watson Research, and Microsoft Research Asia. His research interests are in large-scale distributed infrastructures for non-relational data such as spatiotemporal, scientific, or Semantic Web data. Webpage: http://people.csail.mit.edu/pcm/

Speaker
Pedro Leite da Silva Dias
Title:"Extracting Useful Information from a Grand Global Ensemble of Weather Forecasts: the THORPEX/TIGGE program"


Abstract:

Weather forecasters face uncertainty that is inherent to the nonlinear nature of the governing equations of the atmospheric state. Significant progress has been achieved in the two decades in estimating uncertainty and predicting “predictability”. More recently, ten operational weather forecasting centers producing daily global ensemble forecasts to 1-2 weeks ahead have agreed to deliver in near-real-time a selection of forecast data to the TIGGE (THORPEX Interactive Grand Global Ensemble) data archives at CMA, ECMWF and NCAR. This is offered to the scientific community as a new resource for research and education. The objective of TIGGE is to establish closer cooperation between the academic and operational worlds by encouraging larger use of operational products for research, and to explore actively the concept and benefits of multi model probabilistic weather forecasts, with a particular focus on severe weather prediction. Data policy and current status of the archives, exchange procedures and complexity of the network will be presented. Examples of the use of super model ensembles in South America will also be shown based on extensive use of the internet.


Short CV
Bachelor degree in Applied Mathematics (Univ. of São Paulo/USP in 1974), MSc and PhD in Atmospheric Sciences by the Colorado State University in 1977 and 1979, respectively. Professor at the Institute Geophysics and Atmospheric Sciences - IAG/USP since 1975. Current position is Director of the National Laboratory for Scientific Computing of the Ministry of Science ad Technology. Member of the Brazilian Academy of Sciences. Visiting researcher at NCAR and NCEP in the USA in several occasions. Senior researcher of the National Institute for Space Research (INPE) and head of the Center of Weather Forecasting and Climate Research (CPTEC) between 1988 and 1990. President of the Brazilian Meteorological Society between 1992 and 1994 and Science Director of the Society from 2006 to 2008. Coordination of the Environmental Area of the Institute of Advanced Studies of USP from 1996 to 2007 and the Regional Weather and Climate Studies Laboratory (MASTER) at USP. Published about 120 papers, book chapters, mostly in international journals and about 240 complete papers in national and international scientific events. 35 MSc and 24 PhD students got their degrees under his guidance.

Speaker
Geoffrey Fox
Title:"eScience: Grids, Clouds and Parallel Computing"


Abstract:

We analyze the different tradeoffs and goals of Grid, Cloud and parallel (cluster/supercomputer) computing. They tradeoff performance, fault tolerance, ease of use (elasticity), cost,interoperability. Different application classes (characteristics) fit different architectures and we describe a hybrid model with Grids for data, traditional supercomputers for large scale simulations and clouds for broad based "capacity computing" including many data intensive problems. We discuss the impressive features of cloud computing platforms and compare Mapreduce and MPI. We take most of our examples from the life science area.

Short CV
Fox has worked in a variety of applied computer science fields with his work on experimental and computational physics evolving into broad contributions to parallel computing initially involving the hypercube architecture. He has worked on the computing issues in several application areas – currently focusing on Earthquake Science, Polar Science, Chemical Informatics and DoD Net Centric Environments. Fox is working with the Minority Serving Institutions in systemic deployment of Cyberinfrastructure. Fox has developed parallel compilers and several Web/Grid computing infrastructures including the NSF Middleware Initiative OGCE project developing Grid portals. Fox is the Vice President for e-Science for the Open Grid Forum (OGF). His group developed NaradaBrokering as open source messaging system for Grids, Peer-to-peer networks and A/V conferencing/Collaboration. Recent interests include Clouds, Web 2.0 and multicore technology with data mining applications to GIS, Bioinformatics and Cheminformatics.















Call For Papers

Program



Registration

Committee

  • Fabio Porto
  • Bruno Schulze
  • Simone Santana


Organization

  •       
  •       
  •