event

Brazilian Symposium on Databases (SBBD) 2020

Brazilian Symposium on Databases (SBBD) 2020

Event Info

The annual Brazilian Symposium on Databases (SBBD) is the official event on databases of the Brazilian Computer Society (SBC). It is a leading Latin American forum for database researchers, practitioners, developers, and users, to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences. Along with technical sessions, the symposium includes invited talks, tutorials, and mini-courses given by distinguished speakers from the national and international database research community. Please see the history of SBBD events.

Due to COVID-19 and coronavirus pandemic, this year we will have a different experience with all activities happening online only.

For the 2020 edition, SBBD accepts Full, Short, Vision, Industrial, Demos, and Distinguished papers submissions. The 35th SBBD will be held ONLINE, from September 28th to October 1st, 2020 – organized by the Informatics Department of the Pontifical Catholic University of Rio de Janeiro (PUC-Rio, Brazil).


Keynotes


Getting Rid of Data

Author: Tova Milo (Tel Aviv Univ., Israel)


Abstract

We are experiencing an amazing data-centered revolution. Incredible amounts of data are collected, integrated and analyzed, leading to key breakthroughs in science and society. This well of knowledge, however, is at a great risk if we do not dispense with some of the data flood. First, the amount of generated data grows exponentially and already at 2025 is expected to be more than five times the available storage. Second, even disregarding storage constraints, uncontrolled data retention risks privacy and security, as recognized, e.g., by the recent EU Data Protection reform. Data disposal policies must be developed to benefit and protect organizations and individuals. Retaining the knowledge hidden in the data while respecting storage, processing and regulatory constraints is a great challenge. The difficulty stems from the distinct, intricate requirements entailed by each type of constraint, the scale and velocity of data and the constantly evolving needs. While multiple data sketching, summarization and deletion techniques were developed to address specific aspects of the problem, we are still very far from a comprehensive solution. Every organization has to battle the same tough challenges, with ad hoc solutions that are application specific and rarely sharable. In this talk I will discuss the logical, algorithmic, and methodological foundations required for the systematic disposal of large-scale data, for constraints enforcement and for the development of applications over the retained information. I will overview relevant related work, highlighting new research challenges and potential reuse of existing techniques, as well as the research performed in this direction in the Tel Aviv Databases group.


Bio

Tova Milo received her Ph.D. degree in Computer Science from the Hebrew University, Jerusalem, in 1992. After graduating she worked at the INRIA research institute in Paris and at University of Toronto and returned to Israel in 1995, joining the School of Computer Science at Tel Aviv university, where she is now a full Professor and holds the Chair of Information Management. She served as the Head of the Computer Science Department from 2011-2014. Her research focuses on large-scale data management applications such as data integration, semi-structured information, Data-centered Business Processes and Crowd-sourcing, studying both theoretical and practical aspects. Tova served as the Program Chair of multiple international conferences, including PODS, VLDB, ICDT, XSym, and WebDB, and as the chair of the PODS Executive Committee. She served as a member of the VLDB Endowment and the PODS and ICDT executive boards and as an editor of TODS, IEEE Data Eng. Bull, and the Logical Methods in Computer Science Journal. Tova has received grants from the Israel Science Foundation, the US-Israel Binational Science Foundation, the Israeli and French Ministry of Science and the European Union. She is an ACM Fellow, a member of Academia Europaea, a recipient of the 2010 ACM PODS Alberto O. Mendelzon Test-of-Time Award, the 2017 VLDB Women in Database Research award, the 2017 Weizmann award for Exact Sciences Research, and of the prestigious EU ERC Advanced Investigators grant.


Towards Learned Algorithms, Data Structures, and Systems

Author: Tim Kraska (MIT, U.S.A)


Abstract

All systems and applications are composed of basic data structures and algorithms, such as index structures, priority queues, and sorting algorithms. Most of these primitives have been around since the early beginnings of computer science (CS) and form the basis of every CS intro lecture. Yet, we might soon face an inflection point: recent results show that machine learning has the potential to alter the way those primitives or systems at large are implemented in order to provide optimal performance for specific applications.  In this talk, I will provide an overview of how machine learning is changing the way we build systems and outline different ways to build learning algorithms and data structures to achieve “instance-optimality” with a particular focus on data management systems.


Bio
Tim Kraska is an Associate Professor of Electrical Engineering and Computer Science in MIT’s Computer Science and Artificial Intelligence Laboratory and co-director of the Data System and AI Lab at MIT (DSAIL@CSAIL). Currently, his research focuses on building systems for machine learning, and using machine learning for systems. Before joining MIT, Tim was an Assistant Professor at Brown, spent time at Google Brain, and was a PostDoc in the AMPLab at UC Berkeley after he got his PhD from ETH Zurich. Tim is a 2017 Alfred P. Sloan Research Fellow in computer science and received several awards including the VLDB Early Career Research Contribution Award,  the VMware Systems Research Award, the university-wide Early Career Research Achievement Award at Brown University, an NSF CAREER Award, as well as several best paper and demo awards at VLDB and ICDE.


Tutorials


Principles of Distributed Database Systems: spotlight on NewSQL

Author: Patrick Valduriez (Inria, University of Montpellier, CNRS, LIRMM, France)


Abstract

The first edition of the book Principles of Distributed Database Systems, co-authored with Prof. Tamer Özsu (University of Waterloo) appeared in 1991 when the technology was new and there were not too many products. In the Preface to the first edition, we had quoted Michael Stonebraker who claimed in 1988 that in the following 10 years, centralized DBMSs would be an “antique curiosity” and most organizations would move towards distributed DBMSs. That prediction has certainly proved to be correct, and most systems in use today are either distributed or parallel.
The fourth edition of this classic textbook [Özsu & Valduriez 2020] provides major updates, in particular, new chapters on big data platforms, NoSQL, NewSQL and polystores. In this tutorial, we introduce these major updates, with a focus on NewSQL.
NewSQL is the latest technology in the big data management landscape, enjoying a fast-growing rate in the DBMS and BI markets. NewSQL combines the scalability and availability of NoSQL with the consistency and usability of SQL. By providing online analytics over operational data, NewSQL opens up new opportunities in many application domains where real-time decision is critical. Important use cases are eAdvertisement (such as Google Adwords), IoT, performance monitoring, proximity marketing, risk monitoring, real-time pricing, real-time fraud detection, etc. NewSQL may also simplify data management, by removing the traditional separation between NoSQL and SQL (ingest data fast, query it with SQL), as well as between operational database and data warehouse / data lake (no more ETLs!). However, a hard problem is scaling out transactions in mixed operational and analytical (HTAP) workloads over big data, possibly coming from different data stores (HDFS, SQL, NoSQL). Today, only a few NewSQL systems have solved this problem.
A first in-depth presentation of NewSQL was given in a tutorial at IEEE Big Data 2019 with Prof. Ricardo Jimenez-Peris (CEO and founder at LeanXcale) [Valduriez & Jimenez-Peris 2019]. In this tutorial, we provide a taxonomy of NewSQL systems based on major dimensions including targeted workloads, capabilities and implementation techniques. We illustrate with popular NewSQL systems such as Google Spanner, LeanXcale, CockroachDB, SAP HANA, MemSQL and Splice Machine. In particular, we give a spotlight on some of the more advanced systems. We also compare with major NoSQL and SQL systems, and discuss integration within big data ecosystems and corporate information systems, using polystores. Finally, we discuss the current trends and research directions.


Bio

Patrick Valduriez is a senior scientist at Inria, France, and the scientific advisor of the LeanXcale company. He has also been a professor of computer science at University Pierre et Marie Curie (UPMC), now Sorbonne University, in Paris (2000-2002) and a researcher at Microelectronics and Computer Technology Corp. in Austin, Texas (1985-1989). He received his Ph. D. degree and Doctorat d'Etat in CS from UPMC in 1981 and 1985, respectively. From 1995 to 2000, he was the manager of the Bull-Inria joint venture (called Dyade), which fostered technology transfer in IT and security. Dyade spined off five successful start-ups, including Kelkoo based on the Disco software that he built at Inria with his team. He has also been consulting for major companies in USA, Europe, Brazil and France.
He is currently the head of the Zenith team (between Inria and University of Montpellier, LIRMM) that focuses on data science, in particular data management in large-scale distributed and parallel systems and scientific data management. He has authored and co-authored many technical papers and several textbooks, among which “Principles of Distributed Database Systems” (with Professor Tamer Özsu, University of Waterloo). He currently serves as associate editor of several journals, including the VLDB Journal, Distributed and Parallel Databases, and Internet and Databases. He has served as PC chair of major conferences such as SIGMOD and VLDB. He was the general chair of SIGMOD04, EDBT08 and VLDB09.
He received prestigious awards and prizes. He obtained several best paper awards, including VLDB00. He was the recipient of the 1993 IBM scientific prize in Computer Science in France and the 2014 Innovation Award from Inria – French Academy of Science – Dassault Systems. He is an ACM Fellow.