Data Science Platform applied to Health

2018 - 2020

Data Science is a field of study that stands out for the ability to assist the discovery of useful information from large or complex databases, as well as data-driven decision making. It can be defined as a set of strategies, tools and techniques for collecting, transforming and analyzing data carried out by multidisciplinary teams formed by researchers with substantive knowledge of the problem under analysis - in our case public health - statisticians, mathematicians and computer scientists (date -driven analysis).

It combines traditional analysis methods with sophisticated algorithms to process large volumes of data in various formats; structured, semi-structured and unstructured. The process of analysis in the scope of Data Science involves the phases of (i) collection and ingestion: extraction, transformation and load (better known as ETL); (ii) pre-processing: selection of records, reduction of dimensionality, normalization, creation of subsets of data; (iii) exploratory analysis and data mining: mainly analyzes aimed at classification, association, clustering, anomaly detection and prediction; (iv) post-processing: pattern interpretation, filtering, visualization and coupling in decision support systems and online platforms for visualization.

