Article d’opinió en VIA EMPRESA sobre la disponibilitat de dades

Article d’opinió al voltant de la disponibilitat de dades al voltant de la evolució de la pandèmia del COVID-19. Després de portar uns mesos analitzant dades sobre l’evolució de la COVID-19 hem comprovat que algunes coses han millorat i que ha augmentat la disponibilitat de les dades i l’accés a aquests.

Comptar cada vegada amb ciutadans amb una major cultura de l’ús de les dades ens permet ser més corresponsables en la presa de decisions individuals i poder recolzar amb major criteri les decisions dels organismes oficials.

Pots trobar l’article sencer en https://www.viaempresa.cat/opinio/conejero-upv-dades-dades-dades_2142814_102.html

 

 

Investigadores de la UPV trabajan en la lucha contra la COVID-19 a través de la Ciencia de Datos

We Alberto Conejero (Instituto Universitario de Matemática Pura y Aplicada, IUMPA) and Miguel Rebollo (of the Valencian Research Institute for Artificial Intelligence, VRAIN) of UPV are part of the Data Science Working Group in the Fight against COVID-19 , of the Commissioner for the Presidency of the Generalitat Valenciana on Strategy for Artificial Intelligence and Data Sciences against COVID-19. News appeared in UPV, in El Periodic newspaper and in RUVID website.

Data Science Group COVID-19 (Comunitat Valenciana)

Since April 2020, I have been involved in the Data Science Group COVID-19 of the València Region under the supervision of Nuria Oliver. This is a multidisciplinary team of volunteers that work side by side with the General Director of Analysis and Public Policies of the Presidency of the RValencia Region Government. The analysis for COVID-19 is coordinated with the Ministry of Health and the rest of the Councils involved. This working group is led by Nuria Oliver, Commissioner of the Presidency of the Generalitat for the Valencian Strategy for Artificial Intelligence and, especially, for the coordination of data intelligence before the COVID-19 epidemic in the Valencia Region.

They are part of the group of experts from the Jaume I University, the University of Valencia, the Polytechnic University of Valencia, the Miguel Hernández University, the University of Alacant, the CEU Cardenal Herrera University, Fisabio, and Microsoft, with the collaboration of Esri, the INE, the Secretary of State for Artificial Intelligence and the three most important mobile phone companies in the country.

This group is divided into three priority areas with their respective work coordinators: (1) analysis, visualization, and modeling of mobility data, (2) epidemiological models and (3) data science applied to COVID-19. There, I work in epidemiological models with Antonio Falcó, Miguel Rebollo, Miguel A. Lozano, Emilio Sansano, Xavier Barber, and Francisco Escolano.

This is the web page of our data research group http://infocoronavirus.gva.es/es/grup-de-ciencies-de-dades-del-covid-19-de-la-comunitat-valenciana

Paper published in JAMIA: Potential limitations in COVID-19 machine learning due to data source variability: a case study in the nCov2019 dataset

The lack of representative COVID-19 data is a bottleneck for reliable and generalizable machine learning. Data sharing is insufficient without data quality, where source variability plays an important role. We showcase and discuss potential biases from data source variability for COVID-19 machine learning. In this work, we used the publicly available nCov2019 dataset, including patient level data from several countries. We aimed to the discovery and classification of severity subgroups using symptoms and comorbidities.

In our work published in JAMIA, we have shown that cases from the two countries with the highest prevalence were divided into separate subgroups with distinct severity manifestations. This variability can reduce the representativeness of training data with respect the model target populations and increase model complexity at risk of overfitting. We conclude that data source variability is a potential contributor to bias in distributed research networks. We call for systematic assessment and reporting of data source variability and data quality in COVID-19 data sharing, as key information for reliable and generalizable machine learning.
Our analysis tool developed within BDSLab at UPV can be found at http://covid19sdetool.upv.es/?tab=ncov2019