Predictive Model of COVID-19 Incidence and Socioeconomic Description of Municipalities in Brazil

Author(s): Isadora CR Carneiro, Eloiza KGD Ferreira, Janaina C da Silva, Guilherme Soares, Daisy M. Strottmann, Guilherme F. Silveira


A new highly contagious coronavirus termed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in China, in December 2019. The virus has spread rapidly, reaching all continents around the world, causing a potentially lethal human respiratory infection, COVID-19. Despite being the best alternative in the current pandemic context, social distancing measures alone may not be sufficient to prevent SARS-CoV-2 spread, and the overall impact of the virus is of great concern.


Herein, we describe the demographic and socioe-conomic characteristics of 672 cities with at least 1 reported case of COVID-19 until June 26, 2020, and thus, determine a predictive model for the number of cases using data from patients tested for SARS-CoV-2 and the autoregressive integrated moving average (ARIMA) approach.


Predict model and epidemiological study based on aggregated data from the recent COVID-19. The SARS-CoV-2 has spread around the world wider than any previous human viral disease over a century and to predict the dynamic risk of the disease into subnational regions we used a thorough exploratory data analysis of COVID-19 cases according to the sociodemographic Brazilian municipalities indicators and an autoregressive integrated moving average (ARIMA) model.


Following the first case of COVID-19 in the country to the reporting period confirmed cases of the disease were present in cities of all Brazilian states, affecting 36.5% of the municipalities in Rio de Janeiro State. The inhabitants in cities with reported cases of COVID-19 represent more than 73.1% of the Brazilian population. Stratifying by age or gender groups of the inhabitants does not affect COVID-19 incidence (confirmed cases/100,000 inhabitants). The demographic density, the Municipal Human Development Index (MHDI) and the per capita income of the municipalities with cases of COVID-19 do not affect disease incidence. In addition, according to official data, our model proved to be effective for disease forecasting, predicting 2,358,703 (2,172,930 to 2,544,477) cumulative cases on July 25, 2020.


On this data, the official case data from Ministry of Health was 2,394,513, and in the database used as the source, was 2,337,647, so the proposed model was accurate above 98,5% correct. The 30 days of predicted data compared to those observed was obtained R² (coefficient of determination) of 0.99306.

