A quick and inaccurate answer to the question: how many people will contract COVID-19 in Russia?

Disclaimer: The current post expresses the personal opinion of the author, which may or may not coincide with the opinion of the WHO experts, governmental workers, health care professionals, or other involved institutions. Please check the official web site of the World Health Organisation for the most recent updates on the situation and professional advice.

In this post, we show how to fit the often-mentioned ‘logistic curve’ to COVID-19 cases across various countries using a couple of econometric techniques, and how to compute scenario forecasts.

There is one question everyone is asking in Russia: ‘When will our country hit the peak?’ This is a tricky question since the word ‘peak’ usually means ‘the day with the most new cases’ and has little to nothing to do with policy implications. The more general question usually sounds like ‘how will the dynamics look in 2 months’. Many articles—even scientific ones—seem to be talking about some ephemeral ‘logistic curve’. However, there is one problem: climbing in is not the same as climbing out, so the dynamics before and after the peak might vary. Tell me, do these two objects look similar?

Logistic curve coronavirus Italy cases

This is why asymmetry must be modelled explicitly. We are going to adopt an approach due to Fernández & Steel (1998), where they generalise symmetrical probability density functions to incorporate the skewness parameter, namely stretching the left part and shrinking the right part by the same factor, and then, normalising to match the first two moments of the original distribution. For the sake of interpretability, we are not going to match the moments since we are not modelling densities; we are just taking the logistic curve, and shifting, scaling, and skewing it in order to match the observed data. Besides that, the logistic curve has exponentially decaying tails, which might not be too realistic due to the commonplace complacency: people think that when there are 10–15 new cases per day, it is over, and begin communicating too closely, and it becomes their (and eveyone’s) undoing. Modelling of tail heaviness (as with Student’s t distribution) could also be an option, but there might be not too many observations to reliably estimate this parameter. The formula for the new-cases-per-day logistic curve will therefore follow the form: \[ g(x; \mu, \sigma, s, h) \mathrel{\stackrel{\text{def}}{=}} \begin{cases} h \cdot f \left( \frac{x - \mu}{\sigma \cdot s} \right), & \frac{x - \mu}{\sigma} > 0, \\ h \cdot f \left( \frac{s(x - \mu)}{\sigma} \right), & \frac{x - \mu}{\sigma} \le 0. \end{cases} \] Here, $\mu$ is the mode (peak of the function), $\sigma$ is the spread, $s$ is the asymmetry coefficient that takes values from $(0, +\infty)$ and shows the relative heaviness of the right tail ($s > 1$ implies a longer right tail), and finally, $h$ corresponds to vertical scaling (the number of cases per country matters, so the absolute scale should be inferred from the data, too). Remember that f can be any symmetrical function—here, we are using symmetrical-density-like ones for convenience. Remember that we want a quick answer!

Density logistic curve for corona virus flattening

First of all, we download the March and April data from Wikipedia pages titled ‘2020 coronavirus pandemic in X’. We shall focus our attention on Spain, Germany, Italy, Luxembourg, Sweden, Czech Republic, Turkey, and Russia. You will find the data in the table below; please note that these data are as of the 1st of May, 2020, and are subject to change due to methodology updates or new reports.


Date	spa	ger	ita	lux	swe	cze	tur	rus

2020-03-01	85	117	1,694	0	7	3	0	0
2020-03-02	121	150	2,036	0	7	3	0	3
2020-03-03	166	188	2,502	0	11	5	0	3
2020-03-04	228	240	3,089	0	16	5	0	3
2020-03-05	282	349	3,858	2	17	8	0	4
2020-03-06	401	534	4,636	3	22	19	0	10
2020-03-07	525	684	5,883	4	29	26	0	14
2020-03-08	674	847	7,375	5	38	32	0	17
2020-03-09	1,231	1,112	9,172	5	54	38	0	20
2020-03-10	1,695	1,460	10,149	7	108	63	0	20
2020-03-11	2,277	1,884	12,462	7	181	94	1	28
2020-03-12	3,146	2,369	15,113	26	238	116	1	34
2020-03-13	5,232	3,062	17,660	34	266	141	5	45
2020-03-14	6,391	3,795	21,157	51	292	189	6	59
2020-03-15	7,988	4,838	24,747	77	339	298	18	63
2020-03-16	9,942	6,012	27,980	81	384	383	47	93
2020-03-17	11,826	7,156	31,506	140	425	450	98	114
2020-03-18	14,769	8,198	35,713	203	511	560	191	147
2020-03-19	18,077	10,999	41,035	335	603	765	359	199
2020-03-20	21,571	13,957	47,021	484	685	889	670	253
2020-03-21	25,496	16,662	53,578	670	767	1,047	947	306
2020-03-22	29,909	18,610	59,138	798	845	1,161	1,236	367
2020-03-23	35,480	22,672	63,927	875	927	1,287	1,529	438
2020-03-24	42,058	27,436	69,176	1,099	1,027	1,472	1,872	495
2020-03-25	50,105	31,554	74,386	1,333	1,125	1,763	2,433	658
2020-03-26	57,786	36,508	80,539	1,453	1,228	2,022	3,629	840
2020-03-27	65,719	42,288	86,498	1,605	1,317	2,395	5,698	1,036
2020-03-28	73,232	48,582	92,472	1,831	1,392	2,657	7,402	1,264
2020-03-29	80,110	52,547	97,689	1,950	1,450	2,817	9,217	1,534
2020-03-30	87,956	57,298	101,739	1,988	1,566	3,001	10,827	1,836
2020-03-31	95,923	61,913	105,792	2,178	1,683	3,308	13,531	2,337
2020-04-01	104,118	67,366	110,574	2,319	1,828	3,589	15,679	2,777
2020-04-02	112,065	73,522	115,242	2,487	1,982	3,858	18,135	3,548
2020-04-03	119,199	79,696	119,827	2,612	2,179	4,190	20,921	4,149
2020-04-04	126,168	85,778	124,632	2,729	2,253	4,472	23,934	4,731
2020-04-05	131,646	91,714	128,948	2,804	2,359	4,587	27,069	5,389
2020-04-06	136,675	95,391	132,547	2,843	2,566	4,822	30,217	6,343
2020-04-07	141,942	99,225	135,586	2,970	2,763	5,017	34,109	7,497
2020-04-08	148,220	103,228	139,422	3,034	2,890	5,312	38,226	8,672
2020-04-09	153,222	108,202	143,626	3,115	3,036	5,569	42,282	10,131
2020-04-10	158,273	113,525	147,577	3,223	3,100	5,732	47,029	11,917
2020-04-11	163,027	117,658	152,271	3,270	3,175	5,902	52,167	13,584
2020-04-12	166,831	120,479	156,363	3,281	3,254	5,991	56,956	15,770
2020-04-13	170,099	123,016	159,516	3,292	3,369	6,059	61,049	18,328
2020-04-14	174,060	125,098	162,488	3,307	3,539	6,141	65,111	21,102
2020-04-15	180,659	127,584	165,155	3,373	3,665	6,301	69,392	24,490
2020-04-16	184,948	130,450	168,941	3,444	3,785	6,433	74,193	27,938
2020-04-17	190,839	133,830	172,434	3,480	3,909	6,549	78,546	32,008
2020-04-18	194,416	137,439	175,925	3,537	3,964	6,654	82,329	36,793
2020-04-19	198,674	139,897	178,972	3,550	4,027	6,746	86,306	42,853
2020-04-20	200,210	141,672	181,228	3,558	4,173	6,900	90,980	47,121
2020-04-21	204,178	143,457	183,957	3,618	4,284	7,033	95,591	52,763
2020-04-22	208,389	145,694	187,327	3,654	4,406	7,132	98,674	57,999
2020-04-23	213,024	148,046	189,973	3,665	4,515	7,187	101,790	62,773
2020-04-24	219,764	150,383	192,994	3,695	4,610	7,273	104,912	68,622
2020-04-25	223,759	152,438	195,351	3,711	4,686	7,352	107,773	74,588
2020-04-26	226,629	154,175	197,675	3,723	4,756	7,404	110,130	80,949
2020-04-27	229,422	155,193	199,414	3,729	4,884	7,445	112,261	87,147
2020-04-28	232,128	156,337	201,505	3,741	4,980	7,504	114,653	93,558
2020-04-29	236,899	157,641	203,591	3,769	5,040	7,579	117,589	99,399
2020-04-30	239,369	159,119	205,463	3,784	5,051	7,682	120,204	106,498

Secondly, we have to choose the estimation method. Basically, for every country, we have pairs of observations $(1, Y_1), (2, Y_2), \ldots, (61, Y_{61})$, and the functional form is assumed to be known up to 4 (logistic) or 5 (Student) parameters. The most obvious way of estimation is minimising some statistic based on the discrepancies between the real data and the model predictions, the easiest case being squared errors and the sum thereof. However, often in practice more robust penalty functions are used, and sometimes, more robust statistics, so here, we shall also try minimising the sum of absolute deviations (like in a quantile non-linear regression) and the median of squared errors (like in a Least-Median-of-Squares regression due to Rousseeuw). We shall be minimising these 3 criteria. Note that the last one is not smooth in parameters, so there might be a continuum of parameter values yielding the global minimum of the criterion. \[ C_1 = \frac{1}{n} \sum_{t=1}^T (Y_t - g(t; \theta))^2, \quad C_2 = \frac{1}{n} \sum_{t=1}^T |Y_t - g(t; \theta)|, \quad C_3 = \mathrm{median}[(Y_t - g(t; \theta))^2] \] Thirdly, since we are modelling parametric curves with four intuitive parameters, we pick some ballpark estimates in order to roughly match the dynamics. The criterion function is highly non-linear in parameters, so we begin by manually fiddling with parameters for each of our countries, and then, constructing a rather generous hypercube around it and trying a stochastic optimisation routine to pick the candidate for the best solution. We further refine it with constrained Nelder—Mead optimiser.

We implement the general curve via a custom function that depends on the data points (days elapsed from the 1st of March), distribution type (we consider logistic and Student, but other ones are possible), and finally, distribution parameters (a vector of length 4 or 5: mean, scale, skewness, height multiplier, and for Student’s t distribution, the number of degrees of freedom).

Remember, we do not refer to this function as to density since we do not know the real scale; it is just a curve, and its unit of measurement is, well, people, so it measures the number of cases. This is how all dynamics look in one graph.

$New covid 19 cases as a fraction of maximum$

Now, it is time to choose reasonable starting values for the optimiser because they belong to various scales. For numerical stability, especially for gradient-based methods, it is reasonable to have all parameters of the same order of magnitude (say, via rescaling), and it seems like the vertical scale parameter is in the range of tens of thousands, so we divide it by 10,000. For every country, we eyeballed the parameters $(\tilde \mu, \tilde\sigma, \tilde s, \tilde h\ [, \widetilde{df}])$, and then set the following ranges for stochastic search: $(\tilde\mu/2, \tilde\sigma/3, \tilde s/3, \tilde h / 4\ [, 0.1])$ to $(1.5\tilde\mu, 3\tilde\sigma, 3\tilde s, 3\tilde h\ [, 50])$.

For optimisation, we are using the hydroPSO library with its excellent parallel capabilities and many tweaks. Since high-dimensional spaces are getting sparser than one can imagine, for a more thorough sweep, we are using a population of size 400 or 600. Convergence is quite quick for the logistic model, and somewhat slower for the Student’s one, so we use 50–100 iterations for the former and 75–125 iterations for the latter (this is a rough first-step preëstimation anyway). One iteration took 0.25–0.5 seconds, so estimating all models took 8 ⋅ (50 + 75 + 100 + 75 + 100 + 125) ⋅ 0.35 ≈ 1500 seconds, or roughly half an hour—a small price to pay for robustness against multiple local optima.

You would think that estimation can continue without a hindrance... right? Wrong!

There is one cloud on the horizon: so far, there has been no peak in Russia... yet! This open a whole new can of worms: strictly speaking, the skewness parameter and the scale parameter are not identified because we observe only the left side of the distribution: one can multiply the skewness parameter by 2 and the scale parameter by 2, and get the same left tail. Relying on the rightmost end of the data is a dangerous affair in this case. It is well-known that estimation of the truncated normal parameters when the mean is outside the truncation range is very unreliable; in our situation, the peak is also outside the range, which makes our problem even worse. This means that for the purposes of estimation, the skewness parameter must be set to that of other countries, so only scale will be estimated. E. g. if Russia were to follow the right-tail decay model of Luxembourg, what would the total number of cases be? In other words, we can make scenario forecasts: Turkish scenario, Italian scenario, average scenario.

The second-step maximisation from the candidate solution was done in a blink of an eye, and the estimates themselves did not change much. However, the results were more volatile for Sweden, one model returned strange results for Turkey, and, as expected, the results for Russia were quite strange—because we have not applied the magic fix yet.

Fitted logistic curve total infected coronavirus cases

Now it is obvious that there is no clear winner among the models, and both logistic and Student’s fit yield similar fits. For the sake of parsimony, we restrict our analysis to logistic fits only and forget about tail heaviness estimation.

Besides that, in this estimation, we are not worrying about statistical inference. Of course, for the NLS estimator, one can compute robust standard errors, but for these data, it is non-trivial to do so for the two other estimators, and the data are dependent, so there is no obvious solution. Remember, we are keeping it simple here.

Now comes the part everyone is eagerly waiting for: how many cases are expected to occur in the upcoming months? Or, say, on which date are we expecting less than one person to be infected? Or even better, when are we expecting less than 1 case in two weeks (that is, 1/14 cases per day), and what will be the toll up until that date—so we can call it the end of the pandemic? For this, we can invoke a very simple computation: increase the time frame to the right by as much as we want, and just find the points where that curve takes values 1 or 1/14. The total expected number of cases will be the cumulative sum up until that date.

Now, we use the logistic models to forecast the number of cases in each country until the day less than 1 case or less than 1/14 cases per day. For every country, we pick the best model (out of three logistic ones) as the one that predicted the closest number of total cases up to the last date (i. e. the best in-sample fit).


	Real cases by April, 30	Mean least squares	Mean abs. dev.	Median least sq.

Spain	239,369	237,304	242,555	203,910
Germany	159,119	158,465	164,505	183,371
Italy	205,463	203,246	204,339	212,175
Luxembourg	3,784	3,749	4,108	5,147
Sweden	5,051	5,076	5,046	5,160
Czech Republic	7,682	7,646	7,710	8,033
Turkey	120,204	121,944	116,104	102,600


Cases by April, 30	Days from March, 1 till 1 new case/day	Days from March, 1 till 1 new case/14 days	Cases till 1 new case/day	Cases till 1 new case/14 days

239,369	189	231	282,401	282,415
159,119	142	172	171,369	171,379
205,463	201	249	241,977	241,993
3,784	83	109	3,838	3,847
5,051	148	196	6,752	6,768
7,682	95	120	8,029	8,037
120,204	140	166	145,922	145,930


	Peak (μ)	Horiz. scale (σ)	Skewness (s)	Vert. scale (h)

Spain	25.383	7.859	2.026	2.854
Germany	27.533	7.050	1.612	2.180
Italy	20.519	9.619	1.881	2.103
Luxembourg	20.131	3.999	2.410	0.068
Sweden	32.858	12.033	1.519	0.052
Czech Republic	27.818	6.764	1.400	0.113
Turkey	41.888	7.754	1.277	1.829

But... wait! What about Russia?

We have 7 countries and 7 different scenarios. In terms of parameters, these scenarios are described by the peak day ($\mu$), steepness of the curve ($\sigma$), asymmetry ($s$), overall scale ($h$), and for Russia, we cannot know the degree of asymmetry. Compare these two plots.

Density logistic curve yielding different tails

For the same left tail, we get the red right tail that is 4 times heavier than the blue one. This has huge implication for the forecasts as well: the blue curve describes a pandemic that decreases to less than 1 new case per day in 66 days with 37,469 cases, and the red one will take a whopping 232 days with 127,451 infections! This means that the initial values for Russia must be tweaked according to all 7 scenarios for other countries.

After we re-estimate the logistic model for Russia with skewness fixed to 1 (least absolute deviations yielded a slightly better total number of cases, with relative error slightly less than 1%), we can multiply both the scale $\sigma$ and the skewness $s$ by other countries’ skewnesses, compute the two sacred days (less than 1 case per 1 day or per 14 days), and obtain the total number of cases.

The estimates for Russia are: $\hat\mu = 56.1$, $\hat\sigma / s = 6.33$, $\hat h = 2.49$. This means that, despite the lack of clear peak by day 61, it must have occurred on day 56, just before the end of data collection.


	Days till 1 new case/day	Days till 1 new case/14 days	Cases till 1 new case/day	Cases till 1 new case/14 days

Spain scenario	320	388	402,422	402,445
Germany scenario	223	267	283,730	283,745
Italy scenario	284	343	357,887	357,907
Luxembourg scenario	429	526	536,540	536,573
Sweden scenario	205	243	260,712	260,724
Czech Republic scenario	182	215	233,328	233,339
Turkey scenario	161	188	207,443	207,452
Average scenario (s = 1.73)	249	299	315,371	315,387

Russia coronavirus covid 19 cases forecast daily

Russia coronavirus covid 19 cases forecast cumulative

Before we jump to the conclustion, there are several ‘buts’ worth mentionings:

Inferring the shift of the curve ($\hat\mu$) from the left part is very unreliable, so there is a huge confidence band around that estimate. How huge? It is up to the dear reader to compute the standard errors, but no one should even assume normality of the estimator since it is well known that the behaviour of estimators on the boundary of the parameter space is far from asymptotically normal (cf. the failure of maximal likelihood for estimation of extreme order statistics).
The ‘peak’, although defined as the maximum number of cases, in our estimation, corresponds to th peak of the smoother underlying curve, even if individual observations can jump above or below, assuming, among others, that the error is mean zero. If there is systematic under-reporting because there are not enough tests available and the real numbers are higher, this overly simplistic model cannot learn anything useful from the data, and other, more sophisticated methods must be used.
There are many asymptomatic cases that are not registered anywhere that are making up the underwater part of the iceberg, so if that iceberg of unobservedness were to be raised entirely above water, the peak would shift to the right.
The reported daily numbers depend heavily on testing tactics, good weather (this is not ‘sunspot economics’ envisioned by Jevons, this correlates with the number of people gathering outside), presidental speeches, and other hidden variables.

All these caveates hint at the fact that what we obtained is an under-estimate of the real peak, but dealing with the problems outlined above required better spatial data, more information about the population density, geographical distribution—a real job for real epidemilogists.

We conclude the following (if Russia follows the dynamics of some European countries and if one blisfully ignores the caveats above):

The peak is most likely to have already occurred at the very end of April, 2020;
The total number of infected people will be between 200 and 500 hundred thousand Russian citizens;
Under the assumption of average asymmetrical European scenario (where the decrease of cases per day after the peak is roughly 3 times slower than the growth prior to reaching the peak), the total number of cases is estimated to be 315 thousand;
In early November, the number of new cases per day will become less than 1, and by the end of December, the number, there will be fewer than 1 case per 2 weeks.

Only time will tell how accurate these quick estimates are, so you are encouraged to redo the estimation on new data (maybe with more countries) and update the forecasts.

Copy the table above into R under the name ds (data frame) and use the code below to replicate these results.

library(hydroPSO)
library(stargazer)

cs <- c("spa", "ger", "ita", "lux", "swe", "cze", "tur", "rus")
csn <- c("Spain", "Germany", "Italy", "Luxembourg", "Sweden", "Czech Republic", "Turkey", "Russia")
# Uncomment this for Cyrillic names
# csn <- c("Испания", "Германия", "Италия", "Люксембург", "Швеция", "Чехия", "Турция", "Россия")
d <- lapply(cs, function(x) readLines(paste0(x, ".fsf")))

clean <- function(x) {
  x <- gsub("\\s+", " ", x)
  x <- gsub(" $", "", x)
  x <- gsub(",", "", x)
  x <- gsub("\\([^)]+\\)", "", x)
  x
}

d <- lapply(d, function(x) strsplit(clean(x), " "))
d <- lapply(d, function(y) lapply(y, function(x) if (length(x) == 2) x else x[1:2]) )
d <- lapply(d, function(x) as.data.frame(do.call(rbind, x), stringsAsFactors = FALSE))
d <- lapply(d, function(x) {colnames(x) <- c("Date", "Cases"); x})

ds <- Reduce(
  function(x, y) merge(x, y, all = TRUE, by = "Date"),
  d
)
colnames(ds) <- c("Date", cs)

for (i in 2:ncol(ds)) {
  ds[, i] <- as.numeric(ds[, i])
  if (is.na(ds[1, i])) {
    nm <- which(is.finite(ds[, i]))[1]
    ds[1:(nm-1), i] <- 0
  }
}

ds <- ds[1:61, ]

stargazer(ds, type = "html", summary = FALSE, out = "blog.html", rownames = FALSE)

mycurve <- function(x, distrib, pars) {
  pars[4] <- pars[4] * 1e4
  xstd <- (x - pars[1]) / pars[2]
  f <- switch(distrib,
              logistic = ifelse(xstd > 0, dlogis(xstd/pars[3]), dlogis(xstd*pars[3])) * pars[4],
              student = ifelse(xstd > 0, dt(xstd/pars[3], df = pars[5]), dt(xstd*pars[3], df = pars[5])) * pars[4]
  )
  return(f)
}

png("plot1-logistic-curve-coronavirus-italy.png", 640, 400, type = "cairo", pointsize = 16)
plot(diff(ds$ita), type = "l", xlab = "Days since the 1st of March", ylab = "New cases per day",
     bty = "n", lwd = 2, main = "Logistic curve for Italy")
lines(1:60, mycurve(1:60, "logistic", c(25, 6, 1, 2.45)), col = "red", lty = 2, lwd = 2)
text(50, 1500, "WTF?!", col = "red", cex = 2)
dev.off()

mycols <- c("#000000", "#008800", "#AA0000")

png("plot2-density-logistic-curve-for-corona-virus-flattening.png", 720, 360, pointsize = 16)
par(mar = c(2, 2, 3, 1), mfrow = c(1, 2))
curve(dlogis(x), -6, 6, n = 1001, main = "Skewed logistic density", bty = "n",
      lwd = 2, ylim = c(0, dlogis(0)), ylab = "", xlab = "")
curve(mycurve(x, "logistic", c(0, 1, 1.5, 1e-4)), -6, 6, n = 1001, add = TRUE,
      col = mycols[2], lwd = 2, lty = 2)
curve(mycurve(x, "logistic", c(0, 1, 0.5, 1e-4)), -6, 6, n = 1001, add = TRUE,
      col = mycols[3], lwd = 2, lty = 2)
legend("topright", c("log.", "log., skew = 1.5", "log., skew = 0.5"), col = mycols, pch = 16, bty = "n")

curve(dt(x, 5), -6, 6, n = 1001, main = "Skewed Student’s density", bty = "n",
      lwd = 2, ylim = c(0, dt(0, 5)))
curve(mycurve(x, "student", c(0, 1, 1.5, 1e-4, 5)), -6, 6, n = 1001, add = TRUE,
      col = mycols[2], lwd = 2, lty = 2)
curve(mycurve(x, "student", c(0, 1, 0.5, 1e-4, 5)), -6, 6, n = 1001, add = TRUE,
      col = mycols[3], lwd = 2, lty = 2)
legend("topright", c("t(5)", "t(5, skew = 1.5)", "t(5, skew = 0.5)"), col = mycols, pch = 16, bty = "n")
dev.off()

ds[, (length(cs)+2):(length(cs)*2+1)] <- sapply(ds[, 2:(length(cs)+1)], function(x) diff(c(NA, x)))
colnames(ds)[(length(cs)+2):(length(cs)*2+1)] <- paste0("d", cs)
ds <- ds[2:nrow(ds), ]

mycols <- rainbow(length(cs), end = 0.8, v = 0.8)
png("plot3-new-covid-19-cases-as-a-fraction-of-maximum.png", 640, 400, pointsize = 16)
plot(NULL, NULL, xlim = c(1, nrow(ds)), ylim = c(0, 1), xlab = "Days since March, 1",
     ylab = "Cases / Cases at peak", bty = "n", main = "Scaled dynamics of new cases per day")
for (i in (length(cs)+2):(length(cs)*2+1)) lines(ds[, i] / max(ds[, i]),
                                                 type = "l", lwd = 2, col = mycols[i - length(cs) - 1])
legend("topleft", cs, pch = 16, col = mycols, bty = "n")
dev.off()

start.pars <- list(list(c(27, 8, 1.8, 3), c(26, 11, 2, 2, 5)), # Spain
                   list(c(26, 6, 2, 2.3), c(27, 9, 2, 1.5, 5)), # Germany
                   list(c(20, 9, 2.1, 2.2), c(20, 12, 2.1, 1.5, 5)), # Italy
                   list(c(25, 5, 1.5, 0.08), c(25, 7, 1.5, 0.055, 5)), # Luxembourg
                   list(c(35, 8, 1.2, 0.07), c(35, 12, 1.2, 0.045, 5)), # Sweden
                   list(c(30, 7, 1.3, 0.11), c(30, 11, 1.3, 0.07, 5)), # Czech,
                   list(c(43, 7, 1.3, 1.8), c(43, 11, 1.3, 1.2, 5)), # Turkey
                   list(c(60, 7, 1, 2.8), c(58, 10, 1, 1.8, 5)) # Russia
                   )
start.pars <- lapply(start.pars, function(x) {names(x) <- c("logistic", "student"); x})

leg <- function(i) {
  c(parse(text = paste0('mu == ', start.pars[[i]][[1]][1], '*" | "*', start.pars[[i]][[2]][1])),
    parse(text = paste0('sigma == ', start.pars[[i]][[1]][2], '*" | "*', start.pars[[i]][[2]][2])),
    parse(text = paste0('s == ', start.pars[[i]][[1]][3], '*" | "*', start.pars[[i]][[2]][3])),
    parse(text = paste0('h == ', start.pars[[i]][[1]][4], '*" | "*', start.pars[[i]][[2]][4])),
    parse(text = paste0('df(t) == ', start.pars[[i]][[2]][5])))
}

png("plot4-initial-values-for-search.png", 720, 1200, pointsize = 16)
par(mfrow = c(4, 2))
for (i in 1:length(cs)) {
  plot(ds[, paste0("d", cs[i])], type = "l", lwd = 2,
       xlab = "Days since March, 1",        ylab = "Registered cases", bty = "n",
       main = paste0(csn[i], " initial values"), ylim = c(0, max(ds[, paste0("d", cs[i])])))
  lines(mycurve(1:nrow(ds), "logistic", start.pars[[i]][[1]]), col = mycols[i], lwd = 2, lty = 2)
  lines(mycurve(1:nrow(ds), "student", start.pars[[i]][[2]]), col = mycols[i], lwd = 2, lty = 3)
  legend("topleft", legend = leg(i), bty = "n")
}
dev.off()

criterion <- function(pars, observed, distrib, fun = function(x) mean(x^2)) {
  theor <- mycurve(x = 1:length(observed), distrib = distrib, pars = pars)
  return(fun(theor - observed))
}

rls <- function(x) sqrt(mean(x^2)) # Root mean least square; root is here for the purposes of comparison
nla <- function(x) mean(abs(x))
nms <- function(x) sqrt(median(x^2)) # Root median of squares
opt.res <- vector("list", length(cs))
for (i in 1:length(cs)) {
  sl <- start.pars[[i]][[1]]
  st <- start.pars[[i]][[2]]
  ll <- start.pars[[i]][[1]] / c(2, 3, 3, 4) # Lower bound for logistic
  ul <- start.pars[[i]][[1]] * c(1.5, 3, 3, 3) # Upper bound for logistic
  lt <- c(start.pars[[i]][[2]][1:4] / c(2, 3, 3, 4), 0.1) # Lower bound for Student
  ut <- c(start.pars[[i]][[2]][1:4] * c(1.5, 3, 3, 3), 50) # Upper bound for Student
  contl1 <- list(npart = 400, REPORT = 1, parallel = "parallel", par.nnodes = 6, maxit = 50, plot = TRUE)
  contl2 <- list(npart = 400, REPORT = 1, parallel = "parallel", par.nnodes = 6, maxit = 75, plot = TRUE)
  contl3 <- list(npart = 400, REPORT = 1, parallel = "parallel", par.nnodes = 6, maxit = 100, plot = TRUE)
  contt1 <- list(npart = 600, REPORT = 1, parallel = "parallel", par.nnodes = 6, maxit = 75, plot = TRUE)
  contt2 <- list(npart = 600, REPORT = 1, parallel = "parallel", par.nnodes = 6, maxit = 100, plot = TRUE)
  contt3 <- list(npart = 600, REPORT = 1, parallel = "parallel", par.nnodes = 6, maxit = 125, plot = TRUE)
  set.seed(1); cat(csn[i], "logistic: least squares\n")
  opt.log.nls <- hydroPSO(par = sl, fn = function(x) criterion(x, ds[, paste0("d", cs[i])], "logistic", rls),
                          lower = ll, upper = ul, control = contl1)
  set.seed(1); cat(csn[i], "logistic: least absolute deviations\n")
  opt.log.nla <- hydroPSO(par = sl, fn = function(x) criterion(x, ds[, paste0("d", cs[i])], "logistic", nla),
                          lower = ll, upper = ul, control = contl2)
  set.seed(1); cat(csn[i], "logistic: median of least squares\n")
  opt.log.nms <- hydroPSO(par = sl, fn = function(x) criterion(x, ds[, paste0("d", cs[i])], "logistic", nms),
                          lower = ll, upper = ul, control = contl3)
  set.seed(1); cat(csn[i], "student: least squares\n")
  opt.t.nls <- hydroPSO(par = st, fn = function(x) criterion(x, ds[, paste0("d", cs[i])], "student", rls),
                        lower = lt, upper = ut, control = contt1)
  set.seed(1); cat(csn[i], "student: least absolute deviations\n")
  opt.t.nla <- hydroPSO(par = st, fn = function(x) criterion(x, ds[, paste0("d", cs[i])], "student", nla),
                        lower = lt, upper = ut, control = contt2)
  set.seed(1); cat(csn[i], "student: median of least squares\n")
  opt.t.nms <- hydroPSO(par = st, fn = function(x) criterion(x, ds[, paste0("d", cs[i])], "student", nms),
                        lower = lt, upper = ut, control = contt3)

  opt.res[[i]] <- list(opt.log.nls, opt.log.nla, opt.log.nms, opt.t.nls, opt.t.nla, opt.t.nms)
  save(opt.res, file = "opt.RData", compress = "xz")
}

ui4 <- diag(4)
ci4 <- c(10, 1, 0.01, 1e-5)
ui5 <- diag(5)
ci5 <- c(10, 1, 0.01, 1e-5, 0.1)

opt.refined <- vector("list", length(cs))
for (i in 1:length(cs)) {
  cat(csn[i], "\n")
  sl1 <- opt.res[[i]][[1]]$par # Replace with start.pars[[i]]$logistic if you did not do the stochastic optimisation
  sl2 <- opt.res[[i]][[2]]$par # Replace with start.pars[[i]]$logistic if you did not do the stochastic optimisation
  sl3 <- opt.res[[i]][[3]]$par # Replace with start.pars[[i]]$logistic if you did not do the stochastic optimisation
  st1 <- opt.res[[i]][[4]]$par # Replace with start.pars[[i]]$student if you did not do the stochastic optimisation
  st2 <- opt.res[[i]][[5]]$par # Replace with start.pars[[i]]$student if you did not do the stochastic optimisation
  st3 <- opt.res[[i]][[6]]$par # Replace with start.pars[[i]]$student if you did not do the stochastic optimisation

  opt.log.nls <- constrOptim(sl1, function(x) criterion(x, ds[, paste0("d", cs[i])], "logistic", rls),
                             grad = NULL, ui = ui4, ci = ci4, outer.iterations = 300)
  opt.log.nla <- constrOptim(sl2, function(x) criterion(x, ds[, paste0("d", cs[i])], "logistic", nla),
                             grad = NULL, ui = ui4, ci = ci4, outer.iterations = 300)
  opt.log.nms <- constrOptim(sl3, function(x) criterion(x, ds[, paste0("d", cs[i])], "logistic", nms),
                             grad = NULL, ui = ui4, ci = ci4, outer.iterations = 300)
  opt.t.nls <- constrOptim(st1, function(x) criterion(x, ds[, paste0("d", cs[i])], "student", rls),
                           grad = NULL, ui = ui5, ci = ci5, outer.iterations = 300)
  opt.t.nla <- constrOptim(st2, function(x) criterion(x, ds[, paste0("d", cs[i])], "student", nla),
                           grad = NULL, ui = ui5, ci = ci5, outer.iterations = 300)
  opt.t.nms <- constrOptim(st2, function(x) criterion(x, ds[, paste0("d", cs[i])], "student", nms),
                           grad = NULL, ui = ui5, ci = ci5, outer.iterations = 300)

  opt.refined[[i]] <- list(opt.log.nls, opt.log.nla, opt.log.nms, opt.t.nls, opt.t.nla, opt.t.nms)
}
save(opt.res, opt.refined, file = "opt.RData", compress = "xz")

xmax <- 120
xinf <- 1000

mycols <- c("#CC0000", "#0000FFCC")
png("plot5-fitted-logistic-curve-total-infected-coronavirus-cases.png", 720, 1200, type = "cairo", pointsize = 16)
par(mfrow = c(4, 2))
for (i in 1:length(cs)) {
  plot(ds[, paste0("d", cs[i])], xlim = c(1, xmax), type = "l", lwd = 3,
       xlab = "Days since March, 1", ylab = "Cases", bty = "n", main = paste0("Fitted curves for ", csn[i]))
  for (j in 1:3) {
    lines(mycurve(1:xmax, distrib = "logistic", pars = opt.refined[[i]][[j]]$par), lwd = 2, lty = j, col = mycols[1])
    lines(mycurve(1:xmax, distrib = "student", pars = opt.refined[[i]][[j + 3]]$par), lwd = 2, lty = j, col = mycols[2])
  }
  legend("topleft", c("Logis.", "Stud."), pch = 16, col = mycols, bty = "n")
  legend("topright", c("Mean(least squares)", "Mean(least abs. dev.)", "Median(least abs. sq.)"),
         lty = 1:3, lwd = 2, bty = "n")
}
dev.off()

res.tab <- vector("list", length(cs))
for (i in 1:length(cs)) {
  cat(csn[i])
  a <- cbind(c(start.pars[[i]]$logistic, NA), c(opt.refined[[i]][[1]]$par, NA),
             c(opt.refined[[i]][[2]]$par, NA), c(opt.refined[[i]][[3]]$par, NA),
             start.pars[[i]]$student, opt.refined[[i]][[4]]$par,
             opt.refined[[i]][[5]]$par, opt.refined[[i]][[6]]$par)
  colnames(a) <- c("Log. init. val.", "Log. LS", "Log. LAD", "Log. LMS",
                   "Stud. init. val.", "Stud. LS", "Stud. LAD", "Stud. LMS")
  row.names(a) <- c("Peak", "Horiz. scale", "Skew", "Vert. scale", "Stud. DoF")
  res.tab[[i]] <- a
  # stargazer(a, out = paste0(csn[i], ".html"), type = "html", summary = FALSE, digits = 2)
}

find.day <- function(cases, distrib, pars) {
  uniroot(function(x) mycurve(x, distrib, pars) - cases,
          lower = pars[1], upper = pars[1] + 180, extendInt = "downX", maxiter = 200, tol = 1e-8)$root
}

get.cases <- function(xmax, distrib, pars) {
  round(sum(mycurve(1:xmax, distrib, pars)))
}

days.to.one <- days.to.tw <- cases.one <- cases.tw <- cases.insample <- matrix(NA, nrow = length(cs), ncol = 6)
best.mod <- best.log <- numeric(length(cs))
for (i in 1:length(cs)) {
  days.to.one[i, ] <- c(sapply(1:3, function(j) find.day(1, "logistic", opt.refined[[i]][[j]]$par)),
                        sapply(4:6, function(j) find.day(1, "student", opt.refined[[i]][[j]]$par)))
  days.to.tw[i, ] <- c(sapply(1:3, function(j) find.day(1/14, "logistic", opt.refined[[i]][[j]]$par)),
                        sapply(4:6, function(j) find.day(1/14, "student", opt.refined[[i]][[j]]$par)))
  cases.one[i, ] <- c(sapply(1:3, function(j) get.cases(ceiling(days.to.one[i, j]), "logistic", opt.refined[[i]][[j]]$par)),
                      sapply(4:6, function(j) get.cases(ceiling(days.to.one[i, j]), "student", opt.refined[[i]][[j]]$par)))
  cases.tw[i, ] <- c(sapply(1:3, function(j) get.cases(ceiling(days.to.tw[i, j]), "logistic", opt.refined[[i]][[j]]$par)),
                      sapply(4:6, function(j) get.cases(ceiling(days.to.tw[i, j]), "student", opt.refined[[i]][[j]]$par)))
  cases.insample[i, ] <- c(sapply(1:3, function(j) get.cases(nrow(ds), "logistic", opt.refined[[i]][[j]]$par)),
                           sapply(4:6, function(j) get.cases(nrow(ds), "student", opt.refined[[i]][[j]]$par)))
  best.mod[i] <- which.min(abs(ds[nrow(ds), cs[i]] - cases.insample[i, ]))
  best.log[i] <- which.min(abs(ds[nrow(ds), cs[i]] - cases.insample[i, 1:3]))
  cat(csn[i], "has the best in-sample fit from model", best.mod[i], "and among logistic models,", best.log[i], "\n")
}
colnames(days.to.one) <- colnames(days.to.tw) <- c("Log. LS", "Log. LAD", "Log. LMS", "Stud. LS", "Stud. LAD", "Stud. LMS")
row.names(days.to.one) <- row.names(days.to.tw) <- csn

mod.inds <- c("Mean least squares", "Mean abs. dev.", "Median least sq.")
cases.ins <- cbind(as.numeric(ds[nrow(ds), 2:(length(cs))]), cases.insample[1:7, 1:3])
colnames(cases.ins) <- c("Real cases by April, 30", paste0(mod.inds, " forecast"))
row.names(cases.ins) <- csn[1:7]
stargazer(cases.ins, type = "html", out = "insample.html", summary = FALSE)
best.log

best.pars <- lapply(1:7, function(i) opt.refined[[i]][[best.log[i]]]$par)
best.pars <- do.call(rbind, best.pars)
row.names(best.pars) <- csn[1:7]
colnames(best.pars) <- c("Peak (mu)", "Horiz. scale (sigma)", "Skewness (s)", "Vert. scale (h)")
stargazer(best.pars, type = "html", out = "bestpars.html", summary = FALSE)

days.to <- ceiling(cbind(sapply(1:7, function(i) days.to.one[i, best.log[i]]),
                         sapply(1:7, function(i) days.to.tw[i, best.log[i]])))
cases <- ceiling(cbind(sapply(1:7, function(i) cases.one[i, best.log[i]]),
                       sapply(1:7, function(i) cases.tw[i, best.log[i]])))
dcases <- cbind(as.numeric(ds[nrow(ds), 2:length(cs)]), days.to, cases)
colnames(dcases) <- c("Cases by April, 30", "Days from March, 1 till 1 new case/day",
                      "Days from March, 1 till 1 new case/14 days",
                      "Cases till 1 new case/day", "Cases till 1 new case/14 days")
row.names(dcases) <- csn[1:7]
stargazer(dcases, type = "html", out = "days-to-one.html", summary = FALSE)

png("plot6-density-logistic-curve-yielding-different-tails.png", 600, 400, pointsize = 16)
par(mar = c(2, 2, 3, 1))
curve(mycurve(x, "logistic", c(10, 3, 2, 1)), 0, 10, n = 1001, col = "blue",
      lwd = 2, bty = "n", xlab = "n", ylab = "n",
      main = "Identification problems without the right tail (logistic)", xlim = c(0, 40))
curve(mycurve(x, "logistic", c(10, 6, 4, 1)), 0, 10, n = 1001, add = TRUE, col = "red", lwd = 2)
curve(mycurve(x, "logistic", c(10, 3, 2, 1)), 10, 40, n = 1001, add = TRUE, col = "blue", lwd = 2, lty = 2)
curve(mycurve(x, "logistic", c(10, 6, 4, 1)), 10, 40, n = 1001, add = TRUE, col = "red", lwd = 2, lty = 2)
legend("right", c(expression(sigma == 3 ~ ", " ~ s == 2), expression(sigma == 6 ~ ", " ~ s == 4)),
       bty = "n", col = c("blue", "red"), lwd = 2)
legend("topleft", c(expression(mu == 10), expression(h == 1)), bty = "n")
dev.off()

d.good <- find.day(1, "logistic", c(10, 3, 2, 1))
d.bad <- find.day(1, "logistic", c(10, 6, 4, 1))
get.cases(ceiling(d.good), "logistic", c(10, 3, 2, 1))
get.cases(ceiling(d.bad), "logistic", c(10, 6, 4, 1))

criterion.log.rest <- function(pars, skew, observed, fun = function(x) mean(x^2)) {
  theor <- mycurve(x = 1:length(observed), distrib = "logistic", pars = c(pars[1:2], skew, pars[3]))
  return(fun(theor - observed))
}
set.seed(1); cat("Russia logistic: least squares\n")
opt.log.nls <- hydroPSO(par = start.pars[[8]]$logistic,
                        fn = function(x) criterion.log.rest(x, 1, ds[, "drus"], rls),
                        lower = start.pars[[8]]$logistic[c(1, 2, 4)] / 4,
                        upper = start.pars[[8]]$logistic[c(1, 2, 4)] * 4, control = contt3)
set.seed(1); cat("Russia logistic: least absolute deviations\n")
opt.log.nla <- hydroPSO(par = start.pars[[8]]$logistic[c(1, 2, 4)],
                        fn = function(x) criterion.log.rest(x, 1, ds[, "drus"], nla),
                        lower = start.pars[[8]]$logistic[c(1, 2, 4)] / 4,
                        upper = start.pars[[8]]$logistic[c(1, 2, 4)] * 4, control = contt3)

rus.nls <- c(opt.log.nls$par[1:2], 1, opt.log.nls$par[3])
rus.nla <- c(opt.log.nla$par[1:2], 1, opt.log.nla$par[3])
rus.nls.cases <- get.cases(nrow(ds), "logistic", rus.nls)
rus.nla.cases <- get.cases(nrow(ds), "logistic", rus.nla)
c(ds[nrow(ds), "rus"], ds[nrow(ds), "rus"] - rus.nls.cases, ds[nrow(ds), "rus"] - rus.nla.cases)

# Least absolute deviations is better

russia <- matrix(NA, nrow = 8, ncol = 4)
mycols <- rainbow(length(cs), end = 0.8, v = 0.8)
for (i in 1:7) {
  these.pars <- c(rus.nla[1], rus.nla[2] * best.pars[i, 3], best.pars[i, 3], rus.nla[4])
  to1 <- ceiling(find.day(1, "logistic", these.pars))
  to14 <- ceiling(find.day(1/14, "logistic", these.pars))
  c1 <- get.cases(to1, "logistic", these.pars)
  c14 <- get.cases(to14, "logistic", these.pars)
  russia[i, ] <- c(to1, to14, c1, c14)
}
these.pars <- c(rus.nla[1], rus.nla[2] * mean(best.pars[, 3]), mean(best.pars[, 3]), rus.nla[4])
to1 <- ceiling(find.day(1, "logistic", these.pars))
to14 <- ceiling(find.day(1/14, "logistic", these.pars))
c1 <- get.cases(to1, "logistic", these.pars)
c14 <- get.cases(to14, "logistic", these.pars)
russia[8, ] <- c(to1, to14, c1, c14)

png("plot7-russia-coronavirus-covid-19-cases-forecast-daily.png", 640, 400, pointsize = 14)
plot(ds[, "drus"], xlim = c(1, 305), ylim = c(0, max(ds[, "drus"])), type = "l",
     lwd = 3, bty = "n", main = "Conditional forecast COVID-19 pandemic in Russia for 2020",
     xlab = "", ylab = "Daily cases", xaxt = "n")
axis(1, seq(1, 305, length.out = 11), c(month.name[3:12], ""), las = 2)
for (i in 1:7) {
  these.pars <- c(rus.nla[1], rus.nla[2] * best.pars[i, 3], best.pars[i, 3], rus.nla[4])
  lines(nrow(ds):305, mycurve(nrow(ds):305, "logistic", these.pars), lwd = 2, col = mycols[i])
}
these.pars <- c(rus.nla[1], rus.nla[2] * mean(best.pars[, 3]), mean(best.pars[, 3]), rus.nla[4])
lines(nrow(ds):305, mycurve(nrow(ds):305, "logistic", these.pars), lwd = 3, lty = 2, col = "black")
legend("topright", paste0(c(csn[1:7], "Average"), " scenario"),
       col = c(mycols[1:7], "black"), lwd = 2, lty = c(rep(1, 7), 2), bty = "n")
dev.off()

png("plot8-russia-coronavirus-covid-19-cases-forecast-cumulative.png", 640, 400, pointsize = 14)
plot(cumsum(ds[, "drus"]), xlim = c(1, 305), ylim = c(0, max(russia)),
     type = "l", lwd = 3, bty = "n", main = "Forecast dynamics of COVID-19 pandemic in Russia",
     xlab = "", ylab = "Total cases, thousands", xaxt = "n", yaxt = "n")
axis(1, seq(1, 305, length.out = 11), c(month.name[3:12], ""), las = 2)
axis(2, seq(0, 6e5, 1e5), as.character(seq(0, 6e2, 1e2)), las = 2)
for (i in 1:7) {
  these.pars <- c(rus.nla[1], rus.nla[2] * best.pars[i, 3], best.pars[i, 3], rus.nla[4])
  lines(nrow(ds):305, cumsum(mycurve(nrow(ds):305, "logistic", these.pars)) + ds[nrow(ds)-1, "rus"],
        lwd = 2, col = mycols[i])
}
these.pars <- c(rus.nla[1], rus.nla[2] * mean(best.pars[, 3]), mean(best.pars[, 3]), rus.nla[4])
lines(nrow(ds):305, cumsum(mycurve(nrow(ds):305, "logistic", these.pars)) + ds[nrow(ds)-1, "rus"],
      lwd = 3, lty = 2, col = "black")
legend("topleft", paste0(c(csn[1:7], "Average"), " scenario"),
       col = c(mycols[1:7], "black"), lwd = 2, lty = c(rep(1, 7), 2), bty = "n")
dev.off()

mean(best.pars[, 3])

colnames(russia) <- colnames(dcases)[2:5]
row.names(russia) <- c(paste0(csn[1:7], " scenario"),
                       paste0("Average scenario (s = ", round(mean(best.pars[, 3]), 2), ")"))
stargazer(russia, type = "html", out = "russia-final.html", summary = FALSE)

References

Fernández, C., Steel, M.F.J., 1998. On bayesian modeling of fat tails and skewness. Journal of the American Statistical Association 93, 359–371.