HTML
-
On December 31, 2019, the World Health Organization (WHO) was alerted to several cases of pneumonia of unknown etiology in Wuhan, the capital city of Hubei Province in Central China. A novel coronavirus (2019-nCoV) was identified as the causative virus by Chinese authorities on January 7, 2020 (1), and China CDC has named the associated disease as novel coronavirus-infected pneumonia (NCIP) (2-3). As of January 23, 2020, the National Health Commission (NHC) of China had confirmed a total of 830 cases of NCIP in Mainland China, including 177 in critical condition, 25 fatalities, and 34 recoveries. In Wuhan the origin of the NCIP outbreak, 495 cases including 24 fatalities have been confirmed. In addition to the cases in Mainland China, two cases have been detected in Hong Kong Special Administrative Region, China, two in Macao Special Administrative Region, China, one in Taiwan, China, and a total of nine cases have been detected outside China in Thailand (4 cases), Vietnam (2 cases), USA (1 case), Japan (1 case), Republic of Korea (1 case) and Singapore (1 case) (4).
The current epidemiological information has indicated that most of the global cases were directly imported from Wuhan. Therefore, a careful and precise understanding of the total number of cases in Wuhan is crucial for decision making and prevention of NCIP. There has already been a considerable investment of resources in Wuhan in order to combat the spread of NCIP. However, estimating the magnitude of the epidemic in Wuhan based on the reported number of confirmed cases is difficult due to the virus’ lengthy incubation period and variable symptom presentation (sometimes without the presence of fever or other symptoms), overburdened medical resources and personnel, and time added to receive test results from China’s NHC and China CDC. In this article, a method for estimating the total number of NCIP-onset cases within Wuhan is proposed using the number of cases detected outside Hubei Province.
-
A total of 3,933 cases of NCIP have been estimated in Wuhan (95% confidence interval [CI]: 3,454–4,450) that had an onset of symptoms by January 19, 2020. The estimate, which uses a statistical model (5) of 1,723 cases (95% CI: 427–4,471), was given by another research team on January 12, 2020. Compared with that model (5), the existing model is improved by 1) including more data from regions outside Wuhan in the model rather than just the three (now nine) confirmed cases outside China used in that model (5), which leads to obtaining a much narrower CI and 2) letting the probability of traveling to regioni, namelypi, be different for differentirather than a constant in that model (5), which establishes a more realistic and elaborate model.
-
The proposed model is based on the following assumptions:
1. Wuhan International Airport has a catchment population of 19 million individuals.
2. There is, on average, ad=10-day window between infection and detection, which includes a 5- to 6-day incubation period and a 4- to 5-day delay from symptom onset to detection.
3. Trip durations are long enough that a traveling patient infected in Wuhan will develop symptoms and be detected in other places rather than after returning to Wuhan.
4. All travelers departing from Wuhan, including transfer passengers, have the same risk of infection as local residents.
5. We only consider symptomatic cases with disease severity of a level that can be detected and do not consider asymptomatic or mild cases.
6. Traveling is independent of the exposure risk to 2019-nCoV or of infection status.
7. Patient recoveries are not considered in the model.
8. The proportion of adjusting for the total passengers by air travel volume is a constant over different regions.
Aside from Assumption 8, the same assumptions were also made in the previous model (5). Assumption 7 was not explicitly stated in the previous model (5) but is implicitly required. Some of the above assumptions are unrealistic, but the data needed to account for these assumptions are not currently available. The following points are further noted:
(i) Violation of Assumption 2 (e.g., the mean time from infection to detection is longer than 10 days) would cause an overestimation of the total number of cases in Wuhan.
(ii) Violation of Assumption 4 (e.g., travelers have a lower risk of infection than residents in Wuhan) would cause an underestimation.
(iii) Violation of Assumption 6 (e.g., infected individuals are less likely to travel due to the health condition) would cause an underestimation.
(iv) Given that there are very few cases of recovery before January 19, 2020, Assumption 8 should not significantly influence the outcome.
Assumptions
-
Table 1lists the top four regions outside of Hubei Province with a relatively large number of reported confirmed cases alongside the corresponding maximum seating capacity for flights from Wuhan. The number of confirmed cases is positively related to passenger volume from Wuhan. Hence, the following model was considered: the number of imported casesXK+d,ifrom Wuhan to regioniby Day (K+d) has a Binomial (10NK,pi) distribution,i=1, 2, …,m, whereNKis the total number of cases in Wuhan by DayKto be estimated,piis the daily probability of traveling from Wuhan to regioni, which can be estimated using the ratio of daily volume of passengers and the catchment population of Wuhan airport, anddis the mean time from infection to detection (see details of the model explanation in Appendix A). The calculated daily number of travelers based on flight capacity is further described in Appendix B.
Region Total seats Cases Guangdong 111,624 53 Zhejiang 46,528 43 Beijing 59,364 26 Shanghai 51,517 20 Table 1.Number of confirmed cases and seating capacity for 4 regions in China.
Determining the number of imported cases in regioni, namelyXK+d,i, plays a crucial role in the modeling procedure.Table 2shows the number of reported confirmed cases in various provinces/cities/countries (excluding Hubei Province) within and outside of China on January 23, 2020. The column titled “No. of Local Cases” indicates the number of cases which were not directly imported from Wuhan. Despite the rapid spread of the epidemic, the current situation outside Hubei Province is relatively controlled given the adequate medical support being allocated towards the current outbreak. This suggests that the number of reported cases outside Hubei, as of January 23, 2020, is a fairly accurate representation of the actual epidemic situation in the surrounding regions. Note that only cases directly imported from Wuhan were considered. For example, among the 53 confirmed cases reported in Guangdong Province, of which 8 were local cases, the actual number of imported cases,XK+d,i, was regarded as 45. Moreover, for the one case in Singapore, the patient departed from the airport in Guangzhou, hence, it was a non-directly imported case and the correspondingXK+d,iis 0. Furthermore, observations from a few nearby provincial-level administrative divisions (PLADs) including Hunan, Anhui, Henan, Jiangxi, and Tibet and other cities within Hubei Province were dropped due to challenges with estimating daily probability of travel without air transportation data from Wuhan.
Region No. of cases No. of local cases Guangdong 53 8 Zhejiang 43 Beijing 26 Shanghai 20 1 Chongqing 27 Sichuan 15 Guangxi 13 1 Jiangsu 9 Shandong 9 1 Hainan 8 Fujian 5 Tianjin 4 Liaoning 4 1 Heilongjiang 4 Jilin 3 Shaanxi 3 1 Guizhou 3 1 Ningxia 2 Xinjiang 2 Gansu 2 1 Yunnan 1 Inner Mongolia 1 Shanxi 1 Qinghai 0 Hunan 24 1 Anhui 15 Henan 9 1 Jiangxi 7 Hebei 2 Tibet 0 Macau, China 2 Hong Kong, China 2 Taiwan, China 1 Japan 1 South Korea 1 USA 1 Thailand 3 Singapore 1* Vietnam 2 1 *The patient was a resident from Wuhan city but departed from the airport in Guangzhou. Table 2.Number of reported confirmed cases within (excluding Hubei) and outside China on 23 January 2020.
UsingXK+d,iobtained from domestically and internationally reported cases and the corresponding estimated travel probabilitypi, wherei=1, 2, …,l, it is possible to infer the magnitude of comparable cases,NK, within Wuhan that may have occurred on DayKthrough a binomial model. The MLE estimate ofNKis 3,933 and the corresponding 95% CI is (3,454–4,450). Note thatXK+d,iwas obtained on 23 January 2020, hence, the estimatedNKis the number of total cases (including those in incubation period) as of January 14, 2020 or the number of cases with symptom onset by January 19, 2020.
-
The number of confirmed cases in Wuhan reported by China’s NHC has increased rapidly in recent weeks. However, the currently reported number of 495 cases as of January 23, 2020 in Wuhan is still far below our estimate of 3,933. This may be due to the insufficient amount of medical resources in Wuhan and Hubei Province given the suddenness of the outbreak. We suggest boosting medical resources using specific methods such as increasing the amount of hospital beds in order to accommodate all fever patients with pneumonia or a severe respiratory disease in Wuhan in order to expedite the virus examination process and to allow the region to more adequately respond to this public health crisis.
-
Assume Day 1 is the date of the infection for the very first case. LetNjdenote the number of cases (including those in incubation period) in Wuhan by Dayj,Yjbe the number of the cases traveling to regionlon Dayi,Xjbe the number of cases detected in regionlby Dayj,pis the pre-defined probability of traveling to regionldescribed in Appendix B anddis the mean time from infection to detection (here we suppress the notationlfor conciseness). ThenYjwould follow a binomial distribution listed inTable 3below. Note that from Dayd+1 on, the number of trials in the binomial is no longerNjbutNj-(Nj-d-Yj-d) under Assumption 2. Note thatYj-dis relatively small compare withNj-d, hence we dropYj-dhere for simplicity. Therefore,
Date Distribution Period ofYibeing detected Day 1 Y1~Binomial(N1,p) Y1is expected to be detected on Dayd+1 Day 2 Y2~Binomial(N2,p) Y2is expected to be detected on Dayd+1 and Dayd+2 $\vdots $ $\vdots $ $\vdots $ Dayd Yd~Binomial(Nd,p) Ydis expected to be detected between Dayd+1 and Day 2d Dayd+1 Yd+1~Binomial(Nd+1−N1,p) Yd+1is expected to be detected between Dayd+2 and Day 2d+1 $\vdots $ $\vdots $ $\vdots $ Day 2d−1 Y2d−1~Binomial(N2d−1−Nd-1,p) Y2d−1is expected to be detected between Day 2dand Day 3d−1 Day 2d Y2d~Binomial(N2d-Nd,p) Y2dis expected to be detected between Day 2d+1 and Day 3d Table 3.Binomial distributions on Dayi.
$$ \mathop \sum \limits_{j = 1}^K {Y_j}\sim Binomial\left( {\mathop \sum \limits_{j = K - d + 1}^K {N_j},p} \right),K > d $$ However, note thatYjwould not be directly observed on Dayjor any other single day but would be detected between a certain period listed inTable 1. For example, suppose thatNKis of interest, then
$\mathop \sum \nolimits_{i = 1}^K {Y_i}$ needs to be calculated, note thatY1,…,YKwould be all included inXK+d, but$\mathop \sum \nolimits_{i = 1}^K {Y_i} \le {X_{K + d}}$ as the observedXKwould include parts ofYK+1,…,YK+d−1. A straightforward but rough way to approximate$\mathop \sum \nolimits_{i = 1}^K {Y_i}$ is to useXK+d/2. The other problem is that using such binomial model, what we can estimate is$\mathop \sum \nolimits_{i = K - d + 1}^K {N_i}$ but not a singleNi, we suggest using$\mathop \sum \nolimits_{i = K - d + 1}^K {N_i}/{\rm{d}}$ as an estimation ofNK-d/2, that is$$ {X_{K + d/2}}\sim Binomial\left( {d \times {N_{K - d/2}},p} \right),\;\;K > d $$ A binomial distribution can be approximated by a Poisson distribution if the number of trials in the binomial distribution is large while the probability of success is small. Hence,
$$ {X_{K + d}} \approx Poisson\left( {d \times p \times {N_K}} \right),\;\;K > d/2 $$ Including multiple regions into the model, we have
$$ {X_{K + d,i}} \approx Poisson\left( {d \times {p_i} \times {N_K}} \right)\;\;{\rm{for}}\;{{i}} = 1,2, \cdots ,\;{{m,}} $$ and therefore,
$$ \mathop \sum \limits_{i = 1}^m {X_{K + d,i}}\sim Poisson\left( {d \times {N_K}\mathop \sum \limits_{i = 1}^m {p_i}} \right) $$ wherem=25 is the total number of regions used in our model. Note that ifpi=p, our model is almost identical to the previous model (5). The total number of cases on Day K,NK, is estimated by its maximum likelihood estimate (MLE), that is
$$ {\hat N_K} = \frac{{\mathop \sum \nolimits_{i = 1}^m {X_{K + d,i}}}}{{d \times \mathop \sum \nolimits_{i = 1}^m {P_i}}} $$ and the corresponding (1–α) CI is derived using the relation between Poisson distribution and chi-square distribution (6).
$$ \left( {\frac{{{\rm{\chi }}_{2\left( {\mathop \sum \nolimits_{i = 1}^m {X_{K + d,i}}} \right),\alpha /2}^2}}{{2 \times d\mathop \sum \nolimits_{i = 1}^m {p_i}}},\frac{{{\rm{\chi }}_{2(\mathop \sum \nolimits_{i = 1}^m {X_{K + d,i}}) + 2,1 - \alpha /2}^2}}{{2 \times d\mathop \sum \nolimits_{i = 1}^m {p_i}}}} \right) $$ -
The daily probability of traveling from Wuhan to regioni,pi, can be estimated using the ratio of daily volume of passengers to regioniand the catchment population of Wuhan airport. Below are the details for obtaining daily volume of passengers to regioni.
There were a total of 7,122 flights from Wuhan to 84 airports in Mainland China in the 30 days from December 22, 2019 to January 20, 2020, where 6,586 flights were to the top 50 destinations which accounted for 6,586/7,122=92.47% of the total volume (7). Meanwhile, there were 854,383 seats in the flights to top 50 destinations being reported in IATA data in the 22 days between December 30, 2019 and January 20, 2020 (8). Hence, the average number of seats in a single flight can be estimated by 854,383/(6,586×22/30)=177. Over Spring Festival/Lunar New Year, Wuhan airport is expected to handle 24,600 flights and 3.52 million passengers in 40 days (9), and thus, each flight is expected to have on average 3,520/24.6=143 passengers onboard, which gives an average load factor of a flight departing from Wuhan as 143/177=0.81. Therefore, the total volume of air travels during the Spring Festival/Lunar New Year can be estimated to be 854,383×0.81/0.9274/22×40=1.35 million. In addition, based on historical evidence, 15 million passengers are expected to depart Wuhan by rail, road, and air, 66% of whom are estimate to travel across 300 km (10). That would imply, on average, that 135/(1,500×0.34)=26.47% of trips longer than 300 km would be by air. Therefore, the total passenger volume from Wuhan to other regions in Mainland China can be calculated bythe number of seats×0.81/0.2647. Note that Hainan Province is a special case because of its geographical location, and a majority of passengers from Wuhan to Hainan Province will likely travel by air. As a result, we would usethe number of seats×0.81 for Hainan Province. For other international regions, we use the estimate of 3,301 passengers per day given by the previous model (5).
Appendix A
Appendix B
Citation: |