Cite this asAmini MH, Menhaj MB, Talebi HA (2021) A modified CNN-based Covid-19 detection using CXR. Arch Community Med Public Health 7(2): 142-145. DOI: 10.17352/2455-5479.000154
In this paper, a deep neural network for the purpose of detecting COVID-19 from Chest X-Ray (CXR) images is presented. Since this pandemic has emerged worldwide , there is no large dataset available for it. So for its detection, care must be taken not to use methods with high variance. However, for a deep neural network to get acceptable performance, we usually need huge amounts of datasets. Otherwise, there may be issues like overfitting. To resolve this problem, we use the beautiful idea of transfer learning. Training a deep neural network with the idea of transfer learning on 2 available datasets on the web, we achieved a COVID-19 detection accuracy of 98% on about 1000 test samples.
1(Use footnote for providing further information about author (webpage, alternative address)—not for acknowledging funding agencies.)
It’s obvious how the COVID-19 has affected the whole world for months and nearly everyone has been dealing with it. The bad thing about COVID-19 is that there is no symptom for about 2 weeks, while the patient can transfer it to others. This causes rapid growth of it in the world. So, on-time detection of COVID-19 can save many lives.
Two of the main methods of detecting COVID-19 is by the examination of chest X-rays and CT images. Although there may be better performance on working with CT images, chest X-rays are also used wildly due to some reasons including their cheaper and less dangerous properties.
In this paper, we try to classify CXR into two categories of whether COVID-19 is present or not (hopefully!). So we have a binary hypothesis testing problem. Although due to the current datasets, we can do even more and, for example, detect other diseases by CXR, we just focus on detecting COVID-19.
We have used deep convolutional neural networks for our problem. Although high accuracies can be achieved using these networks, there are some fundamental challenges for using them. One of these challenges is that due to the problem of the interpretability of these networks, we can’t really understand how they are classifying, and this causes little information about false-negatives and false-positives. However, by preparing large datasets with trustworthy labels we can somehow improve our performance.
But do these challenges mean that we should not use deep neural networks in such cases? We strongly believe no. At least, they can give a hand to doctors and help faster detection for some patients. For example, for the cases predicted positive, the patient may be visited faster by a doctor to be examined further. This can prioritize patients for the examination, especially when there are lots of people waiting to be examined and there are just a few doctors there.
For the neural network structure, as it is so famous today, since we are working on images, a very good choice might be Convolutional Neural Networks (CNN). But creating a new convolutional neural network and just starting to train it from scratch needs a dataset far larger than those available for the COVID-19. Using current COIVD-19 datasets to train a convolutional neural network from scratch, causes the network not to generalize well. It may either underfit or overfit. So what to do now? The nice idea of transfer learning helps us overwhelm this problem.
As mentioned earlier, training a convolutional neural network from scratch with current COVID-19 datasets is not a good idea at all. On the other hand, there are plenty of deep convolutional neural networks trained on large datasets of images and achieved good performances on the task of object detection. These networks, like Inception, Resnet, Mobilenet, VGG16, VGG19, etc. are trained on datasets of millions of images and can classify hundreds of different objects with a good performance. This good performance shows us that convolutional layers of these networks, which have the task of feature extraction, are doing their job so well to achieve defined goals. So, why not using their convolutional layers for our task of COVID-19 detection? This may help our network extract useful features and therefore, improve its performance a lot. By the use of this idea, our network can escape the aforementioned issues and can generalize so well. So we use convolutional layers of a pre-trained deep neural network as our network’s convolutional layer and we don’t even need to train those convolutional layers anymore. We can simply add some fully-connected layers after those convolutional layers to do the classification task of COVID-19 detection.
Table 1 represents some of the pre-trained models and their properties.
Since we defined our problem as detection of COVID-19 from CXR images, we have a binary hypothesis testing or binary classification task. For the choice of loss function, we can have a binary cross-entropy loss function.
ℓ = −ylog(h(X)) − (1 − y)log(1 − h(X)) (1)
Where y is the target and h is the function of our neural network and X is the CXR. The main reason for this loss function is that we want our output to be the probability of the patient having COVID-19. So with this loss function and maximum likelihood estimation, it is proved that, our network gives the probability of being infected by COVID-19.
Neural network structure: For the implementation, we use a pre-trained VGG-19 network. We keep its convolutional layers and discard the rest. We simply add 2 fully-connected layers to them. We do the training just on the fully-connected layers and not on the convolutional layers since they have been trained earlier and are good at their job. Of course, training the last layers of the convolutional layers can even cause more improvement in the performance, but without training them further, we have achieved acceptable performance. So we did not train them.
Our network architecture can be seen in Figure 1.
We have used 2 datasets available on the net. One of them is COVID-19 image data collection available on Github. The other is CoronaHack - Chest X-Ray-Dataset available on Kaggle [1,2]. Images of both datasets were gathered. Although they had different features labeled, we just divided them into 2 parts, whether the patient has COVID-19 or not. Our null hypothesis consists of normal cases, as well as those patients facing other diseases other than COVID-19.
We’ve chosen 4718 training samples and 1152 test samples. Of these, there are 330 COVID-19 cases in the training set and 73 COVID-19 cases in the test set. It’s worth noting that, these are just raw samples of the two datasets. By some techniques like random jittering, we can increase samples. Although this may not increase the information, it can help the model to generalize better.
With 10 epochs of training, we had an accuracy of about 98% on average on our test set. Table 2 represents the training details.
Figures 2,3 represents some examples of our model prediction. As it can be seen, almost all COVID-19 cases in the test set are predicted with a probability of more than 0.6 (most of them even more than 90%).
Subscribe to our articles alerts and stay tuned.