In silico design of angiotensin-converting enzyme 2 (ACE2) recombinant protein to block the S1 protein pathway of COVID-19 virus

Coronavirus is a large family of viruses that includes the common cold and the SARS virus. The Chinese corona, or coronavirus, is a new respiratory virus that began in late 2019 and early 2020 in the province of Hubby and Wuhan, China, and became known as COVID-19. The COVID-19 virus genome is a positive single-stranded RNA (ssRNA (+)) and is 29903 nucleotides long, encoding twelve different proteins. One of these proteins is called the S-protein. During the S-protein contamination cycle, it is divided into two subunits, S1 and S2. The subunit S1, which contains the Receptor Binding Protein (RBD), binds directly to the protease domain of the AngiotensinConverting Enzyme 2 (ACE2) protein and enters the cell through it. In this study fi rst the ACE2 protein sequence extracted from the NCBI site. To convert the protein to an extracellular protein and excrete it out of the cell, the signal peptide sequence was added to the beginning of the recombinant protein and two amino acids, cysteine and asparagine, added to both sides of the signal peptide sequence to create a self-catalyzing process similar to that found in Inteins. The identifi able motif was then incompletely added to both sides of the peptide signal sequence by ACE2 sequences with F-H-L amino acids sequence. Also, amino acids involved in direct interaction between the two subunits of ACE2 protein were inhibited. Dimerization was removed from the amino acid sequence, eventually to improve the lamb The interaction between the two ACE2 proteins designed with the S1 protein virus enhanced the physicochemical properties of the protein designed using the PROTPARAM and GPMAW sites..


Introduction
In the vernacular, the coronavirus is called the Wuhan coronavirus. As the number of victims of the Coronavirus virus exceeded 1,000, the World Health Organization (WHO) has chosen the offi cial name of COVID-19 for the disease, which refers to Corona, Virus, Disease and 2019. The COVID-19 virus genome is a positive single-stranded RNA (ssRNA (+)) [1,2], that encodes twelve different proteins. One of these proteins is called S-protein. During the pollution cycle, the S protein is divided into two subunits, S1 and S2. The S1 subunit containing the Receptor Binding Domain (RBD) binds directly to the protease domain of the angiotensin-converting enzyme 2 (ACE2) protein. ACE2 protein acts as a homodimer. The individuals is more likely at this time. The main reproductive number (R0) [17] of this virus is estimated to be between 1.4 and 3.9 [18][19][20][21][22]. This means that, if left unchecked, the virus will normally cause 1.4 to 3.9 new infections in each infection. The virus has been shown to be able to transmit at least four people in a chain [23].
The new coronavirus appears to be less dangerous than SARS and is severely present in 15 to 20 percent of cases. Preliminary estimates suggest that the mortality rate for the virus is between 2 and 3 percent [24]. The World Health Organization has published several protocols for testing the disease [25,26]. The standardized method for testing is the polymerase chain reaction of reverse transcription by Real time (rRT-PCR) [27]. This test should be performed using breathing samples prepared from a variety of methods, including a pharyngeal swab or a sputum sample [28]. Results are generally available within a few hours to two days [29,30]. Blood tests can also be used, but this requires two blood samples taken two weeks apart, and the results are not of immediate value [31]. Chinese scientists have been able to isolate a one-way coronavirus and release a nucleic acid sequence so that laboratories around the world can independently develop Polymerase Chain Reaction (PCR) testing laboratories to detect viral infections [32][33][34][35][36][37].
In this study, an attempt was made to adopt a preventive solution against COVID-19 with the design of a recombinant ACE2 receptor. The ACE2 protein plays a key role in the entry of the virus into the cell. The designed ACE2 protein travels an extracellular pathway with the help of a signal peptide sequence added to the beginning. This causes the COVID-19 virus S1 protein, which interacts directly with the ACE2 protein to enter the cell, to collide with the designed protein before reaching the cell membrane in the extracellular space and block its activity. Amino acids are placed on either side of the signal peptide sequence to act similarly to the exit of intein from the protein sequence, and after the transfer of the designed protein to the extracellular space, the signal peptide sequence leaves the main protein sequence. After the signal peptide sequence is removed from the main sequence, a motif identifi able by ace2 group proteases appears, which is incompletely added to both sides of the signal peptide sequence. Cutting in this motif causes the virus protein to separate from the receptor surface. The incomplete addition of the motif means that no incisions are made before the receptor is transferred out of the membrane.

Materials and methods
First The amino acid sequence of the ACE2 protein was extracted from the NCBI database (http://www.ncbi.nlm. nih.gov). Using bioinformatics software, the protein was analyzed. In order to prevent homodimer formation, part of the protein structure involved in homodimerization was altered. Accordingly, the amino acids involved in the direct interaction between the two subunits were identifi ed and excluded from the initial sequence ( Figure 1).
A sequence of proteins involved in a direct connection with the virus's S-protein was applied without altering the designed protein.
In order to convert membrane protein into a secretory protein using a signal peptide sequence, a number of secretory proteins are predicted by the SignalP site (http://www.cbs.dtu. dk/services/SignalP) and their protected areas by the ClustalO site (http://www.uniprot.org/align/) were identifi ed, a signal peptide sequence was designed for protein secretion outside the cell, and at the beginning of the recombinant protein after the binding motif was added to the S-protein so that the virus could reach the cell membrane and Connect the main protein to the designed protein and block its activity path.
To the two sides of the signal peptide sequence, two amino acids, cysteine and asparagine, were added to perform the same catalytic process as in the proteins, and the signal peptide sequence after transfer. Protein is removed from the cell from the main sequence, which results in a short threeamino acid motif, which is the site for detection and removal by ACE2 proteases and is incompletely added to both sides of the peptide signal sequence by removing the peptide signal sequence appear. By breaking down in this area the virus protein that is connected to the designed protein is released from the protein surface.

Results
The ACE2 protein sequence and the amino acids involved in the interaction between ACE2 and the S1 protein of the COVID-19 virus (red) are shown, also the amino acids involved in the interaction between the two ACE2 subunits for dimerization (yellow), are shown. At the end of the sequence the motif outside the protease is shown.
The addition of two amino acids, cysteine and asparagine, to both sides of the signal peptide sequence to create a self-catalyzing process similar to that found in Inteins.

WTLALENVVGAKNMNVRPLLNYFEPLFTWLKDQNKNSFVGW-STDWSPYADGSLEVLFQ
F-H-L amino acids form a motif that is detected and cleaved by ACE2 group proteases. Addition of the motif to the protein sequence immediately after the domain binding to the virus causes the incision in this region to separate the domain and associated virus from the surface of the receptor. 4. Adding amino acids to create a disulfi de bond between the two sides of the signal peptide sequence.
>pdb|6VW1|A Chain A, Angiotensin-converting enzyme 2 Eliminate amino acids that are involved in direct interaction between the two subunits of ACE2 protein to prevent dimerization.
>pdb|6VW1|A Chain A, Angiotensin-converting enzyme 2 According to the ProtParam website, the half-life of ACE2 protein in mammalian bodies is 1.9 hours, which is related to Serine amino acid. Half-life is the time it takes for half of the protein to disappear into the cell after synthesis [38]. According to Table 1, the highest half-life in mammals is that of the amino acid Valine, which is 100 hours. Accordingly, a Valine amino acid was added to the amino acid sequence designed for ACE2 protein.   [41], which is calculated according to the following equation:  In the above relation, L represents the length of the protein and the DIWV (x [i] x [i + 1]) expression indicates the value of the instability weight for the diopter located at position i. For stable proteins, the instability index (Ii) is less than 40, and when this index is higher than 40, it means that the protein is unstable. With the changes applied to the protein sequence, the recombinant protein instability index was transferred to less than 39.92.

Discussion
Although many attempts have been made to produce the vaccine, there is still a vaccine or antiviral drug to eradicate it. There is no defi nitive cure, prevention or response to coronavirus infections in general [42]. In scientifi c and medical studies, Chinese and Japanese scientists and researchers have found that antiviral drugs such as Lupinavir / Ritonavir have been shown to be useful in treating and preventing the development of coronavirus and even treating the disease. These drugs have been saved, and overall evidence has shown that antiviral drugs have saved the lives of many people with coronavirus [45].
In general, recombinant products, which are associated with genetic manipulation and DNA changes in various organisms, have caused a huge change in the type and variety of pharmaceutical products used, so that today we see the use of high molecular weight pharmaceutical recombinant products instead of small chemical molecules. Protein drugs have a very specifi c function. Therefore, they will not have an adverse effect on other unrelated biological processes, and in this respect, they have fewer side effects. The rapid growth of biological data has created problems for biologists and biotechnologists to gather, store, and store information in a way that may no longer be possible without the use of new technologies. But in addition to these capabilities, the shadow of ambiguity due to the less predictable effects of this knowledge has led to a challenging future that challenges most social aspects and perhaps most of all genomics.
The function of proteins depends on the spatial structure of the protein, or its third structure, so that many of the defects and dysfunctions of proteins are due to changes in the spatial structure of a protein [46,47]. Extensive data from protein sequencing data have been obtained from modern research methods in molecular biology. The rapid growth of laboratory information on protein sequence and diffi cult access to protein structure has led to the importance of predicting structure more than ever before [47]. Despite extensive research into the third structure of proteins, the physical basis for the stability of the structure of proteins has not been fully understood [48].
On the other hand, the strong dependence of protein function on its structure has opened new perspectives in the treatment of diseases. Today, the use of information about the third structure of proteins is one of the basic methods in the logical design of drugs [49].
In general, the strategies used to design a vaccine against coronavirus are different from previous strategies used to prevent infl uenza and colds, and are based more on molecular methods such as DNA, RNA, and recombinant proteins [50].
Among these, the greatest focus is on protein subunits and recombinant proteins [51,52]. The ACE2 protein, as the initiator of the viral infection cycle, can be a good choice for treatment.
This protein, which acts as a hemodimer, has been the subject of extensive research because of its high therapeutic potential. fusion inhibitor called IPB02, which showed great potential for inhibiting cell-cell binding [57]. Shuai, et al. Developed a series of EK1-derived lipopeptides previously designed to target the second HR1 on protein S and found that EK1C4 was the most potent fusion inhibitor against protein-mediated protein-