Cite this asDíaz J (2020) SARS-Cov-2 Systems Biology. Ann Syst Biol 3(1): 029-032. DOI: 10.17352/asb.000009
The aim of this mini review is to analysis the advances in the research of the SARS-CoV-2 molecular structure and pathogenesis from a systems biology approach.
Introduction: Experimental analysis of the interaction of viral and host proteins, or interactome, by Gordon and collaborators has been a fundamental contribution to understand the form in which SARS-CoV-2 virus takes control of the host molecular network to produce new virions and propagate the infection. This result allows the construction of the viral network representation of the interactome and its statistical analysis. Formulation of network models of this interactome is a basic tool to identify drug targets capable of interrupt the viral replication cycle, and for the design of novel therapeutic agents.
Discussion: SARS-CoV-2 is a free-scale hierarchical modular structure in which the open reading frame 8 protein, nucleocapside protein (N), and nonstructural protein 7 (Nsp7) are the central hubs. This kind of organization confers an extra level of complexity to this molecular network allowing it to resist the attack of drugs on single nodes. However, simultaneous suppression of these three hubs can effectively disrupt the network.
Conclusions: Systems biology approach to the analysis of the SARS-CoV-2 interactome reveals the existence of six nodes that belong to high modularity classes (open reading frame 8 protein (orf8), Membrane protein (M), open reading frame 9b protein (orf9b), Nucleocapside protein (N), open reading frame 10 protein (orf10), Envelope protein (E), open reading frame 6 protein (orf6), open reading frame 7 protein (orf7) and Spike protein (S)) and control the flow of information during infection. These proteins are possible targets for new drugs, principally for antagonist of the three proteins orf8, M and Nsp7 that are the main bottlenecks throw which most of the information required for the production of new virus must flow. A therapeutic attack on these hubs can increase the probability to defeat viral infection as an alternative to a vaccine.
SARS-CoV-2: Severe Acute Respiratory Syndrome Coronavirus 2; COVID19: Coronavirus Disease 2019; ACE2: Angiotensin-Converting Enzyme 2; Ang II: Angiotensin II; Ang (1-7): Angiotensin (1-7); Nsp: Nonstructural protein: RTC: Replication-Transcription Complex; AI: Artificial Intelligence; ODE: Ordinary Differential Equation; orf: Open Reading Frame; Nsp: Nonstructural protein; E: Envelop protein; S: Spike protein; M: Membrane protein; N: Nucleocapside protein
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) virus is an intracellular parasite whose replication cycle depends on host cell structures and functions; in particular, it uses the translational apparatus of different types of infected cells to express its proteins . SARS-CoV-2 causes the COVID19 (Coronavirus Disease 2019), which has infected around 21,000,000 persons worldwide since the end of 2019 and killed about 776,000 of them. There are not therapeutic drugs neither an effective vaccine to defeat SARS-CoV-2 infection . .
SARS-CoV-2 virion is formed by four proteins: Spike (S), Envelope (E), Membrane (M) and Nucleocapside (N) that enclose the virus genome . In normal cells, the surface receptor Angiotensin-Converting Enzyme 2 (ACE2), highly abundant in the lung alveolar type II cells, converts the molecules of Angiotensin II (Ang II) into Angiotensin (1-7) (Ang (1-7)) . However, when SARS-CoV-2 virion infects the organism, ACE2 binds with high affinity to the S protein, and forms a molecular complex that begins the process of fusion of the virion envelope and the host cell membrane. The nucleocapside containing the viral genome in then released into the host cytoplasm .
SARS-CoV-2 genome consists of a positive-sense nonsegmented single stranded mRNA ((+)ssRNA) of 30 kb. The open reading frames 1a (orf1a) and 1b (orf1b) are located near the untranslated region 5’ (5’UTR) of the positive single stranded RNA ((+)ssRNA) and they code for the polyproteins pp1a and pp1ab. The process of translation occurs in the cytoplasm and produces a set of viral polyproteins whose maturation results in 11 nonstructural proteins (Nsp) from the orf1a segment (Nsp1 to Nsp11) and 5 nonstructural proteins from the orf1b segment (Nsp12 to Nsp16). Nsp proteins form the replication-transcription complex (RTC) in a double-membrane vesicle where a set of nested subgenomic minus-strands of RNA ((-)sgRNA) are synthesized in a process of discontinuous transcription. These (-) sgRNAs serve as the templates for the production of subgenomic mRNAs from which the structural proteins E, M, N and S, together with the accessory proteins orf 3a, orf6, orf7a, orf7b, orf8, orf9b, orf9c and orf10 are synthesized [2,6,7]. SARS-CoV-2 uses the host translational machinery to redirect it to viral protein synthesis and replication, while cellular mRNA translation is inhibited . Viral proteins are inserted into the host molecular machinery to modify and redirect a great number of host cell functions towards the production of more virus particles [8,9]. Experimental analysis of the interaction of viral and host proteins, or interactome, by Gordon and collaborators  has been a fundamental contribution to understand the form in which SARS-CoV-2 virus takes control of the host molecular network to produce new virions and propagate the infection. Gordon and collaborators  cloned, tagged and expressed 26 viral proteins in human cells using affinity- purification mass spectrometry to identify the human proteins physically associated with each other. They found around 332 SARS-CoV-2-human protein-protein interactions that form the virus interactome. This result allows the construction of the viral network representation of the interactome and its statistical analysis. Formulation of network models of this interactome is a basic tool to identify drug targets capable of interrupt the viral replication cycle, and for the design of novel therapeutic agents [2,11-13].
Systems Biology uses network theory as a tool to understand the organization and dynamics of complex systems . A theoretical approach to biological networks structure and function allows the integration of disperse experimental data in a coherent model of the spatio-temporal dynamics of interconnected cellular processes . The number of nodes, the number of connections of each node to his neighbours, and the distribution of these connections in the network determine its complexity, structure and dynamical properties .
In the particular case of SARS-CoV-2 virus, the analysis of the statistical properties of the undirected network model (Figure 1) of its interactome [2,10,15] shows that the degree distribution of its 332 proteins or nodes is a decaying exponential with great number of nodes with few connections and a low number of nodes with degree above 20. A model of the Calu-3-specific human-SARS-CoV-2 interactome, with 4,123 nodes, also shows a degree decaying exponential distribution . Both results suggest that the SARS-CoV-2 network has a scale-free structure. In particular, orf8, M, Nsp7, orf9c, Nsp12 and Nsp13 proteins are the most connected nodes in the network and can be considered as hubs . Three hubs from this set control approximately 102 viral-host processes  as shown in Figure 1: orf8 (47 links), M (29 links) and Nsp7 (26 links). Hubs identification is of great importance because these highly connected nodes control the flow of information in the infected cell, indicating that orf8, M and Nsp7 are the main organizers of the SARS-CoV-2 virus activity during infection and the possible source of the virus pathogenicity [2,13,15].
All networks have a structure determined by nodes connectivity and links distribution. In random nets connectivity and links distribution are arbitrary and the degree distribution follows a binomial distribution. However, every binomial distribution can be approximated with a Poisson distribution that is also a decaying exponential one. In consequence, a decaying exponential degree distribution is not a sufficient condition to assure that SARS-CoV-2 is not a random network, i.e., that in every case of infection viral proteins are inserted at random in the host cell without a specific coordinated function.
Centrality measures are necessary to determine if an undirected network is random or not. Different measures have been employed to investigate the structure of the SARS-CoV-2 interactome: clustering coefficient, closness centrality, betweeness centrality and modularity class, load centrality, information centrality and Page Rank index [2,15]. From these set of centrality measures, modularity is the key to determine if the host cell-virus interactome has any kind of nonrandom structure.
Modularity is the fraction of links that fall within a cluster minus the expected fraction if links were distributed at random, and indicates the nodes that are more densely connected between them than with the rest. In a nonrandom network, modularity has a value between 0 and 1 and reveals clues about the structure and the vulnerable spots of the network. A common method used for community detection is the Louvain method [2,16], from which the modularity of the undirected SARS-CoV-2 was calculated as approximately 0.85 that is far from the negative value for a random network. Furthermore, in the Gordon et al. interactome , 21 clusters were detected for which the number of nodes distributed between them is larger than the expected number due to random. The viral proteins orf8, M, orf9b, N, orf10, E, orf6, orf7 and S also belong to high modularity classes. These results suggest that SARS-CoV-2 is not a random network but a free-scale modular hierarchical structure in which orf8, N, and Nsp7 are the central hubs . This kind of organization confers an extra level of complexity to the SARS-CoV-2 molecular network allowing it to resist the attack of drugs on single nodes. However, simultaneous suppression of these three hubs could effectively disrupt the network organization and stop the viral replication cycle . These results reveal the complexity of the SARS-CoV-2 molecular network, and open the possibility to understand the factors that determine the pathogenicity of the virus [2,13], and how can they be modified to decrease it [2,13,15,18].
In silico analysis of the structure and dynamics of the SARS-CoV-2 network can be of aid in the identification of drug targets and in the design of novel medicaments against COVID19. Artificial Intelligence (AI) and bioinformatics methods have been used for this purpose with great success [19,20].
CoV-responsive human genes and their functional roles were indentified based on available genomic data and on the basis of both the relative synonymous codon usage (RSCU)-based correlation of viral genes with human genes and differential gene expression analysis. From this analysis, potential drugs for COVID-19 treatment based on these genes were predicted . Furthermore, experimental data from Gordon, et al. , suggest that drug attack on orf8, N, and Nsp7 can block the effect of these three hubs on downstream nodes . However, there are not therapeutic drugs that can hit directly these proteins, although there are FDA approved drugs than block their downstream activity. For example, orf8 is a target of Rapamycin, which also targets Nsp2 and N. Rapamycin blocks Tor1a activity in the quality control of protein folding in the Endoplasmic Reticulum (ER), disrupting the effects of orf8 [22,23]. Unfortunately, Rapamycin has strong immunosuppressant effects which turn it inadequate for the treatment of the COVID19 disease . It is necessary more research to design drugs that target directly orf8, M and Nsp7 as a complement or an alternative to a vaccine.
Systems biology approach to the analysis of the SARS-CoV-2 interactome reveals that is a modular free-scale hierarchical network in which six nodes orf8, M, orf9b, N, orf10, E, orf6, orf7 and S that belong to high modularity classes control the flow of information during infection. These proteins are possible targets for new drugs, principally for antagonist of the three proteins orf8, N, and Nsp7 that are the main bottlenecks throw which most of the information required for the production of new virus must flow. A therapeutic attack on these hubs can increase the probability to defeat viral infection as an alternative or a complement to a vaccine.
The next step is the formulation of an Ordinary Differential Equations (ODEs) model of the SARS-CoV-2 interactome to know in deep the dynamical behavior of this complex network. This kind of modeling is necessary because viral protein insertion is a perturbation that distorts the original host cell phase space topology. Probably, viral proteins produce some type of bifurcation, with a restructuration of the original fixed points, which leads the overall host cell dynamics to the production of new virus . Characterization of the kind of bifurcation involved in the infection by SARS-CoV-2 is a necessary step to know the factors that determine the pathogenicity of the virus , and how they can be modified with specific drugs to decrease it.
However, there is a lack of the quantitative information necessary to propose an ODEs-based continuous model for the analysis of the spatio-temporal dynamics of the infection due to the novelty of this virus species.
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
I thank CIDC colleagues for their constructive comments and suggestions. I also thank Erika Juarez Luna for logistical support.
Subscribe to our articles alerts and stay tuned.