In final put up, I hypothesised that the whole diploma of a node within the GO:0005575 community follows two guidelines, specifically:
I additionally proposed the community would observe a Energy Legislation distribution. On this put up, I’m going to elaborate on this.
In my put up “Protein Perform Prediction — The Community Constructed”, I already ran a evaluation on GO:0071944. This time I’m going to do the identical on GO:0005575.
Of this specific GO, there are over 210K nodes of which they’ve complete diploma of connection spanning from 2 all the way in which to 100. And my speculation of this diploma distribution could be divided into two elements: these equal or under 10 and people above 10.
To analyse this specific set of nodes, I’ve following helper features to assist the evaluation.
I then collected the fitted parameters utilizing following script.
Then I plot respective parameter in opposition to the in-degree.
An fascinating remark from the p-value diagram is that when the in-degree takes up worth between 4 and 6 diploma of incoming connection, the p-value stays is important at 0.05 stage (You’ll be able to set a distinct important stage and I simply choose this worth from what is usually utilized in scientific journal writing. There isn’t a specific rationale behind!).
I’d argue the in-degree 4 (or 6) could be the focus for additional examine. (See under on how I structured the gathering of related nodes to suit the Energy Legislation Distribution)
Positive the above remark might be completely different if I alter one other parameter “min_node_length” to values apart from Zero as I utilized in above script.
Earlier than I study the affect of adjusting parameter “min_node_length’, let’s take one other have a look at how the mix of in-degree and out-degree will change the worth of alpha and whether or not the distribution will probably be finest modelled by Energy Legislation.
From above two diagrams, you may inform solely sure mixtures of in-/out-degree are having important Alpha. For instance, observe pairs are important at 0.1 stage (I simply consult with the world colored in RED of the diagram of P-Worth for straightforward clarification.)
So I’m going to cowl the remainder of nodes and see how these nodes are distributed. Does it observe a Energy Legislation Distribution?
Right here I impose constraints on each in-degree and out-degree and so they should be bigger than 10 as a starter. Following are the outcomes.
From above two diagrams, it’s clear that the assertion that these nodes following a Energy Legislation Distribution is just not justified. It could possibly belong to any distribution from the exponential household.
Given this, I’d argue this specific GO community constructed in a method that defies the usual community evaluation. It follows a Energy Legislation Distribution with very important alpha of worth >3 at sure sub-network whereas your complete community behaves like a Random Community. Additionally, the comparatively small community (with 200K+ nodes at most) makes the same old assumption of community dimension approaching infinity an inappropriate assumption. Another formulation of how the community grows in dimension to this particular topology is required.
On this put up, I’ve lined the statistical behaviour of a Energy Legislation distribution with nodes of various in-/out-degree.
In subsequent put up, I’ll deal with the topology of the community equivalent to centrality of nodes to see if I can extract additional insights of this community.
Keep tuned!