The next evaluation has been performed with a purpose to assess the modifications in panorama and to know the causes of the dramatic mannequin collapse over the previous 2–3 weeks and restoration in the previous few days.
This evaluation has been performed and the next report ready for the $IYKYK community.
Background
Two predictive fashions had been efficiently skilled and examined for the aim of predicting (at launch time) whether or not a Pump.enjoyable launch WILL or WILL NOT attain King of the Hill and Raydium respectively.
Catboost machine learning mannequin developed by Yandex, was chosen for this binary classification process. Catboost is taken into account essentially the most correct of the Gradient Boosting algorithms, it’s also the quickest at inference time and is efficient even with small datasets.
Disproportionate Efficiency Degradation
Degradation of mannequin efficiency was noticed on the twenty seventh August 2024, at which level, an investigation into the mannequin efficiency was required instantly because of the variety of purchase indicators declining to zero on the twenty sixth August.
The chart under exhibits purchase indicators per day (these embody each true positives and false positives), the shaded pink space represents the early phases of the mannequin coaching the place comparatively inadequate coaching information had been captured. Additionally annotated on this plot is the launch and subsequent flipping of pump.enjoyable by sunpump (extra on this within the subsequent part).
Adjustments in Panorama
Throughout August 2024, there have been some notable modifications within the meme coin panorama, most notably the launch and speedy adoption of SunPump on Tron.
SunPump, a Pump.enjoyable clone on Tron launched on ninth August 2024 and on twenty first August, SunPump surpassed Pump.enjoyable in every day launch quantity. Whereas this flippening solely lasted for sooner or later. The rise of SunPump shouldn’t be ignored or underestimated.
Whereas Pump.enjoyable’s all time excessive (every day launches) exceeded 20K launches on thirteenth August 2024, this quantity steadily declined till stabilising round 6k launches on the twenty fifth August 2024. (Supply: Dune Analytics)
Preliminary Hypotheses
As you may see from the above two charts the mannequin efficiency degradation (when it comes to indicators solely) towards launch quantity is vastly disproportionate.
For instance: one would count on that with a 50% discount in launches, there to even be a 50% discount in indicators. Nevertheless we noticed between the thirteenth August to the seventeenth August roughly a 50% discount with approx 70% discount in indicators.
By the twenty sixth, twenty seventh, thirtieth, thirty first of August and in addition the first September, the quantity of every day indicators had grow to be very regarding.
Preliminary hypotheses, potential points to rule out included:-
- Had Pump.enjoyable launches deteriorated to shut to zero. This was immediately dominated out primarily based on each the quantity, distribution launches logged within the proprietary dataset, and confirmed with third social gathering Dune Analytics information.
- Had Pump.enjoyable modified their inner information construction? A fast look on the database dominated this risk out. The information construction was appropriate and correctly. Nothing had modified.
- Have been there another bugs associated to the operating system? No, all of the error logs for each coaching, inference, and information storage we nice and in line with earlier time frames.
- Had the “Preliminary Purchase” distribution modified? Was this getting filtered incorrectly in the beginning of the info pipeline?
- Did the “predictable devs” which the mannequin had beforehand recognized (maybe resulting from them utilizing a programmatic method) transfer to Tron?
- Did the mannequin efficiency collapse as a result of the window of time from which the observations had been taken and skilled towards straddled considered one of extra change factors?
The chart under exhibits the preliminary purchase by the dev. This was additionally examined to rule out any important modifications in purchase distribution which can have result in predictable devs being filtered out in error.
Evaluation of Mannequin Accuracy over Time
Let’s take a look at the precise every day purchase indicators from dwell information for the King of the Hill mannequin. The chart under exhibits true positives (TP) and false positives (FP) over time.
Utilizing the present optimised hyper parameters, 50K remark slices of coaching information, going backwards in time in 24 hour intervals had been skilled and examined towards with TP and FP outcomes plotted.
50,000 remark coaching set.
Now for a cumulative plot — as a result of bullish!
The above evaluation examined a diminished coaching dataset. The variety of observations was diminished from 150,000 observations to 50,000 observations as a way to rule out the likelihood that the mannequin was fitted to determine “predictable devs” who had moved over to Tron.
See the time vary with respect to the annotated change factors for every 50,000 observations could be seen within the following plot.
Time Span
100,000 remark coaching set.
Cumulative
Time Span
Temporal Evaluation
The chart under exhibits the typical every day pump.enjoyable launch occasions for the earlier 4 weeks in 7 day home windows.
Indian Dev Change Level
Pump.enjoyable banned India on 14th August. Presumably in response to the week to the launch time distribution anomaly which exhibits a dramatic enhance in launches between 9am and 11am UTC (9am UTC is 1.30pm in India)
The mannequin’s power has traditionally being in precisely figuring out rip-off devs by their seemingly programmatic predictability, resulting in a excessive quantity of correct predictions.
The evaluation of mannequin accuracy over time, exhibits a notable restoration in each sign quantity (relative to the whole variety of pump.enjoyable launches) and mannequin recall accuracy. Over the following 3 weeks, we’ll lengthen the coaching information window again to 150,000 observations such that we don’t straddle the numerous change factors with a view to enhancing the hit ratio.
It’s value protecting in thoughts that the occasions and alter factors recognized on this evaluation may cause the mannequin to be brittle. Any comparable modifications in future ought to be carefully monitored.
Because the mannequin is recovering, auto-buy and auto-sell will now be applied. This was placed on maintain for apparent causes.
Observe Me on X — @Eth_Moon_
Take a look at the IYKYK Chart
Be a part of the IYKYK Telegram