Will my stock trading model exceed 100% CAGR?
➕
Plus
39
Ṁ12k
Feb 1
7%
chance

Late last year, I created several markets to assist with the development of my deep learning stock trading model. Thanks in part to Manifold machine learning experts, we have gotten it into live testing. However, as every machine learning engineer knows, and how long it took OpenAI to get the GPT store online shows, the infrastructure surrounding the models (like the API calls, the server hardware, and the traditional code that obtains the data) is far harder than training intelligent models.

Full details of the Seregon AI Investing organization: https://shoemakervillage.org/temp/transition_trio.pdf, pages 6-7.

The model is a binary classifier that uses 15+ layers of different types with custom activation functions. It started out as a cryptocurrency trading bot, but we transitioned to stock trading due to the lack of SIPC insurance at cryptocurrency exchanges. It is the best of 1400 models that were trained with different hyperparameters and layer configurations and data processing on a bank of 4090 cards over the course of a year. It was trained on more than 5TB of data related to financial markets from multiple exchanges.

The model has an accuracy of 92% and backtests at 656% CAGR on forward unseen test data when a fee/slippage of 0.15% on each buy and sell is assumed. It exclusively trades S&P 500 stocks. On January 22, 2024, its accuracy was 87% and it exceeded the backtesting (1.19% daily) by earning 1.31% at the time the exchange integration code crashed and failed to sell.

Earnings for January 22, 2024:

ID Pair Profit (USD)

----- -------- --------------

4 ANSS/USD -0.17% (-0.58)

23 DAL/USD 3.82% (12.78)

3 NWL/USD 3.95% (13.53)

1 BXP/USD 2.30% (7.95)

2 DFS/USD -1.24% (-3.75)

5 PVH/USD -1.05% (-2.54)

7 APA/USD 0.39% (1.32)

6 ARE/USD -0.88% (-2.18)

8 VFC/USD 4.39% (14.91)

29 USB/USD 1.13% (3.76)

27 ALGN/USD 0.35% (0.95)

24 VNO/USD 2.58% (8.64)

21 INTC/USD -1.47% (-4.97)

22 CFG/USD 1.22% (4.00)

28 UAL/USD 9.37% (32.40)

25 BA/USD 1.03% (2.19)

10 CTLT/USD 0.04% (0.12)

14 ALB/USD 6.09% (14.48)

20 PAYC/USD 2.37% (4.60)

16 ZBRA/USD 0.02% (0.04)

11 FCX/USD 2.16% (7.29)

12 CMA/USD 1.94% (6.18)

19 AMD/USD -1.26% (-4.26)

13 EXR/USD 0.20% (0.60)

17 CCL/USD 1.58% (5.46)

9 NCLH/USD 2.36% (8.20)

15 ALK/USD 4.63% (14.58)

18 KEY/USD 1.70% (5.76)

26 UAA/USD 2.22% (7.65)

----- -------- ------- --------------

Total 159.11 USD

When the infrastructure bugs are worked out and the full amount of real money is used, I will add a comment stating the model has started running in production. From that point, the total returns will be calculated.

If this model, or an improved derivative, doubles its starting capital within one year of that date, the market will resolve to YES immediately on the day the money is doubled. Otherwise, it will resolve to NO, including in cases where some unrelated issue, such as troubles with "traditional" software engineering and exchange integration prevents us from running the model for long enough to earn that much money.

The money earned will be calculated after all expenses related to running the model, such as electricity, server costs, data subscriptions, and trade fees, but before taxes, are subtracted. The earnings on the first day indicated above are after all these expenses were deducted.

I will post weekly updates of the model's progress towards its goal of making the market resolve YES. Are its goals realistic and can it achieve them?


UPDATED: The model's "production entry" date has been set to January 29. The deadline for the bot to achieve its goals is January 28, 2025.

Because of how long it takes to move money, we expect that it will be at least three weeks before all the money is able to be moved to the accounts through the slow ACH system. We are also seeking investment and hope to borrow as much money as possible.

Therefore, rather than track absolute gains, I will post daily percentage gains or losses on however much was bought that day and compound them.


The current adjusted profit (compounded regardless of amount traded) is 17.062% over 45 days the model was active (ending May 31, 2024.) The model was inactive for 65 additional days because a failed motherboard led us to erroneously believe we were receiving poor validation statistics.

Get
Ṁ1,000
and
S3.00
Sort by:

During June, the model earned another 4.9%, bringing the total earned by the model to approximately 22%. A three-month rate of 22% continues to exceed the target 12-month gain of 100%; however, it would be useful for bettors to consider the few months when the models were offline due to the failed motherboard, and how that affects the resolution criteria.

I thought I should update people on this market.

The models were offline between February and May 3 for troubleshooting. We could not get our models to reproduce with new training, and did so out of caution. It turns out that a bad motherboard was feeding bad data to the GPUs, and that 5 months of work had to be discarded. The simplest things always cause the biggest problems.

Once we got a new motherboard, we were able to validate that the models were always performing as expected. We created an improved model and put it online. It earned $37,509 on $263,000 from May 3 to May 31, for a total of 14.2%. The S&P 500 gained 2.9% during that period. The 14.2% comes after slippage (2.03% of the portfolio) and margin fees (0.05% of the portfolio) were accounted for, but before taxes were paid.

Remember, this market does not consider taxes because I lost $7 million in the BlockFi, Genesis, FTX, and Celsius scams, and will likely never owe Federal taxes again.

We already have a superior strategy based on the same model in paper trading; it is backtested to perform about 8x better than the existing strategy with the same amount of money. Our belief is that we have solved the deep learning part, but that we can dramatically optimize our trading by figuring out how to use the model's outputs better. Our goals this month are:

  1. Increase the amount at stake with the existing strategy

  2. Allow the new strategy to continue paper trading and promote it to production unless bugs are found

  3. Continue backtesting strategies to develop even better strategies using the same model

  4. Maintain slippage at or below 2% of the portfolio per month despite increasing amounts

Our deadlines are:

  • On June 3, we are increasing our leverage to 2x, from 1.5x before, for a total of $550,000 at stake

  • By June 10, we hope to finally get the remainder of our money unstuck from banks to increase to $600,000

  • By June 17, we are targeting the addition of an additional $100,000 (multiplied by 2x) from selling BTC from the BlockFi bankruptcy claim payout

  • By June 30, we hope to deploy the better strategy that has currently begun paper trading

We have gained about 17% in two months of trading, which is well above a rate of 100% CAGR. However, the downtime caused by the bad motherboard, and then waiting for the new equipment to arrive, set us back during the non-trading period and actually allowed 4 months to elapse.

With the new strategy in development, I believe it is much more likely than not that 100% or even 200% will be achieved - however, we are not trading to win this Manifold market. Slippage, not predicting stock prices, has turned out to be the #1 concern. As probably should be apparent to everyone by now, deep learning can solve any problem given that data exists and has turned out to be a non-issue. The problem is that there is no orderbook data available to train on slippage.

It's possible that (for example) we double our money at stake, and slippage increases 1.9x. That might cause the percentage gains to decline, but still make us more money, which could cause us to win bigger but still result in NO. I'll update these comments every month.

Update:

We have been chasing ghosts for the past few weeks. There was an "error" where the training loss was higher than the testing loss. It turns out that Keras reports training loss to the user as the average of the entire "epoch," whereas testing is the state of the model at the end of training. We aren't using "epochs" because we have far more data than the 4090 cards can handle, and this uncommon situation was apparently not considered by the people who designed Keras.

During this period, we have been using the existing models with human-aided trading, because we didn't trust them any more due to the above issue. The human-aided trading achieved a gain of a little over 8%. The gain was not higher because we trained inferior models to get around this issue that turned out not to be an issue at all.

Additionally, we are talking to an investor. However, the investor would prefer to have a higher Sharpe ratio and lower profits. That would make this market difficult to resolve.

If the investor gives us ten times as much money as we have now, but wants us to earn, say, 20% of the potential profits so that we have very high Sortino ratios, it would still be in our best interest to take the money. We would get a cut of the profits, and if we took a 10% cut of 10x more money, we would still earn 1.2x what we would have earned if the model was going full-tilt on our own money.

So, even if the model would have earned 100%, having to seek investment from others may result in a perverse incentive not to actually seek that much profit.

It's great to see all the comments here!

I really think though that people are focused on the wrong thing. If this system fails to produce the expected returns, it's most likely to be because we run into data API call limits, or can't further parallelize the data ingest, have the broker execute our trades too slowly or have some other issue with the code surrounding the models.

It could also be because the cost of the auxiliary services, like hosting and data, is so high that it takes an appreciable chunk out of the starting capital when we only have $300,000 to trade with. We're also overloaded with taxes and can't afford to pay an engineer $25,000 to write the code to do our taxes.

Everyone is focused on the strategy itself, which is great - but that's not where we are spending most of our time or what we are worried about. Our training GPUs are actually off temporarily right now. I strongly suspect that people should be using a revenues/expenses analysis here, because whether the strategy would double money by itself is a lot different than whether our expenses and difficulties in surrounding code are so high that they cut the profits from 110% to 70%.

I hope you're successful, seems like you've put in quite a bit of effort... but it's also very hard to achieve 100% returns on S&P 500 stocks. (Though, not too hard to have a ~50% chance of doing so, but I assume you're not just going for super high volatility)

bought Ṁ100 YES

I'll calculate the current profitability and update the daily returns later this week.

also yuge limits up

How's the trading been in the last couple of weeks?

My estimate is that we're up about 11% at this point, on track for the targeted 176% CAGR. However, there are two issues right now that we're slammed by.

First, we are suffering from insufficient CPU power. We took the system offline to try to expand from 500 to 5000 stocks, and when we did that, we found our 10-year-old servers were inadequate. We had to order new servers , and then it's been a litany of problems since - a bait and switch on eBay, misthreaded screws, needing to buy tools, watercooling blocks not fitting, etc. We will have 56 cores with 512GB of RAM coming online this week.

Second, we are slammed with taxes, and can't concentrate on the business right now, and that's hurting us. Last year, it took me about 230 hours of work to do taxes, and this year my brother is doing them; they have been his full time job since mid-January. Taxes are one of the reasons we have left the cryptocurrency industry - even after I'm done with bitcoins, I still have to account for 202 bitcoins over 10 years that were lost in the Genesis, BlockFi, and FTX scams, and every one of the 1500 instances where those were purchased in 2013 and earlier need to be listed. We expect the taxes to be about 250 pages this year. Without GPT-4 to write the code to do the taxes, it might have been impossible to reconcile the bitcoin losses if AI had never come about.

As I mentioned in an earlier comment, everything outside of the GPUs and models is the hard part - actually creating superintelligent machines at this point in human history is very easy.

We're hoping to get back online within two days expanded to the 10x more stocks. The backtests show that having more stocks to trade approximately triples the expected profit.

@SteveSokolowski neat, thanks for the update! my calculation based on your reported results for 1/29-2/28 below was that you were up 2.7% so far, which extrapolated to a CAGR of 45%

@SteveSokolowski Why are you not just running this on the cloud? Can you get better latency with your own servers?

seriously just use aws or if thats too cloud for you ovh or literally anything else other than physical hardware you actually touch. if you can actually >2x a meaningful amount of money in a year (which, you know, eh) you are wasting your time with watercooling blocks and screws

@jacksonpolack There are a few issues with that.

First, we already have a hosting contract and the servers from the mining pool business, and Comcast would charge a $15,000 termination fee if we cancel it.

Second, AWS is very expensive - it would actually cost $25,000 per year in additional costs for training and inferencing versus buying 4090s.

Third, more importantly, we never lost a single satoshi with this self-hosting method in 11 years, while competitors like NiceHash constantly lost $60m+. That is because no remote hypervisor ever had access to any private key - ever. Now that we've moved onto AI, as soon as the weights go to AWS, we have no idea who can gain access to them. There can be security vulnerabilities, or a crooked employee can copy them from memory. Unlike with the private key, we won't even know that this has happened, and could lose money indefinitely.

Fourth, when you rely on a third party entirely for something like this, you learn that they can discontinue service to you at any time, for no reason at all. I've had more than 30 companies suddenly decide to discontinue service over the past decade, ranging from banks to E-Mail marketers like Constant Contact. I'm still mired in litigation against these companies, most notably with Wells Fargo, who I am likely headed to court with later this month:

/SteveSokolowski/will-wells-fargo-sue-me

We always use open source and avoid third party commercial companies at all costs for this reason. Putting everything in the hands of one company - Amazon - would be an awful decision.

I don't think AWS is pareto optimal, but it's definitely better than worrying about misthreaded screws. I don't think AWS is going to steal your private keys, they have very strong isolation and security. If it's really true that ">100% CAGR" that's just ... a more important thing to spend your time on

@SteveSokolowski Messing with your own servers is much more slow and expensive than moving to AWS.

Hypervisor breaches are so far entirely theoretical, as are crooked Amazon employees stealing data. In all our years of clous experience, none of that happened. You know what does happen often? People misconfiguring their own servers and networks and causing security issues. Do yourself a favor and switch to a provider that makes it easier for you to maintain a safe environment.

Also, this isn't a bitcoin mining operation. You are not a lucrative stealing target. Even if somebody could copy your model it wouldn't help them unless they can also copy your business model.

Idk what the hell you were doing that caused 30 companies to refuse you service, but as opposed to whatever caused that to happen, algo-trading is entirely legal. You don't need to worry about that.

@jacksonpolack That's only the case if you have a lot of money, and you point would be very important if I still had the $7m that was lost in the BlockFi and Genesis scams.

We now only have $300,000 though. Doubling that is a lot, but after being split three ways and paying 50% in taxes, that yields approximately $50,000 per person per year. Meanwhile, if I can save $25,000 per year in AWS billing, that is after tax spending, so I'm actually saving twice that in investment capital.

In most cases, you should always prefer saving money over trying to earn more, because of how taxes are so high.

When you say "It is the best of 1400 models...", is that based on a validation set and then the 656% result is based on a held-out test set? Or did you train 1400 models and pick the best based on a single hold-out test set without an intermediate validation set?

@Weepinbell The test set is a forward test set. There is 20 years of training data, a gap of several months where the data is discarded, and then five months of test data, which contains both a crash and a recovery.

Every time a new better model is identified, it is then promoted to a development environment, where it runs with the same settings as the production model in the Alpaca "paper trading" environment, which still requires the same API calls and has the same delays and slippage, to see how it differs. Every time so far, after some time to ensure there isn't a regression, we have promoted the better model to production and sunsetted the existing model. If one of the times a model didn't end up being promoted, we would then reevaluate what has changed in the market to invalidate our testing process.

@SteveSokolowski I think that's a no? I n that case you can't realistically expect to get this kind of return.

@SteveSokolowski I agree with Shump, I think right now that lack of a validation set (aside from strong negative priors) is my main object-level reason for skepticism. From my perspective you've just picked the model that got the luckiest on the test set of 1400 models - and of 1400 models, you would expect some models to be very, very lucky! But that doesn't imply anything about future returns.

@Weepinbell Three comments, one of which might also help @Shump .

First, I'm not sure what you're referring to here specifically, but if you're suggesting that we should have three different sets of data, I intentionally ruled that out. We found that having a third set that we just ran the model against doesn't add any value. It just becomes another set of "training" data that you're fitting for after you use it enough. We found that after the validation data shows the model isn't overfitted, it's better to actually run the model in paper trading for a few weeks. That means we are using brand new, different, data for each model, eliminating fitting to a specific set, and it also (more importantly) tests factors that cannot otherwise be tested, like slippage and market reaction. Outperforming an existing model when both models have never seen the data before is far better than a third test set, so I disagree with Weepinbell here.

Second, in regards to the comments about high returns, hedge funds typically achieve returns you would consider absurd - one has maintained 70% for 30 years, before machine learning. The reason we can do better than it is because that fund is sloshing around tens of billions of dollars, and we have just $300,000. When you hear people talking about fund returns, they are almost always talking about funds that have billions of dollars under management, which can't get into and out of position every day.

Third, the world conditions people that index funds should be a baseline metric because "investment advisors," which manage 90% of the country's stock wealth, aren't actually trying to earn the most money. Their goal is to limit max drawdown and maximize Sharpe (not even Sortino), to an absurd degree. They advertise "safe investment for retirement" and actively advise against high-volatility assets. My advisor up until I started doing this warned me against putting 33% of my net worth in bitcoins in 2013. Then, after I made $300,000 by selling the $7 million those bitcoins would have been worth in bankruptcy claims, she warned me against using it all to buy nVidia, AMD, TSMC, Microsoft, and (most importantly) META stock - but I did it anyway. These stocks aren't actually all that likely to fail, but they aren't "diversified." Like bitcoin, I was told they could lose half their value. I bought META at $170; you can do the math for what happens in the worst case.

Someone mentioned "priors" here. Using "priors" are usually a signal that you are basing your decision upon incomplete or erroneous data that you need to rethink. In this case, it's that index funds are the baseline. The real baseline is spending a week doing research about various industries and buying a portfolio of 9-13 stocks with the best companies, regardless of industry or volatility. That baseline is significantly higher than "hold index fund" baseline that society is accustomed to.

@SteveSokolowski

  1. Actually you might be right here. What you call a test set is functioning as a validation set in this case, and the paper trading environment is functioning as the test set. I would just report expected CAGR based on the paper trading environment.

  2. Lol, 70% CAGR is not the norm by any means. Maybe there's one that can do that, but that involves an insane combination of luck and skill. Having small AUM doesn't mean you can achieve that. The fact that you think you can get several times that much is revealing.

  3. Investment advisors, mutual funds, individual smart investors, and virtually everyone other than a small elite of hedge funds and extremely successful investors, all cannot beat the market consistently. Just consider how banks and other institutionals don't do that, even though they have all the resources they need, and can allocate at least some money to super risky strategies.

  4. Just because you got lucky twice by making risky investments doesn't mean you're better at investing than your investment manager. You could also have bough bitcoin at the top. If you gamble you sometimes win.

  5. I didn't explicitly say priors, but I did basically say what they were. The fact that you think that you can just beat the market with a week of research convinces me that you are out of your depth.

@Shump But I didn't buy bitcoins at the top. I'm not buying bitcoins now and I'm not buying any more nVidia now either. That's not random guessing and it's not luck. I evaluated the facts on the ground and made a decision based upon those facts. I would never have put so much money into bitcoins if they had been worth more than $600; that was always my plan and I stopped buying at that figure because the risk/reward was too high for that high of an allocation above $600.

I did make 77% without the machine learning models in the stock market from February 1, 2022 to March 13, 2023. Again, it's not luck; I had been following @EliezerYudkowsky my entire life, and saw that what he said was occurring and that a lot of money could be made off of it (even though I disagree the world will end.) It's why I sold the bankruptcy claim in the first place.

I disagree with your statement about how much hedge funds make. I've talked to several people and they told me the Sharpe and Sortino ratios they require to invest, and one of them wants to invest with us. The ratio they require is enormous; far above what is listed on any public Internet sites.

Finally, I didn't get lucky twice. I continually evaluated the markets and decided not to sell the bitcoins, because I determined those crashes would recover and because the fraudulent data provided to me by the lenders indicated not to sell. I spent over 500 hours on this research over the course of that period. I still continue to watch the bitcoin charts to determine when I need to buy puts on the bitcoins that BlockFi stupidly bought with the cash they owe but hasn't paid to me yet.

We can argue over whether this model will be successful, of course, and you would be correct if you said that it is harder to avoid scams than it is to make money in the stock market.

But some of your existing statements here are simply false. I have actually beaten the market, earning a total of $9.8 million in profit, starting out with $50,000, over 11 years. It simply isn't true that I'm "out of my depth," or that I am not better than an investment manager. What metric is an investment manager measured by other than making money? The investment managers I used only doubled my money I gave to them over that decade. Those statements are not supported by a reasonable reading of the facts.

bought Ṁ100 YES

Note: I've started to bet on this market, since it is a quantitative resolution.

It's weird to me that people bid down the market to 17%, when even 0.17% per day would double the money in a year and the model is tracking at 0.5%.

It's interesting that the NO bids come in on days when the model performs poorly. I notice that in BTC markets too. Even if the model is exceeding its backtesting or the charts indicate accumulation, people bet according to the present instead of the future.

@SteveSokolowski when you are a trillionaire, remember that Al Quinn believed in you!

@AlQuinn Alas, I won't be a trillionaire. To be there, I would need to get someone to actually believe me and give me money to trade with. As it stands, I hope to earn a few hundred thousand to make up for the $7m I lost in the FTX-related scams until an AGI codes a better model and mine stops working.

I've commented on that in other forums. It's possible to create narrow superintelligent models with astonishing ease now. I was able to do this in just 3500 man hours of work during one year - with one person working on it. I used four 4090s to train it. Yet, LLM research groups put out papers on their models, and they have charts that are perfect green with perfect recall out to 1M tokens, and you get people saying that the papers are a scam.

I hate to say this, but @EliezerYudkowsky is probably one of the few people who would actually believe that it's possible to solve the stock market with a model right now. I don't share his views on anything else except one thing: that people better WAKE UP, because all this AI stuff is happening now - not in a few years.

@SteveSokolowski I guess where I get suspicious of this working long term is that: 1) others will start doing it and 2) more traditional quant models will also evolve. As other models enter the fray at scale, the patterns and correlations your model is trained on may meaningfully change and impact returns. You could continuously retrain, but this just becomes an arms race and eventually there are no more $20 bills on the ground, except for those with the very best cutting edge approaches.

@SteveSokolowski personally I bet NO because if it seems too good to be true, it's most likely because it is, and that statement is especially true for anything to do with stock market returns.

It also fails the second sanity check for investments. If it's so profitable, why isn't everyone else doing it, therefore eliminating any competitive advantage?

There's plenty of hedge funds who employ the cream of the crops to do similar algo-trading strategies. Did you find some previously unbeknownst breakthrough that allows you to achieve these exceptional returns?

@Shump But it isn't too good to be true. The strategy has made about 12% (updated from 11% yesterday's post.) This is in line with expectations; the backtesting results of 600-1000% were not expected to be achieved because backtesting can't account for the effects of your own buys and sells on the market.

As to why others aren't doing it, one major lesson I learned over the past ten years - and this is for any field in general - is that the bar for a "good job" is extraordinarily low. Two jobs ago, I worked with people who would spent two hours a day chatting in an empty conference room, and another two hours in Scrum meetings, and another two surfing the Internet and talking about the news of the day whenever someone walked by. I'd be surprised if they actually spent two hours writing code.

This is also the case when I hired employees - I found that no matter how "smart" they were, all but one of them would take the easiest possible road on all tasks, and a "full" day of work to them was three hours of actual labor. After all my money was lost in the FTX and Genesis and BlockFi and Celsius scams, my brother and I realized that not only did we have to fire them, we are doing acceptably now with only 15% of the staff we had before and will never hire again.

With Claude 3 plus working 10 hours a day, 7 days a week, I calculated I can write about 13-19 times what the "average" developer accomplished in 2020. For example, between last Sunday and this Wednesday, I replaced an open-source backtesting framework that took 40s to run with one I wrote that takes 0.007s to run, but produces the same results and has more unit tests.

@Shump , when you ask why everyone else isn't doing it, the simple answer is that the vast majority of people just don't work all that hard because there is no reason right now for them to do so when unemployment is at 3%. I worked 360 days in 2023, most of them for at least 9 hours, and analyzed my time and stopped visiting useless sites like reddit and turn off my phone during the day.

I know you will probably see this statement as arrogant, but I stand by it.

bought Ṁ150 NO from 14% to 12%

@SteveSokolowski Just want to start by saying I hope this doesn't feel like a pile-on or anything - while I'm skeptical, I am genuinely curious.

In your pdf doc, the techniques you mention for this stock trading algorithm seem to be things that existed 10+ years ago - LSTMs (which as I understand most have dropped in favor of attention), regularization, dropout, etc. I kind of expected your algorithm to leverage the new hotness somehow (LLMs/transfer learning on top of them/etc), but the techniques you seem to be using are much older. Even if big hedge funds aren't as efficient as you, there are enough of them with enough employees that one would think in the last 10 years many of them would have managed something similar.

If you don't want to go too deep into what you consider your secret sauce that's totally understandable, but I'm curious what you think your technical edge is over other companies that have been using ML for stock trading for the last decade.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules