Will I get LLaMA (v1) to run on Modal.com using tinygrad by end of market?

I am trying to get:

LLaMA (the large language model), 7B parameter
to run via tinygrad (a pytorch alternative by George Hotz with fewer operation types)
to run on modal.com (a machine-learning oriented cloud)

The repo for tinygrad already has the code to run LLaMA in llama.py, so this is a "why the hell ain't it working" sort of thing, rather than a coding exercise.

It runs OK, but produces garbage output even with many different options tried. Which includes changing to CPU/GPU/Cuda, and lots of debugging print statements. I have checked the weights checksum

Other information: people have been helping me on Discord and say it works for them; I don't have the resources at the moment to check it on a local machine, even for CPU;

#	Name	Total profit
1		Ṁ2
2		Ṁ2
3		Ṁ1

Name

Total profit

Ṁ2

Ṁ1

🏅 Top traders

People are also trading

Related questions