You cannot run the full Deep Seek model in a Rasperry Pi. It is mathematically impossible to run a 671 billion parameters model on anything less than a cluster of work station GPUs.
What you're talking about is running the distilled versions of the model, which is significantly weaker than the full R1, and while still impressive, it's more for hobby developers than enterprise users.
The real benefit of Deep Seek is the ~20x reduction in enterprise API costs (according to benchmarks - Deep Seek generates more thinking tokens so costs a bit more than it might seem just looking at cost/token vs. O1). That is a consequence of several factors, of which cheaper training costs is only a small contributor.
It took Meta ~$60 million to train Llama 3.1. But how much money do you think they spent on engineering resources within the company? A team of ~200 researchers/engineers costs >$250 million a year for a Silicon Valley company to keep around between compensation and benefits. Add that on top of all the GPUs they had to buy to build up infrastructure, the support staff, the building costs, and we're talking a billion dollars a year for a generative AI team. It is well known that Open AI pays its researchers $1 million a year, so they're probably paying even more.
That's the thing that Deep Seek was able to get around, not just from significantly reducing model training costs, but from superior value/currency spent, which is shared by all Chinese companies relative to the West. The cost to train a model is just a fraction of the cost it takes to maintain a viable LLM product. It's everything else where the cost difference makes the most impact, and that's also why I think, longer term, China's real advantage won't be from algorithmic innovations - which will be quickly copied - it'll be from structural advantages, which are far more sustainable.