Artificial Intelligence thread

tamsen_ikard

Junior Member
Registered Member
There must have had people making the same claims when steam engines were used in mines and workshops for the first time. We are only at the early stages of AI as productivity tools.
Yes, that's the thing. AI applications is in its infancy. It will take decades to actually create the proper applications and reliability. But US stock bubble will burst long before that happens.
 

vincent

Grumpy Old Man
Staff member
Moderator - World Affairs
Chatbot AI is a massive bubble that will burst soon. As I have said before, chatbot AI has not replaced a single human yet and they won't anytime soon. Cause they simply lack the general competence on every single thing and also the social skills to communicate with customers, stakeholders and your bosses.

Chatbot AI will make things easier for humans, make it more efficient and easier to do things. Dumb people will be able to act smart with the help of Chatbot AI. But good luck replacing a single worker.
Lookup Salesforce's AgentForce. It is a replacement for Indian call centres.
 

Fatty

Junior Member
Registered Member
CS professor complains that DeepSeek is not open enough to be called truly open source.


OK, but they are still sharing more information than most frontier model papers from closed labs. Also, they still need some amount of moat, no? I think they strike a nice balance.
This professor is just being purposefully daft. There are a ton of legal and privacy reasons why the data is not open source. Even if deepseek wanted to open source the data it would likely not be able to.
 

iewgnem

Junior Member
Registered Member
Feel bad for Kimi. They keep reminding us that their model is out, it's a great model, but got completely overshadowed by DeepSeek. Shows that in life a lot is just timing/luck. Hopefully they get the recognition they deserve in the West eventually.

-----

CS professor complains that DeepSeek is not open enough to be called truly open source.


OK, but they are still sharing more information than most frontier model papers from closed labs. Also, they still need some amount of moat, no? I think they strike a nice balance.
The idea that a for profit company would release its latest and greatest model on day 1 in its entirity for free is only somehow expected because its so advanced people cant imagine they're holding back.

And the implication if they were..
 

tokenanalyst

Brigadier
Registered Member
My NVIDIA investment is cooked but I’ve never been so happy about it.
Not necessary, DeepSeek highly efficient method could open the door for smaller startups to enter the AI business, so more GPUs. Many people was abandoning the idea of the AI bussiness because the training/fine tuning cost was becoming prohibitively. The DeepSeek guys proved everyone wrong, there is more optimization to do. As I pointed in my previous post in the past few years it was weird that lack of optimization in the AI landscape.
 

OptimusLion

New Member
Registered Member
Qwen officially launched the open source Qwen2.5-1M model and its corresponding reasoning framework support.

Tongyi Qianwen released two new open source models this time, namely Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M. This is the first time that Qwen has extended the context of the open source Qwen model to 1M in length.

In order to help developers deploy the Qwen2.5-1M series models more efficiently, the Qwen team has completely open-sourced the vLLM-based reasoning framework and integrated the sparse attention method, which makes the framework 3 to 7 times faster when processing 1M labeled inputs.

Key technology:

The training of long sequences requires a lot of computing resources, so a gradual length expansion method is adopted to expand the context length of Qwen2.5-1M from 4K to 256K in multiple stages: starting from an intermediate checkpoint of the pre-trained Qwen2.5, the context length is 4K at this time.

In the pre-training stage, the context length is gradually increased from 4K to 256K, and the Adjusted Base Frequency scheme is used to increase the RoPE base frequency from 10,000 to 10,000,000.

In the reinforcement learning stage, the model is trained on short texts (up to 8K tags). Through the above training, the final Instruct model can handle sequences up to 256K Tokens.

In the above training process, the context length of the model is only 256K Tokens. In order to expand it to 1M Tokens, the length extrapolation technology is used.

In terms of inference speed, in order to speed up the pre-filling stage, the research team introduced a sparse attention mechanism based on MInference.

In addition, several improvements are proposed: block pre-filling, integrated length extrapolation scheme, sparsity optimization, other optimizations, etc.

Deployment memory requirements:

Qwen2.5-7B-Instruct-1M: At least 120GB of video memory is required (total of multiple GPUs).
Qwen2.5-14B-Instruct-1M: At least 320GB of video memory is required (total of multiple GPUs).

Model link:
Please, Log in or Register to view URLs content!

technical report:
Please, Log in or Register to view URLs content!


experience link:
Please, Log in or Register to view URLs content!
 

Maikeru

Major
Registered Member
There must have had people making the same claims when steam engines were used in mines and workshops for the first time. We are only at the early stages of AI as productivity tools.
Some tried to make sure that the new machines wouldn't replace them by throwing their wooden shoes - sabots - into the machinery. Hence "sabotage". Organised bands of saboteurs left notices signed in the name of one "Ned Ludd" - hence, "Luddites".
 

Eventine

Junior Member
Registered Member
AI will be replacing workers pretty soon. In fact it’s already happening as many software companies are implementing de facto hiring freezes and slow downs. Same for translators, legal assistants, copy editors, artists, musicians, technical support, and more.

Just because companies aren’t announcing the replacements doesn’t mean it’s not happening. Across many white collar industries, companies are hiring less, laying off more, and demanding more productivity from the remaining employees (via AI tools). This directly translates to lost jobs.

Please, Log in or Register to view URLs content!

This is what annihilation looks like. We’re in the middle of an AI software boom yet software hiring is falling off a cliff, think about why that is. In any other industry boom we’d be seeing an explosion of new hires.

And we’re just getting started as Deep Seek showed that “reinforcement learning is all you need,” which has been the critical missing piece of the puzzle in continuous model improvement.
 
Top