Artificial Intelligence thread

tokenanalyst

Brigadier
Registered Member
1737578402774.png1737578474507.png

Global-batch load balance: almost free launch to improve your MoE LLM training.​

Background​

The Mixture-of-Experts (MoEs) architecture has become a popular model-parameter-scale-up technique. Typically, one MoE layer consists of a router (often parameterized as one single Linear layer) and a group of experts (for transformer-based models, each expert is one feedforward layer). Given an input, only a subset of experts will be activated, and then their outputs will be aggregated based on the scores the router assigned. Specifically,

Please, Log in or Register to view URLs content!
Please, Log in or Register to view URLs content!
 
Last edited:

tphuang

Lieutenant General
Staff member
Super Moderator
VIP Professional
Registered Member
btw, I tried the search function of deepseek app. It basically works like a RAG where it fetches like 30+ pages and then augment it to your prompt and now the results are super good. I think I'm going to just use this for any search request now. Who needs perplexity.

I got DeepSeek R1 working on my local machine. It's just a regular desktop with a RTX 3090. It's super easy and amazingly good for something that's literally running on a home computer.

And you can too! It only took 30 minutes to install.

Instructions:
1. Download Ollama and Docker.
Please, Log in or Register to view URLs content!
and
Please, Log in or Register to view URLs content!

2. Figure out how much VRAM your graphics card has, and open a command prompt and type one of these:

For 12GB to 16GB GPUs
ollama run deepseek-r1:14b
(14 billion parameters is a bit limiting, so this version won't be great)

For 24GB GPUs, or if you have 2 16GB GPUs
ollama run deepseek-r1:32b
(This is when the model starts getting amazingly good for a locally hosted one. I recommended getting this tier if you can)

For 2 24GB GPUs
ollama run deepseek-r1:70b
Are these all distilled Qwen versions?

What can I run with my MacBook Pro from 2024?

Well according to deepseek

Recommendations:​

  • For Most Users: If your MacBook Pro 2024 has 16GB of RAM, the DeepSeek R1 Distill Llama 8B is the best choice, offering a good balance of performance and resource efficiency2.
Maybe I will try that.
 

siegecrossbow

General
Staff member
Super Moderator
I got DeepSeek R1 working on my local machine. It's just a regular desktop with a RTX 3090. It's super easy and amazingly good for something that's literally running on a home computer.

And you can too! It only took 30 minutes to install.

Instructions:
1. Download Ollama and Docker.
Please, Log in or Register to view URLs content!
and
Please, Log in or Register to view URLs content!

2. Figure out how much VRAM your graphics card has, and open a command prompt and type one of these:

For 12GB to 16GB GPUs
ollama run deepseek-r1:14b
(14 billion parameters is a bit limiting, so this version won't be great)

For 24GB GPUs, or if you have 2 16GB GPUs
ollama run deepseek-r1:32b
(This is when the model starts getting amazingly good for a locally hosted one. I recommended getting this tier if you can)

For 2 24GB GPUs
ollama run deepseek-r1:70b

After Ollama downloads the model, install Open WebUI in Docker by typing this in the command prompt:


Finally go to
Please, Log in or Register to view URLs content!
and you now have your own LLM! In my testing with the 32b parameter version, it's at least as good as ChatGPT 3.5. Absolutely amazing - same performance using just 5% of ChatGPT 3.5's hardware.


Example output:

Input:
how would a modern amphibious assault of the same scale differ from D-Day? how would it be similar?

Output:
deepseek-r1:32b

<think> Okay, so I need to figure out how a modern amphibious assault on the scale of D-Day would differ and also what similarities there would be. Hmm, D-Day was June 6, 1944, right? The Normandy landings with all those troops and equipment being landed by sea. Let me think about what's changed since then.
First off, technology has advanced a lot. Back in WWII, they didn't have things like GPS or satellites. So navigation must have been tricky back then. Now, modern navies probably use GPS to pinpoint landing zones accurately. That should make landings more precise and maybe reduce casualties because they can get where they need to be without getting lost.

Then there's the issue of detection. During D-Day, the Allies had to avoid German radar and submarines. But now, with satellites and advanced surveillance, it's easier to spot enemy movements beforehand. Maybe modern forces could even disrupt enemy communications or use cyber warfare to confuse them before the assault begins.

Air support has also evolved. In WWII, planes were slower and less accurate. Now, we have fighter jets that can provide close air support more effectively. Plus, precision-guided munitions mean they can hit targets without causing as much collateral damage. That should help in softening up defenses before the troops land.

Another thing is the use of helicopters. During D-Day, there were no helicopters, so troops had to come in by sea or gliders. Helicopters allow for more flexible insertions, like landing behind enemy lines or reinforcing areas quickly once the initial assault is underway. That adds another layer of strategic depth.

Logistics must be different too. Moving such a large force requires advanced planning. Modern militaries use computer systems to coordinate ships, supplies, and troop movements in real-time. This should make logistics more efficient and reduce the chances of supply shortages or delays.

Defensive measures have also changed. In 1944, beaches were defended with concrete bunkers and machine gun nests. Today, enemies might use mobile missile systems, anti-ship missiles, and maybe even submarines equipped with torpedoes. So modern forces would need better protection against these threats.

In terms of armor and vehicles, today's tanks are more advanced with better armor and firepower. They can move faster and are harder to detect thanks to things like stealth technology or thermal imaging. That makes them more effective in breaking through enemy lines once they've landed.

Cyber warfare is a new factor that didn't exist back then. Modern forces could use cyber attacks to disable enemy communication networks, disrupt command structures, or even take control of their radar systems. This would give the assaulting troops a significant advantage before and during the landing.

Special Forces operations are another difference. During D-Day, there wasn’t the same level of specialized units that we have today. These teams can conduct pre-assault missions to gather intelligence, sabotage enemy infrastructure, or eliminate key targets, making the main assault smoother.

Amphibious vehicles have advanced too. Back then, landing crafts were more basic. Now, they have things like hovercraft and advanced landing craft that can handle different terrains and are harder for enemies to target because of their speed and maneuverability.

But despite all these changes, there are some similarities. The basic concept remains the same: land a large force on enemy-held beaches to establish a foothold. The element of surprise is still crucial, just like it was in 1944. Without catching the enemy off guard, the assault becomes much harder.

Coordination among different branches is another similarity. D-Day involved navies, air forces, and armies working together seamlessly. Modern operations would require even more coordination since they involve cyber and space components as well. It's a complex dance that needs meticulous planning.

Training is still essential. Just like in WWII, modern troops need to practice amphibious landings extensively to handle the chaos and stress of combat. Realistic simulations help them prepare for unexpected situations.

The strategic importance of such an assault hasn't changed either. D-Day was pivotal in turning the tide of WWII in Europe. A modern D-Day would likely be just as significant, maybe even more so given today's geopolitical landscape.

So putting it all together, a modern amphibious assault would have better technology, more advanced logistics, new threats to face, and incorporate things like cyber warfare and special forces. But at its core, the goal of establishing a beachhead with coordinated forces remains similar to D-Day. </think>

Modern Amphibious Assault: Differences and Similarities Compared to D-Day

Differences:


  1. Technology and Navigation:
    • Utilizes GPS and satellite technology for precise navigation, enhancing accuracy and reducing casualties.
  2. Surveillance and Detection:
    • Advanced satellites and surveillance systems allow early detection of enemy movements.
    • Cyber warfare and communication disruption can confuse enemy defenses before the assault.
  3. Air Support and Precision Munitions:
    • Modern fighter jets provide effective close air support with precision-guided munitions, minimizing collateral damage.
  4. Helicopter Insertions:
    • Flexible use of helicopters for strategic insertions behind enemy lines or rapid reinforcements.
  5. Logistics Coordination:
    • Computer systems manage real-time coordination of ships, supplies, and troop movements for efficient logistics.
  6. Defensive Measures:
    • Counter advanced threats like mobile missile systems, anti-ship missiles, and submarines with improved protection strategies.
  7. Armor and Vehicles:
    • Advanced tanks with stealth technology, thermal imaging, and enhanced armor provide better mobility and effectiveness.
  8. Cyber Warfare:
    • Exploitation of cyber capabilities to disrupt enemy communications and command structures.
  9. Special Forces Operations:
    • Deployment of specialized units for pre-assault missions, including intelligence gathering and sabotage.
  10. Amphibious Vehicles:
    • Use of advanced landing craft like hovercraft for better terrain handling and increased survivability.

Similarities:

  1. Objective of Establishing a Foothold:
    • The core goal remains to land a large force on enemy beaches to establish a beachhead.
  2. Element of Surprise:
    • Crucial for success, as surprise tactics were pivotal in D-Day and remain so today.
  3. Inter-Branch Coordination:
    • Seamless integration of navy, air force, army, cyber, and space components is essential for mission success.
  4. Extensive Training:
    • Troops undergo rigorous simulations to prepare for combat stress and chaos.
  5. Strategic Importance:
    • The assault holds significant strategic value, much like D-Day, influencing broader geopolitical outcomes.

In summary, a modern amphibious assault would leverage advanced technology, sophisticated logistics, and new operational strategies while maintaining the fundamental objectives and coordination seen in D-Day.

Someone asked a question about how to design a sixth gen fighter aircraft for PLAAF (keep in mind that data cuts off in 2023) and it suggested either two TBCC engines or fifth gen turbo fans like WS-15…
 
Top