Artificial Intelligence thread

9dashline

Captain
Registered Member
are you doing that for Coding? For me, using Kimi 2.6 + Kimi Code on VS Code, it already does everything I need it to do. I just basically approve changes. I don't let it get more automated than that.
Yup the only thing Opus 4.7 is actually better at isn't coding but emotional intelligence, creative writing and interpersonal "calibration"
 

magmunta

Junior Member
Registered Member
I use ChatGPT 5.5 , 200$/month version and its pro version takes several minutes to answer simple questions that free Kimi does in Less than a minute. That's kinda downside of gpt 5.5 because pro always thinks a lot even when there's no need for thinking. For example, when you ask both to find certain answers based on data, free Kimi, at least in my case, was faster and gpt 5.5. of course, I am not saying Kimi is better than gpt 5.5
 

Eventine

Senior Member
Registered Member
Opus 4.7 was not that well received, I guess.

Nonetheless, the next wave of Western frontier models should be arriving soon. GPT 5.5 was the first to break the "Opus 4.6/4.7" hold, while Google is soon to release Gemini 3.5 Pro, which should be GPT 5.5+ level. Anthropic responding with 4.8 and Mythos 1 was expected to maintain their agentic lead.

The real question is how much it's going to cost. Western AI labs have enough compute to hyper scale their models, but do they have sufficiently cheap compute to be cost effective? That is the bigger question as each iteration of models costs more in inference.
 

iewgnem

Captain
Registered Member
I'm currently using a combination of Codex (GPT 5.5) and Claude Code (deepseek V4). Simple preliminary tasks like information gathering are handled by V4, while GPT 5.5 defining user needs and developing a plan.

Once the plan is finalized, I only need to use /goal to let deepseek V4 pro complete it. Because the plan is already defined, it's less likely to go astray. A task that would normally require using the full 5-hour CodexPlus limit can be completed with only 20% of the limit and V4 pro, in several times the time and at a less than a dollar's cost.

And these data will definitely benefit the training of V4.1. It will no longer be a free-roaming without Harness, nor will it be a distillation of GPT-5.5. It will be completed step by step under the guidance of the GPT-5.5's framework. There will be errors in the process, but it will not go astray.
I just use Kimi if the task requires working on multiple code base together and DS if its a single code base.
Use Kimi CLI for Kimi and OpenCode for DS (OpenCode doesn't play well with Kimi 2.6)
Not a single problem so far, and not a single thought about usage limits, the idea of rationing how many tokens I'm using or usage limits is frankly insane at this point.

I mean once you get to unlimited fronteer model use, everything just change, I don't even bother do git commit myself anymore.
 

meedicx

Junior Member
Registered Member

Kimi 2.6 is just so good and well versed. I don't understand why people don't use this instead of Opus.

From personal experience and reading a lot of reviews from other devs, Opus/GPT5.5 feels more useful the less you know about the code and the more higher-level the prompt. If you understand the structure of your code and can code the feature yourself, Kimi K2.6 / DeepSeek v4 pro is more than enough. On the prompt-level, this is the difference between writing about high-level features details versus to referring to class/variable names directly.

However, the negative of not have "cognitive ownership" of your code is now being discussed. Fully relying on Claude to do everything while not understanding your code often lead to an unsustainable code base where even small changes could cause regressions in random places and you don't know why since you don't understand your code.

Claude/GPT also performs better in more obscure frameworks/languages like game programming in Godot, due to more training data, whereas DeepSeek/Kimi would fall back on raw reasoning using much more tokens with less accuracy. For enterprises with lots of code, this actually suggest the optimal future setup would be to CPT/RL DeepSeek flash on their own code base making it perform much better than even the largest models
 

iewgnem

Captain
Registered Member
From personal experience and reading a lot of reviews from other devs, Opus/GPT5.5 feels more useful the less you know about the code and the more higher-level the prompt. If you understand the structure of your code and can code the feature yourself, Kimi K2.6 / DeepSeek v4 pro is more than enough. On the prompt-level, this is the difference between writing about high-level features details versus to referring to class/variable names directly.

However, the negative of not have "cognitive ownership" of your code is now being discussed. Fully relying on Claude to do everything while not understanding your code often lead to an unsustainable code base where even small changes could cause regressions in random places and you don't know why since you don't understand your code.

Claude/GPT also performs better in more obscure frameworks/languages like game programming in Godot, due to more training data, whereas DeepSeek/Kimi would fall back on raw reasoning using much more tokens with less accuracy. For enterprises with lots of code, this actually suggest the optimal future setup would be to CPT/RL DeepSeek flash on their own code base making it perform much better than even the largest models
The price difference is so massive, the only way to feel Opus/GPT is more useful than Kimi is if you're not paying the bill and the feeling doesn't include the pain of spending money or rationing token use.

Cognitive ownership has different levels, I have projects that are now full AI written, but I still designed it because I told it how it should be structured, I also have legacy projects that AI has taken over, honestly I haven't felt any difference in Kimi performance between the two.

But I can't speak for projects that's a total black box to me, maybe Opus does better, maybe not, but frankly those are also not projects that can do anything useful other than benchmark. At end of the day if you have no idea how code works you're still not gonna build anything useful regardless.

IMO the real productive difference is the ability to use frontier models for the most mundane things without a second thought while having high confidence it will do the job correctly. I'm now at a point where I'd tell Kimi to copy files between folders because I can't be bothered to navigate, or make system config changes because otherwise I'd have to open AppData folder, or install a program because I'm too lazy to go to the website and click on the link. IMO its the combination of cost and trust that really feel like the biggest change, and I can't imagine doing the same thing if I have to think about token usage.
 
Top