IcaroRibeiro said:
Regarding the training using all code (including shit code), that are metrics applied during the training phase to qualify what is the output of the model. Even if there is lots of trash code in training phase the learning algorithm will strongly punish low quality answers (in this case, shit code) and give good scores to high quality answers (high quality code) I strongly recommend using the Premium version of Claude. It has ability to understand contextual project patterns and generate new code following the same principles, even if you need to explicitly tells it to follow the coding standards already present in the source code. The level of actual coding erros is minimal for almost everything outside front-end development strongly related to HTML rendering and user interaction (among other more niche and esoteric applications) For automation, backend development, machine learning and anything strongly logic-based it's much more efficient, fast and produces much less bugs than humans I'm using Gemini 3 in my current project (mostly data engineering code) and it's absurd how you can quickly write shit code to teach Gemini what bussines logic you actually want and it will change your slop to bug-free production-ready code instantly. Team has being using it 5 months and the number of bugs related to actual coding were reduced to literal zero. Unit testing has been rendered completely useless, because the AI already cover edge cases and side effects All bugs now are related to badly defined (or badly understood) bussines logic, or problems on input data. Really game changer |
We're working with a relatively large codebase, much of which could be called legacy and much of that legacy we're still building on because we still have a lot of other tech debt that's prioritized higher for fixing but still need to further develop the system, and AI is simply unreliable. It's not necessarily bad, but it can't understand the codebase the way humans can, so it makes dumb mistakes because of that, but from time to time it also makes dumb mistakes unrelated to that. I'm sure it would do better in a smaller project or a project that was built to higher standards from the ground up, but unfortunately it's not.
For what we're doing, the risk of any bugs resulting from poor AI code are simply such that basically everything the AI does needs to be reviewed, preferably carefully, lest we risk stupid mistakes getting through. It's slow work because of that, maybe still somewhat faster than doing it manually but reviewing so much code is not fun.
We're using paid versions of several models, I believe, and that includes Claude. I think the models available to us are top-notch, for what is available at the moment at least.
I'm glad it's working out for you, but for us, there's a long way to go still before AI can really be a game-changer. For us specifically, I feel like it's more like a minor convenience at best at the moment.







