the 1000x before bit has quite a few sideffects to it as well.
lesser used languages suffer because there’s not enough training data. this gets annoying quickly when it overrides your static tools and suggests nonsense.
larger training sets contain more vulnerabilities as most code is pretty terrible and may just be snippets that someone used once and threw away. owasp has a top 10 for a reason. take input validation for example, if I’m working on parsing a string there’s usually context such as is this trusted data or untrusted? if i don’t have that mental model where I’m thinking about the data i might see generated code and think it looks correct but in reality its extremely nefarious.
the 1000x before bit has quite a few sideffects to it as well.