Gate Square “Creator Certification Incentive Program” — Recruiting Outstanding Creators!
Join now, share quality content, and compete for over $10,000 in monthly rewards.
How to Apply:
1️⃣ Open the App → Tap [Square] at the bottom → Click your [avatar] in the top right.
2️⃣ Tap [Get Certified], submit your application, and wait for approval.
Apply Now: https://www.gate.com/questionnaire/7159
Token rewards, exclusive Gate merch, and traffic exposure await you!
Details: https://www.gate.com/announcements/article/47889
GPT-4 became more stupid, and it was revealed that the cache history replied: a joke was told 800 times, and I didn't listen to the new one
Original source: qubits
Some netizens found another proof that GPT-4 has become “stupid”.
He questioned:
OpenAI will cache historical responses, allowing GPT-4 to directly retell previously generated answers.
Evidence shows that even when he turned up the temperature value of the model, GPT-4 repeated the same “scientists and atoms” response.
It’s the “Why don’t scientists trust atoms?” Because everything is made up" by them".
Not only that, but even if we don’t move the parameters, change the wording, and emphasize having it tell a new, different joke, it won’t help.
This shows that GPT-4 not only uses caching, but also clustered queries rather than matching a question exactly.
The benefits of this are self-evident, and the response speed can be faster.
However, since I bought a membership at a high price, I only enjoy such a cache retrieval service, and no one is happy.
If that’s the case, isn’t it unfair that we keep using GPT-4 to evaluate the answers of other large models?
Previous studies have shown that ChatGPT repeats the same 25 jokes 90% of the time.
Evidence Real Hammer GPT-4 with Cache Reply
Not only did he ignore the temperature value, but this netizen also found:
It’s useless to change the top_p value of the model, GPT-4 does just that.
(top_p: It is used to control the authenticity of the results returned by the model, and the value is lowered if you want more accurate and fact-based answers, and the answers that are more diverse are turned up)
It is worth mentioning that others seem to have found a similar phenomenon on the local model.
So the question is, how exactly does the big model cache our chat information?
Their reasoning is that the case adopted by the author happens to be a joke.
After all, in June of this year, two German scholars tested and found that 90% of the 1,008 results of ChatGPT telling a random joke were variations of the same 25 jokes.
So you can understand why it looks as if the previous answer is cached.
Therefore, some netizens also proposed to use other types of questions to test and then see.
However, the authors insist that it doesn’t have to be a problem, and that it’s easy to tell if it’s cached just by measuring the latency.
What’s wrong with GPT-4 telling a joke all the time?
Haven’t we always emphasized the need for large models to output consistent and reliable answers? No, how obedient it is (manual dog head).
Reference Links: