VERY slow GPT-4 slow requests

In recent days, I’ve observed that GPT-4 requests are experiencing significant slowdowns, with queue lengths reaching up to 89 positions and resolution times stretching into minutes. Often, these requests error out upon reaching position 0 in the queue. This issue primarily occurs in the early morning hours in Europe, which corresponds to post-midnight on the US west coast and 2-4 AM on the US east coast. This leads me to believe that the problem isn’t due to high traffic but possibly due to reduced capacity during these hours. It would be beneficial if the capacity could be adjusted to better accommodate the current high demand from Europe.

image

2 Likes

This limit I think was meant for claude but this has been applied to everything… Please fix it as early as possible…

Hi there - apologies about the large slow pool. We are working to scale with the increased usage, please feel free to message me here whenever this becomes an issue!

1 Like

It’s not Europe. It’s Japan, which is about their noon when the queue becomes overwhelming. They are big proponents of Cursor on Twitter.

its been a issue for a while to the point of it being unusable at points pls fix

Cursor is unusable every day at this time. Please, fix.

1 Like

I’d suggest some kind of another model actually than 500 fast messages a month…

For example, there are companies with unlimited numbers of messages (but pretty small context size and bad autocompletion tool). There are companies with a pretty big context and a limited but still pretty serious number of messages a day…

But when I am a Cursor PRO subscriber I always worry that I can’t relax and try something to edit because messages are counting and at the end of the day, I’ll be the 80th in a queue… I love Cursor and paid for it for many months but this problem driving me nuts.

You can buy more than 500 fast per month.

Fair enough.

Also, it’d be great if the ‘unlimited’ aspect was honoured—not regarding the latency but the ‘large number of slow requests’ error.

I’m confident the guys are working hard to keep everything running smoothly while keeping their financials healthy.

I’m still looking for Cursor updates but for the same money, I see more value in other “places” (and for me, there are better assistants for autocompletion and for chat). And I’m not talking about GitHub copilot which is the most popular but actually pretty meh thing with outdated models.

Don’t get me wrong, I’m trying to be useful for devs because people in my case usually don’t write anything. But I liked Cursor and still wonder if it’s going to be better.

I guess it depends on how much money they are ready to loose and for how long.

For example Phind was incredible value, 500 opus/gpt4 per day. Then they downgraded to 100 opus and now they are down to 10. So in the end someone is paying the bill, if it’s not yourself, then it’s a matter of time before they cut you off.

As long as the new GPT-4 is slightly better than Opus (in general) I don’t see a big problem here.

1 Like

Note that this ticket doesn’t refer to Cloude-3-opus

Today, it is unusable:

image
image

Here again, the long-requested feature of being able to choose between fast and slow requests would be great. Then you could save fast requests and essentially have high and low priority requests. I still can’t understand why this feature doesn’t exist. I can only imagine it’s because people would buy fewer fast requests and this would impact finances.

But still… I sometimes switch to ChatGPT for smaller, often less important questions to save fast requests. While this saves Cursor requests, it defeats the whole purpose of having an integration…

Please consider this improvement!

Devs have consistently ignored this issue for a while now, I wouldn’t hold my breath for any improvement to be made.

1 Like

I can, there is peak usage time and quiet time. People would use slow request during quiet time and fast request during peak time. Which would make things even worst for the slow request during those peak period. So if they were to offer that feature they would need to lower the amount of fast request or charge for each of them (like with Opus).

Cursor use dedicated instance of GPT4, they don’t pay per request. So they have a fixed capacity and need to share that usage among all the users. If everyone want fast request at the same time of the day, it would just not work. When queue are very long, people would just switch to fast request which would make it even worst.

Also by making your fast request spend away first, it lower your overall usage. As you will think more before using one than if it was just a free unlimited slow request.

That’s why you won’t get any answer on this.

Yes, of course. But as I’ve mentioned previously, I’m not a fan of this approach, and I often find myself copying and pasting into ChatGPT, which seems counterproductive. However, there would certainly be more overall requests when I can ask questions directly within the integration.

Yes, I’m aware of the dedicated instance. However, I’m not convinced by your conclusions. Certainly, during peak times, fast requests might slow down, but I’m not sure it would be worse overall. Sometimes, I’m willing to wait and other times not. I rarely reach the 500 fast-request limit, especially now with 10 Opus requests available. I still prefer to save my fast requests—just in case . Sometimes I can multitask, and waiting is not an issue for me. So, for someone like me, who almost never hits the 500 request mark, I might actually use fewer fast requests. The impact greatly depends on how users are distributed. There are likely some downsides for Cursor, or else they would have implemented this feature by now. However, I don’t agree with the general view that this change will make things worse.

with the gpt4o, which is %50 cheaper, i think we can get more fast request’s.

also, sam mentioned that the mysterious im-also-good-gpt-bot was theirs, and its coding elo is good.

1 Like

Awaiting cursor announcement about gpt-4o, it makes opus obsolete and it should fix the capacity issue as well.