"Local mode" is misleading, even with BYO OpenAI Key

I wanted to see if I can use Cursor for work and did a bit of vetting. Under no circumstances are we allowed to send work related code to third-parties, unless explicitly greenlit by compliance and security. OpenAI is greenlit so the “bring your own API key” option sounded great with “local mode”

Was surprised to see that “local mode” isn’t local at all.

Asking anything about code will upload the code to *.api.cursor.sh, including my openai key in the form of {"modelName":"gpt-3.5-turbo","apiKey":"xxx","azureState":{}},

together with stuff like

{"relativeWorkspacePath":"testappTests/testappTests.swift","range":{"startPosition":{"line":1,"column":1},"endPosition":{"line":27,"column":6}},"contents":"//\n//  testappTests.swift\n//  testappTests\n//\n//.\n//\n\nimport XCTest\n@testable import testapp\n\nfinal class testappTests: XCTestCase {\n\n    override func setUpWithError() throws {\n        // Put setup code here. This method is called before the invocation of each test method in the class.\n    }\n\n    override func tearDownWithError() throws {\n        // Put teardown code here. This method is called after the invocation of each test method in the class.\n    }\n\n    func testExample() throws {\n        // This is an example of a functional test case.\n        // Use XCTAssert and related functions to verify your tests produce the correct results.\n        // Any test you write for XCTest can be annotated as throws and async.\n        // Mark your test throws to produce an unexpected failure when your test encounters an uncaught error.\n        // Mark your test async to allow awaiting for asynchronous code to complete. Check the results with assertions afterwards.\n    }"},"score":0.711380124}

Asking about the codebase will result in all files of the codebase to be uploaded to the cursor.sh apis, which then I guess pings the openai API

I read through the forums and understand that cursor needs to store the index in the form of a vector database on it’s servers, but sending code and api key to those servers when “local mode” is explicitly enabled with an openai key is misleading and needs to be clearly stated.

Can’t the data vectors be computed locally? Why does code need to leave my machine, and why can’t my machine directly use the openai APIs when I already specified my key?

While cursor looks great, IMHO the current data aggregation is a bit much and could never pass a corporate compliance review. It also makes me a little uncomfortable using it for my private projects since I have no idea what happens on cursor’s servers, how data is stored (or not stored) and how logging is handled. It’s all just “we don’t store it” currently without further details

So what exactly is “local mode” when it still communicates everything to the cursor servers?

2 Likes

Hi! Apologies for the unclarity here. We’ll certainly consider changing the language. Do you have any suggestions for a phrase that would feel more correct to you?

With local mode enabled:

  1. We do not store any code data in plaintext.
  2. We only store what’s truly needed for you to be able to use Cursor (e.g., your email, your request count so we can enforce the usage limits, the @docs we’ve crawled for you, etc.)
  3. Unless you’re on the enterprise plan, OpenAI stores the prompts we send them for 30 days, but does not train them. If you’re on the enterprise plan, OpenAI guarantees zero-day retention.
  4. If you ask us to index a repository, we store lossy embedding vectors on our servers, with pointers to filenames + file lines. When you do a codebase-wide chat query, we retrieve the right locations and then read the code from your local files. You can turn off the indexing, and most features will still work (but be lower quality).
  5. When you make a request to the AI, we send up relevant context to our server (e.g., parts of your current file, other potentially relevant files, linter errors, etc.). We construct the prompt on our server, and then send it to OpenAI.
  6. In transit, your data is sent encrypted over https (as basically all other internet traffic is these days with SSL).

Please let me know if anything is still unclear here!

At some point in the future we hope to store and compute the embeddings locally. For now, though, it is easier for us to do it all on the server — we’re still a very small team.

The same is true for why the requests hit our servers, instead of just hitting OpenAI directly — it is just easier for us to develop that way. At some point in the future, we hope to be able to support a mode where our servers are not touched at all.

1 Like

Thanks for the reply, yeah I understand what you’re saying but the critical points I see here are sending of data to the servers, when “local mode” implies that things happen local

You’re saying you don’t store any details in plaintext, or only what’s necessary, but that’s still problematic because the data has to go to you in the first place (in plaintext), and there is no external audit or security review yet that validates that this data is really not stored. For example, you could have some debug log somewhere that prints stuff that’s sent to the server and pipes it into logstash/datadog/etc, and all of a sudden data is stored, just not by you.

What I would like to see is direct communication with the OpenAI API from my computer in local-mode, and the data that is sent to Cursors servers being already in embeddings, and not plaintext sourcecode or secrets (like the openai api key).
I understand it’s easier to do it on the server, but I feel this could be a big security risk that’s waiting to happen, especially when you’re a small team without dedicated security team

I also think it’s odd that specifying an API key will cause Cursor to send that API key to the Cursor API instead of using it directly.

Do you have any suggestions for a phrase that would feel more correct to you?

Honestly I’m not even sure what the current local mode is doing, what exactly is happening locally compared to when not having it turned on? To me it looks like nothing is really computed locally and the editor still relies heavily on the cursor API for all the heavy lifting :thinking:

After writing the definition down of what local and non-local mode is, I would see what name fits better to what it does in either mode

1 Like

Makes sense! We hope to be able to introduce support for direct communication with OpenAI at some point. We also hope to get SOC 2-verified soon. Unfortunately, we are a small team, and aren’t able to focus on this right now, so Cursor may not be a fit for you just yet.

Based on your feedback, we’re considering changing the name of local mode to “privacy mode” or “no-storage mode”, which would more accurately reflect that the key benefit is not having any code data stored by us.

To be clear, if you are on local mode, we do not do any logging at all. The only place where your data would be stored is at OpenAI for 30 days, unless you have a business subscription, in which case your data wouldn’t be stored anywhere at all.

Privacy mode is IMHO still misleading because it’s just a pinky promise currently. Out of the suggested options I would not even call it a mode, but just a “Don’t store data serverside” toggle. It was to me not clear what this toggle does until now, so it’s just a flag that tells the server to not store.

To be clear, if you are on local mode, we do not do any logging at all. The only place where your data would be stored is at OpenAI for 30 days, unless you have a business subscription, in which case your data wouldn’t be stored anywhere at all.

I know you keep mentioning this, but it doesn’t change the fact that the data is going to the server in the first place, when “local” implies that it doesn’t. What’s happening on the server is out of my control and I can’t vet it, so you might log, or you might not. Until audits are done it will probably stay like that.

In any way, I would love for a real local mode, or a way for me to host the server component on my own Mac, so that Cursor communicates with this local server. It could even be a binary that doesn’t expose the source code.

I sound critical but I love the idea of Cursor and want to use it, just, I currently can’t because of the implications mentioned above

2 Likes

Totally get it! I appreciate your feedback here.

Unfortunately, I don’t see us supporting a self-hosted version of our server anytime soon. The main reason is that most of the new features we’re building (e.g. the /edit feature — but there are many more things to come here) not only rely on public OpenAI models, but also rely on custom or in-house models that aren’t publicly accessible. A self-hosted server wouldn’t be able to access those new features, so it is unlikely that that is a path we’ll be able to prioritize.

1 Like

I am coming from using VS Code with Continue and local AI models on my Mac via Ollama and Codellama. I love this approach for two reasons. 1. With the straight forward privacy, it is incredibly easy to get buy in from companies I am coaching on adopting AI coding practices. 2. I can code with AI even when my internet goes down.

I hope this encourages you consider the competitive advantage of offering a truly “local” mode.

Here’s some code to help you implement local embeddings. Its not very complicated at all, runs fast and without any performance degradation. Replace with model of your choice:

/**
 * Embed a chunk of text using local embeddings
 * @param {string} chunk - The text to be embedded
 * @returns {Promise} - A promise that resolves to the embedding of the input text
 */
export const localEmbeddings = async (chunk) => {
  const transformers = await import('@xenova/transformers');
  const model = 'Xenova/all-MiniLM-L6-v2';
  const getEmbeddings = await transformers.pipeline(
    'feature-extraction',
    model
  );

  let result = await getEmbeddings(chunk, {
    pooling: 'mean',
    normalize: true,
  });

  // Convert tensor to array
  return result.tolist()[0];
}
1 Like

What about a local vector store? With the continuous improvement in embeddings and the reconstruction of data from these embeddings, this is becoming increasingly significant. This is especially relevant as embeddings are being stored. I’m also wondering, what is the lifetime of these embeddings? Will these be removed when the index is deleted?

Would love to see a solution were no code information (in any form) is stored in the cloud. :heart:

if you turn off indexing, does regular chat become worse if you tend to copy and paste some code into the chat or only code base queries?

say I want cursor to make some new code for me but I need to give it some context of two files in my code which are say about 60 lines each

if I have index turned off, will it be similar to as if I am just putting the two files in my code and letting chatgpt decide for everything? as in the only context will be those two files and they will all be converted to tokens fully?

I copy paste. I do not index at all frankly because indexing doesn’t work for me in SSH. Just by copy pasting I was able to create a fully functioning web tool. Each code file was like 500-800 lines.

1 Like