Desensitize your conversation with LLMs (that you don't trust)

I've been using gptel since the beginning of the AI craze. It's, in my opinion, a work of art. However, manually checking and replacing names, phone numbers, and project details is both annoying and unreliable. For people who use LLM daily, an automated desensitization could save a lot of time.

Several weeks ago, after reading a news report about ChatGPT calling a user by his real name 1, I was freaked out and finally decided to take some time to implement myself a function to automate desensitization. After a few weeks of use, I found it works without major problems. The process is straightforward: it first checks if the text being sent to an LLM contains any strings matching pre-defined sensitive information. If it does, it asks whether you want to halt the request or mask the information. In the latter case, it sends the desensitized data and restore the placeholder with predefined restoring strings after receiving the LLM's response.

In the rest of this blog, I'll briefly explain the process and how you can adapt it for your own use.

Define the patterns of sensitive informations

Sensitive information contains three components: the information you want to hide (the pattern), the placeholder that will be visible to LLMs, and the restoring string that replaces the placeholder when you get a response from an LLM. These components comprise a triplet, which should be stored in a list, like the llm:sensitive-list variable shown below:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
  (setq llm:sensitive-list
        ;; STRUCTURE: '(pattern placeholder restore-string)
        '(("\\b\\(G\\|g\\)eorge\\b \\b\\(O\\|o\\)rwell\\b"      "person-name-2V75" "George Orwell")
          ("Albert Einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("Albert einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("albert Einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("albert einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("\\b13211111111111111\\b"                            "identity-no-zV5Y" "13211111111111111")
          ("7[\\ \\.]*77[\\ \\.]*77[\\ \\.]*77[\\ \\.]*77\\b"   "phone-number-w9rD" "7.77.77.77.77")
          ("\\b\\(mistral\\|Mistral\\|MisTral\\|MISTRAL\\)\\b"  "project-name-rqOu" "Mistral")
          ("\\b\\(CNRS\\|cnrs\\)\\b"                            "institute-name-JxMi" "CNRS")
          ;; DANGEROUS ZONE - API key format in REGEX - this might impede ‘gptel’ from even functioning
          ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
          ;; ("[a-zA-Z0-9-_.]\\{24,40\\}"                       "an-api-key" "AN-API-KEY") ;;
          ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
          ))

Thank you for leaving a comment if you have any idea about desensitizing API keys

You can decide to use plain text as the pattern, like "Albert Einstein", or use regular expression. Besides the inconvenience of defining multiple patterns for a name, using plain text actually comes with an advantage: performance. This is also the reason why I prefer using plain text patterns to mask API keys: defining a regex for API keys is easy, but Emacs processing the regex is extremely costly if still possible, no need to mention how many false alarms it produces.

I intentionally made placeholders in the pattern of “person-name-x9Pf” or “phone-number-w9rD”, since in this way most LLMs will still understand what is happening in the conversation, without knowing any concrete personal information. The only thing you need to take care of is try to append a random string at the end of each placeholder, instead of using strings like “a-person-name”, so that both you and your LLM won't mix the names up.

The filter function

The task left to the filter function, llm:filter-sensitive-info, is to simply check the patterns defined in the sensitive information list. It was inspired by this reply to an issue post on gptel's github repo:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
  (defun llm:filter-sensitive-info ()
    (let* ((gptel-blacklist llm:sensitive-list)
           (case-fold-search nil) ; case-sensitive
           )
      ;; Iterate over each entry in the blacklist and replace the original text with the replacement text
      (dolist (entry gptel-blacklist)
        (goto-char (point-max))  ; Start from the beginning of the buffer
        (while (re-search-backward (car entry) nil t)
          (let ((msg (concat "Sensitive item found: " (caddr entry))))
            ;;(print msg)
            ;; Ask for user’s permission to continue 
            (if (y-or-n-p (concat msg ", abandon this task? Press ‘n’ to mask it and continue: "))
                ;; Throw an error and stop
                (error "Abandoned."))
            (replace-match (cadr entry))
            )))))
  ;; Run the above function every time you send LLM a request
  (add-hook 'gptel-prompt-filter-hook #'llm:filter-sensitive-info)

After defining the filter function, just don't forget to add it to the hook gptel-prompt-filter-hook.

The restoring function

By implementing the llm:sensitive-list and llm:filter-sensitive-info, you already made sure that no information in the black list will go out of your computer. But it will be annoying if LLM's responses are full of “person-name-x9Pf”, right?

The aim of the restoring function is to restore placeholders back to the actual information, taking place in your computer. By using the following function, the restored text are those defined in the third component in llm:sensitive-list. Again, after defining the function, remember to add it to the hook gptel-post-response-functions.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
  (defun llm:restore-sensitive-info (beg end)
    "Restore sensitive information in the LLM response from BEG to END."
    (let* ((gptel-whitelist (mapcar (lambda (entry) (cdr entry)) llm:sensitive-list))) ; (cons (cdr entry) (car entry))
      ;; Iterate over each entry in the blacklist and replace the original text with the replacement text
      (save-excursion
        (save-restriction
          (buffer-disable-undo)
          (narrow-to-region beg end)  ; Restrict region to where the response is
          (atomic-change-group
            (dolist (entry gptel-whitelist)
              (goto-char (point-max))     ; Start from the beginning of the response region
              (while (re-search-backward (car entry) nil t)
                (replace-match (cadr entry))))
            )
          ))))
  ;; Run the above function every time you get a response from an LLM
  (add-hook 'gptel-post-response-functions #'llm:restore-sensitive-info)

The complete codes

Voilà! That's about all that I wanted to share. You can just copy the following code and adapt llm:sensitive-list for your own needs. Thanks for reading, and happy Emacs-ing!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
  (setq llm:sensitive-list
        ;; STRUCTURE: '(pattern placeholder restore-string)
        '(("\\b\\(G\\|g\\)eorge\\b \\b\\(O\\|o\\)rwell\\b"      "person-name-2V75" "George Orwell")
          ("Albert Einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("Albert einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("albert Einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("albert einstein"                                    "person-name-x9Pf" "Albert Einstein")
          ("\\b13211111111111111\\b"                            "identity-no-zV5Y" "13211111111111111")
          ("7[\\ \\.]*77[\\ \\.]*77[\\ \\.]*77[\\ \\.]*77\\b"   "phone-number-w9rD" "7.77.77.77.77")
          ("\\b\\(mistral\\|Mistral\\|MisTral\\|MISTRAL\\)\\b"  "project-name-rqOu" "Mistral")
          ("\\b\\(CNRS\\|cnrs\\)\\b"                            "institute-name-JxMi" "CNRS")
          ;; DANGEROUS ZONE - API key format in REGEX - this might impede ‘gptel’ from even functioning
          ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
          ;; ("[a-zA-Z0-9-_.]\\{24,40\\}"                       "an-api-key" "AN-API-KEY") ;;
          ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
          ))

  (defun llm:filter-sensitive-info ()
    (let* ((gptel-blacklist llm:sensitive-list)
           (case-fold-search nil) ; case-sensitive
           )
      ;; Iterate over each entry in the blacklist and replace the original text with the replacement text
      (dolist (entry gptel-blacklist)
        (goto-char (point-max))  ; Start from the beginning of the buffer
        (while (re-search-backward (car entry) nil t)
          (let ((msg (concat "Sensitive item found: " (caddr entry))))
            ;;(print msg)
            ;; Ask for user’s permission to continue 
            (if (y-or-n-p (concat msg ", abandon this task? Press ‘n’ to mask it and continue: "))
                ;; Throw an error and stop
                (error "Abandoned."))
            (replace-match (cadr entry))
            )))))
  ;; Run the above function every time you send LLM a request
  (add-hook 'gptel-prompt-filter-hook #'llm:filter-sensitive-info)

  (defun llm:restore-sensitive-info (beg end)
    "Restore sensitive information in the LLM response from BEG to END."
    (let* ((gptel-whitelist (mapcar (lambda (entry) (cdr entry)) llm:sensitive-list))) ; (cons (cdr entry) (car entry))
      ;; Iterate over each entry in the blacklist and replace the original text with the replacement text
      (save-excursion
        (save-restriction
          (buffer-disable-undo)
          (narrow-to-region beg end)  ; Restrict region to where the response is
          (atomic-change-group
            (dolist (entry gptel-whitelist)
              (goto-char (point-max))     ; Start from the beginning of the response region
              (while (re-search-backward (car entry) nil t)
                (replace-match (cadr entry))))
            )
          ))))
  ;; Run the above function every time you get a response from an LLM
  (add-hook 'gptel-post-response-functions #'llm:restore-sensitive-info)

Footnotes