Open-source proxy strips PII before AI prompts go external

▼ Summary
– Dataiku’s Kiji Privacy Proxy is an open-source local gateway that detects and masks PII in requests to external AI APIs like OpenAI and Anthropic.
– The tool flags 16+ PII categories, replaces them with realistic dummy values before sending the request, and restores original values in the response.
– PII detection runs locally via a quantized DistilBERT model on ONNX Runtime, with latency under 100 milliseconds and a 94% F1 score on a benchmark.
– Distribution includes a macOS Electron app, a Linux server binary, and a Chrome extension, all routing traffic through the proxy.
– Sending PII to third-party APIs risks violating GDPR, HIPAA, and CCPA; a 2026 survey found 85% of CIOs saw AI projects delayed due to privacy and traceability concerns.
Enterprise developers frequently send prompts to external large language models that include customer emails, support transcripts, and other identifying details, often without any sanitization layer between the application and the API. Dataiku has introduced Kiji Privacy Proxy, an open-source local gateway designed to detect and mask personally identifiable information (PII) before requests leave the network.
This tool operates as an intermediary between local applications and external AI APIs like OpenAI and Anthropic. Incoming requests are processed by a machine learning model that identifies 16 or more categories of PII, such as email addresses, phone numbers, Social Security numbers, credit card numbers, and IP addresses. It replaces detected entries with realistic dummy values, forwards the sanitized request to the upstream API, and then restores the original values in the response, ensuring the calling application receives output that aligns with its input.
Local inference and deployment options
PII detection relies on a quantized DistilBERT model running locally through ONNX Runtime on the user’s machine, with no external calls for the detection step. According to the project documentation, latency remains under 100 milliseconds for most requests. The base model achieved a 94 percent F1 score on an industry benchmark dataset.
Distribution covers three form factors. macOS users can install a native Electron desktop application that configures Proxy Auto-Config, routing Safari and Chrome traffic through Kiji on port 8081 without needing manual environment variables. Linux users run a standalone server binary and set HTTPPROXY and HTTPSPROXY values. A separate Chrome extension directs web requests through the proxy for users interacting with services like ChatGPT via a browser.
Compliance drivers
Sending PII to a third-party API can trigger obligations under GDPR, HIPAA, and CCPA, leading many enterprises to restrict what data leaves the corporate perimeter. A 2026 Dataiku survey of 600 CIOs revealed that 85 percent had seen AI projects delayed or blocked due to gaps in traceability or explainability, with privacy concerns playing a significant role.
Kiji Privacy Proxy is available for free on GitHub.
(Source: Help Net Security)
