User sends prompt with sensitive data (SSN, API keys, credit cards)
Real-time detection & redaction using 14+ pattern types in <50ms
Sanitized request sent to OpenAI/Claude/etc. for processing
Output scanned again, safe response returned to user
Use our purpose-built SDKs for Python, Java, or Node.js with additional features like session tracking and custom rules.
# Install the Data Hawk SDK
pip install datahawk-shield
# Import and configure
from datahawk import ShieldedOpenAI
client = ShieldedOpenAI(
shield_url="https://api.datahawk.io",
api_key="your-openai-key",
redaction_mode="MASK" # MASK, REPLACE, HASH, TOKEN
)
# Use exactly like OpenAI client
response = client.chat.completions.create(
model="gpt-4",
messages=[{
"role": "user",
"content": "My credit card is 4532-1111-2222-3333"
}]
)
# Automatically redacted to: "My credit card is [CARD_REDACTED]"
Deploy as an API Gateway for centralized protection across all teams and applications. Perfect for enterprise-wide enforcement.
# NGINX Configuration
upstream datahawk_shield {
server shield-1.datahawk.io:8090;
server shield-2.datahawk.io:8090;
server shield-3.datahawk.io:8090;
}
# Route all LLM traffic through Data Hawk
location /v1/ {
proxy_pass http://datahawk_shield;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Correlation-ID $request_id;
}
# Your apps continue using standard endpoints
# https://api.yourcompany.com/v1/chat/completions
# ↓ Automatically routed through Data Hawk Shield
# ↓ Then forwarded to OpenAI/Claude/etc.
The simplest integration — just point your LLM endpoint to Data Hawk. Works with any OpenAI-compatible SDK.
# Just change your environment variable
OPENAI_BASE_URL="https://shield.datahawk.io/v1"
OPENAI_API_KEY="your-openai-key"
# Your existing code works unchanged
import openai
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": "My SSN is 123-45-6789"}]
)
# Data Hawk automatically filters before sending to OpenAI
| Feature | Data Hawk | Cloud-Based DLP |
|---|---|---|
| Deployment | ✓ 100% Self-Hosted | ⚠ Cloud SaaS |
| Data Sovereignty | ✓ Complete Control | ⚠ Data leaves network |
| LLM Provider Support | ✓ Any Provider | ⚠ Limited integrations |
| Latency (P95) | ✓ <50ms | ⚠ 100-500ms |
| Bidirectional Filtering | ✓ Input + Output | ✗ Input only |
| Reversible Redaction | ✓ Tokenization | ✗ Permanent |
| Pricing Model | ✓ Predictable licensing • No per-call fees | ⚠ Usage-based charges |
| Air-Gapped Deployment | ✓ Supported | ✗ Not possible |
| Custom Patterns | ✓ Full control | ⚠ Limited customization |
| Code Changes Required | ✓ Zero | ⚠ Varies by provider |