Data controller
istari.ai GmbHJulius-Hatry-Straße 1
68163 Mannheim, Germany Data Protection Officer: Dr. Sebastian Schmidt. Contact: support@istari.ai.
Where data lives
Application data is processed and stored in Google Cloudeurope-west3 (Frankfurt). The Postgres database, application servers, and audit logs are all in-region.
What we log
Every MCP tool call produces one row in ourrequest_log table. Each row stores:
- Your istari.ai user ID and tier.
- The tool name (e.g.
search_organizations). - The structured arguments you passed: for example, your search query text, the domain list you asked us to fetch, the filter values you applied. We log these so we can enforce monthly quotas, debug failures, and improve relevance.
- The result count returned and the wall-clock duration.
- HTTP status and error code (if any).
How long we keep it
- Server / HTTP access logs: retained for at most 7 days, per the istari.ai privacy policy.
- Application request log (the
request_logtable described above): retained for 24 months, then deleted. - Embedding cache: embeddings of submitted reference domains are retained indefinitely. They are derived numerical features, not personal data.
Cross-tenant isolation
- Your queries and results are never visible to other users.
- Your queries are not used to train AI models.
- Your queries are not sold or shared with third parties beyond the sub-processors listed in the istari.ai privacy policy (Google Cloud for hosting, Clerk for authentication, Azure OpenAI for embedding generation).
Scraping and the embedding cache
When you pass a reference domain tofind_similar_organizations or find_similar_with_steering, GOI MCP tries to look up that domain in our database first. If the domain is not already indexed, the server fetches the public website and generates an embedding from it on demand.
- The fetched URLs are public web pages: the same content anyone with a browser can see.
- We cache the resulting embedding (a numerical vector), not the raw page content. The cache is shared across the user base so that the same domain isn’t re-fetched repeatedly. This is purely a performance optimization.
- Subsequent requests for the same domain: from you or from any other user, re-use the cached embedding and trigger no further HTTP fetch.
robots.txt directives.