Baseline OpenAI End-to-End Chat Reference Architecture

Microsoft published the baseline OpenAI end-to-end chat reference architecture. This baseline contains information about components, flows and security. There are also details regarding performance, monitoring and deployment guidance. Moreover, Microsoft prepared the reference implementation to deploy and run the solution.

Baseline end-to-end chat architecture with Open AI leverages components similar to those in the baseline App Service web application for hosting the chat UI. This architecture prioritises components for orchestrating chat flows, data services, and accessing Large Language Models (LLMs). Azure Machine Learning is used for training, deploying and managing machine learning models. Azure Storage stores prompt flow source files, while Azure Container Registry manages container images for deployment. Furthermore, Azure OpenAI provides access to LLMs and enterprise features. Azure AI Search supports search functionalities in chat applications, implementing the RAG pattern for query extraction and retrieval.

Baseline end-to-end chat architecture with OpenAI (Source: Microsoft Blog)

The baseline end-to-end chat architecture with OpenAI prioritises network security alongside identity-based access. Key aspects include a secure entry point for chat UI traffic, filtered network traffic, and end-to-end encryption with TLS for data in transit. Data exfiltration is minimised through Private Link usage. Network resources are logically segmented and isolated, ensuring robust network flows. The architecture involves routing calls from the App Service-hosted chat UI through a private endpoint to the Azure Machine Learning online endpoint, which then directs calls to a server running the deployed flow. Calls to Azure PaaS services are routed through managed private endpoints for added security.

This architecture restricts access to Azure Machine Learning workspace to private endpoints, ensuring enhanced security. Private endpoints are employed throughout, allowing chat UI hosted in App Service to connect securely to PaaS services.

This architecture establishes security measures both at the network and identity levels. The network perimeter allows only the chat UI access via the App Gateway from the Internet, while the identity perimeter ensures authentication and authorisation for requests. Access to Azure Machine Learning workspace is managed through default roles like Data Scientist and Compute Operator, alongside specialised roles for workspace secrets and registry access.

Additionally, Microsoft shared some suggestions and strategies related to deployment. Among others, blue/green deployments or A/B testing which improves releases and evaluation of changes.

When it comes to monitoring, all services except Azure Machine Learning and Azure App Service are set to capture all logs. Azure Machine Learning diagnostics are configured specifically to capture audit logs, which include all resource logs documenting customer interactions with data or service settings. For Azure App Service, logging settings encompass AppServiceHTTPLogs, AppServiceConsoleLogs, AppServiceAppLogs, and AppServicePlatformLogs.

Azure OpenAI service also offers content filtering to detect and prevent harmful content. This includes abuse monitoring to detect violations, though exemptions can be requested for sensitive data or legal compliance.

In the LinkedIn threat, Balz Zuerrer asked if this solution could be built on Azure AI Studio. Tobias Kluge answered:

From my understanding, this blueprint is for the whole application including the security boundaries for sensitive and user-related data. AI studio is for testing and playing with the model and some of your data. But it does not say anything about how you build and deploy the whole application in a secure environment for production.
This is why this blueprint is so valuable for all of us.

Apart from this question, below this post appeared many positive comments. Rishi Nikhilesh added:

Amazingly built on top of network isolation and disabling Azure ML workspaces on public endpoint(s). Fascinating to see how the app service is communicating with deployed ML prompt flow keeping security intact.

For deploying this scenario, Microsoft engineers prepared the reference implementation.

About the Author

Robert Krzaczyński

Show moreShow less

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

About the Author

Robert Krzaczyński

Rate this Article

This content is in the Cloud topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter