Microsoft extends Copilot with open standard extensions

Microsoft Copilot logo on blurred background.
Image: Adobe Stock

Microsoft’s Build developer conference pulled back the curtain on how it plans to add developers unique content and application integration for Copilot applications. It’s an approach that makes them more relevant and less likely to go off the rails and focus their results on specific tasks.

It is important to understand that after training, a large language model such as GPT-4 needs additional data to focus. That’s why Microsoft’s various co-pilots are built on their own data sources: GitHub, Power Platform, Microsoft Graph, and most obviously Bing. It’s a mostly successful approach that reduces the risk of hallucinations and instant overshoot, but still imposes Microsoft-defined limits on its AI platform.

In its current state, Bing’s copilot can only answer questions about Bing’s search database. And while that’s powerful, it can’t answer questions about data from users’ firewalls or the apps they want to use. The service cannot take these responses and feed them to other applications and use additional results to format the output or run an interaction on behalf of the user. Users can ask Bing Chat to name the best restaurants in New Orleans or create a three-day travel itinerary for them, but it doesn’t reserve a table.

Microsoft copilot logo
Image: Microsoft


Add plugins to support artificial intelligence

Where plugins can help, which provides additional data sources and new interactions. Users can now use it plugins built for ChatGPT, and Microsoft is building on the same extension architecture for new Bing extensions. It initially offers OpenTable and Wolfram Alpha support, with plugins for services like Expedia, Instacart, Zillow, TripAdvisor, and more. So, for example, if someone uses the Instacart plugin, they can quickly turn their Bing menu into a shopping list and then into a delivery order for ingredients that aren’t in their cupboard. Interestingly, these plugins will include one for ChatGPT itself.

See also  The 2020-2022 ATM/PoS Malware Landscape

Microsoft goes further: this common plugin is also used for Microsoft 365 copilot and AI tools in Microsoft Edge browser. The common model for LLM plugins makes a lot of sense. It allows the code to be written once and reused in the user’s different applications.

Working with a standard plug-in architecture allows the user to offer their code to other users and organizations, so once they’ve built a tool that can integrate a Salesforce app with Bing Chat, they can sell it as a product or open source it and share it. that.

Build plugins quickly

So how do users do it Creating a ChatGPT plugin? Plugins are interfaces between existing application APIs and ChatGPT, with manifest and OpenAPI specifications for the APIs they use. The Bing Chat service acts as an orchestration tool, calling APIs as needed and formatting responses using its natural language tools.

These tools allow users to ask, “Can you tell me all the deals that closed in the first quarter?” and Bing Chat connects to your customer relationship management system and pulls the necessary information from your sales data, displaying it as a chat response. They can then ask if they need to order more raw materials, use another plugin that links to an enterprise resource planning platform, check inventory levels, and then ask if they approve ordering the necessary materials and parts.

As a result, we support users to work with the applications they normally use, organize interactions and turn complex tasks into micro-tasks, allowing them to work deeply on other tasks.

By building on existing API definitions and a standard definition format, extensions should simplify development. If the user has not built an OpenAPI definition for the REST API, they can use tools like Postman to automatically create one. Description fields in the OpenAPI definition can help Bing or ChatGPT generate text around your queries and help choose which API to use. The resulting plugin definition is added to the LLM prompt (hidden from the chat UI), but still counts in its context and uses tokens. It is important to note that plugins must be invoked directly by users; they are not available for all queries.

See also  The way to create a database and add a set with MongoDB GUI Compass

The first thing to do is create a manifest for their plugin in YAML or JSON format. The user stores it himself in a specific folder at the top of the domain with a predefined name so that the GPT host can easily find it. Helpfully, the OpenAI plugin specifications include ways to handle authentication, so you can ensure that only authenticated users can access internal APIs. Using OpenAPI descriptions allows users to restrict GPT access to certain aspects of their APIs by editing the API definition to hide calls they don’t want to make. For example, someone can only allow read on an API that has update and delete capabilities.

Making extensions better

Plugins do not add data to Bing or ChatGPT; they add direction and focus to the output, run only when requested by the user, and return only data that is part of the response to the original query. Users should avoid returning natural language responses – the GPT model generates its own responses around data from its API.

A useful feature of the plugin manifest is the “model description” attribute, which allows users to refine the prompt generated from the API description to add additional instructions. As users test their plugin, they can add additional controls to its usage. ChatGPT provides a way to debug plugins by displaying requests and responses, usually in JSON format. This helps them understand what data from their applications is being used by AI, if not exactly how or how the original request was made.

See also  How to use Google Lock Screen Widgets for iPhone

More complex plugins can interact with vector databases to extract and use documents. This approach is probably most useful for applications that need to work with user document stores that can be preprocessed with embeddings and indexed with vector searches to speed up access to complex business information that can create documents based on responses from other applications. using the most relevant content to structure any generated text.

Convert existing Microsoft Teams apps to plugins

Another interesting option is to use the existing one Teams message extensions with Microsoft 365 copilot. This approach can simplify quickly add AI to existing Teams bots, connecting the user’s web services to Copilot through the bot framework. The key here is to ensure that the application description and skill parameters are used to create the Copilot LLM prompt and content requests in the plugin. Outputs are embedded in chat sessions as adaptive cards. It’s even possible to modify an extension to be a fully conversational system via the GPT-4 model that underlies most Microsoft copilots.

Microsoft’s approach to extending Bing and other co-pilots is a good one right now. It’s still early days for generative AI, so a standard plugin format makes a lot of sense, allowing APIs to support more than one AI platform and reducing the need to build the same plugin multiple times . Code that works with ChatGPT will work in Bing Chat and Microsoft 365, as well as anywhere else Microsoft will add Copilot functionality in the future.