Data security and AI: How to use Copilot AI without exposing sensitive data

6 minute read

Generative artificial intelligence (GenAI) is already changing the way we work. 

However, the technology also comes with considerable data security and privacy concerns.

Maximizing your investment in AI means navigating these roadblocks without compromising on privacy and data security for sensitive company info, customer data, and proprietary information. 

Read on to learn more about how your organization can leverage AI to its maximum potential while protecting your data.

Building the foundation by starting with user education

Like any new technology, educating your users on the responsible use of artificial intelligence is crucial. 

They must understand how to most effectively leverage AI capabilities and what information they should (and shouldn’t) input into tools like Copilot, ChatGPT, or Gemini. The potential for accidental data exposure through AI tools is very real, and a few high-profile incidents have already been reported.

In May 2023, Samsung opted for a company-wide ban on third-party generative AI tools. Previously, the company allowed engineers working for its semiconductor division to use ChatGPT to fix problems with their source code. Unfortunately, this resulted in multiple data leaks, including the source code for a new program, confidential hardware data, and meeting notes. 

Even if information is not directly leaked to the public, an organization’s data may still be used to train AI models, putting it at risk of compromise. You need to ensure your employees understand the risks and know which tools to avoid sharing sensitive data with.

Are you unsure where to start when developing your own training? Microsoft offers many courses focused on the practical and responsible use of AI, including Fundamentals of AI Security, Responsible Generative AI, and AI Fluency. Regular cybersecurity awareness training is also imperative.

Purchase paid subscriptions to further protect your data

Companies using AI tools at an enterprise level should consider purchasing paid subscriptions. 

These subscriptions typically provide increased data security over free tools and assurance that company data will not be used to train AI models. Additionally, paid subscriptions often give access to greater functionality than non-paid plans.

A paid subscription to Microsoft Copilot provides direct integration with Microsoft Word, Excel, PowerPoint, OneNote, and Outlook. It also provides preferred access to the latest AI models, higher usage limits, and advanced features like Copilot Voice. 

Microsoft also protects Copilot through a comprehensive defense-in-depth strategy, though this applies equally to both the free and professional versions of the tool.

Restrict AI usage for teams handling sensitive data

It’s also important to consider how AI tools fit into your network infrastructure, from the data center and workplace to your employees’ home offices. 

Even if your employees do everything right, there’s still the risk of data exfiltration—or potentially even a data breach—without proper safeguards. Slack AI is a perfect example of the potential risks. 

Though no known incidents are currently associated with the tool, security firm PromptArmor reported in August 2024 that it contained a prompt injection vulnerability. The firm advised that threat actors could potentially use Slack AI to exfiltrate data from private Slack channels. Though Slack patched the vulnerability a few days later, it’s a reminder of the importance of controlling what AI bots can access and who can use AI.

Using Microsoft 365 Sensitivity Label and Microsoft 365 Purview Information Protection, your IT admins can implement network restrictions and permissions for your AI tools. They can also isolate those tools from specific systems and devices by combining service tags with additional authorization and authentication. In addition to these controls, you should limit the use of AI bots to employees who have undergone training on best practices and responsible use.

Put strong data governance measures in place

Who has access to your data? How can they access it? What authentication measures are in place to prevent unauthorized access? 

These are all crucial questions to ask before considering using any AI tool. 

You need to know where all your data is stored and how it’s stored. It’s also important to periodically check and update access permissions to ensure no one has access to anything they shouldn’t.

To put it another way — everyone in the organization, from interns to the CEO, should only have access to the systems and data they need to do their jobs. 

Work to establish a single source of truth for all company data, consolidating to eliminate silos and unnecessary redundancy. From there, you’ve two options. Azure Local allows companies to run their own Large Language Models (LLMs) on-premises, ensuring both training data and company data remain within the local data center and are not hosted on the cloud.

Alternatively, Azure Data Lake Storage provides a secure, cloud-based centralized repository designed to store massive volumes of structured and unstructured data that’s easily accessible for analytics and model training.

Encrypt and secure your information

Governance isn’t the only thing to consider when creating policies for managing your data.

There’s also the matter of security. You’ll want to make sure you have all the basics in place. This includes things like: 

  • At-rest and in-transit encryption
  • Regular malware checks and file integrity scans
  • Full visibility of how, where, and to whom data your data is transmitted. 

We also recommend encrypting all communication between AI bots and Azure. 

Fortunately, with the Azure ecosystem, you can access a wide range of security options, including: 

  • 256-bit AES encryption for protecting all data stored in Azure
  • Microsoft Entra ID for identity and access management (IAM) 
  • Microsoft Defender to protect against malware 

Microsoft has also built Copilot as a secure layer on top of OpenAI’s platform, offering different service levels for government and enterprise sectors. 

Although all deployments of Copilot are developed to a high-security standard, Microsoft 365 Copilot GCC allows administrators to ensure data security by controlling features, such as web grounding and providing access to features like auditing and eDiscovery.

Other security controls to consider

Lastly, you should consider how to customize your company’s AI model and control the data it’s trained on. 

Consider turning off Internet capabilities to ensure a model is trained exclusively on company data. 

If you’re using a data lake or something similar, try connecting the model directly to that. However you decide to go about training — access to Microsoft’s Azure AI Foundry will provide access to various LLM models beyond OpenAI, allowing you to develop and deploy a custom AI app or API for just about any use case.

You also have the option to run your models locally, allowing you to keep your data on-premises. Alternatively, you can create custom models in AI Builder. There are also several pre built models designed for specific scenarios and use cases.

Unlock the full potential of GenAI in Microsoft Teams

GenAI is undoubtedly one of the most disruptive technologies in recent years, and we’re only at the beginning of its evolution. 

However, for all its potential, adopting it without thinking about security and privacy is a substantial risk.

Need help determining the way to integrate AI into your workflows? Book a demo today to see how to add powerful AI capabilities to Microsoft Teams and more.

Share on
Momentum