When Your Employer Uses Your Work to Train AI: What Every Organization Should Consider

Meta just announced that it is installing software on US-based employees' computers that captures mouse movements, clicks, keystrokes, and periodic screenshots. The tool, called Model Capability Initiative (MCI), is designed to train AI agents to use computers the way humans do.

The program runs across hundreds of apps and websites, including Google, LinkedIn, Slack, GitHub, Atlassian, and Meta's own platforms. According to Reuters, the data will be used to improve Meta's AI models in areas where they struggle to replicate human computer interaction, like choosing from dropdown menus and using keyboard shortcuts.

Multiple Meta employees described the initiative as "dystopian" in internal messages. Others raised concerns about passwords, immigration details, health information, and product development secrets being captured. Meta says safeguards are in place to protect "sensitive content" but has not specified what data gets excluded from collection.

This is not just a Meta story. It is a preview of decisions every organization will face as AI development moves from using public datasets to harvesting internal workflows. The security, privacy, and governance implications are significant, and most organizations are not prepared for them.

What Meta Is Actually Doing

According to internal memos viewed by CNBC, a staff AI research scientist posted in a channel for Meta's SuperIntelligence Labs team that MCI would capture "on-screen content as the context of what was being manipulated or interacted with." The memo stated that in order to "teach our models to be able to use computers," Meta requires a "big and unbiased" data set reflecting how employees work on their corporate devices.

Meta CTO Andrew Bosworth described the broader vision in a separate internal memo: "The vision we are building towards is one where our agents primarily do the work and our role is to direct, review and help them improve." The goal, he wrote, was "a closed loop" in which agents could "automatically see where we felt the need to intervene so they can be better next time."

Meta spokesperson Andy Stone stated that the data would not be used for performance assessments and that safeguards exist to protect sensitive content, though he did not elaborate on what types of data would be excluded.

The same week Meta announced MCI, the company also confirmed it would lay off approximately 8,000 employees, roughly 10% of its workforce, as part of a continued push for efficiency. The company will also leave 6,000 open positions unfilled.

The juxtaposition is hard to ignore. Employees are training the systems designed to automate their roles, using data collected from their own workflows, with unclear boundaries on what is being captured.

The Privacy Problem

The privacy implications of workplace AI training data collection extend well beyond Meta.

In the United States, there is currently no federal law limiting employer surveillance of employee computer activity. According to Ifeoma Ajunwa, a law professor who studies workplace surveillance and was quoted by Reuters, "On the U.S. side, federally, there is no limit on worker surveillance." State-level laws, where they exist, require at most that employees be broadly informed that monitoring is occurring.

European law tells a different story. Valerio De Stefano, a law professor at York University who studies technology and comparative labor law, told Reuters that such monitoring would likely be prohibited under GDPR. In Italy, using electronic monitoring to track employee productivity is explicitly illegal. In Germany, courts have held that employers can deploy keystroke logging only in exceptional circumstances, such as suspicion of a serious criminal offense.

For organizations operating in multiple jurisdictions, the regulatory landscape creates immediate compliance risk. A monitoring tool that is permissible in the US could expose the company to significant penalties under GDPR, which carries fines of up to 20 million EUR or 4% of global annual turnover.

The Security Risks Nobody Is Talking About

Beyond the privacy debate, there are concrete security risks that organizations need to consider before deploying any system that captures employee workflows for AI training.

Credential exposure. Keystroke logging and screenshot capture inevitably encounter authentication events. Employees type passwords. They navigate to credential management interfaces. They interact with API keys, tokens, and secrets in development environments. Unless the capture system has robust, verified mechanisms to exclude this data, credentials end up in training datasets. Once they are in a training dataset, they are extremely difficult to remove and could surface in model outputs.

Sensitive data aggregation. Employee workflows touch regulated data constantly. An HR employee reviewing benefits applications handles protected health information. A finance team member reviewing loan applications handles financial PII. A legal team member working on litigation holds handles privileged communications. Capturing all on-screen content across these workflows creates a concentrated repository of sensitive data that was never intended to be aggregated in this way.

Insider threat amplification. A system that captures and stores detailed records of how employees interact with internal systems, including which tools they use, what data they access, and how they navigate sensitive workflows, creates a high-value target. If that training data repository is compromised, an attacker gains not just raw data but a detailed map of how to navigate the organization's internal systems the way a trusted employee would.

Supply chain contamination. If employee workflow data is used to train AI models that are then deployed externally or shared across organizational boundaries, there is a risk of sensitive information leaking into model weights or outputs. Training data extraction attacks, where adversaries probe a model to recover training data, are a documented and actively researched threat — a 2023 study from researchers at Google DeepMind, the University of Washington, Cornell, and other institutions demonstrated extraction of gigabytes of training data from production language models, including ChatGPT. An AI model trained on employee keystrokes that included customer data, internal communications, or trade secrets could potentially surface that information in responses to external users.

What This Means for Your Organization

Even if your company is not Meta, this development raises questions that security and privacy leaders should be asking now, before similar initiatives arrive at your organization.

Do you know what data your existing AI tools are capturing? Many organizations already use AI-powered productivity tools that ingest employee communications, documents, and workflows. The difference between those tools and Meta's MCI is one of degree, not kind. Understanding what data your current AI tools collect, how it is stored, how it is used for model training, and what opt-out mechanisms exist is a baseline governance requirement.

Do you have an AI data governance policy? Most organizations have data classification policies and data handling procedures. Few have extended those policies to cover data used for AI model training. Questions like "Can employee workflow data be used for model training?" and "What categories of data must be excluded?" and "Who approves new AI data collection initiatives?" need documented answers.

Is your vendor doing this? Meta is doing this internally and publicly. Other organizations, including your AI vendors, may be doing something similar with less visibility. Review your vendor agreements and data processing addendums. Look specifically for language around model training, data retention for model improvement, and whether your organization's data can be used to train models that serve other customers.

Are your employees aware? Transparency matters both legally and culturally. Even in jurisdictions where employee monitoring is legal, springing a keystroke logger on your workforce without clear communication and context creates trust issues that are difficult to repair. If your organization is considering any form of AI-driven workflow capture, involve your legal, HR, and communications teams before deployment.

The Bigger Picture

Meta's MCI is a signal of where enterprise AI is heading. As foundation models mature, the bottleneck is shifting from model architecture to training data. Scale AI's co-founder Alexandr Wang said it directly: "For a lot of the capabilities that we want to build into the models, the biggest blocker is actually a lack of data." Meta paid $14.3 billion for a 49% stake in Scale AI in 2025 and installed Wang as the head of its new superintelligence team. Scale built its entire business on harvesting workflow data.

The pressure to capture real human computer interaction data will grow across the industry. Organizations that want to build or fine-tune AI agents capable of performing knowledge work will need training data that reflects how knowledge work actually gets done. Employee workflows are the richest available source of that data.

This is not inherently wrong. But it requires governance, transparency, and security controls that most organizations have not built yet. The organizations that get ahead of this will be the ones that establish clear policies now, before the pressure to collect becomes urgent.

What You Should Do Now

If you are a security or privacy leader, this is the checklist:

Audit your current AI tools. Understand what employee data they collect, how it is stored, whether it is used for model training, and what data processing agreements govern that usage.

Establish AI data governance policies. Define what categories of employee data can and cannot be used for AI model training. Document approval processes for new AI data collection initiatives. Ensure these policies align with your existing data classification framework.

Review vendor agreements. Look for language around model training, data improvement, and cross-customer data usage. If your vendor's terms allow training on your data by default, negotiate an opt-out or find a vendor with clearer boundaries.

Assess regulatory exposure. If you operate across jurisdictions, understand where employee monitoring for AI training purposes is permissible and where it creates liability. GDPR, state-level privacy laws, and emerging AI regulations all have implications.

Plan for the conversation. Whether or not your organization pursues AI workflow capture, your employees are going to hear about Meta's MCI and wonder if something similar is happening where they work. Have an answer ready.

The question is no longer whether organizations will use employee data to train AI. The question is whether they will do it with the governance, security, and transparency that the moment requires.

At Ayliea, we help mid-market organizations navigate the intersection of AI adoption, security, and compliance. If you are evaluating how AI tools interact with your organization's data, or need help building an AI governance framework, book a free scoping call to discuss your situation.

When Your Employer Uses Your Work to Train AI: What Every Organization Should Consider

What Meta Is Actually Doing

The Privacy Problem

The Security Risks Nobody Is Talking About

What This Means for Your Organization

The Bigger Picture

What You Should Do Now

Related Articles

Colorado AI Act: Compliance Guide for Mid-Market

Evaluating Third-Party AI Vendors: A Security-First Approach

How to Map Data Flows in Your AI Systems