AI agents are all the rage and with that has come various stories of agents performing unwanted and consequential actions, diverging from what the user has asked of them. From Claude deleting a company’s database to Meta AI’s alignment director having her emails deleted by an autonomous agent, giving an AI Agent excess access to data and tools is fraught with issues. Whilst fully autonomous, access all-areas agents may be a utopian dream (or nightmare), for large institutions like the University with an absolute obligation to personal data protection and stability, necessary controls and guardrails must be put in place to ensure that we can harness the power of agentic ai whilst eliminating the risk of an agent causing any damage.

What is Agentic AI?
Agentic AI is an extension of the standard AI, like ChatGPT and Claude chatbots, where the LLM is given access to tools and knowledge to act autonomously to achieve goals with minimal human supervision. For example, rather than just having a customer service chatbot, that agent could raise service desk tickets or proactively try and fix a complaint. This introduces the risk of an agent doing the wrong thing rather than a chatbot just saying the wrong thing.
Agents are provided access to tools and resources through various methods, but the most supported and popular method is via the Model Context Protocol (MCP). MCP is like giving the AI a set of ‘approved remote controls’ – each remote can only do a specific thing (read a calendar, create a ticket), and we decide which remotes it’s allowed to hold. An MCP server is a a granular set of tools provided to an agent, providing context on what each tool does and how/when the agent should be using it. An MCP Server can be provided by a third party like Microsoft, or internally where we can build tools to interact with internal data.
The secret to agentic AI security isn’t tearing down our existing security governance, best practice or frameworks, but adapting these proven approaches, to meet the unique challenge created by AI agents and MCP.
Tim Packwood – Head of IT Innovation
How we securely use MCP
Whilst this is new technology, many of the same security principals still apply – if you wouldn’t give an uninformed intern a connection to delete a database, why would you give an agent the same thing? When building MCP connections, we will adhere to 6 key guardrails:
Least Privilege – Any MCP Server will have the minimum access it requires to perform its function. A tool to read files on a SharePoint site will not have write access.
Read-Only by default – Any MCP Server that requires write access must be approved with the use-case, blast-radius and risk associated assessed.
Human-in-the-loop – We will have varying protocols for human in the loop applied to the differing risk levels associated with the actions. At the highest risk any action the MCP Server is wanting to make must be approved outside of the chatbot interface.
Trust Nothing – The rise of LLMs has introduced a whole new set of security threats with the main one being prompt injection. By maliciously inputting text to manipulate the chatbot to perform actions outside of its intended use case. This can be prevented by sanitising the input and output of the MCP server and applying AI safety filters to the content.
Enforce Security at every layer – From the chat interface to the database, security and access controls for the user will be enforced throughout the authorization flow.
Full Audit Trail – The whole process will be logged to track the tool usage and allow for full visibility if anything does go wrong.
Is the data secure?
All agents developed for the university will be built on NebulaOne, the university’s secure AI platform. NebulaOne provides access to leading AI providers like ChatGPT and Claude with a key guarantee: nothing you enter is ever used to train their underlying models.
Any MCP tools built to power these agents will fall into one of two categories — either developed through a trusted third-party provider with full data security agreements in place, or built directly on the university’s own infrastructure, keeping data entirely within our control. Either way, your data stays protected.
What this means
By developing stringent guardrails on AI Agents, we can provide safe and secure agentic AI capabilities for all members of the university through NebulaOne. We can maximise the functoriality of agents without having the risk of deleted emails or production databases. As with everything in AI, things are moving quickly so these guardrails will continue to evolve and inform our development as new ways of working with agents emerge.