In August 2025, the UK’s First-tier Tribunal (FTT) ordered HMRC to reveal whether it had used generative Artificial Intelligence (AI), such as ChatGPT, in correspondence to taxpayers about R&D tax relief claims (Elsbury v Information Commissioner [2025] UKFTT 915 (GRC)).
For the first time, an English court confronted the secrecy surrounding algorithmic decision-making in tax administration. The case highlights a tension that exists globally: tax authorities are embracing AI to detect fraud and manage compliance at scale, while taxpayers and courts are demanding transparency, accountability and fairness.
This article examines how AI is being deployed in tax administration, the litigation risks it creates, and what corporate taxpayers should be doing now to prepare. The article draws on recent OECD findings, UK experience with HMRC’s “Connect” analytics platform, international case law, comparative perspectives and practical strategies to help businesses and their advisers navigate this rapidly changing landscape.
Don't miss a thing, subscribe today!
Stay up to date by subscribing to the latest Tax and Structuring insights from the experts at Fieldfisher.
Subscribe nowThe new reality: AI-driven tax enforcement
AI has moved decisively from experimental pilots to operational deployment in tax administrations worldwide. Machine learning (ML), computer vision, and large language models (LLMs) are now integral to risk assessment, fraud detection, case triage, and taxpayer services. These tools accelerate enforcement, widen what can be detected, and compress the timetable between anomaly-spotting and regulatory enforcement. Yet they also introduce new and significant risks that can cause legal issues: algorithmic bias, opacity of reasoning, weak oversight, privacy intrusions, and contested disclosure of model evidence.
The OECD’s 2025 report ‘Governing with Artificial Intelligence’(OECD AI Report) documents how member countries are embedding AI into revenue collection. While the efficiency gains are clear, so are the potential challenges. In the UK, HMRC’s long-standing 'Connect' analytics platform, coupled with emerging experiments in LLMs, underscores how deeply AI has permeated compliance and enforcement. This reality forces companies to adapt in two ways: first, to be 'algorithm-ready':
- ensuring that data, narratives, and governance are coherent and defensible when scrutinised by AI-driven systems;
- to hardwire tax dispute strategies that integrate domestic remedies with cross-border stabilisers like Mutual Agreement Procedures (MAP) and Advance Pricing Agreements (APA).
What’s new: imaging analytics can now spot undeclared assets (e.g. constructions, pools), ML engines flag anomalies across VAT, corporation tax and customs returns at national scale, and courts experiment with AI for case routing and drafting. Expect more tax enquiries triggered by opaque models, quicker escalation of enquries to enforcement, and discovery disputes over use of AI models, training data and error rates. The litigation battleground (tax or other) will centre on adequacy of reasoning, proportionality, data-protection compliance, and access to the AI model's design, training data, and results.
How tax administrations use AI today
Risk assessment and enquiry selection
Tax authorities now routinely integrate returns, third-party data, financial transactions, and customs feeds into ML models that generate risk scores. The OECD reports that 29 of 38 members use AI in tax enforcement, primarily to detect evasion, fraud, support decision-making, and provide taxpayer services. According to the OECD AI Report, Austria’s Predictive Analytics Competence Centre, for example, processed millions of cases across income tax, VAT, and corporate tax in 2023, recovering €185m in incremental revenue attributable to AI-assisted detection. Greece’s Independent Authority for Public Revenue combines over 100 criteria to flag anomalies in markets vulnerable to fraud, such as fuel distribution. For multinational taxpayers, this means that consistency of narratives and reconciliations across jurisdictions is no longer optional; inconsistencies are more visible than ever to risk engines.
Litigation implications: non-transparent or black-box enquiry selection mechanisms (also known as 'opaque selection mechanisms') can invite judicial challenges in some jurisdictions, especially if taxpayers can argue that they were unfairly targeted, subjected to biased profiling, unfettered discretion, or lack of intelligible explanation for enquiry selection.
Computer vision and alternative data sources
France’s 'Foncier Innovant' programme uses aerial imagery to detect undeclared developments such as swimming pools and extensions, cross-checking them against filed tax returns. Greece has used geospatial analytics to identify undeclared pools for local taxation. These projects demonstrate authorities’ growing reliance on non-traditional data and image analytics to expand the enquiry beyond filed returns.
Litigation implications: disputes will increasingly revolve around accuracy (e.g. false positives from image misclassification), proportionality, and privacy compliance under GDPR. Authorities often point to human validation steps to demonstrate procedural fairness, but courts will end up deciding whether that validation is meaningful or merely rubber-stamping automated detections.
Administrative decision-support and case triage
In Brazil, administrative courts have tested ML to cluster similar cases and assign them to consistent panels, reporting sensitivity and specificity levels above 80% in pilots. While this promises greater 'productivity' in the sense of 'throughput', it also risks templated or mechanised reasoning. If litigants feel that their case has been decided by a formula rather than an individualised assessment, challenges on grounds of adequacy of reasons, and procedural fairness, could see rapid and widespread growth.
Taxpayer services and LLM-enabled guidance
Authorities worldwide are deploying chatbots and LLMs to triage taxpayer queries, prepopulate forms, and even provide tailored guidance. While this improves access and compliance, it introduces risks: if a chatbot provides incorrect advice and the taxpayer relies on it, disputes may arise over legitimate expectation and fairness. Preserving transcripts, identifiers, and metadata of such interactions will be essential for taxpayers seeking to defend their reliance in future litigation.
The new litigation risk landscape
Bias and discrimination
Models that disproportionately flag certain sectors, business sizes, or geographies may be challenged as discriminatory. Courts in various countries are increasingly interested in how representative training datasets are, how features were selected, and what governance exists. The OECD urges adoption of 'trustworthy AI' principles to mitigate such risks.
Explainability and reasons
The duty to give reasons is fundamental. Black-box scoring systems that lack human-interpretable rationales are susceptible to attack. Tax authorities may need to produce 'model cards' or feature-importance explanations to meet this obligation.
Evidence, disclosure, and confidentiality
Access to model documentation, validation data, and error rates is becoming central to disputes involving AI. Authorities will resist wholesale disclosure, citing confidentiality and system integrity. Courts may put special arrangements in place to protect sensitive information.
Automated decision-making and human oversight
Where decisions such as withholding/delaying refunds or penalty assessments are materially automated, taxpayers may challenge them as lacking sufficient human review. Evidence of human-in-the-loop oversight, override logs, and accountable sign-offs will be key.
Privacy and data protection
GDPR and equivalent regimes require lawful basis, purpose limitation, and minimisation when processing alternative data such as social media signals or aerial images. Breaches not only risk regulatory penalties but also fuel arguments in litigation that data was used unfairly or improperly.
Case studies: UK, Netherlands and France
In the UK, the Elsbury case represents the first UK tribunal judgment directly addressing AI opacity in tax administration.
Thomas Elsbury made a Freedom of Information (FOI) request about HMRC’s use of LLMs in R&D tax compliance. HMRC first confirmed it held the information, then reversed its position, claiming risks to tax collection.
The Tribunal rejected this and ordered HMRC to clarify its use of AI. It found that AI-generated letters had already damaged trust, discouraging legitimate R&D relief claims. The judgment stressed that lack of transparency can undermine taxpayer confidence and policy effectiveness.
This case sets a precedent for future FOI and judicial review challenges involving AI.
In the Netherlands, in the SyRI Case (2020), the Hague District Court struck down the government’s System Risk Indication welfare fraud detection tool for violating privacy and proportionality under the European Convention on Human Rights Article 8. The ‘black box’ nature of the algorithm lacked transparency, preventing individuals from understanding the data and functioning of the system used to assess their risk of fraud. This landmark case shows courts’ willingness to curb opaque AI-driven risk scoring.
In France in 2022, the Conseil d’État (France’s highest administrative court), in a 360 page report on AI in the public sector, recommended a doctrine of “trusted public AI,” with seven principles including transparency, accountability, human oversight, and auditability.
Cross-border tax disputes in the AI era
AI-driven adjustments, especially in transfer pricing, profit attribution, and permanent establishment determinations, are likely to trigger double taxation. The OECD framework emphasises MAP and APA as stabilisers. The UK’s 2024 Dispute Resolution Profile shows how mature competent authority processes operate. For companies, there is a clear imperative: align data and narratives across jurisdictions to minimise MAP exposure, and deploy APAs where recurrent risk profiles are flagged by their own AI-driven engines.
Taxing transparency: AI in Civil and Common Law systems
Civil law and common law traditions diverge in their handling of AI opacity. In France, the Conseil d’État has proposed the adoption of guidelines requiring algorithmic decision-making by public authorities to comply with principles of transparency, intelligibility, and accessibility, going so far as to claim that the administration can be held liable for its decisions before an administrative Court if such decisions can be proven to have caused harm to citizens.
Common law systems such as the UK, by contrast, do not have formal rules that require public bodies to explain how their algorithms work. Instead, courts rely on established principles of fairness, proportionality, and reason-giving. The Elsbury case demonstrates how opacity itself may be challenged as undermining trust and procedural fairness. While UK courts are unlikely to demand disclosure of source code or proprietary algorithms, they may require explanations of decision-making processes, evidence of human oversight, or disclosure of validation metrics where necessary to ensure fairness.
Multinational taxpayers are impacted by this legal contrast. Strategies that compel disclosure in civil law jurisdictions may not succeed in common law courts, where focus is on procedure rather than substantive transparency. Understanding this comparative dimension is crucial for designing litigation strategies in cross-border disputes.
Practical steps for taxpayers
To navigate this environment, taxpayers should adopt a three-stage approach: pre-controversy (or investigation), enquiry response, and litigation.
Pre-investigation
- Establish a single source of truth for data across entities and jurisdictions. Inconsistent or incomplete records can be misinterpreted by automated risk engines.
- Maintain clear documentation of business model, data lineage, and explanations of anomalies in case they are queried by HMRC or its automated tools.
- Embed AI-aware controls within tax governance frameworks, including monitoring of refunds, referrals, and reliance on official chatbots.
- Identify potential cross-border exposures early and consider whether APA or MAP might be appropriate if challenged.
When flagged or investigated
- Request intelligible reasons for selection beyond numeric scores; challenge, if possible, unreasonable requests for information.
- Seek disclosure of validation metrics, error rates, or policies where AI tools are suspected to have driven the selection.
- Probe human oversight, including evidence of review, overrides, and accountability structures.
- Initiate MAP promptly where adjustments may lead to double taxation (relevant where transfer pricing or cross-border adjustments exist).
In litigation
- Retain expert witnesses capable of interpreting AI models and explaining them to courts.
- Ensure evidence and documents hygiene when using AI tools in defence, documenting methodology and validation.
- Design remedies strategically, focusing on procedural flaws to preserve substantive bearing for settlement or MAP.
Conclusion
AI is compressing tax enforcement timelines and reshaping the litigation landscape. To remain resilient, taxpayers must stabilise their data narratives, embed AI-aware controls, and view procedural rights as strategic levers. Cross-border, MAP and APA will remain vital to defuse disputes arising from automated adjustments. The Elsbury case underscores that opacity is itself a risk: transparency is both a compliance necessity and a defensive strategy. The comparative analysis between civil and common law traditions further shows that litigation strategies must adapt to jurisdictional contexts. Ultimately, AI in tax administration is not merely a technological change, it is a transformation of the legal and procedural terrain of tax disputes.