Automating DAC8 Reporting: Build vs Buy for CASPs in 2026
Automating DAC8 Reporting: Build vs Buy for CASPs in 2026
Reviewed by Wag3s Editorial Team — verified against Council Directive (EU) 2023/2226 and European Commission DAC8 guidance · Last reviewed May 2026
Automating DAC8 Reporting
DAC8 reporting is not a form you fill in once a year. It is a recurring data pipeline that has to capture, normalize, and aggregate a full year of activity per reportable user, then emit a schema-valid report. This is the mechanics pillar for that pipeline: the three layers it decomposes into, what to automate at each, where human judgement has to stay, and how to make the build-vs-buy call. Adjacent articles drill into the pieces — the exact reportable fields, the XML schema and templates, and the penalty exposure that makes data integrity the priority.
The build-vs-buy picture in five points
- DAC8 is a pipeline, not a filing. Data collection had to be live from 1 January 2026.
- It decomposes into three layers: identity and tax residency (Layer 1), transaction aggregation (Layer 2), output formatting (Layer 3).
- Layer 2 and Layer 3 automate well. Layer 1 verification and discrepancy-investigation judgement do not fully automate.
- The riskiest layer to get wrong is Layer 2 data integrity: well-formatted but incomplete reports.
- On build vs buy: buy the regulatory-commodity layers, own the part tied to your unique transaction data.
Why "automation" is the wrong frame on its own
The instinct is to ask "which tool generates the DAC8 XML?" That is the last 10% of the problem. The report-generation step is largely deterministic once the inputs are correct. The work — and the risk — is upstream: getting complete, correctly attributed, correctly aggregated data into the generator.
A pipeline that automates output but is fed incomplete transaction data produces a report that passes schema validation, files on time, and is wrong. That is a data-integrity failure, the highest-frequency DAC8 penalty exposure (see DAC8 penalties). Automation that accelerates a defective process only produces defects faster.
The three layers, and what automates
Layer 1 — Identity and tax residency
Collect and verify legal name, date of birth, address, jurisdiction(s) of tax residence, TIN per jurisdiction, and a verified self-certification (see DAC8 data collected).
- Automatable: capture flows, TIN format validation, re-certification reminders, registry lookups where APIs exist.
- Not fully automatable: resolving conflicting residency indicia, judging an ambiguous self-certification, handling refusals. These need a documented human decision.
Layer 1 is the hardest to fully automate because verification is a judgement, and a wrong reportable-user determination propagates into every subsequent layer.
Layer 2 — Transaction aggregation
Normalize all activity across chains and venues, attribute it per user, and produce the annual aggregates (acquired/disposed vs fiat and vs crypto, transfers).
- Automatable: ingestion, normalization, per-user per-asset aggregation, counterparty categorization rules.
- Not fully automatable: classification edge cases (NFT scope — see DAC8 and NFTs; EMT vs CBDC vs tokenized security — see DAC8 and the digital euro) and discrepancy investigation.
This is the layer most tied to your specific data, and the riskiest to get wrong.
Layer 3 — Output formatting
Emit the report in the required XML schema for exchange via the EU Common Communication Network. The EU schema aligns with the OECD's CARF/CRS XML schemas published to support cross-authority transmission.
- Automatable: almost entirely, once Layer 2 outputs are correct and the target schema is fixed.
- Not fully automatable: per-Member-State variations where national transposition adds fields (see DAC8 transposition by country).
Layer 3 is the cheapest to automate and the least risky — precisely because it is downstream of the layers that carry the judgement.
Build vs buy, per layer
| Layer | Typical decision | Rationale |
|---|---|---|
| Layer 1 (identity/residency) | Buy | High regulatory specificity, commodity engineering, KYC vendors specialize here |
| Layer 2 (aggregation) | Own/build the normalization tied to your data; buy the aggregation engine | This is your unique data; the engine is reusable, the connectors are yours |
| Layer 3 (output) | Buy | Deterministic, schema-driven, low differentiation |
The recurring anti-pattern is trying to buy a single end-to-end tool. The three layers need different specialists; one product rarely does all three well. The other anti-pattern is building everything in-house — Layers 1 and 3 are commodity regulatory plumbing where vendors are ahead and staying ahead.
The defensible architecture for most CASPs: a KYC/self-certification vendor (Layer 1), a transaction-data platform that owns your chain/venue normalization (Layer 2), and a reporting-output tool (Layer 3), with documented human checkpoints at the judgement points.
Where vendors fit
- Sumsub — Layer 1: self-certification capture, TIN verification, re-certification flags.
- TaxBit — Layer 3 (and parts of Layer 2 output): generating DAC8/CARF-shaped reports.
- Cryptio — Layer 2: normalizing fragmented on-chain and exchange data with retained lineage.
The mapping is almost one tool per layer, which is why DAC8 automation is a stack-design problem, not a tool-selection problem.
Where Wag3s sits: the Layer 2 foundation
Layer 2 is the part of the pipeline tied to your unique transaction data, and the part this article flags as both least automatable to buy whole and riskiest to get wrong. Wag3s Ledger is built for exactly that layer:
- Multi-chain normalization across 20+ chains and exchange APIs (see multi-chain reconciliation)
- Per-user, per-asset annual aggregation matching the DAC8 reportable categories
- Counterparty categorization and instrument-level tagging (EMT vs CBDC vs tokenized security)
- Retained per-transaction lineage so any aggregate is auditable and discrepancies are investigable
- Clean outputs that feed a Layer 3 reporting tool or in-house XML generation
It does not file the report or make the reportable-user determinations — those judgement points and the submission stay with your compliance team and counsel. Wag3s supplies the audited substance they file on. See the Wag3s Ledger product page for module details.
Step-by-step: building a DAC8 pipeline for a mid-size CASP
A CASP operating on 5 EVM chains with 80,000 reportable users, processing roughly 3 million transactions per year, needs a pipeline that is reliable enough to withstand audit but not over-engineered to the point of being undeliverable by the 30 September 2027 deadline for FY 2026 data. The following architecture is illustrative.
Layer 1 — Identity and tax residency (buy). Select a KYC and self-certification vendor (e.g. Sumsub or a peer) with a DAC8-ready self-certification module. The module should capture: legal name, date of birth, address, tax residence jurisdiction(s), and TIN per jurisdiction. It should send re-certification reminders at least 30 days before annual deadlines and flag users where TIN validation returns an error. Build internal procedures for the human review of flagged cases — the vendor handles the infrastructure; the compliance team handles the edge cases.
Layer 2 — Transaction aggregation (own the normalization; buy the engine where possible). This is the hardest layer to buy entirely, because the normalization is specific to the CASP's data. For each supported chain and exchange venue, the CASP's engineering team must produce a normalization adapter that converts raw chain data (or exchange API data) into a canonical transaction record with: user identifier, transaction type (acquired vs disposed vs transferred), asset type (crypto-asset, EMT, tokenized security), quantity, fair value in EUR at the transaction date, and counterparty type (CASP, non-CASP, unhosted wallet).
Once normalized, the aggregation step (summing per user, per asset, per year, per reportable category) is deterministic and can use an off-the-shelf aggregation engine or a SQL-based data warehouse. The reporting categories required by DAC8 are: (a) aggregate consideration and units for all transactions where the reportable user disposed of assets for fiat; (b) aggregate consideration and units for crypto-to-crypto exchanges; (c) aggregate transfers to unhosted wallets (unit count and value); (d) retail payment transactions. Maintain each category separately in the aggregation model.
Layer 3 — Output formatting (buy). The XML schema for DAC8 reporting follows the EU Common Communication Network specifications, which closely mirror the OECD CARF XML schema. A reporting-output tool should accept the Layer 2 aggregates, map them to the correct DAC8 XML elements, and validate the output against the schema. This step is entirely deterministic once the inputs are correct. Most schema errors at this stage are caused by missing or malformed Layer 2 data (a null TIN, an unsupported asset type), not by the output tool itself.
Human checkpoints. Three mandatory human review points: (1) After Layer 1: review the list of users classified as reportable vs non-reportable, and spot-check 50–100 cases for plausibility. (2) After Layer 2 aggregation but before output: run a statistical review of the aggregates — check for users with implausibly high or low aggregate values, zero-value aggregates for active users, or categories that appear empty when activity is known to exist. (3) After Layer 3 output but before filing: confirm the filing covers the correct reporting period, the correct Member State, and that the schema validation passes without errors.
Common automation failures and how to prevent them
Missing exchange or chain. A new exchange was integrated mid-year but its API was not added to the normalization pipeline until Q3. Transactions from H1 are missing from that exchange's data. Prevention: maintain a complete inventory of data sources; require that every new integration is added to the DAC8 pipeline on the same day it goes live for users.
Wrong asset classification. A stablecoin pegged to EUR is classified as a crypto-asset rather than an EMT in the normalization layer. EMTs (electronic money tokens) are reported differently under DAC8. Prevention: maintain an asset classification register tied to the normalization layer; review it when new assets are listed.
User merge failure. A user changes their email address and the system creates a duplicate account. Transactions are split across two user records; the DAC8 aggregate shows two low-activity users instead of one high-activity one. Prevention: user identity resolution (linking all accounts for the same individual) must be done before aggregation, not after.
Self-certification gap. Users who registered before the DAC8 self-certification requirement was implemented never completed a tax-residency self-certification. They are in the trading population but absent from Layer 1. Prevention: backfill self-certification for all pre-DAC8 users; remind and block trading for users who do not complete it within a grace period, consistent with the CASP's terms of service and legal advice.
Further reading
- DAC8 Compliance Guide 2026
- DAC8 Data Collected
- DAC8 Reporting Templates & XML Format
- DAC8 CASP Penalties
- DAC8 and NFTs
- DAC8 Transposition by Country
- Multi-Chain Reconciliation
Sources
- Council Directive (EU) 2023/2226 (DAC8) — EUR-Lex
- European Commission — DAC8 overview
- OECD Crypto-Asset Reporting Framework — model rules, XML schema and guidance
DAC8 and the Digital Euro: Why CBDCs Are Outside Crypto Tax Reporting
The digital euro is a central bank digital currency, and CBDCs are excluded from DAC8's reportable crypto-asset scope — unlike regulated e-money-token stablecoins, which are in scope. The distinction that matters for treasuries and payment flows in 2026.
DAC8 Reporting Templates and the XML Schema: What CASPs Submit in 2026
DAC8 data is exchanged in an XML format aligned with the OECD CARF/CRS schemas, over the EU Common Communication Network. What the report structure looks like, why the schema is the easy part, and how to avoid schema-valid but wrong filings.
Every chain, integration, and competitor mentioned in this article gets its own page — coverage detail, comparison signals, and the audit trail your finance team needs.
- Chain
Ethereum
ERC-20, DeFi, gas, restaking — the largest ecosystem.
View page - Chain
Solana
SPL tokens, native stake, Jupiter, Metaplex NFTs.
View page - Integration
NetSuite integration
Mid-market and enterprise crypto subledger.
View page - Integration
QuickBooks integration
SMB GL with daily JE sync.
View page - Integration
Safe integration
DAO and corporate multi-sig accounting.
View page - Compare
Wag3s vs Cryptio
Side-by-side enterprise subledger comparison.
View page