Job Title: Data Architect
Job Location: London, UK
Job Type: Permanent/ Contract
Note on Snowflake
Candidates must have significant, hands-on Snowflake architecture experience including schema design, ingestion patterns, security configuration, and cost governance. Experience with other cloud warehouses (BigQuery, Redshift, Databricks) will be considered alongside demonstrable ability to operate at Snowflake depth quickly.
Data Architect
CAGM Engineering Project Compass
Accounts (RTL) Payments Engine FX Enterprise Data Platform
About the Role
We are seeking a senior Data Architect to join the CAGM Global Markets) engineering organisation as part of Project Compass. This programme is delivering next-generation capabilities across Accounts (Real-Time Ledger), Payments Engine, and Foreign Exchange all of which generate, consume, and depend on high-quality, well-governed data at scale.
The Data Architect will own the end-to-end data architecture across CAGM, spanning Snowflake as the enterprise data warehouse and a landscape of in-house application databases (relational, time-series, document, and in-memory stores) that serve real-time operational workloads. You will define how data flows from source systems into the warehouse, how application databases are modelled and managed, and how data products are exposed to downstream consumers within and beyond .
This is a hands-on, delivery-focused role. You will work closely with Integration Architects, platform engineers, and domain product teams to translate business data requirements into durable, governed, and scalable data solutions.
Key Responsibilities
Data Architecture & Strategy
- Define and own the CAGM data architecture target state, covering the Snowflake enterprise data warehouse, application databases, and the data flows that connect them
- Establish a unified data modelling standard across relational (PostgreSQL, Oracle), in-memory (Redis), time-series (TimescaleDB / InfluxDB), and document (MongoDB) stores used by CAGM applications
- Design the data ingestion and movement architecture real-time CDC pipelines, batch ETL/ELT patterns, and event-driven feeds from the NATS messaging layer into Snowflake
- Define data domain boundaries, ownership, and lineage standards aligned with Project Compass product domains (RTL, Payments, FX)
- Produce and maintain authoritative data architecture artefacts: entity-relationship models, data flow diagrams, data dictionaries, and Architecture Decision Records (ADRs)
Snowflake & Data Warehouse
- Lead the design and evolution of the Snowflake data warehouse, including schema design (Raw / Conformed / Consumption layers), virtual warehouse sizing, and cost governance
- Define standards for data loading (Snowpipe, Streams & Tasks, external stages), transformation (dbt patterns), and data sharing across business units
- Establish Snowflake data access controls, row-level security, dynamic data masking, and PII governance in line with regulatory requirements (GDPR, BCBS 239)
- Champion Snowflake best practices for performance tuning, clustering keys, materialised views, and query optimisation
- Evaluate Snowflake-native capabilities (Snowpark, Cortex AI, Dynamic Tables) and recommend adoption where they accelerate CAGM data product delivery
Application Database Architecture
- Govern the application database landscape across CAGM reviewing schema designs, indexing strategies, and data lifecycle management across all in-house databases
- Define patterns for operational data stores (ODS) that bridge real-time application databases and the analytical warehouse layer
- Ensure consistency between transactional data models and their warehouse representations, minimising transformation complexity and maximising fidelity
- Set standards for database change management, migration tooling (Liquibase / Flyway), and schema versioning across the CAGM application estate
- Identify and remediate data quality issues at source, defining data contracts between application teams and downstream consumers
Data Governance & Quality
- Define and implement data governance frameworks covering data ownership, stewardship, classification (PII, sensitive, public), and retention policies
- Establish data lineage and cataloguing standards, working with tooling such as Apache Atlas, Collibra, or Snowflake Horizon Catalog
- Design and enforce data quality rules and SLAs at ingestion, transformation, and consumption layers
- Collaborate with the Risk and Compliance function to ensure CAGM data architecture meets BCBS 239 Risk Data Aggregation and Reporting requirements
- Champion Master Data Management (MDM) principles for shared reference data (counterparty, instrument, currency) across CAGM domains
AI, Analytics & Data Products
- Define the architecture for CAGM data products curated, well-documented datasets served to analytics, reporting, and AI/ML consumers
- Design feature stores and data pipelines that support AI/ML model training and inference for use cases such as FX pricing, payment anomaly detection, and limit utilisation forecasting
- Evaluate and integrate AI-assisted data tooling (AI-powered cataloguing, natural language querying, automated data quality) where it accelerates productivity
- Partner with the Analytics Engineering team to establish dbt modelling standards, testing frameworks, and documentation practices
Collaboration & Leadership
- Work hands-on across multiple CAGM product teams as a data authority, balancing strategic design with direct delivery contribution
- Guide and mentor application engineers on data modelling, query optimisation, and data quality best practices
- Engage senior stakeholders across Technology, Finance, Risk, and Operations to communicate data strategy, risks, and trade-offs
- Facilitate data architecture working groups with platform, BI, and enterprise architecture teams to align on shared standards
Core Technical Skills
Data Warehouse
Snowflake schema design, Snowpipe, Streams & Tasks, Snowpark, dynamic data masking, cost governance
Transformation
dbt (data build tool) modelling layers, testing, documentation, incremental strategies
Application Databases
PostgreSQL, Oracle, Redis, MongoDB, TimescaleDB / InfluxDB schema design, indexing, replication
Data Integration
CDC (Debezium / Kafka Connect), ETL/ELT pipelines, NATS event feeds, AWS Glue, Apache Spark
Cloud Platform
AWS S3, RDS, Aurora, Redshift (migration context), Glue, Lake Formation, IAM, VPC
Data Governance
Data lineage, cataloguing (Apache Atlas / Collibra / Snowflake Horizon), GDPR, BCBS 239, MDM
Architecture Practice
ERDs, data flow diagrams, data contracts, ADRs, C4 modelling, domain-driven data design
AI / ML Data
Feature stores, ML pipeline data design, Snowflake Cortex AI, vector stores, LLM data patterns
Query & Performance
SQL optimisation, clustering keys, partitioning, query profiling, cost-based tuning
CAGM Data Landscape
The Data Architect will work across the following technology landscape. Candidates should have direct experience with the majority of these platforms and the ability to define coherent architecture across heterogeneous stores:
Platform / Store
Primary Use in CAGM
Key Architecture Concerns
Snowflake
Enterprise data warehouse, analytics, reporting, data sharing
Layer design, ingestion patterns, security, cost governance
PostgreSQL
Transactional data ledger entries, client records, audit
Schema design, CDC, replication lag, index strategy
Oracle DB
Legacy core banking integration, reference data
Migration strategy, data contracts, schema versioning
Redis
Real-time caches FX rates, limit state, session data
Cache invalidation, persistence strategy, data consistency
MongoDB
Document stores client profiles, trade enrichment data
Schema evolution, aggregation pipelines, CDC integration
TimescaleDB
Time-series market data ticks, position history
Hypertable design, retention policies, compression
NATS JetStream
Event streaming payments, ledger events, FX confirmations
Event schema contracts, consumer group design, replay strategy
AWS S3 / Glue
Data lake staging, archival, batch ingestion into Snowflake
Partitioning, file format (Parquet/ORC), Lake Formation governance
Finance Domain Knowledge
Candidates should have hands-on data architecture experience in one or more of the following financial services domains:
Domain
Key Data Concepts
Real-Time Ledger
Double-entry accounting data models, event-sourced ledgers, real-time balance aggregation, reconciliation datasets
Payments Engine
Payment message data (ISO 20022 / SWIFT), settlement instructions, payment status lifecycle, fee and charge data
Foreign Exchange
Trade data models, rate feeds and time-series storage, position keeping, P&L attribution data
Limit Management
Exposure data models, limit hierarchy, breach event data, real-time risk aggregation feeds
Client Onboarding
Client master data, KYC / AML data structures, account hierarchy, regulatory reporting feeds
Regulatory Reporting
BCBS 239 data lineage, EMIR / MiFID trade reporting data, data quality SLAs for regulatory submissions
Experience & Profile
- 15+ years of progressive technology experience, with at least 5 years in senior data architecture roles
- Deep, hands-on experience with Snowflake as an enterprise data warehouse ideally holding Snowflake SnowPro Core or Advanced: Architect certification
- Proven track record of designing data architectures across heterogeneous application database landscapes in large financial institutions or fintech organisations
- Demonstrated experience implementing data governance frameworks, lineage tooling, and data quality programmes at programme scale
- Comfortable working hands-on writing dbt models, reviewing SQL, profiling queries while operating at senior stakeholder and architecture level
- Experience with CDC-based real-time data pipelines and event-driven data integration patterns
- Strong communicator able to convey complex data architecture decisions to both engineering teams and business stakeholders
- Familiarity with AI/ML data architecture patterns (feature stores, vector databases, LLM data pipelines) is a strong advantage
- AWS Solutions Architect or AWS Data Analytics certification is advantageous