/
Soha Group Home
Data Platform & CDC

A Governed Data Platform Built on Red Hat

A focused architecture for teams who need controlled data movement, reliable event contracts, and near-real-time integration around the systems that already run the business.

This page goes deeper than the broader DevOps overview and zooms in on the data layer itself. Debezium turns committed database changes into reliable event streams, Apache NiFi orchestrates how those events move and transform, Apache Kafka (or Red Hat AMQ Streams) carries them at scale, and Apache Avro keeps the contracts stable. All of it runs comfortably on Red Hat OpenShift, alongside whatever core databases your organization already depends on — Oracle, SQL Server, IBM DB2, PostgreSQL, MySQL, or others.

Data Platform on Red Hat

From a Database Change to a Reliable Stream — Running on OpenShift

Debezium turns changes in Oracle, SQL Server, IBM DB2, and other databases into events. Apache NiFi shapes the path, Apache Kafka carries them at scale, and Apache Avro keeps the schemas stable — all running on Red Hat OpenShift.

Red Hat OpenShiftDebezium CDCApache NiFiApache KafkaApache AvroOracle · SQL Server · DB2
Apache NiFi data flow illustration

Data Flow with NiFi and Kafka

Apache NiFi handles routing, enrichment, and transformation visually. Apache Kafka delivers events durably, in order, and at scale. Operations teams can see every step and diagnose problems instead of guessing.

Red Hat OpenShift logo
Apache Kafka logoAMQ Streams
IBM logoDB2

Oracle · SQL Server · DB2 — Core Preserved

Modernization Around the Core

Whatever runs at the core — Oracle, SQL Server, IBM DB2, or another engine — the goal stays the same: protect the transactional core and modernize the data-distribution, API, and service layers around it, on Red Hat OpenShift.

Debezium

Change data capture

Capture committed database changes from Oracle, SQL Server, IBM DB2, PostgreSQL, MySQL, and MongoDB — and turn them into a governed event stream.

NiFi

Flow orchestration

Route, enrich, transform, buffer, and control enterprise integration flows with a visual interface and full operational visibility.

Kafka

Event backbone

Apache Kafka — or Red Hat AMQ Streams — provides durable, ordered, high-throughput delivery between producers and consumers.

Avro

Schema discipline

Keep producer and consumer contracts stable as pipelines, services, and downstream systems evolve over time.

The Red Hat Ecosystem Behind This Platform

Red Hat OpenShift provides the runtime. Debezium, Apache NiFi, and Apache Kafka (or Red Hat AMQ Streams) form the capture, flow, and event-transport layer, while Apache Avro keeps the contracts stable. It's a mature, well-proven ecosystem in regulated environments.

Red Hat logoRed Hat
Red Hat OpenShift logoOpenShift
Debezium logoDebezium
Apache NiFi logoApache NiFi
Apache Kafka / Red Hat AMQ Streams logoApache Kafka
Apache Avro
Apache Avro

Why the Data Platform Deserves Its Own Architecture

In most enterprise modernization programs, the application runtime isn't the hardest problem. The hardest problem is the data path between old and new systems. New digital channels need fresher data. Downstream teams need cleaner contracts. Integration teams need flows that are observable, supportable, and not held together by fragile custom jobs.

Our answer is to separate concerns cleanly. The core transactional systems — whether they run on Oracle, SQL Server, IBM DB2, or another engine — stay authoritative. Debezium exposes their committed changes as events. Apache NiFi orchestrates how those events move, transform, and reach their destinations. Apache Kafka (or Red Hat AMQ Streams) acts as the event backbone. Apache Avro keeps every event's structure under control.

The whole platform runs on Red Hat OpenShift, which means it inherits the same security model, deployment discipline, and operational tooling as the rest of the delivery stack. New services, integrations, and data products can evolve quickly while the core systems keep doing what they're trusted to do.

Official Debezium CDC architecture diagram showing connectors streaming changes from databases through Kafka Connect to Apache Kafka

Official Debezium Architecture

From a Database Change to a Reliable Stream — Running on OpenShift

Debezium turns changes in Oracle, SQL Server, IBM DB2, and other databases into events. Apache NiFi shapes the path, Apache Kafka carries them at scale, and Apache Avro keeps the schemas stable — all running on Red Hat OpenShift.

Debezium — Change Data Capture

Modernize without rewriting the source

Debezium captures committed changes from databases like Oracle, SQL Server, IBM DB2, PostgreSQL, and MySQL — without modifying the source application. That matters when the source system is critical, sensitive, or owned by a team that can't take on direct changes.

Real-time visibility into transactional events

Every committed change becomes an event that downstream consumers can use for synchronization, analytics, cache invalidation, or workflow triggers. Integration becomes near-real-time instead of overnight, while the database remains the system of record.

Decouple the consumers from the source

Instead of every downstream system reading directly from production tables, consumers subscribe to a governed event stream. Coupling drops, interfaces become easier to reason about, and changes on either side become much safer.

Apache NiFi — Data Flow Orchestration

Visual control over every integration path

NiFi gives you a clear, graphical view of how data moves: routing, transformation, retries, throttling, prioritization, and delivery. Especially useful when several downstream systems need different treatments of the same upstream events.

Built for operational support

Enterprise data flows need more than execution — they need to be supportable. NiFi shows where data is, why a transfer failed, what's queued, and which path each payload took. Operations teams can actually answer questions instead of guessing.

Enrichment and transformation without glue code

As data flows out of core systems, NiFi can enrich, filter, redact, reshape, and route records toward APIs, analytics platforms, and internal services — replacing piles of bespoke integration scripts with a single, observable platform.

Apache Avro — Schema and Contracts

Schema as a delivery control

Avro gives teams a practical way to define and evolve schemas. In event-driven systems, that's what stops consumers from breaking silently when a producer changes a field or shifts the meaning of a payload.

Stable contracts across teams

When several teams depend on the same family of events, schema discipline becomes a governance need, not a preference. Avro makes it explicit which fields can change and which must stay stable.

Easier scaling of consumers

With contracts that hold, adding new reporting tools, analytics jobs, operational services, or machine-learning consumers becomes a much smaller exercise — no need to renegotiate the meaning of each payload every time.

Modernizing Around Your Core Transactional Systems

In most enterprises, the core transactional systems — ledgers, customer cores, settlement, billing — were built years ago on databases like Oracle, IBM DB2, or SQL Server. They run the business, they're trusted, and rewriting them quickly is rarely a sensible option. The risk is too high and the value too embedded.

The better pattern is to leave those cores doing what they do best, and modernize the layer around them. Debezium externalizes their committed changes as events. Apache NiFi shapes and routes the resulting flows. Apache Avro keeps the contracts disciplined. Apache Kafka (or Red Hat AMQ Streams) carries the events at scale. Services running on Red Hat OpenShift consume those streams or expose new APIs on top of them.

The result is a modernization path that's incremental and reversible: the perimeter — digital channels, data products, integration services, operational automation — moves forward, while the systems of record stay protected. Speed goes up, risk goes down, and the organization gets new capabilities without betting everything on a single replacement program.

Where Teams Use This

  • Expose changes from Oracle, SQL Server, or IBM DB2 to digital channels without coupling them directly to production tables.
  • Replace fragile nightly or interval batch jobs with CDC-driven, near-real-time integration flows.
  • Build a governed event-contract layer so multiple consumers can safely use the same business events.
  • Use NiFi to separate routing, transformation, and policy logic out of application code and into an observable platform.
  • Carry events at scale on Apache Kafka or Red Hat AMQ Streams running natively on OpenShift.
  • Modernize the perimeter — APIs, data products, analytics, automation — ahead of any decision to replace the core itself.
Apache NiFi provenance and data lineage illustration
Reference Data-Platform Architecture

Operational Source Layer

Existing transactional databases — Oracle, SQL Server, IBM DB2, PostgreSQL, or MySQL — remain authoritative for core business state and committed events.

CDC Extraction Layer

Debezium captures committed changes and publishes them as a clean stream of events, ready for controlled downstream consumption.

Event Backbone

Apache Kafka — or Red Hat AMQ Streams in OpenShift — provides durable, ordered, partitioned delivery of events across the platform at enterprise scale.

Flow and Policy Layer

Apache NiFi applies routing logic, enrichment, filtering, buffering, error handling, and consumer-specific data-movement policies in a visual, supportable way.

Contract Layer

Apache Avro provides a schema-governed event model so producers and consumers can evolve independently without uncontrolled breakage.

Consumption Layer

Services running on Red Hat OpenShift, reporting tools, analytics platforms, and operational APIs consume the governed event and data flows.

Printable Summary

If you need a downloadable summary alongside this page, you can use the current platform brief while the dedicated Data Platform PDF is being prepared.