Architecture
This chapter explains FLUX architecture in depth: runtime boundaries, module responsibilities, request paths, data ownership, concurrency model, and durability semantics.
FLUX is currently a single-broker system, but its internal decomposition is intentionally designed to evolve toward distributed control/data plane separation.
1) Architectural Overview
At runtime, FLUX is one broker process that opens a TCP port and accepts plain text requests one line at a time.
Internally it is split into these domains:
- Protocol domain: parse/format wire envelopes (
V1|<correlation_id>|<command>|<args>) - Network domain: per-connection request loop and command dispatch
- Storage domain: segmented append-only logs + indexes + retention + recovery
- Replication domain: ISR/high watermark state and ack semantics
- Coordination domain: consumer-group lifecycle and offset commit validation
- Composition domain: dependency wiring (
broker.Broker)
Even in single-node mode, this separation makes behavior explicit and testable.
2) Process and Runtime Model
Startup sequence
- load and validate config (
internal/config) - initialize storage (including on-disk replay)
- initialize coordinator (offset + group metadata load)
- initialize replication manager
- bind TCP listener
- accept connections and spawn per-connection goroutines
Connection model
- each accepted socket is handled by
network.HandleConnection - requests are line-delimited
- each line is parsed independently
- each response is one line in protocol envelope format
This is a simple request/response model rather than a streaming multiplex protocol.
3) Module-by-Module Deep Dive
cmd/broker
Responsibility:
- runtime entrypoint
- config loading
- dependency construction
- listener lifecycle
Important property:
- keeps business logic out of
main; orchestration only
internal/broker
Responsibility:
- composition root
- holds pointers to Storage, Coordinator, Replication manager
Important property:
- command handlers depend on a single runtime aggregate instead of global state
internal/network
Responsibility:
- parse request lines
- route command names to command handlers
- translate domain errors into protocol errors
Important property:
- protocol-level validation in handlers is centralized and deterministic
internal/protocol
Responsibility:
- request parsing (
V1|correlation|command|args) - response formatting (
OK/ERRenvelopes) - argument conversion helpers
Important property:
- wire contract is isolated from domain modules
internal/storage
Responsibility:
- append records to partition segments
- segment rollover
- index append (offset/time)
- startup recovery and checksum validation
- retention enforcement (age/size)
Important property:
- storage is authoritative for record existence and partition append order
internal/replication
Responsibility:
- track leader offset per partition
- track follower progress and freshness
- compute ISR and high watermark
- enforce
acks=allwait semantics - manage partition role (
leader/follower)
Important property:
- commit visibility boundary is represented by high watermark
internal/coordinator
Responsibility:
- group membership state machine (
JOIN,SYNC,HEARTBEAT,LEAVE) - generation advancement and rebalance
- assignment strategies (range/round_robin)
- commit fencing (generation + ownership)
- persistence of group metadata and offsets
Important property:
- stale members are fenced via generation checks, reducing ownership races
4) Data Ownership and State Boundaries
Understanding FLUX requires knowing who owns which state.
Storage-owned state
- log segments
- in-memory partition message cache
- segment/index metadata
Replication-owned state
- leader offsets
- follower offsets / last seen
- ISR membership flags
- high watermark
- partition role
Coordinator-owned state
- group generation
- member liveness
- assignment map
- committed offsets (with offset manager)
Cross-domain interaction rule
- Storage writes are not automatically “committed-visible”
- visibility to consumers is gated by replication high watermark
- offset commit validity is gated by coordinator generation ownership
This explicit split mirrors distributed systems design patterns where durability, commit, and ownership are distinct concepts.
5) End-to-End Request Paths
Produce path (PRODUCE)
- network handler validates args (
topic,key:value,acks) - partition chosen via key hash
- replication role checked (
leaderrequired) - storage append returns
(partition, offset) - replication leader offset advanced
- if
acks=all, wait until high watermark reaches target and ISR minimum satisfied - response returns
partition,offset,hw
Key architectural point:
- append success and commit success are separate phases
Consume path (CONSUME)
- parse
topic/partition/offset - read current high watermark from replication manager
- read candidate messages from storage
- filter output to
offset <= high watermark - return committed-visible set only
Key architectural point:
- consumers do not observe uncommitted records
Group lifecycle path (JOIN / SYNC / HEARTBEAT / LEAVE)
JOIN: creates/updates member and may bump generation and rebalanceSYNC: validates generation and returns assignment snapshotHEARTBEAT: refreshes liveness for current generationLEAVE: explicit removal and rebalance
Key architectural point:
- generation is an epoch fence for ownership correctness
Commit path (COMMIT)
- parse group/topic/member/generation/partition/offset
- coordinator validates member is current + owns partition
- offset manager persists commit
Key architectural point:
- commit is a coordination decision, not just a write
6) Durability Model and On-Disk Layout
FLUX writes several classes of files under FLUX_DATA_DIR.
Record data and indexes
<topic>-<partition>-segment-<id>.log<topic>-<partition>-offset.idx<topic>-<partition>-time.idx
Replication visibility artifacts (current local simulation)
<topic>-<partition>-replica-<id>-segment-<id>.log
Coordination metadata
offsets.json(committed offsets)groups.json(group generation, members, assignments)
Recovery behavior
On restart:
- storage replays segments with checksum verification
- offsets/group metadata reload from JSON snapshots
- runtime resumes from persisted control state
7) Concurrency and Synchronization
FLUX uses explicit mutex boundaries per subsystem.
- storage mutex: append/read/retention state consistency
- replication mutex: ISR/high-watermark transitions consistency
- coordinator mutex: membership/generation/assignment consistency
Why this matters:
- avoids cross-subsystem races in current single-process model
- makes tests deterministic
- provides clear migration points for future distributed state separation
8) Error Model and Failure Semantics
Protocol-level errors are explicit and stable.
Examples:
BAD_REQUEST: malformed argsUNKNOWN_COMMAND: unsupported commandNOT_LEADER: produce attempted on follower role partitionREPLICATION_TIMEOUT:acks=alldeadline exceededGENERATION_MISMATCH: stale or invalid member epoch/ownership
Design property:
- errors map directly to architectural invariants (leadership, ISR commit, epoch fencing)
9) Current Limits vs Future Architecture
Current state (intentionally constrained):
- single broker process
- local replication simulation
- no external metadata quorum
Planned evolution path:
- metadata quorum and controller plane
- broker registration/fencing
- true broker-to-broker replication
- leader-aware client routing
This keeps the codebase pedagogically useful while progressively introducing production distributed complexity.
10) How to Read the Code with This Model
Recommended code-reading order:
internal/network/handler.gointernal/protocol/protocol.gointernal/storage/storage.gointernal/replication/manager.gointernal/coordinator/group.goandoffset.go- tests for each module
If you follow this order, the architectural invariants above map directly to implementation details and test cases.