blockchaindev
☰ Lessons

Build a blockchain in Rust · Lesson 03 of 07

Building the block and chain structures

TL;DR — A block is a header (prev_hash, merkle_root, timestamp, height) plus a body of transactions. Hash the header's canonical bytes, chain blocks by embedding the parent hash, and validate the link + height + merkle root on every append.

What a block actually is

A blockchain is an append-only list where each entry commits to the one before it. That commitment is the whole trick: change any historical byte and every subsequent hash changes, so tampering is detectable with a single comparison. To build this we need three things — a header that carries the metadata we hash, a block that pairs the header with its transactions, and a chain that enforces the rules on append.

We split the header from the body deliberately. The header is small and fixed-shape, so it’s cheap to hash and cheap to gossip. The body (transactions) is summarized inside the header by a single merkle_root, which you built in the previous lesson. That means hashing the header alone transitively commits to every transaction — you never need to re-hash megabytes of body to identify a block.

This lesson assumes a Hash type and a hashing helper from earlier lessons. To keep the code self-contained and runnable, we define a minimal Hash here using sha2. Substitute your own.

Cargo.toml
[dependencies]
sha2 = "0.10"
thiserror = "1"

The header

The BlockHeader holds exactly the fields that define a block’s position and content commitment: prev_hash (the parent’s hash — this is the “chain” link), merkle_root (commitment to the transaction set), timestamp (seconds since the Unix epoch), and height (the block’s index, with genesis at 0).

block.rs
use sha2::{Digest, Sha256};

/// 32-byte content hash. In your project this comes from the hashing lesson.
#[derive(Clone, Copy, PartialEq, Eq, Hash, Debug, Default)]
pub struct BlockHash(pub [u8; 32]);

#[derive(Clone, Debug, PartialEq, Eq)]
pub struct BlockHeader {
    pub prev_hash: BlockHash,
    pub merkle_root: BlockHash,
    pub timestamp: u64,
    pub height: u64,
}

impl BlockHeader {
    /// Canonical, deterministic byte encoding of the header.
    /// Fixed field order + fixed-width big-endian integers => same bytes
    /// on every machine, which is mandatory for hashes to agree across nodes.
    fn canonical_bytes(&self) -> Vec<u8> {
        let mut buf = Vec::with_capacity(32 + 32 + 8 + 8);
        buf.extend_from_slice(&self.prev_hash.0);
        buf.extend_from_slice(&self.merkle_root.0);
        buf.extend_from_slice(&self.timestamp.to_be_bytes());
        buf.extend_from_slice(&self.height.to_be_bytes());
        buf
    }

    /// The block's identity: SHA-256 over the canonical header bytes.
    pub fn hash(&self) -> BlockHash {
        let digest = Sha256::digest(self.canonical_bytes());
        let mut out = [0u8; 32];
        out.copy_from_slice(&digest);
        BlockHash(out)
    }
}

The non-negotiable detail here is canonical_bytes. Two nodes must produce identical bytes for identical headers or their hashes diverge and consensus falls apart. That’s why we hand-serialize with a fixed field order and to_be_bytes() instead of leaning on a debug format or a hashmap iteration order. Endianness is pinned (big-endian) so an ARM node and an x86 node agree. This is the single most common source of “works on my machine” hash mismatches in toy chains — don’t skip it.

The block

A Block is just the header plus the transactions it commits to. We keep transactions as opaque byte vectors here so the lesson stays focused on structure; in a real chain this is your transaction type. The body’s hash is summarized by header.merkle_root, which the producer computes from transactions before sealing the block.

block.rs
pub type Transaction = Vec<u8>;

#[derive(Clone, Debug)]
pub struct Block {
    pub header: BlockHeader,
    pub transactions: Vec<Transaction>,
}

impl Block {
    pub fn hash(&self) -> BlockHash {
        self.header.hash()
    }

    /// Recompute the merkle root from the body and compare to what the
    /// header claims. Use your real merkle function from the prior lesson;
    /// this stand-in concatenates tx bytes and hashes them.
    pub fn body_commitment_is_valid(&self) -> bool {
        merkle_root(&self.transactions) == self.header.merkle_root
    }
}

/// Placeholder merkle root — replace with the tree from the hashing lesson.
fn merkle_root(txs: &[Transaction]) -> BlockHash {
    let mut hasher = Sha256::new();
    for tx in txs {
        hasher.update((tx.len() as u64).to_be_bytes());
        hasher.update(tx);
    }
    let mut out = [0u8; 32];
    out.copy_from_slice(&hasher.finalize());
    BlockHash(out)
}

Note the length prefix in merkle_root: hashing len || tx rather than bare concatenation prevents two different transaction sets from colliding by re-splitting the same byte stream. Even in a placeholder, get domain separation right.

The chain: validate, then append

Now the rules. A Blockchain owns an ordered Vec<Block>. The interesting method is append, which is where we reject invalid blocks instead of silently accepting them. We enforce four invariants on every candidate:

  1. Height is exactly tip.height + 1 — no gaps, no replays.
  2. prev_hash equals the current tip’s hash — the link is real.
  3. timestamp is non-decreasing — time moves forward (a weak but standard sanity rule).
  4. The body commitment matches the header’s merkle root — the header isn’t lying about its contents.

We model failures as a typed error with thiserror so callers can match on the specific reason rather than parse a string.

chain.rs
use thiserror::Error;

#[derive(Debug, Error, PartialEq, Eq)]
pub enum ChainError {
    #[error("expected height {expected}, got {got}")]
    BadHeight { expected: u64, got: u64 },
    #[error("prev_hash does not match current tip")]
    BrokenLink,
    #[error("timestamp {got} is older than tip timestamp {tip}")]
    BackwardsTime { got: u64, tip: u64 },
    #[error("merkle_root does not match the block body")]
    BadBodyCommitment,
}

pub struct Blockchain {
    blocks: Vec<Block>,
}

impl Blockchain {
    /// Genesis: height 0, prev_hash all zeroes. Every chain starts here,
    /// and all nodes must agree on it byte-for-byte.
    pub fn new(genesis_txs: Vec<Transaction>) -> Self {
        let header = BlockHeader {
            prev_hash: BlockHash([0u8; 32]),
            merkle_root: merkle_root(&genesis_txs),
            timestamp: 0,
            height: 0,
        };
        let genesis = Block { header, transactions: genesis_txs };
        Blockchain { blocks: vec![genesis] }
    }

    pub fn tip(&self) -> &Block {
        // Safe: the chain is never empty — genesis is inserted in `new`.
        self.blocks.last().expect("chain always has genesis")
    }

    pub fn height(&self) -> u64 {
        self.tip().header.height
    }

    pub fn append(&mut self, block: Block) -> Result<(), ChainError> {
        let tip = self.tip();

        let expected_height = tip.header.height + 1;
        if block.header.height != expected_height {
            return Err(ChainError::BadHeight {
                expected: expected_height,
                got: block.header.height,
            });
        }
        if block.header.prev_hash != tip.hash() {
            return Err(ChainError::BrokenLink);
        }
        if block.header.timestamp < tip.header.timestamp {
            return Err(ChainError::BackwardsTime {
                got: block.header.timestamp,
                tip: tip.header.timestamp,
            });
        }
        if !block.body_commitment_is_valid() {
            return Err(ChainError::BadBodyCommitment);
        }

        self.blocks.push(block);
        Ok(())
    }

    /// Walk the whole chain and re-verify every link. Useful on startup
    /// when loading persisted blocks you didn't validate this run.
    pub fn validate(&self) -> Result<(), ChainError> {
        for window in self.blocks.windows(2) {
            let (parent, child) = (&window[0], &window[1]);
            if child.header.height != parent.header.height + 1 {
                return Err(ChainError::BadHeight {
                    expected: parent.header.height + 1,
                    got: child.header.height,
                });
            }
            if child.header.prev_hash != parent.hash() {
                return Err(ChainError::BrokenLink);
            }
        }
        Ok(())
    }
}

Two design notes. First, tip() uses .expect() rather than returning an Option, because the genesis block makes “empty chain” unrepresentable — encoding that invariant in new lets the rest of the code stop checking for it. Second, validate() uses slice::windows(2) to walk adjacent (parent, child) pairs, which is the cleanest idiomatic way to verify pairwise links across a Vec.

Proving it works

A producer builds the next block by reading the tip, then a verifier (the chain) accepts or rejects it. The test below builds a valid second block, appends it, then tampers with a third block’s prev_hash to confirm the link check fires.

chain.rs
#[cfg(test)]
mod tests {
    use super::*;

    fn next_block(chain: &Blockchain, txs: Vec<Transaction>, ts: u64) -> Block {
        let tip = chain.tip();
        let header = BlockHeader {
            prev_hash: tip.hash(),
            merkle_root: merkle_root(&txs),
            timestamp: ts,
            height: tip.header.height + 1,
        };
        Block { header, transactions: txs }
    }

    #[test]
    fn appends_valid_blocks_and_rejects_broken_links() {
        let mut chain = Blockchain::new(vec![b"genesis".to_vec()]);
        assert_eq!(chain.height(), 0);

        let b1 = next_block(&chain, vec![b"tx-a".to_vec()], 100);
        chain.append(b1).expect("valid block appends");
        assert_eq!(chain.height(), 1);

        // Tamper: point at the wrong parent.
        let mut bad = next_block(&chain, vec![b"tx-b".to_vec()], 200);
        bad.header.prev_hash = BlockHash([9u8; 32]);
        assert_eq!(chain.append(bad), Err(ChainError::BrokenLink));

        // Chain is still internally consistent.
        assert!(chain.validate().is_ok());
    }
}

Run it with cargo test. The valid block raises the height to 1; the tampered block is rejected with BrokenLink, and validate() confirms the stored chain is still sound.

Where this goes next

You now have the structural backbone: hashed headers, body commitments, and an append path that refuses to extend the chain with anything that doesn’t link cleanly. What’s missing is where the transactions come from. Producers don’t invent them — they pull from a pool of pending transactions, ordered by fee and filtered for validity. That’s the mempool, and it’s the next lesson. See the sha2, HashMap, and thiserror docs for the primitives used above.

Sources