FeaturesJune 24, 2025

Beyond the Buzzwords: Building a Real Blockchain from First Principles

In this post, I build a simple blockchain in Python, focusing on mining, cryptography and the creation of the first coin.

Posted by

Joao Teixeira

Introduction

In a previous blockchain post, I gave a brief introduction to what a blockchain is. In this article, I will go a step further and build a simple blockchain in Python to make the concept more concrete.

This post was inspired by a Medium article titled “From Zero to Genius: How I Built a Blockchain from Scratch in Python” by Pasindu Rangana. I appreciated the clarity of the original, but I wanted to walk through the same process from a different lens, focusing more on first principles. While the core structure is similar (a blockchain is a blockchain, after all), my goal here is not to reinvent the wheel but to gain a deeper understanding of it.

What Is a Blockchain?

A blockchain is, essentially, an accounting ledger where a list of transactions is recorded. These transactions are linked: each record in this “book” references the previous one. Because of this, it's not possible to alter a past entry without also changing all subsequent entries.

More precisely, in a Blockchain, each block contains a cryptographic hash of the previous block, a timestamp, and transaction data. Since each block refers back to the previous one, they form a chain, hence the name. This structure makes transactions resistant to tampering: once a block is recorded, its contents cannot be altered without modifying all subsequent blocks. And even then, such a change would require network consensus to be accepted.

To gain a better understanding of blockchain technology, check out our blog post titled "What is a Blockchain?"

What Is a Cryptographic Hash?

A cryptographic hash function is a mathematical algorithm that takes an input, any piece of data, and returns a fixed-size string of characters, which appears random. This output is called the hash (or digest).

Let's consider, for example, the following function that converts a string into a SHA-256 hash:

import hashlib
def hash_data(data):
    return hashlib.sha256(data.encode()).hexdigest()

The hashlib is a Python module that implements a common interface for many secure cryptographic hashes. print(hashlib.algorithms_available) will show all available hash algorithms in the module.

SHA-256 is a cryptographic hash function that produces a 256-bit (32 bytes as a byte is composed of 8 bits) hash value from input data of any size. It's part of the SHA-2 family and is widely used for security and data integrity verification in various applications, including digital signatures, authentication, and password hashing. SHA-256 is the cryptographic hash used by the Bitcoin blockchain.

In Python, data.encode() converts a string (which is Unicode by default) into a bytes object. This is necessary because many libraries, like hashlib, work with bytes, not strings.

hexdigest() gives us the hash as a hexadecimal string. Since each hexadecimal digit represents 4 bits, SHA-256 produces always a 64-character string.

A key property of cryptographic hashes is that even a tiny change in the input will produce a completely different hash. For example, changing a single character in a sentence results in a completely different output. Let's consider the next two strings:

str_1 = "TradingShepherd is great"
str_2 = "TradingShepherd is Great"
print("str_1 hash: ", hash_data(str_1))
print("str_2 hash: ", hash_data(str_2))

The hexadecimal hashes are:

str_1 hash: 09bdebeb9027d397b147387209221e85fb9525f1072521f1570a2b61a5474c80
str_2 hash: 9360709488a55fa78e331c08c387a36769676d66c029b684e36161dd47dcbd7a

Hash functions are deterministic, meaning the same input will always produce the same output, but they are also one-way: you can’t reverse-engineer the original input from the hash. In the context of a blockchain, hashes are used to link blocks together. Each block contains the hash of the previous block, so if any data in a previous block is changed, its hash changes and that breaks the entire chain.

As we will see in more detail later, this is one of the reasons blockchains are so tamper-resistant: any alteration becomes immediately visible because it invalidates all subsequent hashes.

A Note on Quantum Threats

While today's blockchain security relies on cryptographic functions like SHA-256, researchers are already exploring how quantum computing may impact these foundations. For example, Grover’s algorithm could reduce the complexity of finding hash collisions, potentially weakening the difficulty of Proof of Work. However, this threat remains mostly theoretical at present. The blockchain community continues to monitor quantum advances closely, and research into quantum-resistant cryptography is already underway.

Hashes Are Finite, so Are Collisions Possible?

One common question is whether two different inputs could ever produce the same hash. The short answer is yes, hashes are finite, so collisions must exist in theory. But SHA-256 is specifically engineered to make finding such a collision computationally infeasible. To date, no practical collision has been found, and the chances of randomly hitting one are astronomically low, comparable to winning the lottery trillions of times in a row.

How Do Blocks Use Hashes?

A block in the blockchain is essentially a data container. These blocks can be represented by a dictionary or a class. The block will typically contain the following fields:

Index: the position of the block in the blockchain.
Timestamp: the time at which the block was created.
Data: the actual block data (a transaction).
Previous hash: the hash of the previous block.
Hash: the hash of the current block.

Using a Python dictionary, a typical function to create a block could look as follows:

import hashlib
import time
import json

def create_block(index, data, previous_hash):

    timestamp = time.time()            
    data_string = json.dumps(data, sort_keys=True)
    block_string = f"{index}{timestamp}{data_string}{previous_hash}"
    block_hash = hashlib.sha256(block_string.encode()).hexdigest()

    block = {
        "index": index,
        "timestamp": timestamp,
        "data": data,
        "previous_hash": previous_hash,
        "hash": block_hash
    }
    return block

Serializing the block's data using JSON with sorted keys ensures that the same data always results in the same hash. This is a crucial detail when verifying a blockchain's integrity.

I will now give a simple example of how to create the first two blocks in our blockchain.

Keeping the tradition initiated by Rivest, Shamir, and Adleman in their 1978 paper "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems" (the so-called RSA paper as lays out the creation of the RSA encryption), let's consider a transaction between two characters named Alice and Bob.

Let's start by creating Block 0, known as the genesis block, which is hardcoded since it has no parent; therefore, we assign "0" as its previous hash.

# Block 0 — Genesis Block
genesis_data = {
    "message": "Genesis Block - TradingShepherd Ledger Initialized"
}
block_0 = create_block(0, genesis_data, "0")
print(json.dumps(block_0, indent=2))

# Output
{
  "index": 0,
  "timestamp": 1750531832.8070593,
  "data": {
    "message": "Genesis Block - TradingShepherd Ledger Initialized"
  },
  "previous_hash": "0",
  "hash": "556b0da8dea73096e6f828bf71caed1524a75691d72495b2feaefbee96c6d5eb"
}

Let's now build Block 1 with a sample transaction: Bob sells 1 BTC to Alice for 103500 USD.

# Block 1 — First trade transaction
trade_data = {
    "from": "Bob",
    "to": "Alice",
    "asset": "BTC",
    "amount": 1.0,
    "price_usd": 103500
}
block_1 = create_block(block_0["index"]+1, trade_data, block_0["hash"])
print(json.dumps(block_1, indent=2))

# Output
{
  "index": 1,
  "timestamp": 1750531890.3329918,
  "data": {
    "from": "Bob",
    "to": "Alice",
    "asset": "BTC",
    "amount": 1.0,
    "price_usd": 103500
  },
  "previous_hash": "556b0da8dea73096e6f828bf71caed1524a75691d72495b2feaefbee96c6d5eb",
  "hash": "fcae57f9d1b4d39d8d16b9a388d5b62389540014d13045ce6ef52bcef4099c14"
}

This example is trivial, of course, but it reflects a real pattern: each block includes the hash of the previous one, forming a chain of verifiable, tamper-resistant records.

Building a Blockchain

A blockchain is simply a list of blocks, where, as we saw before, each new block includes the hash of the previous one. By storing the blocks in order and linking them through these hashes, we create a verifiable, tamper- resistant chain of records: the blockchain.

# Initialize the blockchain with the genesis block
blockchain = [block_0]

# Append the next block
blockchain.append(block_1)

# Print the chain
for block in blockchain:
    print(json.dumps(block, indent=2))

The Role of Mining in Blockchain Integrity

Up until now, I illustrated how to create a block and link it cryptographically. But that alone isn't secure. If anyone can create and add blocks instantly, an attacker could rewrite history. This is where consensus protocols come into play.

Mining is a mechanism that makes adding a block costly, requiring real computational effort. That is, the miner must present the so-called Proof of Work (PoW) to add a block.

Proof of work is a blockchain architecture in which a computer (the miner) works to solve a cryptographic problem. Typically, PoW forces miners to find a special number (a nonce) such that the hash of the block starts with a certain number of zeros. This is computationally challenging, but it is easy to verify once it is found—the more leading zeros required, the more complex the puzzle.

def mine_block(index, data, previous_hash, difficulty=4):
    prefix = "0" * difficulty
    nonce = 0
    timestamp = time.time()
    data_string = json.dumps(data, sort_keys=True)

    while True:
        block_string = f"{index}{timestamp}{data_string}{previous_hash}{nonce}"
        block_hash = hashlib.sha256(block_string.encode()).hexdigest()
        if block_hash.startswith(prefix):
            break
        nonce += 1

    block = {
        "index": index,
        "timestamp": timestamp,
        "data": data,
        "previous_hash": previous_hash,
        "nonce": nonce,
        "hash": block_hash
    }
    return block
    
 # Mining Block 0 — Genesis block
genesis_data = { "message": "Genesis Block - TradingShepherd Ledger Initialized" }
block_0 = mine_block(0, genesis_data, "0", difficulty=4)
print(json.dumps(block_0, indent=2))  

# Output:
{
  "index": 0,
  "timestamp": 1750527539.385674,
  "data": {
    "message": "Genesis Block - TradingShepherd Ledger Initialized"
  },
  "previous_hash": "0",
  "nonce": 53580,
  "hash": "00001e4c505038858de38fdca96c92a8ab4038dbfc68a7a98dc36196b6672d84"
}

It can be seen in the previous output that adding the number 53580 (the nonce) to the block string results in a hash with 4 trailing zeros.

This computational puzzle is what protects the blockchain from tampering. An attacker trying to rewrite history would need to re-mine not just one block but all subsequent ones (each requiring a new valid PoW) faster than the rest of the network combined. This makes large-scale manipulation practically infeasible. Thus, mining is not just a block creation mechanism; it is the guardian of the blockchain’s integrity.

Creating the First Coin: the Coinbase Transaction

Where did coins come from in the first place? The answer lies in a special kind of transaction called a coinbase transaction. This is how new coins are introduced into the system. When a miner successfully mines a block, they are rewarded with newly created coins (not taken from any existing wallet) as defined by the protocol rules. This transaction has no sender and simply assigns the block reward to the miner’s address. It's the only way new coins enter the blockchain economy, and it forms the foundation of monetary issuance in PoW systems like Bitcoin.

# Block 1 — Coinbase transaction rewarding Alice for mining
reward_data = {
    "from": "network",
    "to": "Alice",
    "asset": "TSC", # TradingShepherdCoin
    "amount": 50
}
block_1 = mine_block(block_0["index"] + 1, reward_data, block_0["hash"], difficulty=4)
print(json.dumps(block_1, indent=2))

# Output:
{
  "index": 1,
  "timestamp": 1750528751.4374795,
  "data": {
    "from": "network",
    "to": "Alice",
    "asset": "TSC", 
    "amount": 50
  },
  "previous_hash": "00001e4c505038858de38fdca96c92a8ab4038dbfc68a7a98dc36196b6672d84",
  "nonce": 72886,
  "hash": "0000deb8adc81b01e8e30b94c1b4c17644040daf4a449710153e1f77a55be870"
}

In this case, the "from" field is set to "network", emphasizing that this is not a typical transaction between two users but rather a minting event. This coinbase mechanism is fundamental in most PoW blockchains, incentivizing participants to secure the network by rewarding them with newly minted coins.

In summary, in PoW blockchains, the genesis block contains no real transaction, or at most a symbolic one. The first coins are minted in Block 1, not Block 0. In that block, the miner (or the network creator, in the early days) receives coins as a block reward: the coinbase transaction. These transactions are special and they have no sender (the "from" field is often set to "network" or just omitted).

The supply mechanism is encoded into the blockchain protocol. Everyone agrees on the rules (e.g., 50 coins per block, halving every x blocks). They are the only way new coins enter the system. Nodes validate the block and the coinbase reward during consensus. If someone breaks the rule (e.g., tries to reward themselves 100 coins), their block is rejected.

Validating a Transaction vs Mining

Validating a single transaction is lightweight, a quick cryptographic check that any full node can perform. Mining, on the other hand, is intentionally hard: it requires solving a puzzle whose difficulty applies to the entire block, regardless of how many or what kind of transactions it includes. This separation ensures that transactions propagate quickly while only one valid block is added at a time, keeping the blockchain both fast to use and resistant to attacks.

A regular transaction (like Bob sending 1 BTC to Alice) moves coins that already exist from one address to another. The network must validate it to ensure that:

the sender has sufficient balance,
the transaction is properly signed, and
the same coins haven't been spent elsewhere (preventing double spending).

In contrast, a coinbase transaction is special: it creates new coins out of thin air according to protocol rules. This transaction is always the first entry in a newly mined block and is not validated in the same way. Instead of requiring inputs (like coins being spent), it simply assigns the block reward (say 50 TSC) to the miner's address. The network accepts it because it's part of the consensus rules: miners are allowed to mint a reward for themselves when they successfully add a valid block to the chain.

To prevent abuse, every full node checks that a block contains only one coinbase transaction and that the reward amount is correct (not more than allowed).

In short, regular transactions redistribute existing coins, while the coinbase transaction creates new ones, but both are part of the same cryptographically-linked, consensus-secured block.

Final Remarks

This article covered only the basics. We've walked through the core mechanics that power blockchains: blocks, hashes, proof-of-work mining, and even the first coin creation.

I have skipped over some real-world complexities, such as transaction validation logic and account state management, which could be topics for a future post.

Also, there is much more to explore, of course: peer-to-peer networking, consensus forks, smart contracts, and staking, to mention just a few.

Thanks for joining me on this journey into the ledger beneath the buzzwords.

Want deeper insights into risk and trading strategies? Subscribe to Trading Shepherd today and stay ahead of market volatility!"