A hash function is a mathematical function that produces an identical result of fixed length from a content of any length by means of a simple calculation.
There are countless developments and modifications of hash functions.
Hash functions of this type are often called cryptographic hash functions.
Hash functions are widely used in computing, for example to quickly locate data using hash tables. The size of the hash values is generally smaller than the size of the possible input data. Therefore, many input data points will share a single hash value.
Note that although the hash value “behaves” as if it were random, it is still deterministic: since for an input, its hash value will always be the same.
Bitcoin uses cryptographic hashing functions to perform a proof of work. Cryptographic hash functions (sometimes called secure hash functions) impose additional requirements over traditional hash functions:
- Unidirectional. Given the hash value, it should not be possible to calculate the input data. As will appear in the following chapter, this is a key property for the application of work evidence.
- High collision resistance. It is not calculable to find two input data points that result in the same hash value.
The Bitcoin specification uses hash functions in several places.
These include bitcoin addresses, transaction references in blocks, or block references in the blockchain. Probably the most important application within Bitcoin, the proof of work in mining (this topic is further explored in the section “Mining“).
Specifically, hash functions SHA256 and RIPEMD160 are used.
In almost all areas, the hash function is called twice, for example SHA256 (SHA256 (message)).SHA256 is part of the “Secure Hash algorithms” and was part of DSA – a standard for digital signatures – in the United States.
The National Institute of Standards and Technology (NIST) is likely to collaborate with the “United States”. National Security Agency (NSA). The SHA2 functions (on which SHA256 is also supported) represents a further development of the SHA1 functions, classified as hazardous since 2005. The result of function SHA256 always contains exactly 64 characters. RIPEMD160 in turn has a result of 40 characters and is used in Bitcoin in places where shorter results are desirable. RIPEMD160 was developed in 1996 at the University of Louvain (Belgium).
The diagram above shows an example of hash function SHA256^2.
The input value at the top of the diagram is the string “Message from Predya 1”.
The hash value in hexadecimal is “a32b08cec25 …”. Then, the same message is repeated except that a character “Message from Predya 2” is changed and the hash value becomes “d51d866921…”. Note how changing a single number in the message leads to a completely different hash value. A good hash function acts as a random mapping from the input value to the hash value.
SHA256 meets the requirement of resistance to preimage attacks, given the hash value, it is computer impractical to retrieve the message that generated it. Infeasible calculation means that there is no known algorithm that can retrieve the message. In practice the most known algorithms to break a hash function, i.e. get the message from the hash alone, are brute force algorithms that take an impractical (exponential) duration (several years or ten years).
Many cryptographic hash functions, including SHA256, are built from an older and simpler system called the compression function. A compression function operates on a fixed length input and produces an output strictly less than the input length.
The purpose of a compression function is to scramble the input bits in a deterministic but complicated way to get to the output. This is achieved by putting the original message through a series of offset operations and mixing data with random type constants.
The Merkle-Damgård construction is a system for building cryptographic hash functions that accept input data of arbitrary length using a compression function as a construction block that depend on each other.
Merkle and Damgård have shown that, while the compression function is collision resistant, the entire construction is also collision resistant. The diagram above shows the general construction. In the case of SHA256, the compression function (marked F in the Schema) works on 256 bits of data.
The compression function accepts two inputs: an intermediate hash value and an input data block. To calculate the SHA256 hash, the input data is first divided into 256-bit blocks. The end of the message is completed by zeros and the length of the message (the Schema is a simplification), The intermediate hash value is initialized to IHV (Initial Hash Value). Then the compression function is applied to each block of the message, using the hash value from the last step as the intermediate hash value in the next step.
The hash value provided by the last step is the SHA256 hash of the entire message.
The “00021” corresponds to the zeros to complete the message as well as its size (the number of characters) SHA256, as with most cryptographic hashing functions, was designed to be fast on consumer hardware. In addition, since the compression function only uses bit-level operations and additions with 32-bit registers, it is suitable for efficient hardware implementation.