|
@@ -0,0 +1,660 @@
|
|
|
+<pre>
|
|
|
+ PIP: PIP-0036
|
|
|
+ Title: RandomHash2: Enhanced GPU & ASIC Resistant Hash Algorithm
|
|
|
+ Type: Protocol
|
|
|
+ Impact: Hard-Fork
|
|
|
+ Author: Herman Schoenfeld <i><[email protected]></i>
|
|
|
+ Comments-URI: https://discord.gg/sJqcgtD (channel #pip-0036)
|
|
|
+ Status: Active
|
|
|
+ Created: 2019-07-31
|
|
|
+</pre>
|
|
|
+
|
|
|
+## Summary
|
|
|
+
|
|
|
+A major revision of RandomHash, a GPU and ASIC resistant hashing algorithm, is proposed.
|
|
|
+
|
|
|
+## Motivation
|
|
|
+
|
|
|
+Since the fundamental motivations behind RandomHash2 follow that of [RandomHash][1], only the changes and motivations of moving from RandomHash to RandomHash2 are discussed in this document. The reader is advised to familiarize themselves with [PIP-0009][1] for full context.
|
|
|
+
|
|
|
+In short, the purpose of RandomHash2 is to address the the need for faster block-header validation currently afflicting PascalCoin due to the predecessor RandomHash PoW algorithm.
|
|
|
+
|
|
|
+## Background
|
|
|
+
|
|
|
+Whilst RandomHash appears to have empirically achieved it's GPU and ASIC resistivity goals, it's computationally-heavy nature has resulted in an unforeseen consequence of slow blockchain validation.
|
|
|
+
|
|
|
+First of all, RandomHash introduces the concept of nonce-dependency between nonces. This means to evaluate a nonce, the partial evaluation of other random neighboring nonces is required. This allows RandomHash to operate in two modes, validation mode and mining mode.
|
|
|
+
|
|
|
+In validation mode, RandomHash is simply used to hash a block-header in order to verify it's the correct block in the blockchain -- just the same as how SHA2-256D does in Bitcoin.
|
|
|
+
|
|
|
+In mining mode, RandomHash is used to mine the next block by enumerating many nonces to to find an acceptable Proof-of-Work for the next block. In this mode, RandomHash exploits the partial calculations from previous rounds by resuming them in subsequent hashing rounds. In RandomHash, this mode operates at twice the hashrate as validation mode, and is the basis for the "CPU Bias" and important to achieve GPU resistivity. It is biased towards CPU's since re-using the partially calculated nonces is a serial optimization that cannot be efficiently exploited in a parallelized, batch-computation context (such as GPU mining). In other words, the optimal nonce-set is enumerated on-the-fly and cannot be pre-determined into a range for parallel mining as GPU mining does.
|
|
|
+
|
|
|
+However, on a typical 2019 desktop computer, the validation hashrate of RandomHash is approximately 20 hashes per second. At 288 blocks per day, that's 14 seconds to validate a days worth of blocks and over 1 hour to validate 1 year of blocks. Whilst multi-threading, performance tuning and other optimizations have massively optimize this performance oversight, it remains a fundamental a long-term issue that needs to be resolved.
|
|
|
+
|
|
|
+RandomHash2 offers order of magnitude improvement in both validation hashrate and mining hashrate. In RandomHash2, the same machine validates at ~300 hashes per second yet mines at ~1,000 hashes per second with far less memory-hardness. Whilst not empirically tested, these numbers suggest a 333% CPU bias.
|
|
|
+
|
|
|
+#### RandomHash vs RandomHash2 Measured Performance
|
|
|
+
|
|
|
+| Algorithm | Mean Hashrate (H/s) | Mean Mem Per Hash (b) | Min Mem Per Hash (b) | Max Mem Per Hash (b) | Sample Std Dev. (b) |
|
|
|
+| :------------------------- | :------------------------ | :-------------------- | :------------------- | :------------------- | :------------------ |
|
|
|
+| RandomHash (validation) | 23 | 5,018,876 | 5,018,628 | 5,019,116 | 83.86 |
|
|
|
+| RandomHash (mining) | 44 | 7,528,288 | 7,527,872 | 7,528,732 | 136 |
|
|
|
+| RandomHash2 (validation) | 309 | 16,719 | 1,380 | 221,536 | 29,420 |
|
|
|
+| **RandomHash2 (mining)** | 1,051 | 16,693 | 1,312 | 251,104 | 29,374 |
|
|
|
+
|
|
|
+_**Machine**: AMD FX-8150 8 Core 3.60 Ghz utilizing 1 thread_
|
|
|
+
|
|
|
+**NOTE:**
|
|
|
+- RandomHash2 (validation) is the mode used to validate blocks during load and sync.
|
|
|
+- **RandomHash2 (mining)** is the mode the mining network will use on V5 activation.
|
|
|
+
|
|
|
+## Specification
|
|
|
+
|
|
|
+RandomHash2 is similarly structured to RandomHash, but has some key differences. The below overview outlines the full algorithm.
|
|
|
+
|
|
|
+### Overview
|
|
|
+
|
|
|
+1. Hashing a nonce requires ```N``` iterations (called levels), which are evaluated recursively;
|
|
|
+2. ```N``` varies per nonce in a non-deterministic manner between ```MIN_N=2``` and ```MAX_N=4```, inclusive;
|
|
|
+3. Each level in (1) also depends on ```J``` neighboring nonces, determined randomly and non-deterministically;
|
|
|
+4. The value ```J``` is restricted to ```MIN_J=2``` and ```MAX_J=4```;
|
|
|
+5. Each round selects a random hash function from a set of 18 well-known hash algorithms;
|
|
|
+6. The input digest hashed at level ```x``` is a compression of the transitive closure of the hash outputs of the it's prior-levels (1) and it's neighbouring nonce's prior-levels (3);
|
|
|
+7. The output of every level is expanded for (low) memory-hardness;
|
|
|
+8. Randomness is generated using ```Mersenne Twister``` algorithm;
|
|
|
+9. Randomness is always seeded using last DWORD of a cryptographic hash algorithm output;
|
|
|
+10. The first input is pre-hashed as using ```SHA2-256```;
|
|
|
+11. The last output is post-hashed using ```SHA2-256```;
|
|
|
+
|
|
|
+### Differences to RandomHash 1
|
|
|
+
|
|
|
+The primary differences to predecessor are:
|
|
|
+
|
|
|
+** Constants
|
|
|
+```pascal
|
|
|
+ MIN_N = 2;
|
|
|
+ MAX_N = 4;
|
|
|
+ MIN_J = 0;
|
|
|
+ MAX_J = 8;
|
|
|
+ M = 256;
|
|
|
+```
|
|
|
+
|
|
|
+**N is now random per nonce**
|
|
|
+```pascal
|
|
|
+ Result := (ARound = MAX_N) OR ((ARound >= MIN_N) AND (GetLastDWordLE(LOutput) MOD MAX_N = 0));
|
|
|
+```
|
|
|
+
|
|
|
+**Block Header is pre-hashed before used (last DWORD is seed)**
|
|
|
+```pascal
|
|
|
+ if ARound = 1 then begin
|
|
|
+ LRoundInput := FHashAlg[0].ComputeBytes(ABlockHeader).GetBytes;
|
|
|
+ LSeed := GetLastDWordLE( LRoundInput );
|
|
|
+ LGen.Initialize(LSeed);
|
|
|
+```
|
|
|
+
|
|
|
+**Random number of dependent neighbour nonces**
|
|
|
+```pascal
|
|
|
+ LNumNeighbours := (LGen.NextUInt32 MOD (MAX_J - MIN_J)) + MIN_J;
|
|
|
+ for i := 1 to LNumNeighbours do begin
|
|
|
+```
|
|
|
+
|
|
|
+**Neighbouring nonces are only cached when fully evaluated**
|
|
|
+```pascal
|
|
|
+LNeighbourNonceHeader := SetLastDWordLE(ABlockHeader, LGen.NextUInt32); // change nonce
|
|
|
+LNeighbourWasLastRound := CalculateRoundOutputs(LNeighbourNonceHeader, ARound - 1, LNeighborOutputs);
|
|
|
+LRoundOutputs.AddRange(LNeighborOutputs);
|
|
|
+
|
|
|
+if LNeighbourWasLastRound then begin
|
|
|
+ LCachedHash.Nonce := GetLastDWordLE(LNeighbourNonceHeader);
|
|
|
+ LCachedHash.Header := LNeighbourNonceHeader;
|
|
|
+ LCachedHash.Hash := ComputeVeneerRound(LNeighborOutputs);
|
|
|
+ // if header is different (other than nonce), clear cache
|
|
|
+ if NOT BytesEqual(FCachedHeaderTemplate, LCachedHash.Header, 0, 32 - 4) then begin
|
|
|
+ FCachedHashes.Clear;
|
|
|
+ FCachedHeaderTemplate := SetLastDWordLE(LCachedHash.Header, 0);
|
|
|
+ end;
|
|
|
+ FCachedHashes.Add(LCachedHash);
|
|
|
+end;
|
|
|
+```
|
|
|
+
|
|
|
+### Full Code
|
|
|
+
|
|
|
+```pascal
|
|
|
+
|
|
|
+ TRandomHash2 = class sealed(TObject)
|
|
|
+ const
|
|
|
+ MIN_N = 2; // Min-number of hashing rounds required to compute a nonce, min total rounds = J^MIN_N
|
|
|
+ MAX_N = 4; // Max-number of hashing rounds required to compute a nonce, max total rounds = J^MAX_N
|
|
|
+ MIN_J = 0; // Min-number of dependent neighbouring nonces required to evaluate a nonce round
|
|
|
+ MAX_J = 8; // Max-number of dependent neighbouring nonces required to evaluate a nonce round
|
|
|
+ M = 256; // The memory expansion unit (in bytes), max total bytes per nonce = M * ((MAX_J+1)^MAX_N (MAX_N-2) + 2)
|
|
|
+ NUM_HASH_ALGO = 18;
|
|
|
+
|
|
|
+ public type
|
|
|
+
|
|
|
+ TCachedHash = record
|
|
|
+ Nonce : UInt32;
|
|
|
+ Header : TBytes;
|
|
|
+ Hash : TBytes;
|
|
|
+ end;
|
|
|
+
|
|
|
+ private
|
|
|
+ FMurmurHash3_x86_32 : IHash;
|
|
|
+ FHashAlg : array[0..17] of IHash; // declared here to avoid race-condition during mining
|
|
|
+ FCachedHeaderTemplate : TBytes;
|
|
|
+ FCachedHashes : TList<TCachedHash>;
|
|
|
+
|
|
|
+ function GetCachedHashes : TArray<TCachedHash>; inline;
|
|
|
+ function ContencateByteArrays(const AChunk1, AChunk2: TBytes): TBytes; inline;
|
|
|
+ function MemTransform1(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform2(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform3(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform4(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform5(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform6(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform7(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function MemTransform8(const AChunk: TBytes): TBytes; inline;
|
|
|
+ function Expand(const AInput: TBytes; AExpansionFactor: Int32; ASeed : UInt32) : TBytes;
|
|
|
+ function Compress(const AInputs: TArray<TBytes>; ASeed : UInt32): TBytes; inline;
|
|
|
+ function SetLastDWordLE(const ABytes: TBytes; AValue: UInt32): TBytes; inline;
|
|
|
+ function GetLastDWordLE(const ABytes: TBytes) : UInt32; inline;
|
|
|
+ function ComputeVeneerRound(const ARoundOutputs : TArray<TBytes>) : TBytes; inline;
|
|
|
+ function CalculateRoundOutputs(const ABlockHeader: TBytes; ARound: Int32; out ARoundOutputs : TArray<TBytes>) : Boolean; overload;
|
|
|
+ public
|
|
|
+ constructor Create;
|
|
|
+ destructor Destroy; override;
|
|
|
+ property CachedHashes : TArray<TCachedHash> read GetCachedHashes;
|
|
|
+ function HasCachedHash : Boolean; inline;
|
|
|
+ function PopCachedHash : TCachedHash; inline;
|
|
|
+ function PeekCachedHash : TCachedHash; inline;
|
|
|
+ function TryHash(const ABlockHeader: TBytes; AMaxRound : UInt32; out AHash : TBytes) : Boolean;
|
|
|
+ function Hash(const ABlockHeader: TBytes): TBytes; overload; inline;
|
|
|
+ class function Compute(const ABlockHeader: TBytes): TBytes; overload; static; inline;
|
|
|
+ end;
|
|
|
+
|
|
|
+ { ERandomHash2 }
|
|
|
+
|
|
|
+ ERandomHash2 = class(Exception);
|
|
|
+
|
|
|
+resourcestring
|
|
|
+ SUnSupportedHash = 'Unsupported Hash Selected';
|
|
|
+ SInvalidRound = 'Round must be between 0 and N inclusive';
|
|
|
+ SOverlappingArgs = 'Overlapping read/write regions';
|
|
|
+ SBufferTooSmall = 'Buffer too small to apply memory transform';
|
|
|
+ SBlockHeaderTooSmallForNonce = 'Buffer too small to contain nonce';
|
|
|
+
|
|
|
+implementation
|
|
|
+
|
|
|
+uses UMemory, URandomHash;
|
|
|
+
|
|
|
+{ TRandomHash2 }
|
|
|
+
|
|
|
+constructor TRandomHash2.Create;
|
|
|
+begin
|
|
|
+ FMurmurHash3_x86_32 := THashFactory.THash32.CreateMurmurHash3_x86_32();
|
|
|
+ SetLength(Self.FCachedHeaderTemplate, 0);
|
|
|
+ FCachedHashes := TList<TCachedHash>.Create;
|
|
|
+ FHashAlg[0] := THashFactory.TCrypto.CreateSHA2_256();
|
|
|
+ FHashAlg[1] := THashFactory.TCrypto.CreateSHA2_384();
|
|
|
+ FHashAlg[2] := THashFactory.TCrypto.CreateSHA2_512();
|
|
|
+ FHashAlg[3] := THashFactory.TCrypto.CreateSHA3_256();
|
|
|
+ FHashAlg[4] := THashFactory.TCrypto.CreateSHA3_384();
|
|
|
+ FHashAlg[5] := THashFactory.TCrypto.CreateSHA3_512();
|
|
|
+ FHashAlg[6] := THashFactory.TCrypto.CreateRIPEMD160();
|
|
|
+ FHashAlg[7] := THashFactory.TCrypto.CreateRIPEMD256();
|
|
|
+ FHashAlg[8] := THashFactory.TCrypto.CreateRIPEMD320();
|
|
|
+ FHashAlg[9] := THashFactory.TCrypto.CreateBlake2B_512();
|
|
|
+ FHashAlg[10] := THashFactory.TCrypto.CreateBlake2S_256();
|
|
|
+ FHashAlg[11] := THashFactory.TCrypto.CreateTiger2_5_192();
|
|
|
+ FHashAlg[12] := THashFactory.TCrypto.CreateSnefru_8_256();
|
|
|
+ FHashAlg[13] := THashFactory.TCrypto.CreateGrindahl512();
|
|
|
+ FHashAlg[14] := THashFactory.TCrypto.CreateHaval_5_256();
|
|
|
+ FHashAlg[15] := THashFactory.TCrypto.CreateMD5();
|
|
|
+ FHashAlg[16] := THashFactory.TCrypto.CreateRadioGatun32();
|
|
|
+ FHashAlg[17] := THashFactory.TCrypto.CreateWhirlPool();
|
|
|
+end;
|
|
|
+
|
|
|
+destructor TRandomHash2.Destroy;
|
|
|
+var i : integer;
|
|
|
+begin
|
|
|
+ FCachedHashes.Clear;
|
|
|
+ FreeAndNil(FCachedHashes);
|
|
|
+ FMurmurHash3_x86_32 := nil;
|
|
|
+ for i := Low(FHashAlg) to High(FHashAlg) do
|
|
|
+ FHashAlg[i] := nil;
|
|
|
+ inherited Destroy;
|
|
|
+end;
|
|
|
+
|
|
|
+class function TRandomHash2.Compute(const ABlockHeader: TBytes): TBytes;
|
|
|
+var
|
|
|
+ LHasher : TRandomHash2;
|
|
|
+ LDisposables : TDisposables;
|
|
|
+begin
|
|
|
+ LHasher := LDisposables.AddObject( TRandomHash2.Create ) as TRandomHash2;
|
|
|
+ Result := LHasher.Hash(ABlockHeader);
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.TryHash(const ABlockHeader: TBytes; AMaxRound : UInt32; out AHash : TBytes) : Boolean;
|
|
|
+var
|
|
|
+ LOutputs: TArray<TBytes>;
|
|
|
+ LSeed: UInt32;
|
|
|
+begin
|
|
|
+ if NOT CalculateRoundOutputs(ABlockHeader, AMaxRound, LOutputs) then
|
|
|
+ Exit(False);
|
|
|
+ AHash := ComputeVeneerRound(LOutputs);
|
|
|
+ Result := True;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.Hash(const ABlockHeader: TBytes): TBytes;
|
|
|
+begin
|
|
|
+ if NOT TryHash(ABlockHeader, MAX_N, Result) then
|
|
|
+ raise ERandomHash2.Create('Internal Error: 984F52997131417E8D63C43BD686F5B2'); // Should have found final round!
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.ComputeVeneerRound(const ARoundOutputs : TArray<TBytes>) : TBytes;
|
|
|
+var
|
|
|
+ LSeed : UInt32;
|
|
|
+begin
|
|
|
+ LSeed := GetLastDWordLE(ARoundOutputs[High(ARoundOutputs)]);
|
|
|
+ // Final "veneer" round of RandomHash is a SHA2-256 of compression of prior round outputs
|
|
|
+ Result := FHashAlg[0].ComputeBytes(Compress(ARoundOutputs, LSeed)).GetBytes;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.CalculateRoundOutputs(const ABlockHeader: TBytes; ARound: Int32; out ARoundOutputs : TArray<TBytes>) : Boolean;
|
|
|
+var
|
|
|
+ LRoundOutputs: TList<TBytes>;
|
|
|
+ LNeighbourWasLastRound : Boolean;
|
|
|
+ LSeed, LNumNeighbours: UInt32;
|
|
|
+ LGen: TMersenne32;
|
|
|
+ LRoundInput, LNeighbourNonceHeader, LOutput : TBytes;
|
|
|
+ LCachedHash : TCachedHash;
|
|
|
+ LParentOutputs, LNeighborOutputs, LToArray, LBuffs2: TArray<TBytes>;
|
|
|
+ LHashFunc: IHash;
|
|
|
+ i: Int32;
|
|
|
+ LDisposables : TDisposables;
|
|
|
+ LBuff : TBytes;
|
|
|
+begin
|
|
|
+ if (ARound < 1) or (ARound > MAX_N) then
|
|
|
+ raise EArgumentOutOfRangeException.CreateRes(@SInvalidRound);
|
|
|
+
|
|
|
+ LRoundOutputs := LDisposables.AddObject( TList<TBytes>.Create() ) as TList<TBytes>;
|
|
|
+ LGen := LDisposables.AddObject( TMersenne32.Create(0) ) as TMersenne32;
|
|
|
+ if ARound = 1 then begin
|
|
|
+ LRoundInput := FHashAlg[0].ComputeBytes(ABlockHeader).GetBytes;
|
|
|
+ LSeed := GetLastDWordLE( LRoundInput );
|
|
|
+ LGen.Initialize(LSeed);
|
|
|
+ end else begin
|
|
|
+ if CalculateRoundOutputs(ABlockHeader, ARound - 1, LParentOutputs) = True then begin
|
|
|
+ // Previous round was the final round, so just return it's value
|
|
|
+ ARoundOutputs := LParentOutputs;
|
|
|
+ Exit(True);
|
|
|
+ end;
|
|
|
+
|
|
|
+ // Add parent round outputs to this round outputs
|
|
|
+ LSeed := GetLastDWordLE( LParentOutputs[High(LParentOutputs)] );
|
|
|
+ LGen.Initialize(LSeed);
|
|
|
+ LRoundOutputs.AddRange( LParentOutputs );
|
|
|
+
|
|
|
+ // Add neighbouring nonce outputs to this round outputs
|
|
|
+ LNumNeighbours := (LGen.NextUInt32 MOD (MAX_J - MIN_J)) + MIN_J;
|
|
|
+ for i := 1 to LNumNeighbours do begin
|
|
|
+ LNeighbourNonceHeader := SetLastDWordLE(ABlockHeader, LGen.NextUInt32); // change nonce
|
|
|
+ LNeighbourWasLastRound := CalculateRoundOutputs(LNeighbourNonceHeader, ARound - 1, LNeighborOutputs);
|
|
|
+ LRoundOutputs.AddRange(LNeighborOutputs);
|
|
|
+
|
|
|
+ // If neighbour was a fully evaluated nonce, cache it for re-use
|
|
|
+ if LNeighbourWasLastRound then begin
|
|
|
+ LCachedHash.Nonce := GetLastDWordLE(LNeighbourNonceHeader);
|
|
|
+ LCachedHash.Header := LNeighbourNonceHeader;
|
|
|
+ LCachedHash.Hash := ComputeVeneerRound(LNeighborOutputs);
|
|
|
+ // if header is different (other than nonce), clear cache
|
|
|
+ if NOT BytesEqual(FCachedHeaderTemplate, LCachedHash.Header, 0, 32 - 4) then begin
|
|
|
+ FCachedHashes.Clear;
|
|
|
+ FCachedHeaderTemplate := SetLastDWordLE(LCachedHash.Header, 0);
|
|
|
+ end;
|
|
|
+ FCachedHashes.Add(LCachedHash);
|
|
|
+ end;
|
|
|
+ end;
|
|
|
+ // Compress the parent/neighbouring outputs to form this rounds input
|
|
|
+ LRoundInput := Compress( LRoundOutputs.ToArray, LGen.NextUInt32 );
|
|
|
+ end;
|
|
|
+
|
|
|
+ // Select a random hash function and hash the input to find the output
|
|
|
+ LHashFunc := FHashAlg[LGen.NextUInt32 mod NUM_HASH_ALGO];
|
|
|
+ LOutput := LHashFunc.ComputeBytes(LRoundInput).GetBytes;
|
|
|
+
|
|
|
+ // Memory-expand the output, add to output list and return output list
|
|
|
+ LOutput := Expand(LOutput, MAX_N - ARound, LGen.NextUInt32);
|
|
|
+ LRoundOutputs.Add(LOutput);
|
|
|
+ ARoundOutputs := LRoundOutputs.ToArray;
|
|
|
+
|
|
|
+ // Determine if final round
|
|
|
+ Result := (ARound = MAX_N) OR ((ARound >= MIN_N) AND (GetLastDWordLE(LOutput) MOD MAX_N = 0));
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.SetLastDWordLE(const ABytes: TBytes; AValue: UInt32): TBytes;
|
|
|
+var
|
|
|
+ ABytesLength : Integer;
|
|
|
+begin
|
|
|
+ // Clone the original header
|
|
|
+ Result := Copy(ABytes);
|
|
|
+
|
|
|
+ // If digest not big enough to contain a nonce, just return the clone
|
|
|
+ ABytesLength := Length(ABytes);
|
|
|
+ if ABytesLength < 4 then
|
|
|
+ exit;
|
|
|
+
|
|
|
+ // Overwrite the nonce in little-endian
|
|
|
+ Result[ABytesLength - 4] := Byte(AValue);
|
|
|
+ Result[ABytesLength - 3] := (AValue SHR 8) AND 255;
|
|
|
+ Result[ABytesLength - 2] := (AValue SHR 16) AND 255;
|
|
|
+ Result[ABytesLength - 1] := (AValue SHR 24) AND 255;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.GetLastDWordLE(const ABytes: TBytes) : UInt32;
|
|
|
+var LLen : Integer;
|
|
|
+begin
|
|
|
+ LLen := Length(ABytes);
|
|
|
+ if LLen < 4 then
|
|
|
+ raise EArgumentOutOfRangeException.CreateRes(@SBlockHeaderTooSmallForNonce);
|
|
|
+
|
|
|
+ // Last 4 bytes are nonce (LE)
|
|
|
+ Result := ABytes[LLen - 4] OR
|
|
|
+ (ABytes[LLen - 3] SHL 8) OR
|
|
|
+ (ABytes[LLen - 2] SHL 16) OR
|
|
|
+ (ABytes[LLen - 1] SHL 24);
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.Compress(const AInputs : TArray<TBytes>; ASeed : UInt32): TBytes;
|
|
|
+var
|
|
|
+ i: Int32;
|
|
|
+ LSource: TBytes;
|
|
|
+ LGen: TMersenne32;
|
|
|
+ LDisposables : TDisposables;
|
|
|
+begin
|
|
|
+ SetLength(Result, 100);
|
|
|
+ LGen := LDisposables.AddObject( TMersenne32.Create( ASeed ) ) as TMersenne32;
|
|
|
+ for i := 0 to 99 do
|
|
|
+ begin
|
|
|
+ LSource := AInputs[LGen.NextUInt32 mod Length(AInputs)];
|
|
|
+ Result[i] := LSource[LGen.NextUInt32 mod Length(LSource)];
|
|
|
+ end;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.GetCachedHashes : TArray<TCachedHash>;
|
|
|
+begin
|
|
|
+ Result := FCachedHashes.ToArray;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.HasCachedHash : Boolean;
|
|
|
+begin
|
|
|
+ Result := FCachedHashes.Count > 0;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.PopCachedHash : TCachedHash;
|
|
|
+begin
|
|
|
+ Result := FCachedHashes.Last;
|
|
|
+ FCachedHashes.Delete(FCachedHashes.Count - 1);
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.PeekCachedHash : TCachedHash;
|
|
|
+begin
|
|
|
+ Result := FCachedHashes.Last;
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.ContencateByteArrays(const AChunk1, AChunk2: TBytes): TBytes;
|
|
|
+begin
|
|
|
+ SetLength(Result, Length(AChunk1) + Length(AChunk2));
|
|
|
+ Move(AChunk1[0], Result[0], Length(AChunk1));
|
|
|
+ Move(AChunk2[0], Result[Length(AChunk1)], Length(AChunk2));
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform1(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength : UInt32;
|
|
|
+ LState : UInt32;
|
|
|
+begin
|
|
|
+ // Seed XorShift32 with last byte
|
|
|
+ LState := GetLastDWordLE(AChunk);
|
|
|
+ if LState = 0 then
|
|
|
+ LState := 1;
|
|
|
+
|
|
|
+ // Select random bytes from input using XorShift32 RNG
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := 0 to High(AChunk) do
|
|
|
+ Result[i] := AChunk[TXorShift32.Next(LState) MOD LChunkLength];
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform2(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength, LPivot, LOdd: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ LPivot := LChunkLength SHR 1;
|
|
|
+ LOdd := LChunkLength MOD 2;
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ Move(AChunk[LPivot + LOdd], Result[0], LPivot);
|
|
|
+ Move(AChunk[0], Result[LPivot + LOdd], LPivot);
|
|
|
+ // Set middle-byte for odd-length arrays
|
|
|
+ if LOdd = 1 then
|
|
|
+ Result[LPivot] := AChunk[LPivot];
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform3(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := 0 to High(AChunk) do
|
|
|
+ Result[i] := AChunk[LChunkLength - i - 1];
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform4(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength, LPivot, LOdd: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ LPivot := LChunkLength SHR 1;
|
|
|
+ LOdd := LChunkLength MOD 2;
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := 0 to Pred(LPivot) do
|
|
|
+ begin
|
|
|
+ Result[(i * 2)] := AChunk[i];
|
|
|
+ Result[(i * 2) + 1] := AChunk[i + LPivot + LOdd];
|
|
|
+ end;
|
|
|
+ // Set final byte for odd-lengths
|
|
|
+ if LOdd = 1 THEN
|
|
|
+ Result[High(Result)] := AChunk[LPivot];
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform5(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength, LPivot, LOdd: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ LPivot := LChunkLength SHR 1;
|
|
|
+ LOdd := LChunkLength MOD 2;
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := Low(AChunk) to Pred(LPivot) do
|
|
|
+ begin
|
|
|
+ Result[(i * 2)] := AChunk[i + LPivot + LOdd];
|
|
|
+ Result[(i * 2) + 1] := AChunk[i];
|
|
|
+ end;
|
|
|
+ // Set final byte for odd-lengths
|
|
|
+ if LOdd = 1 THEN
|
|
|
+ Result[High(Result)] := AChunk[LPivot];
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform6(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength, LPivot, LOdd: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ LPivot := LChunkLength SHR 1;
|
|
|
+ LOdd := LChunkLength MOD 2;
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := 0 to Pred(LPivot) do
|
|
|
+ begin
|
|
|
+ Result[i] := AChunk[(i * 2)] xor AChunk[(i * 2) + 1];
|
|
|
+ Result[i + LPivot + LOdd] := AChunk[i] xor AChunk[LChunkLength - i - 1];
|
|
|
+ end;
|
|
|
+ // Set middle-byte for odd-lengths
|
|
|
+ if LOdd = 1 THEN
|
|
|
+ Result[LPivot] := AChunk[High(AChunk)];
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform7(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := 0 to High(AChunk) do
|
|
|
+ Result[i] := TBits.RotateLeft8(AChunk[i], LChunkLength - i);
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.MemTransform8(const AChunk: TBytes): TBytes;
|
|
|
+var
|
|
|
+ i, LChunkLength: Int32;
|
|
|
+begin
|
|
|
+ LChunkLength := Length(AChunk);
|
|
|
+ SetLength(Result, LChunkLength);
|
|
|
+ for i := 0 to High(AChunk) do
|
|
|
+ Result[i] := TBits.RotateRight8(AChunk[i], LChunkLength - i);
|
|
|
+end;
|
|
|
+
|
|
|
+function TRandomHash2.Expand(const AInput: TBytes; AExpansionFactor: Int32; ASeed : UInt32): TBytes;
|
|
|
+var
|
|
|
+ LSize, LBytesToAdd: Int32;
|
|
|
+ LOutput, LNextChunk: TBytes;
|
|
|
+ LRandom: UInt32;
|
|
|
+ LGen: TMersenne32;
|
|
|
+ LDisposables : TDisposables;
|
|
|
+begin
|
|
|
+ LGen := LDisposables.AddObject( TMersenne32.Create (ASeed) ) as TMersenne32;
|
|
|
+ LSize := Length(AInput) + (AExpansionFactor * M);
|
|
|
+ LOutput := Copy(AInput);
|
|
|
+ LBytesToAdd := LSize - Length(AInput);
|
|
|
+
|
|
|
+ while LBytesToAdd > 0 do
|
|
|
+ begin
|
|
|
+ LNextChunk := Copy(LOutput);
|
|
|
+ if Length(LNextChunk) > LBytesToAdd then
|
|
|
+ SetLength(LNextChunk, LBytesToAdd);
|
|
|
+
|
|
|
+ LRandom := LGen.NextUInt32;
|
|
|
+ case LRandom mod 8 of
|
|
|
+ 0: LOutput := ContencateByteArrays(LOutput, MemTransform1(LNextChunk));
|
|
|
+ 1: LOutput := ContencateByteArrays(LOutput, MemTransform2(LNextChunk));
|
|
|
+ 2: LOutput := ContencateByteArrays(LOutput, MemTransform3(LNextChunk));
|
|
|
+ 3: LOutput := ContencateByteArrays(LOutput, MemTransform4(LNextChunk));
|
|
|
+ 4: LOutput := ContencateByteArrays(LOutput, MemTransform5(LNextChunk));
|
|
|
+ 5: LOutput := ContencateByteArrays(LOutput, MemTransform6(LNextChunk));
|
|
|
+ 6: LOutput := ContencateByteArrays(LOutput, MemTransform7(LNextChunk));
|
|
|
+ 7: LOutput := ContencateByteArrays(LOutput, MemTransform8(LNextChunk));
|
|
|
+ end;
|
|
|
+ LBytesToAdd := LBytesToAdd - Length(LNextChunk);
|
|
|
+ end;
|
|
|
+ Result := LOutput;
|
|
|
+end;
|
|
|
+
|
|
|
+end.
|
|
|
+
|
|
|
+```
|
|
|
+
|
|
|
+### Analysis
|
|
|
+
|
|
|
+#### Cryptographic Strength
|
|
|
+
|
|
|
+Since hashing starts and ends with a ```SHA2-256``` the cryptographic strength of RandomHash2 is **at least** that of ```SHA2-256D``` as used in Bitcoin. Even if one were to assume the data transformations in between the start/end were cryptographically insecure, it wouldn't change this minimum security guarantee.
|
|
|
+
|
|
|
+However, the transformations in between are not weak and involve the use of 18 other cryptographically strong hash algorithms. As a result, RandomHash2 is orders of magnitude more stronger than standard cryptographic hash algorithms since they are combined in random, non-determinstic ways. However this achievement is paid for by significant performance overhead (which is intentional).
|
|
|
+
|
|
|
+However, within the 18 hash algorithms used, some are considered "cryptographically weak" such as MD5. The use of some weak algorithms is inconsequential to overall security since their purpose is not to add to security but to computational complexity to prevent ASIC manufacturing.
|
|
|
+
|
|
|
+In order to get a grasp of the minmum security provided by RandomHash2, consider it's high-level algorithmic structure as essentially a set of nested hashes as follows:
|
|
|
+```
|
|
|
+RandomHash2(Data) = SHA2_256( H1( H2( .... H_N( SHA2_256(DATA)) ...) )
|
|
|
+where
|
|
|
+ H_i = a randomly selected hash function based on the output of H_(i-1)
|
|
|
+ N = a random number determined by the nonce and neighbouring nonces (indeterminable but bound)
|
|
|
+```
|
|
|
+
|
|
|
+It follows that the weakest possible RandomHash2 for some ```WeakestDigest``` would comprise of 2 levels of evaluation (``MIN_N=2```) with each of those 2 levels having 0 nieghbouring nonce dependencies (```MIN_J```). Also, we assume the hash algorithms are ```MD5``` for all levels, as it is considered weakest of the 18 possible algorithms. In this case,
|
|
|
+```
|
|
|
+RandomHash2(WeakestDigest) = SHA2_256( MD5 ( MD5 ( SHA2_256( WeakestDigest ) ) ) )
|
|
|
+```
|
|
|
+
|
|
|
+Clearly the above is still far stronger than the typical SHA2-256D used in almost all cryptocurrencies
|
|
|
+```pascal
|
|
|
+SHA2-256D(WeakestDigest) = SHA2-256 ( SHA2-256 ( WeakestDigest ) )
|
|
|
+```
|
|
|
+
|
|
|
+
|
|
|
+In addition to the above, RandomHash2 internally transforms data using expansions and compressions which are themselves cryptographically secure. As a result, it's clear that RandomHash2's cryptographic strength is at least as strong as Bitcoin's ```SHA2-256D``` with likelihood of being orders of magnitude stronger.
|
|
|
+
|
|
|
+### Nonce Scanning Attack
|
|
|
+
|
|
|
+In RandomHash2, the number of levels ```N``` required to mine a nonce is now random and varies per nonce in a non-deterministic manner. The randomization of ```N``` introduces new level of randomness and executive decision-making into the core algorithm in order to enhance GPU and ASIC resistivity. However, it introduces a new attack vector called "Nonce Scanning Attack". In this attack, a miner can implement a simplified miner that only tries to mine "simple nonces" that require few levels to evaluate whilst rejecting "complex nonces" that require more levels to evaluate. By reducing the number of computations required to evaluate a nonce and simplifying the algorithm implementation, a higher hashrate could be achieved and an ASIC implementation made viable.
|
|
|
+
|
|
|
+To thwart this attack, RandomHash2 restricts the range of values ```N``` can take to be between ```MIN_N = 2``` and ```MAX_N = 4```, inclusive.
|
|
|
+
|
|
|
+By forcing a nonce evaluation to have at least 2 levels of computation, the miner necessarily requires the full algorithm implementation which prevents simplified ASIC miners. Also, since each nonce requires at least 2 levels of evaluation, and each of those levels depends on at least 2 more nonces evaluated to at least 1 level each, the number of computations saved by nonce-scanning is balanced by the higher pre-computed cached nonces a miner has by honestly mining without nonce-scanning (due to higher number of dependent neighboring nonces).
|
|
|
+
|
|
|
+In order to determine if this is true, an empirical nonce-scanning attack was conducted. The below table shows empirical results from nonce-scanning ```N=MIN_N``` to ```N=MAX_N```.
|
|
|
+
|
|
|
+| N | Mean Hashrate (H/s) | Mean Mem Per Hash (b) | Min Mem Per Hash (b) | Max Mem Per Hash (b) | Sample Std Dev. (b) |
|
|
|
+| :------ | :------------------------ | :-------------------- | :------------------- | :------------------- | :------------------ |
|
|
|
+| 2 (min) | 240 | 4,175 | 1,312 | 7,184 | 1,854 |
|
|
|
+| 3 | 651 | 5,984 | 1,312 | 49,436 | 6,293 |
|
|
|
+| 4 (max) | 1,051 | 16,693 | 1,312 | 251,104 | 29,374 |
|
|
|
+
|
|
|
+_**Machine**: AMD FX-8150 8 Core 3.60 Ghz utilizing 1 thread_
|
|
|
+
|
|
|
+As the above table shows, nonce-scanning (via CPU) yields a hashrate penalty, not a benefit. In the opinion of the author, it is unlikely a future implementation optimization would necessarily change this result since it would improve all scanning levels proportionally. However, a line of inquiry is to investigate whether or not the reduced memory-hardness may yield a benefit for GPU-based nonce-scanning.
|
|
|
+
|
|
|
+#### CPU Bias
|
|
|
+
|
|
|
+The RandomHash2 algorithm, like it's predecessor, is inherently biased towards CPU mining due to it's highly serial nature, use of non-deterministic recursion and executive-decision making. In addition, RandomHash2 can now evaluate many nonces when evaluating one, allowing CPU miners to enumerate the optimal nonce-set on the fly. Testing shows a 300% - 400% advantage for serial mining over batch mining, which indicates a proportional CPU bias.
|
|
|
+
|
|
|
+#### Memory Complexity
|
|
|
+
|
|
|
+RandomHash is memory-light in order to support low-end hardware. A CPU will only need 300KB of memory to verify a hash. Unlike RandomHash, mining does not consume additional memory since the cached nonces are fully evaluated.
|
|
|
+
|
|
|
+#### GPU Resistance
|
|
|
+
|
|
|
+GPU performance is generally driven by parallel execution of identical non-branching code-blocks across private regions of memory. RandomHash2 is a highly serial and recursive algorithm requiring a lot of executive-decision making, and decisions driven by Mersenne Twister random number generator. These characteristics make GPU implementations quite tedious and inefficient. Since the predecessor algorithm was shown to be GPU resistant, and this algorithm only exarcerbates these characteristics (except for memory hardness), it is expected that GPU resistance is maintained, although not confirmed as of the writing of this PIP.
|
|
|
+
|
|
|
+#### ASIC Resistance
|
|
|
+
|
|
|
+ASIC-resistance is fundamentally achieved on an economic basis. Due to the use of 18 sub-hash algorithms and the use of recursion in the core algorithm, it is expected that the R&D costs of a RandomHash ASIC will mirror that of building 18 independent ASICs rather than 1. This moves the economic viability goal-posts away by an order of magnitude. For as long as the costs of general ASIC development remain in relative parity to the costs of consumer grade CPUs as of today, a RandomHash ASIC will always remain "not worth it" for a "rational economic actor".
|
|
|
+
|
|
|
+Furthermore, RandomHash offers a wide ASIC-breaking attack surface. This is due to it's branch-heavy, serial, recursive nature and heavy dependence on sub-algorithms. By making minor tweaks to the high-level algorithm, or changing a sub-algorithm, an ASIC design can be mostly invalidated and sent back the drawing board with minimal updates to the CPU miner.
|
|
|
+
|
|
|
+This is true since ASIC designs tend to mirror the assembly structure of an algorithm rather than the high-level algorithm itself. Thus by making relatively minor tweaks at the high-level that necessarily result in significant low-level assembly restructuring, an ASIC design is made obsolete. So long as this "tweak-to-break-ASIC" policy is maintained by the PascalCoin Developers and Community, ASIC resistance is guaranteed.
|
|
|
+
|
|
|
+### Hard-Fork Activation
|
|
|
+
|
|
|
+The PIP requires a hard-fork activation involving various aspects discussed below.
|
|
|
+
|
|
|
+## Rationale
|
|
|
+
|
|
|
+Aside from a hash algorithm change, the only other known option to resolve slow validation time is to ship the client with precomputed lookup tables to speed up verification. This has already been done for RandomHash1 periods, but is not a viable option long-term.
|
|
|
+
|
|
|
+## Backwards Compatibility
|
|
|
+
|
|
|
+This PIP is not backwards compatible and requires a hard-fork activation. Previous hashing algorithm must be retained in order to validate blocks mined prior to the hard-fork.
|
|
|
+
|
|
|
+## Reference Implementation
|
|
|
+
|
|
|
+A reference implementation of RandomHash can be found [here][2].
|
|
|
+
|
|
|
+## Links
|
|
|
+
|
|
|
+1. [PIP-0009 RandomHash: GPU & ASIC Resistant Hash Algorithm][1]
|
|
|
+2. [RandomHash2 Reference Implementation][2]
|
|
|
+
|
|
|
+[1]: https://github.com/PascalCoin/PascalCoin/blob/master/PIP/PIP-0009.md
|
|
|
+[2]: https://github.com/PascalCoin/PascalCoin/blob/master/src/core/URandomHash2.pas
|