Skip to content
Snippets Groups Projects

WIP: RFC 1 : Abstract Syntax Tree-based output script locks

Closed nanocryk requested to merge dip/0001 into master
Compare and
1 file
+ 264
0
Compare changes
  • Side-by-side
  • Inline
DIP0001.md 0 → 100644
+ 264
0
 
```
 
DIP: 1
 
Title: New transaction output script system using Abstract Syntaxic Trees
 
Author: Nanocryk <nanocryk@gmail.com>
 
Status: Proposition
 
Type: Hard-fork (transaction document)
 
Created: 2017-11-22
 
License: GPL-3
 
```
 
 
**This document is currently being written and thus it's not complete. Specifications may change in the future.**
 
 
# Abstract
 
 
In the current Duniter implementation input and output conditions are stored in machine-readable BNF text format. This is good for human readability, but it requires parsing and takes place due to its textual format.
 
 
To lower blocksize and transactions weights; or to allow more complex condtions, it could be preferable to use a more simple binary format which need little or no parsing, close to memory and that can be converted back to text format. It can also allow to add new features without hard-forking the protocol.
 
 
# Script AST exemples
 
 
A output script condition can be represented as a tree composed on different operators as branches, and input data as leaves.
 
 
With an operator `ED(pubkey, signature)` we can check if the current document is signed with the provided public key :
 
 
```
 
ED
 
------------
 
| |
 
pubkey signature
 
```
 
 
With a `OR(branches)` we can test 1 out of 2 signatures :
 
 
```
 
OR
 
----------------------
 
ED ED
 
| |
 
------------ ------------
 
| | | |
 
pubkey signature pubkey signature
 
```
 
 
With `SUM(branches)` and `GE(a,b)` operators when can check easely for a 2 out of 3 signatures :
 
 
```
 
GE
 
---------
 
SUM 2
 
|
 
-------------------------------------------
 
ED ED ED
 
| | |
 
------------ ------------ ------------
 
pubkey signature pubkey signature pubkey signature
 
```
 
 
# Binary representation
 
 
As seen in exemples above, we have to manage different types of tree nodes :
 
 
- **Operators** : above some branches and operate on them
 
- **Local values of different types** : leaf nodes with values stored in the script (like pubkeys)
 
- **Extern values** : leaf nodes with values provided as input to the script (like signatures)
 
 
We'll also need a valid way to not provide an ununsed/undefined input, and another one to provide the hash of an AST.
 
 
We can define opcodes to encode this information :
 
 
| Code | Name | Description | Parameters
 
|:------:|:----------:|:-----------------|:-------------
 
| `0x00` | `NOP` | Nothing/void | None
 
| `0x01` | `EXT` | Extern value | `<input index>`
 
| `0x02` | `MASTH` | MAST hash | `<AST>`
 
| | | |
 
| `0x08` | `U8` | 1 byte value | `<value>`
 
| `0x09` | `U16` | 2 bytes value | `<value>`
 
| `0x0a` | `U32` | 4 bytes value | `<value>`
 
| `0x0b` | `U64` | 8 bytes value | `<value>`
 
| `0x0c` | `U128` | 16 bytes value | `<value>`
 
| `0x0d` | `U256` | 32 bytes value | `<value>`
 
| `0x0e` | `U512` | 64 bytes value | `<value>`
 
| `0x0f` | `U1024` | 128 bytes value | `<value>`
 
 
All other AST opcodes are defined with this syntax :
 
`<opcode> <children count> <children data>`
 
 
To allow as many opcodes as we need, we define opcodes in range `[0x03;0xfe]`. With `0xff`, the next byte is accounted in opcode resolution, and can again be `0xff`. Thus `0xffff14` is a valid opcode.
 
 
# List of operators
 
 
| Code | Name | Description | Return type | Children
 
|:------:|:----------:|:------------|:-----------:|:--------
 
| `0x10` | `NOT` | Binary NOT | Boolean (U8)| 1 boolean (U8)
 
| `0x11` | `OR` | Binary OR | Boolean (U8)| Any amount of booleans (U8)
 
| `0x12` | `AND` | Binary AND | Boolean (U8)| Any amount of booleans (U8)
 
| `0x13` | `XOR` | Binary XOR | Boolean (U8)| 2 boolean (U8)
 
| | | | |
 
| `0x20` | `EQ` | == | Boolean (U8)| Any numbers of the same size
 
| `0x21` | `LT` | < | Boolean (U8)| 2 numbers of same size
 
| `0x22` | `GT` | > | Boolean (U8)| 2 numbers of same size
 
| `0x23` | `LE` | <= | Boolean (U8)| 2 numbers of same size
 
| `0x24` | `GE` | >= | Boolean (U8)| 2 numbers of same size
 
| | | | |
 
| `0x30` | `SUM` | Sum | Number (size of inputs) | Any numbers of the same size
 
| | | | |
 
| `0x50` | `ATL`| Verify absolute time lock | Boolean (U8) | Timestamp (UNIX seconds)(U64)
 
| `0x51` | `RTL`| Verify relative time lock | Boolean (U8) | Delay since insertion in blockchain (seconds)(U64)
 
| `0x52` | `SMR` | Run script with merkle root | Boolean (U8) | Hash (U32), AST
 
| | | | |
 
| `0x70` | `PH` | Verify password hash | Boolean (U8) | Hash (U32), password (U32)
 
| `0x71` | `ED` | Verify Ed25519 signature | Boolean (U8) | Public key (U32), signature (U64)
 
| `0x72` | `EDPKH` | Verify Ed25519 signature on pubkey hash | Boolean (U8) | Pubkey hash (U32), Public key (U32), signature (U64)
 
 
 
 
We can now express our previous *visual* trees as combinaisons of opcodes :
 
 
- Simple signature :
 
 
```
 
ED 2
 
U32 <public_key>
 
EXT 0 # Signature in parameter 0
 
```
 
 
- 1 out of 2 signatures :
 
 
```
 
OR 2
 
ED 2
 
U32 <public_key_a>
 
EXT 0 # Signature in parameter 0
 
ED 2
 
U32 <public_key_b>
 
EXT 1 # Signature in parameter 1
 
```
 
 
- 2 out of 3 signatures :
 
 
```
 
GE 2
 
SUM 3
 
ED 2
 
U32 <public_key_a>
 
EXT 0 # Signature in parameter 0
 
ED 2
 
U32 <public_key_b>
 
EXT 1 # Signature in parameter 1
 
ED 2
 
U32 <public_key_c>
 
EXT 2 # Signature in parameter 2
 
U8 2
 
```
 
 
# Input parameters
 
 
Input parameters are provided in a list of AST with indexes.
 
 
In the exemple of 2 out of 3 signatures above, the input could be :
 
 
```
 
3
 
U64 <signature_a>
 
NOP
 
U64 <signature_c>
 
```
 
 
In the document, they will be expressed as `<size on 2 bytes> <ast1> <ast2> ...`.
 
 
If a script try to access to an out of bound parameter, `NOP` will be used.
 
If an operator get invalid inputs (not the correct size or `NOP`), it will return an *undefined value*.
 
 
# Undefined operators
 
 
Undefined operators returns an *undefined value*. *Undefined values* propagate up when used by all operators expect with `OR` operator if another *defined value* is true.
 
 
## Exemple
 
 
```
 
OR
 
--------------------------------
 
| |
 
ED AND
 
------------ --------------------
 
| | | |
 
pubkey signature ED UNDEFINED
 
------------
 
| |
 
pubkey signature
 
```
 
 
Since the `AND` branch is dependent of all of its children, the **undefined state** propagate up to the `AND`.
 
 
```
 
OR
 
--------------------------------
 
| |
 
ED UNDEFINED
 
------------
 
| |
 
pubkey signature
 
```
 
 
If the left signature verification returns true, the `OR` operator returns true. Otherwise it will propagate the **undefined state**.
 
 
If at the end the returned value is `UNDEFINED`, the execution is concidered invalid and the transaction can't be spent.
 
 
# Merklized trees
 
 
+4
The next step now is to transform our script tree into a Merkle Tree. We can hash each leaf with `SHA256(leaf)`, and then each parent with `SHA256(parent hash + leaf hash + leaf hash + ...)`, recursivly until one hash is left : the Merkle Root hash.
 
 
This Merkle Root can be used to store a script in an output condition only with a hash. Then when spending, we rebuild the Merkle Root of the given script tree to check if this is the correct script. We can also replace unused branches in the script (while using an `OR` operator) by their hash, reducing the data size and hiding perhaps secret conditions; while keeping the same Merkle Root.
 
 
## Exemple
 
 
Let's make a 2 out of 3 signatures with a MAST. This script is made of this AST :
 
 
```
 
GE 2
 
SUM 3
 
ED 2
 
U32 <public_key_a>
 
EXT 0 # Signature in parameter 0
 
ED 2
 
U32 <public_key_b>
 
EXT 1 # Signature in parameter 1
 
ED 2
 
U32 <public_key_c>
 
EXT 2 # Signature in parameter 2
 
U8 2
 
```
 
 
 
We calculate the Merkle Root `<merkle root>`, and then build the final output script condition :
 
 
```
 
SMR 2
 
U32 <merkle_root>
 
EXT 3 # Script AST
 
# We need to use index 3 because 0 to 2
 
# are already used by the hashed script
 
```
 
 
Then when you want to use it, you provide as parameters :
 
 
```
 
4
 
U64 <signature_a>
 
NOP
 
U64 <signature_c>
 
GE 2
 
SUM 3
 
ED 2
 
U32 <public_key_a>
 
EXT 0 # Signature in parameter 0
 
MASTH <hash of this unused branch>
 
ED 2
 
U32 <public_key_c>
 
EXT 2 # Signature in parameter 2
 
U8 2
 
```
 
 
When parameter 3 will be used with `SMR` operator, the merkle root will be computed. If the operator `MASTH` is found, its hash parameter will be used instead of hashing the operator. Then providing the hash or the children gives the same hash, and thus the same Merkle Root.
 
\ No newline at end of file
Loading