- #1 (closed) Abstract Syntax Tree-based output script locks
- Abstract
- Table of Contents
- 1. Basic script structure
- 1.1. Scripts as ASTs
- 1.2. Signatures (SIG)
- 1.3 Unlock inputs (INPUT)
- 1.4. Avoid redudancies (INCLUDE)
- 1.5. Avoid providing unused branches (AST_HASH/EXEC_AST_HASH)
- 2. Binary format
- 2.1. Document syntax
- 2.2. Values
- 2.3. Operators
- 2.3.1. Language operators ([0x00;0x0f])
- 2.3.2. Document data ([0x10;0x2f])
- 2.3.3. Boolean operators ([0x30;0x3f])
- 2.3.4. Arithmetic operators ([0x40;0x6f])
- 2.3.5. Locks ([0x70;0x9f])
- 2.4. AST group format
- 2.5. Parameter group format
- 3. Examples
- 3.1. Simple signature
- 3.2. 2 out of 3 signatures (simple)
- 3.3. 2 out of 3 signatures (merkelized + includes)
#1 (closed) Abstract Syntax Tree-based output script locks
RFC: 1 (Update 2)
Title: Abstract Syntax Tree-based output script locks
Type: Protocol change (Hardfork)
Parent: None
Status: WIP
Author: nanocryk <nanocryk@gmail.com>
Created: 2017-11-22
Last edited: 2017-12-20
License: AGPL-3
This document is currently being written and thus it's not complete. Specifications may change in the future.
Abstract
In the current Duniter protocol input and output conditions are stored in machine-readable BNF text format. This is good for human readability, but it requires parsing and takes place due to its textual format.
To lower blocksize and transactions weights, or to allow more complex conditions; it could be preferable to use a more simple binary format which need little or no parsing, close to memory and that can be converted back to text format. It can also allow to add new features without hard-forking the protocol.
Table of Contents
1. Basic script structure
1.1. Scripts as ASTs
An output script allow to define conditions on how funds can be spent. The most basic case is to lock funds with a given public key, and requiring a signature (made with the associated private key) to use them as input of a new transaction. Since we define a condition, we can express it as a boolean function.
We want to have multiple ways of spending an output, such as multi-signatures or time locks. It means we must have operators which combines conditions into one. One way to store these functions are as Abstract Syntax Trees : Each operator is a tree with its inputs as leaves. This leaves can be primitive values or other operators.
For a classic 1-signature lock, an AST could look like :
SIG
|
-------------
PublicKey Signature
-
SIG
would be an operator verifying the association between the signature and the public key. -
PublicKey
is stored in the script. -
Signature
is provided as an input when unlocking funds.
If we want a 1 out of 2 signatures, we could have :
OR
|
----------------------------
SIG SIG
| |
--------------- --------------
PublicKeyA SignatureA PublicKeyB SignatureB
-
OR
would be an operator returningTrue
if at least one input isTrue
.
For a 2 out of 3 signatures, we could have :
GREAT_EQUAL
-----------
SUM 2
|
----------------------------------------------------
SIG SIG SIG
| | |
--------------- -------------- ---------------
PublicKeyA SignatureA PublicKeyB SignatureB PublicKeyC SignatureC
-
SUM
would be an operator computing the sum of its leaves. -
GREAT_EQ
would be an operator comparing if the left leaf is greater or equal to the right leaf.
SIG
)
1.2. Signatures (We want our system to cover as many cases as possible. Currently the Duniter protocol use only Ed25519 signatures, but it could in the future use othe schemes such as Schnorr signatures.
We also want the impossibility to change a script defined as an unlock condition, so the content of the script must be signed, and this signature required to unlock the funds.
It means multiple things :
- Our
SIG
operator should be usable with multiple cryptographic systems. We must have a parameter provided which system is used. - Since keys and signatures lengths may vary from a system to another, it should accept inputs of any length.
- We'll use document signatures as signature source. We need an operator to fetch this data from the document.
With these requirements we have :
-
SIG
operator defined asSIG(System, PublicKey, Signature)
. - An operator capable of providing data of any length.
- An operator capable of fetching data outside of script.
INPUT
)
1.3 Unlock inputs (When a user want to spend an output, it must provide inputs making the lock script returns true.
They are stored as list of pairs of indexes and ASTs.
A script can use an INPUT(Index)
operator returning the input.
INCLUDE
)
1.4. Avoid redudancies (Some parts of a script may required in multiple places the same computed result. To avoid redudancies, we allow to provide an script with multiple ASTs. Each AST is executed in order and only the output of the last one is taken as the result. An AST can use the result of an above script with an INCLUDE
operator. With this structure it's not possible to have recursion or cycles which avoid infinite loops and Turing-Completeness. All ASTs are grouped into a single structure (an AST group or a script).
AST_HASH
/EXEC_AST_HASH
)
1.5. Avoid providing unused branches (It could be bad for privacy to expose a complete script with all of the possibilities of spending an output. With merkelized scripts, we allow the user to provide only the hash of unused branches which can be used to prove it's the same script while not providing its complete definition.
We define an AST hash (on 32 bytes) as HASH(AST) = SHA256(OperatorCode + HASH(Child0) + HASH(Child1) + ...)
. Since we want to use provide only the hash of a branch while keeping the script valid, we defined an operator AST_HASH(Hash)
for which HASH(AST_HASH(Hash)) = Hash
. To allow only hashes of unused branches, this operator will return an Undefined
value and should be used with aggregators such as OR
or SUM
.
Because we are using groups, we can define the group hash as GHASH(Group) = SHA256(HASH(Group.0) + HASH(Group.1) + ...)
Then we can define a EXEC_AST_HASH
operator taking the group and it's GHASH
. It will first verify that the hash is correct, then execute it.
2. Binary format
2.1. Document syntax
- Operators are written in CAPS.
- Arguments are written as
<Name:Type>
. - Type can be :
- An in-place sized value (in bytes)
1
,2
, ... - An AST returning a sized value :
AST:1
,AST:2
, ...
- An in-place sized value (in bytes)
Non fixed sizes can be expressed as ?1
, ?2
, etc. Multiple parameters with the same tag ?1
mut have the same size.
The script is invalid if operator arguments have wrong sizes.
These rules only apply for this document and won't appear in the binary format. They just set rules to respect at script validation and execution.
2.2. Values
All values are raw arrays of bytes. Their meaning depends of their usage and with which operators they'll be used.
There is an alias B
for boolean values of size 1
. Value 0
means False
, and any other value means True
. Boolean operators however must return 0
or 1
. This allow to easily sum boolean results together.
2.3. Operators
We store an operator as a 1 byte value representing its type :
[0x00;0x0f]
)
2.3.1. Language operators (-
0x00
orVALUE:Size <Size:1> <Value:Size>
Return givenValue
of givenSize
.
Value
must have a size ofSize
bytes or the script is unreadable and thus invalid. -
0x01
orINPUT:? <Index:1>
Return an input parameters as an AST.
It this input don't exist,Undefined
is returned. If this signature don't exist,Undefined
is returned. -
0x03
orINCLUDE:? <Index:1>
Return theVALUE
returned by another script. -
0x04
orAST_HASH:32 <Hash:32>
ReturnUndefined
.
It's hash is the providedHash
. -
0x05
orEXEC_AST_HASH:? <Hash:32> <Script:AST:1>
Return script resultVALUE
is the hash is correct,Undefined
otherwise. -
0x06
orRESIZE:Size <Size:1> <Value:AST:?>
Resize the value by truncating or adding padding (0x00
).
[0x10;0x2f]
)
2.3.2. Document data (-
0x10
orFETCH_SIG:32 <Index:1>
Return the signature (hash) from the document signatures. This hash will be used to retrieve the real signature by theSIG
operator. -
0x11
orFETCH_LOCKTIME:8
Return the timestamp of the locking transaction. -
0x12
orFETCH_UNLOCKTIME:8
Return the timestamp of the (new) unlocking transaction.
[0x30;0x3f]
)
2.3.3. Boolean operators (-
0x30
orNOT:B <Script:AST:B>
ReturnFalse
ifScript
returns true,True
otherwise. -
0x31
orNORM:B
<Script:AST:B>
ReturnFalse
ifScript
returns false,True(1)
otherwise.
Allow to normalise anyTrue(...)
value toTrue(1)
. -
0x32
orOR:B <Count:1> <Script0:AST:B> <Script1:AST:B> ...
ReturnTrue
if at least one child returnsTrue
,False
otherwise.
Undefined
values are consideredFalse
. -
0x33
orAND:B <Count:1> <Script0:AST:B> <Script1:AST:B> ...
ReturnFalse
if at least one child returnsFalse
,True
otherwise. -
0x34
orXOR:B <Script0:AST:B> <Script1:AST:B>
ReturnTrue
if one but not both returnsTrue
.
[0x40;0x6f]
)
2.3.4. Arithmetic operators (-
0x40
orEQUAL:B <Script0:AST:?> <Script1:AST:?>
True
if values are equal,False
otherwise. -
0x41
orNEQUAL:B <Script0:AST:?> <Script1:AST:?>
True
if values are not equal,False
otherwise. -
0x42
orLESS:B <Script0:AST:?> <Script1:AST:?>
True
ifScript0
return value is less thanScript1
return value. -
0x43
orGREAT:B <Script0:AST:?> <Script1:AST:?>
True
ifScript0
return value is greater thanScript1
return value. -
0x44
orLESS_EQUAL:B <Script0:AST:?> <Script1:AST:?>
True
ifScript0
return value is less or equal thanScript1
return value. -
0x45
orGREAT_EQUAL:B <Script0:AST:?> <Script1:AST:?>
True
ifScript0
return value is greater or equal thanScript1
return value. -
0x50
orSUM:?1 <Count:1> <Script0:AST:?1> <Script1:AST:?1> ...
Return the sum of given values and discard any overflow. If you don't want overflows, try toRESIZE
values first.Undefined
values are ignored.
[0x70;0x9f]
)
2.3.5. Locks (-
0x70
orPASS:B <PasswordHash:32> <Password:AST:32>
ReturnTrue
ifPasswordHash = SHA256(Password)
.
Password
must be 32 bytes long, so the original password should be hashed a first time to havePassword
, then a second time to havePasswordHash
. -
0x71
orSIG:B <Cryptosystem:2> <PublicKeyHash:32> <SignatureHash:AST:32>
Verify the signature of the transaction with the public key. To avoid dynamic sizes, only the hash is provided and will allow to fetch the real signature.
2.4. AST group format
An AST group is a list of ASTs of a fixed size.
3
<AST 0>
<AST 1>
<AST 2>
They are executed in order and a AST can INCLUDE
a result from an previous AST by its zero-based index.
The last AST must return a 1 byte boolean value (0
is False
, True
otherwise).
This is the structure storing the unlocking conditions of an ouput.
2.5. Parameter group format
A parameter group is a map of AST groups associated with their own parameter group.
3 (3 elements)
0 <Parameter group for 0> <Group 0>
1 <Parameter group for 1> <Group 1>
2 <Parameter group for 2> <Group 2>
This structure allow groups to be simple values or complex merkelized scripts with includes. Indices are provided to avoid listing unused parameters. INPUT
and INCLUDE
in a script will fetch in the same level group/parameter group.
It can become very complicated, there are examples below.
3. Examples
3.1. Simple signature
All
< >
in these examples are placeholders and must be replaced with appropriate data.
Lock :
1 (1 ast)
SIG
0 (if 0 corresponds to Ed25519)
<PublicKeyHash> (on 32 bytes)
INPUT 0 (signature on 32 bytes)
Lock in hex :
01
71 00 <PublicKeyHash> 01 00
Unlocking :
1 (1 parameter)
1 (parameter 1)
0 (no parameters in this group)
1 (1 ast)
FETCH_SIG 0
Unlocking in hex :
01
01
00
01
10 00
3.2. 2 out of 3 signatures (simple)
Lock :
1
GREAT_EQUAL
SUM 3
SIG 0 <PKHash0> INPUT 0
SIG 0 <PKHash1> INPUT 1
SIG 0 <PKHash2> INPUT 2
2
Lock in hex :
01
45
50 03
71 00 <PKHash0> 01 00
71 00 <PKHash1> 01 01
71 00 <PKHash2> 01 02
02
Unlocking :
2 (2 parameters)
0 (parameter 0)
0 (no parameters in this group)
1 (1 ast)
FETCH_SIG 0 (doc signature 0)
2 (parameter 2)
0 (no parameters in this group)
1 (1 ast)
FETCH_SIG 1 (doc signature 1)
Unlocking in hex :
02
00
00
01
10 00
02
00
01
10 01
3.3. 2 out of 3 signatures (merkelized + includes)
Node : This one is overcomplicated and is provided as an example of what it's possible to do. However it's very simple compared to the far more complex contracts you can make with that.
Original Merkelized AST :
4 (4 asts)
SIG 0 <PKHash0> INPUT 0 (sig 0)
SIG 0 <PKHash1> INPUT 1 (sig 1)
SIG 0 <PKHash2> INPUT 2 (sig 2)
GREAT_EQUAL
SUM 3
INCLUDE 0 (include above script 0)
INCLUDE 1 (include above script 1)
INCLUDE 2 (include above script 2)
2
Original Merkelized AST in hex :
04
71 00 <PKHash0> 01 00
71 00 <PKHash1> 01 01
71 00 <PKHash2> 01 02
45
50 03
03 00
03 01
03 02
02
Let's consider <MHASH>
as the hash of the AST above.
Lock :
1 (1 ast)
EXEC_AST_HASH (execute correct script in param 0)
<MHASH>
INPUT 0
Lock in hex :
01
05 <MHASH> 01 00
This lock is contained in the transaction, and the merkelized AST need to be stored offchain. It can then be only known by the participants and used partialy at unlock as shown below.
Unlock :
1 (1 parameter (the merklized script))
0 (our script)
2 (2 parameters for our script (signatures))
0 (parameter 0)
0 (no parameters in this group)
1 (1 ast)
FETCH_SIG 0 (doc signature 0)
2 (parameter 2)
0 (no parameters in this group)
1 (1 ast)
FETCH_SIG 1 (doc signature 1)
4 (our script have 4 asts)
SIG 0 <PKHash0> INPUT 0 (sig 0)
AST_HASH <Hash> (we provide only the hash of the unused AST)
SIG 0 <PKHash2> INPUT 2 (sig 2)
GREAT_EQUAL
SUM 3
INCLUDE 0 (include above script 0)
INCLUDE 1 (include above script 1 / could be replaced with an AST_HASH too)
INCLUDE 2 (include above script 2)
2
Unlock in hex :
Same procedure as before
Notes :
In this exemple merkelized scripts allow to hide each public keys needed to spend the output, and PKHash1
is never sent on the blockchain. Also, it's impossible to know what was the content of the branch, so it could have been any test of any complexity and nobody will never now it by looking at the blockchain.
In this case it takes more place than the classic way, but for far more complicated contracts it'll be shorter to provide just a 32 bytes hash than the complete branch.