Skip to content
Snippets Groups Projects
Select Git revision
  • dewif
  • ws2p_v2
  • dubp_v13
  • master default protected
  • authentication_file_format
  • graphql_api_rfc
  • messages_encryption_and_signature
  • ascii_armor_messages
  • ws2p_v1
  • rfc5-duniter-protocol-rework
  • dip/0001
  • dip/0002
12 results

0001 Abstract Syntax Tree-based output script locks.md

Blame
  • #1 (closed) Abstract Syntax Tree-based output script locks

    RFC: 1 (Update 2)
    Title: Abstract Syntax Tree-based output script locks
    Type: Protocol change (Hardfork)
    Parent: None
    Status: WIP
    Author: nanocryk <nanocryk@gmail.com>
    Created: 2017-11-22
    Last edited: 2017-12-20
    License: AGPL-3

    This document is currently being written and thus it's not complete. Specifications may change in the future.

    Abstract

    In the current Duniter protocol input and output conditions are stored in machine-readable BNF text format. This is good for human readability, but it requires parsing and takes place due to its textual format.

    To lower blocksize and transactions weights, or to allow more complex conditions; it could be preferable to use a more simple binary format which need little or no parsing, close to memory and that can be converted back to text format. It can also allow to add new features without hard-forking the protocol.

    Table of Contents

    1. Basic script structure
      1. Scripts as ASTs
      2. Signatures (SIG)
      3. Unlock inputs (INPUT)
      4. Avoid redudancies (INCLUDE)
      5. Avoid providing unused branches (AST_HASH/EXEC_AST_HASH)
    2. Binary format
      1. Document syntax
      2. Values
      3. Operators
        1. Language operators ([0x00;0x0f])
        2. Document data ([0x10;0x2f])
        3. Boolean operators ([0x30;0x3f])
        4. Arithmetic operators ([0x40;0x6f])
        5. Locks ([0x70;0x9f])
        6. AST group format
        7. Parameter group format
    3. Examples
      1. Simple signature
      2. 2 out of 3 signatures (simple)
      3. 2 out of 3 signatures (merkelized + includes)

    1. Basic script structure

    1.1. Scripts as ASTs

    An output script allow to define conditions on how funds can be spent. The most basic case is to lock funds with a given public key, and requiring a signature (made with the associated private key) to use them as input of a new transaction. Since we define a condition, we can express it as a boolean function.

    We want to have multiple ways of spending an output, such as multi-signatures or time locks. It means we must have operators which combines conditions into one. One way to store these functions are as Abstract Syntax Trees : Each operator is a tree with its inputs as leaves. This leaves can be primitive values or other operators.

    For a classic 1-signature lock, an AST could look like :

               SIG
                |
          -------------
      PublicKey   Signature
    • SIG would be an operator verifying the association between the signature and the public key.
    • PublicKey is stored in the script.
    • Signature is provided as an input when unlocking funds.

    If we want a 1 out of 2 signatures, we could have :

                              OR
                              |
                 ----------------------------
                SIG                        SIG
                 |                          |
          ---------------            --------------
      PublicKeyA   SignatureA   PublicKeyB   SignatureB
    • OR would be an operator returning True if at least one input is True.

    For a 2 out of 3 signatures, we could have :

                                            GREAT_EQUAL
                                            -----------
                                           SUM        2
                                            |
                 ----------------------------------------------------
                SIG                        SIG                      SIG
                 |                          |                        |
          ---------------            --------------           ---------------
      PublicKeyA   SignatureA   PublicKeyB   SignatureB   PublicKeyC   SignatureC
    • SUM would be an operator computing the sum of its leaves.
    • GREAT_EQ would be an operator comparing if the left leaf is greater or equal to the right leaf.

    1.2. Signatures (SIG)

    We want our system to cover as many cases as possible. Currently the Duniter protocol use only Ed25519 signatures, but it could in the future use othe schemes such as Schnorr signatures.

    We also want the impossibility to change a script defined as an unlock condition, so the content of the script must be signed, and this signature required to unlock the funds.

    It means multiple things :

    • Our SIG operator should be usable with multiple cryptographic systems. We must have a parameter provided which system is used.
    • Since keys and signatures lengths may vary from a system to another, it should accept inputs of any length.
    • We'll use document signatures as signature source. We need an operator to fetch this data from the document.

    With these requirements we have :

    • SIG operator defined as SIG(System, PublicKey, Signature).
    • An operator capable of providing data of any length.
    • An operator capable of fetching data outside of script.

    1.3 Unlock inputs (INPUT)

    When a user want to spend an output, it must provide inputs making the lock script returns true. They are stored as list of pairs of indexes and ASTs. A script can use an INPUT(Index) operator returning the input.

    1.4. Avoid redudancies (INCLUDE)

    Some parts of a script may required in multiple places the same computed result. To avoid redudancies, we allow to provide an script with multiple ASTs. Each AST is executed in order and only the output of the last one is taken as the result. An AST can use the result of an above script with an INCLUDE operator. With this structure it's not possible to have recursion or cycles which avoid infinite loops and Turing-Completeness. All ASTs are grouped into a single structure (an AST group or a script).

    1.5. Avoid providing unused branches (AST_HASH/EXEC_AST_HASH)

    It could be bad for privacy to expose a complete script with all of the possibilities of spending an output. With merkelized scripts, we allow the user to provide only the hash of unused branches which can be used to prove it's the same script while not providing its complete definition.

    We define an AST hash (on 32 bytes) as HASH(AST) = SHA256(OperatorCode + HASH(Child0) + HASH(Child1) + ...). Since we want to use provide only the hash of a branch while keeping the script valid, we defined an operator AST_HASH(Hash) for which HASH(AST_HASH(Hash)) = Hash. To allow only hashes of unused branches, this operator will return an Undefined value and should be used with aggregators such as OR or SUM.

    Because we are using groups, we can define the group hash as GHASH(Group) = SHA256(HASH(Group.0) + HASH(Group.1) + ...)

    Then we can define a EXEC_AST_HASH operator taking the group and it's GHASH. It will first verify that the hash is correct, then execute it.

    2. Binary format

    2.1. Document syntax

    • Operators are written in CAPS.
    • Arguments are written as <Name:Type>.
    • Type can be :
      • An in-place sized value (in bytes) 1, 2, ...
      • An AST returning a sized value : AST:1, AST:2, ...

    Non fixed sizes can be expressed as ?1, ?2, etc. Multiple parameters with the same tag ?1 mut have the same size.

    The script is invalid if operator arguments have wrong sizes.

    These rules only apply for this document and won't appear in the binary format. They just set rules to respect at script validation and execution.

    2.2. Values

    All values are raw arrays of bytes. Their meaning depends of their usage and with which operators they'll be used.

    There is an alias B for boolean values of size 1. Value 0 means False, and any other value means True. Boolean operators however must return 0 or 1. This allow to easily sum boolean results together.

    2.3. Operators

    We store an operator as a 1 byte value representing its type :

    2.3.1. Language operators ([0x00;0x0f])

    • 0x00 or VALUE:Size <Size:1> <Value:Size>
      Return given Value of given Size.
      Value must have a size of Size bytes or the script is unreadable and thus invalid.

    • 0x01 or INPUT:? <Index:1>
      Return an input parameters as an AST.
      It this input don't exist, Undefined is returned. If this signature don't exist, Undefined is returned.

    • 0x03 or INCLUDE:? <Index:1>
      Return the VALUE returned by another script.

    • 0x04 or AST_HASH:32 <Hash:32>
      Return Undefined.
      It's hash is the provided Hash.

    • 0x05 or EXEC_AST_HASH:? <Hash:32> <Script:AST:1>
      Return script result VALUE is the hash is correct, Undefined otherwise.

    • 0x06 or RESIZE:Size <Size:1> <Value:AST:?>
      Resize the value by truncating or adding padding (0x00).

    2.3.2. Document data ([0x10;0x2f])

    • 0x10 or FETCH_SIG:32 <Index:1>
      Return the signature (hash) from the document signatures. This hash will be used to retrieve the real signature by the SIG operator.

    • 0x11 or FETCH_LOCKTIME:8
      Return the timestamp of the locking transaction.

    • 0x12 or FETCH_UNLOCKTIME:8
      Return the timestamp of the (new) unlocking transaction.

    2.3.3. Boolean operators ([0x30;0x3f])

    • 0x30 or NOT:B <Script:AST:B>
      Return False if Script returns true, True otherwise.

    • 0x31 or NORM:B <Script:AST:B>
      Return False if Script returns false, True(1) otherwise.
      Allow to normalise any True(...) value to True(1).

    • 0x32 or OR:B <Count:1> <Script0:AST:B> <Script1:AST:B> ...
      Return True if at least one child returns True, False otherwise.
      Undefined values are considered False.

    • 0x33 or AND:B <Count:1> <Script0:AST:B> <Script1:AST:B> ...
      Return False if at least one child returns False, True otherwise.

    • 0x34 or XOR:B <Script0:AST:B> <Script1:AST:B>
      Return True if one but not both returns True.

    2.3.4. Arithmetic operators ([0x40;0x6f])

    • 0x40 or EQUAL:B <Script0:AST:?> <Script1:AST:?>
      True if values are equal, False otherwise.

    • 0x41 or NEQUAL:B <Script0:AST:?> <Script1:AST:?>
      True if values are not equal, False otherwise.

    • 0x42 or LESS:B <Script0:AST:?> <Script1:AST:?>
      True if Script0 return value is less than Script1 return value.

    • 0x43 or GREAT:B <Script0:AST:?> <Script1:AST:?>
      True if Script0 return value is greater than Script1 return value.

    • 0x44 or LESS_EQUAL:B <Script0:AST:?> <Script1:AST:?>
      True if Script0 return value is less or equal than Script1 return value.

    • 0x45 or GREAT_EQUAL:B <Script0:AST:?> <Script1:AST:?>
      True if Script0 return value is greater or equal than Script1 return value.

    • 0x50 or SUM:?1 <Count:1> <Script0:AST:?1> <Script1:AST:?1> ...
      Return the sum of given values and discard any overflow. If you don't want overflows, try to RESIZE values first. Undefined values are ignored.

    2.3.5. Locks ([0x70;0x9f])

    • 0x70 or PASS:B <PasswordHash:32> <Password:AST:32>
      Return True if PasswordHash = SHA256(Password).
      Password must be 32 bytes long, so the original password should be hashed a first time to have Password, then a second time to have PasswordHash.

    • 0x71 or SIG:B <Cryptosystem:2> <PublicKeyHash:32> <SignatureHash:AST:32>
      Verify the signature of the transaction with the public key. To avoid dynamic sizes, only the hash is provided and will allow to fetch the real signature.

    2.4. AST group format

    An AST group is a list of ASTs of a fixed size.

    3
    <AST 0>
    <AST 1>
    <AST 2>

    They are executed in order and a AST can INCLUDE a result from an previous AST by its zero-based index.

    The last AST must return a 1 byte boolean value (0 is False, True otherwise).

    This is the structure storing the unlocking conditions of an ouput.

    2.5. Parameter group format

    A parameter group is a map of AST groups associated with their own parameter group.

    3   (3 elements)
    0 <Parameter group for 0> <Group 0>
    1 <Parameter group for 1> <Group 1>
    2 <Parameter group for 2> <Group 2>

    This structure allow groups to be simple values or complex merkelized scripts with includes. Indices are provided to avoid listing unused parameters. INPUT and INCLUDE in a script will fetch in the same level group/parameter group.

    It can become very complicated, there are examples below.

    3. Examples

    3.1. Simple signature

    All < > in these examples are placeholders and must be replaced with appropriate data.

    Lock :

    1                     (1 ast)
      SIG
        0                 (if 0 corresponds to Ed25519)
        <PublicKeyHash>   (on 32 bytes)
        INPUT 0           (signature on 32 bytes)

    Lock in hex :

    01
      71 00 <PublicKeyHash> 01 00

    Unlocking :

    1                     (1 parameter)
      1                   (parameter 1)
        0                 (no parameters in this group)
        1                 (1 ast)
          FETCH_SIG 0

    Unlocking in hex :

    01
      01
        00
        01
          10 00

    3.2. 2 out of 3 signatures (simple)

    Lock :

    1
      GREAT_EQUAL
        SUM 3
          SIG 0 <PKHash0> INPUT 0
          SIG 0 <PKHash1> INPUT 1
          SIG 0 <PKHash2> INPUT 2
        2

    Lock in hex :

    01
      45
        50 03
          71 00 <PKHash0> 01 00
          71 00 <PKHash1> 01 01
          71 00 <PKHash2> 01 02
        02

    Unlocking :

    2               (2 parameters)
      0               (parameter 0)
        0             (no parameters in this group)
        1             (1 ast)
          FETCH_SIG 0   (doc signature 0)
    
      2               (parameter 2)
        0             (no parameters in this group)
        1             (1 ast)
          FETCH_SIG 1   (doc signature 1)

    Unlocking in hex :

    02
      00
        00
        01
          10 00
      02
        00
        01
          10 01

    3.3. 2 out of 3 signatures (merkelized + includes)

    Node : This one is overcomplicated and is provided as an example of what it's possible to do. However it's very simple compared to the far more complex contracts you can make with that.

    Original Merkelized AST :

    4                           (4 asts)
      SIG 0 <PKHash0> INPUT 0   (sig 0)
    
      SIG 0 <PKHash1> INPUT 1   (sig 1)
    
      SIG 0 <PKHash2> INPUT 2   (sig 2)
    
      GREAT_EQUAL
        SUM 3
          INCLUDE 0             (include above script 0)
          INCLUDE 1             (include above script 1)
          INCLUDE 2             (include above script 2)
        2
    

    Original Merkelized AST in hex :

    04
      71 00 <PKHash0> 01 00
      71 00 <PKHash1> 01 01
      71 00 <PKHash2> 01 02
      45
        50 03
          03 00
          03 01
          03 02
        02

    Let's consider <MHASH> as the hash of the AST above.

    Lock :

    1                           (1 ast)
      EXEC_AST_HASH               (execute correct script in param 0)
        <MHASH>
        INPUT 0

    Lock in hex :

    01
      05 <MHASH> 01 00

    This lock is contained in the transaction, and the merkelized AST need to be stored offchain. It can then be only known by the participants and used partialy at unlock as shown below.

    Unlock :

    1                               (1 parameter (the merklized script))
      0                               (our script)
        2                               (2 parameters for our script (signatures))
          0                               (parameter 0)
            0                               (no parameters in this group)
            1                               (1 ast)
              FETCH_SIG 0                     (doc signature 0)
    
          2                               (parameter 2)
            0                               (no parameters in this group)
            1                               (1 ast)
              FETCH_SIG 1                     (doc signature 1)
    
        4                                (our script have 4 asts)
          SIG 0 <PKHash0> INPUT 0          (sig 0)
    
          AST_HASH <Hash>                  (we provide only the hash of the unused AST)
    
          SIG 0 <PKHash2> INPUT 2          (sig 2)
    
          GREAT_EQUAL
            SUM 3
              INCLUDE 0             (include above script 0)
              INCLUDE 1             (include above script 1 / could be replaced with an AST_HASH too)
              INCLUDE 2             (include above script 2)
            2

    Unlock in hex :

    Same procedure as before

    Notes :

    In this exemple merkelized scripts allow to hide each public keys needed to spend the output, and PKHash1 is never sent on the blockchain. Also, it's impossible to know what was the content of the branch, so it could have been any test of any complexity and nobody will never now it by looking at the blockchain.

    In this case it takes more place than the classic way, but for far more complicated contracts it'll be shorter to provide just a 32 bytes hash than the complete branch.