Efficient Decentral Governance of Structured Data

Denis Erfurt

1. Abstract

Imagine if you could not only vote for reddit posts, but govern the entire platform with your interactions.

With the emergence of decentral organizations such as Bitcoin, Ethereum or Applications written on top of Ethereum new forms of governance are needed to direct and evolve this organizations. Such schemas need to be transparent, fair (influence is directly proportional to the ownership), accessible (easy to interact with), extendable, proposed candidates have to be previewable, and verifiable and decissions have to be automatically enforcible. Currently this is still an unsolved problem. In this paper, we analyze the problem space and propose a schema which could serve as a building block in the roadmap to a full self-governance solution.

In the presented method, governance is understood as the task for a group of actors (organization) with a clear ownership distribution to deterministically find a consensus out of a set of possible candidates.

Efficient governance tries to optimize the amount of necessary interaction done by each actor to arrive at his desired and reachable state.

The main idea is the tightly coupled relationship of user interactions such as voting and delegations statements with a meta data structure.

The paper is structured in the following way: first we introduce the problem space and a general model for governance in 2. Based on this we introduce an extended model in 3. and discuss the role of data structures for governance, in particular those, which can be described with a regular grammar. In section 4 we present our current proof of concept implementation. In 5. an real world use case is given which illustrates the key value proposition of the presended model. In 4. we present the roadmap for further iterations.

2 general model

2.1 organization

Suppose there is organization $O$ with is owned by Alice, Bob, Charlie and Dave who have agreed to find a name for it. This problem is kind of trivial, but we want to study it in depth to lay out the mathematical foundations behind it.

Every person, who has impact on this decision is member of a set of owners mathematically represented as a set $A$:

$$A = \lbrace Alice, Bob, Charlie, Dave \rbrace$$

The influence of each Member on this decision don't has to be equal. Suppose there are 100 shares in total. The ownership of shares represents the influence each owner has on the outcome. We can define a share function which assigns each owner the number of shares he own.

$$share: A \rightarrow \mathbb{N}$$

The sum of all shares is called the size of this organization: $$|O|: = \sum_{a \in A} share(a)$$

In our example Alice has 40 shares which represents 40%, Bob 30, Charlie 20 and Dave 10.

Now everybody can propose a name:

We can call all proposals - candidates. Our final candidate set looks like the following:

$K=\lbrace "Awesomecorp", "omg\, systems",$ $\rbrace$

Now every owner can rate each candidate:

Awesomecorp omg systems
Alice 1 0.4 0
Bob 0 1 0
Charlie 0 0 1
Dave 1 1 1

The evaluation of the candidates by the owners can be represented by a vote function: $$vote: A\times K \rightarrow [0,1]$$

The organization now contains all information we need, to determine the actual winner:

We define an organization state as a tuple: $$O:=(A,K,<,share,vote)$$

2.2 consensus

We can calculate the actual outcome by considering the votes for each candidate together with the shares of each voter:

The winner candidate is "omg systems" with 2 shares ahead of "Awesomecorp".

Note here that its possible for two candidates to receive the same maximal value. (For example if charlie had voted 0.1 with 20 shares which would put "Awesomecorp" to 52 shares total.) In this case we need a system to decide which candidate actually has won: Since "Awesomecorp" was proposed before "omg systems", we simply will take the oldest candidate with the maximum value: Mathematically speaking we need a strict total order over the candidates:

$$<\subset K\times K$$

This approach gives us a consensus function, which is deterministic on every organization state:

$$consens(O):= min_<(\lbrace k\ |\ value(k) = \max_{k'\in K}(value(k'))\rbrace )$$ $$value(k):=\sum_{a\in A} share(a)*vote(a,k)$$


In order to make voting more efficient we can bring in the concept of transitive voting (or delegation). Suppose in our scenario Dave don't really have the time to evaluate all candidates in this difficult decisions. He also trusts his fried Bob to represent his interest, so he decides to delegate his vote to Bob. Now, every time bob votes or delegates, Daves votes will be taken in to account.

Mathematically we define a delegation set: $$D \subseteq A\times A$$

And in our example Dave is delegating to Bob: $$D=\lbrace (Dave, Bob) \rbrace $$

When we look again at our vote matrix, the result would look like this:

Awesomecorp omg systems
Alice 1 0.4 0
Bob 0 1 0
Charlie 0 0 1
Dave 0 1 0

In this example his delegation actually didn't change the outcome but did had an effect on the overall rating of the candidates: Awesomecorp is rated with 40 shares and omg systems with 56 and with 20.

3 extended model

3.1 types

Did you notice what happened with Daves proposal? Didn't the organization initially agree to decide on a name? And clearly isn't one. This can be solved by a type system: An organization can be restricted to an underlaying structure: a set of types it will accepts as valid candidates denoted as $\mathcal{L}$.

A type of an element can be denoted with the colon notation. For example $"cat,corp": String$ states, that the candidate "cat corp" has the type String. Similarly we can say $: Image$.

In our example Dave couldn't have proposed if the organization had the set of types or "language" $\mathcal{L}_1=\lbrace String \rbrace$. However, it would be a valid candidate with the language $\mathcal{L}_2=\lbrace String, Image\rbrace$. The proposed candidate has to satisfy just one of the alternatives in the set in order to be valid. Another notation form for alternatives is $\mathcal{L}_2=(String | Image)$. For a language $\mathcal{L}$ we also adopt the colon notation to state that a candidate matches one of the types in the typeset: $"cat,corp":\mathcal{L}_2$ is valid.

If an organization would want to find a match composed out of a name and a logo, they could have chosen the language $\mathcal{L}_3=\lbrace (String, Image) \rbrace$. A valid candidate must to contain a String and an Image:

e.g. $("catcorp",$ $):\mathcal{L}_3$ would be a valid.


md_222 =` Note here that the type $(String, Image)$ is a composition out of the type $String$ and the type $Image$, so composing types is a valid operation to produce another type. We can write composition with the dot notation: $String\cdot Image$.

Also we want to allow the star operator for a language description: $\mathcal{L}_4 = String*$ states that the type $String$ can be composed arbitrary times: $\mathcal{L}_4=\lbrace (), (String), (String, String), (String, String, String), ... \rbrace$

e.g. $("so", "much", "feature"):\mathcal{L}_4$ is valid.

We can now use the operations (alternative, composition, star) to describe a complex structure of our organization!

An example for a complex language is $\mathcal{L}_5=(String, ( Image | String* ) )=$"WIP Name + Logo":

"Either a name and logo-sketch, or a name and a list of requirements for the logo"

Which can be understood as the language:

$\mathcal{L}_5=\lbrace (String, Image), (String), (String,String), (String, String, Sting), ... \rbrace$

So the statement: $("Catcorp", "Black, and, White, Image"): \mathcal{L}_5$ is valid.

3.2 regular grammar

In theoretical computer science a description schema to produce a language is a grammar, in our case a regular grammar.

A regular grammar is a composition out of the rules: concatenation, alternative and Kleene. Formally it is:

The "WIP Name + Logo" language looks like following in a grammar description $G$:

$T = \lbrace String, Image \rbrace$
$N = \lbrace Start, ImageCtx, ImageDescriptionCtx \rbrace$
$S = Start$
$P = \lbrace$
  $ Start \rightarrow String \cdot ImageCtx, $
  $ Start \rightarrow String\cdot ImageDescriptionCtx, $
  $ ImageCtx \rightarrow Image, $
  $ ImageDescriptionCtx \rightarrow String \cdot ImageDescriptionCtx, $
  $ ImageDescriptionCtx \rightarrow \varepsilon $
$ \rbrace$

The grammar is a way to describe the language, by starting with the Start rule and following the production rules untill the end. With this one can proof or disproof the validity of a candidate.

For example the type: $(String\cdot String)$ can be prooven via:


$\rightarrow String\cdot ImageDescriptionCtx $

$\rightarrow String\cdot String\cdot ImageDescriptionCtx $

$\rightarrow String\cdot String $

This is also a proof for the candidate: $"Catcorp" \cdot "Black, and, White, Image"$.

We can restrict an organization to a grammar: $O_G$ which will handle the candidate validation.

3.3 context

The nonterminals of a grammar can also be interpreted as a named context. This is handy for context based statements such as contextual delegations.

Imagine in our organization in the "WIP Name + Logo" language Alice is aware about her terrible decision making for Logos. Here she can delegate her votes only in the context of ImageCtx to Charlie. Anyone still can make proposals, but Dave has a voting weight in the context of ImageCtx of 50%. In another context (like ImageDescriptionCtx) he still only has his initial voting weight of 10%.

Also imagine that Bob thinks that at the current status, images should't be considered at all. He also likes the name "OMG Systems" so he votes 0.8 for the $"OMG, Systems"\cdot ImageDescriptionCtx$. With this statement all candidates which can be derived out of this context by using the production of the organization will receive his vote. Note that also future coming candidates automatically receives his vote! This is called contextual voting.

Lets see how our voting table looks

("OMG Systems") ("cat corp", "Black and White Image") ("cat corp" ,)
Alice 0 1 0
Bob 0.8 0.8 0
Charlie 1 0 1
Dave 0 0 1
total $0.830+120=44$ $1 * 40 + 0.8 * 30 = 62$ $1 * (10 + 40) + 20 = 70$

Our consensus candidate is therefore the last candidate.

3.4 evolutions

In the previous examples the organization started with the Grammar "Name" and evolved into an organization of the type "WIP Name + Logo". It made this transition by changing the grammar and by porting some of the candidates and possibly making other adjustments.

We can define an Evolution formally by a mapping from one organizational state to another: $$evolution := O_{G_1} \rightarrow O_{G_2}$$

This takes a grammar which the organization is defined in ($G_1$) and maps or filters all of its candidates to candidates which are valid in the grammar $G_2$. It can also make other changes like minting new shares and giving them to the actor, who proposed the winning candidate or change votes and delegations. After the change, it takes the new grammar $G_2$ as its new foundation.

Evolutions are a powerful tool and therefore are only applied when their votes hit a $\frac{2}{3}$ majority.

The set of all defined grammars and evolution-schemas can be seen as a global knowledge base and a path from the current grammar representation to a desired as a "blueprint" or a "recipe" to arrive at the desired goal by following a step by step guide.

3.5 Interpretations

Interpretations are basically anything, which refers to a consensus of an organization in a fixed specific grammar. The interpretation generate a domain specific output based on the consensus candidate and the underlaying language of the linked organization.

For example a candidate with the "Name" lananguage can be displayed in as a webpage header. The webpage would display "OMG Systems", but this could change if the consensus becomes "cat corp" whithout ever changing the website. An interpretation also brings in an easy way to view the underlaying datastructure and therefore preview all proposed candidates before have to vote on them.

An organization can also state its preferred representation: $("LandingPageData"\cdot "Theme")$ is a valid type, where "LandigpageData" represents the type which describes all data of a landingpage such as name, description, logo image etc. and "Theme" is a representation pointer.