jolestar

jolestar

https://twitter.com/jolestar

Why is the ecosystem building of the Move programming language?

As a proponent of Move, I often encounter this question when promoting Move to developers: What advantages does Move have? Why Move? It's similar to introducing a new partner to a friend, where you face similar inquiries. However, such questions are not easy to answer. If you list pros and cons one by one, there will always be skeptics, especially since the ecosystem of a new language is not mature; choices can only be judged based on its potential. Let me make a statement: Move has the greatest potential to build an ecosystem similar to Solidity, or even surpass it, as a smart contract programming language.

Target audience: Developers and friends interested in the technology of the blockchain field. This article aims to explain the current challenges faced by smart contracts and some attempts made by Move in a straightforward manner, minimizing the use of code, hoping that friends who do not understand programming languages can roughly understand, though this is difficult, and I hope readers can provide some feedback.

Two Paths for Smart Contracts#

If we rewind time a few years, there were mainly two ways to support Turing-complete smart contract programming languages on new public chains:

One is to cut down existing programming languages and run them on general virtual machines like WASM. The advantage of this approach is that it can leverage the current programming language and WASM virtual machine ecosystem.

The other is to create a dedicated smart contract programming language and virtual machine, constructing the language and virtual machine ecosystem from scratch. Solidity follows this path, and so does Move.

At that time, people generally did not have high hopes for the Solidity & EVM ecosystem, thinking that Solidity seemed to have no use beyond issuing tokens, its performance was poor, and its tools were weak, resembling a toy. Many chains aimed to allow developers to use existing languages for smart contract programming, believing that the first path was more promising, and few new public chains directly copied Solidity & EVM.

However, after a few years of development, especially after the rise of DeFi, everyone suddenly realized that the Solidity ecosystem had changed. Meanwhile, the smart contract ecosystem that followed the first path did not grow. Why? I summarize a few reasons.

  1. The execution environment for blockchain programs is very different from that of operating system programs. If we discard system calls, file I/O, hardware, network, concurrency, and related libraries, and consider the execution costs on-chain, there is very little code that existing programming languages can share with smart contracts.
  2. Theoretically, the first approach can support many languages, but in practice, programming languages with runtime compiled to virtual machines like WASM result in very large files, which are not suitable for blockchain scenarios. Usable languages are mainly C, C++, Rust, etc. The learning curve for these languages is not lower than that of Solidity, a dedicated smart contract programming language, and supporting multiple languages simultaneously may lead to fragmentation in the early ecosystem.
  3. Different chains have different state handling mechanisms. Even if they all use WASM virtual machines, smart contract applications on different chains cannot be directly migrated, and a common programming language and developer ecosystem cannot be shared.

For application developers, what they directly face is the smart contract programming language, the foundational libraries of the programming language, and the availability of reusable open-source libraries. The security requirements of DeFi necessitate that smart contract code undergoes auditing, and every line of audited code represents money. Developers can slightly modify existing code for replication, thereby reducing auditing costs.

Now it seems that although Solidity took a seemingly slow path, it actually built an ecosystem faster. Many now consider Solidity & EVM to be the endpoint of smart contracts, and many chains have begun to be compatible with or port Solidity & EVM. At this point, new smart contract programming languages need to prove their stronger ecosystem-building capabilities to persuade everyone to pay attention and invest.

The new question is: how can we measure the ecosystem-building capabilities of a programming language?

Ecosystem-Building Capabilities of Programming Languages#

The ecosystem-building capability of a programming language can be simply described as its code reuse ability, mainly reflected in two aspects:

  1. The dependency methods between programming language modules.
  2. The composition methods between programming language modules. "Composability" is a feature that smart contracts boast, but in reality, all programming languages have composability; the interfaces, traits, etc., we invented are all aimed at facilitating composition.

First, let's talk about dependency methods. Programming languages implement dependencies mainly through three methods:

  1. By using static libraries, where dependencies are statically linked at compile time, packaging them in the same binary.
  2. By using dynamic libraries, where dependencies are dynamically linked at runtime; dependencies are not in the binary but must be pre-deployed on the target platform.
  3. By using remote calls (RPC) to depend at runtime. This refers to various APIs that can be called remotely.

Methods 1 and 2 are generally used in scenarios involving foundational library dependencies. Foundational libraries are generally stateless because it is difficult to assume how applications handle state, such as which file to write to or which database table to store in. This type of call occurs within the context of the same process and method call, sharing the call stack and memory space, with no security isolation (or very weak isolation), requiring a trusted environment.

Method 3 actually calls processes on another machine, communicating via messages, with each process responsible for its own state, thus providing state dependencies, and calls have security isolation.

Each of these three methods has its pros and cons. Method 1 includes dependencies in the final binary, which is advantageous because it has no dependencies on the target platform's environment, but the downside is that the binary is relatively large. Method 2 has the advantage of a smaller binary but has prerequisites for the runtime environment. Method 3 can build cross-language dependency relationships, generally used in cross-service and cross-organization collaboration scenarios, often simulated as method calls through SDKs or code generation for developer convenience.

In the history of technology, many programming languages and operating system platforms have spent considerable effort trying to bridge the gap between remote and local calls, aiming for seamless remote calls and composition. Some well-known technical terms, such as COM (Component Object Model), CORBA, SOAP, REST, etc., were all created to solve these problems. Although the dream of seamless calling and composition has been shattered, engineers still rely on manual labor to stitch together the entire Web2 services, but the spark of the dream remains.

Smart contracts have introduced new changes to the dependency methods between applications.

Changes Brought by Smart Contracts#

The traditional dependency method between enterprise applications can be represented as follows:

web2 system rpc call

  1. Systems connect services running on different machines through various RPC protocols.
  2. There are various technical and artificial "walls" between machines to ensure security.

The execution environment for smart contracts is a sandbox environment constructed by the nodes of the chain, where multiple contract programs run in different virtual machine sandboxes within the same process, as shown below:

blockchain smart contract call

  1. Calls between contracts occur between different smart contract virtual machines within the same process.
  2. Security relies on the isolation between smart contract virtual machines.

Taking Solidity as an example, a Solidity contract (indicated as a contract module) declares its functions as public, allowing other contracts to directly call this contract through this public method. Below is an example of an RPC call process:

rpc

Image source: https://docs.microsoft.com/en-us/windows/win32/rpc/how-rpc-works

The chain effectively takes over all communication processes between the Client and Server in the above diagram, automatically generating stubs, implementing serialization and deserialization, making developers feel that remote calls are just like local method calls.

Of course, technology does not have a silver bullet; there is no one-size-fits-all solution, and new solutions always bring new challenges that need to be addressed.

Dependency Challenges of Smart Contracts#

From the previous analysis, we understand that calls between smart contracts are essentially a method similar to remote calls. But what if we want to make dependency calls through libraries?

In Solidity, modules indicated as library are equivalent to static libraries; they must be stateless. Dependencies on library will be packaged into the final contract binary at compile time.

The problem this brings is that if a contract is complex and has too many dependencies, the compiled contract may become too large to deploy. However, if we split it into multiple contracts, we cannot directly share state, and internal dependencies become dependencies between remote services, increasing call costs.

Could we take the second path of dynamic library loading? For example, most contracts on Ethereum depend on the SafeMath.sol library, with each contract including its binary. Since the code is already on-chain, why can't it be shared directly?

Thus, Solidity provides the delegatecall method, similar to a dynamic linking library solution, embedding the code of another contract into the context of the current contract call, allowing the other contract to read and write the state of the current contract directly. But this has two requirements:

  1. The caller and the callee must have a completely trusting relationship.
  2. The states of the two contracts must be aligned.

Non-smart contract developers may not fully understand this issue. A Java developer might understand it this way: Each Solidity contract is equivalent to a Class, and when deployed, it runs as a singleton Object. If you want to load methods from another Class at runtime to modify properties in the current Object, the fields defined in both Classes must be the same, and the newly loaded method acts as an internal method, with the internal properties of the Object fully visible to it.

This limits the usage scenarios and degree of reuse for dynamic linking, which is now mainly used for internal contract upgrades.

For the reasons mentioned above, Solidity struggles to provide a rich standard library (stdlib) like other programming languages, which can be pre-deployed on-chain for other contracts to depend on, and can only offer a limited number of precompiled methods.

This also leads to the inflation of EVM bytecode. Many pieces of data that could have been obtained from the state through Solidity code are forced to be implemented as retrievals from the runtime context via virtual machine instructions. For instance, block-related information could have been obtained from the state through system contracts in the standard library; the programming language itself does not need to know about block-related information.

This issue is one that all chains and smart contract programming languages will encounter. Traditional programming languages have not considered security issues within the same method call stack (or have considered them very little). Once moved to the chain, they can only resolve dependencies through static and remote dependencies, and it is generally challenging to provide solutions similar to Solidity's delegatecall.

So how can we achieve a method of calling between smart contracts that resembles dynamic library linking? Can calls between contracts share the same method call stack and directly pass variables?

Doing so brings two security challenges:

  1. The security of contract states must be isolated through the internal security of the programming language, rather than relying on the virtual machine for isolation.
  2. Cross-contract variable passing needs to ensure security, guaranteeing that they cannot be arbitrarily discarded, especially for variables representing asset types.

State Isolation of Smart Contracts#

As mentioned earlier, smart contracts essentially execute the code of different organizations in the same process. Therefore, isolating the state of contracts (which can be simply understood as the results generated during contract execution that need to be saved for future executions) is necessary. If one contract is allowed to read and write the state of another contract directly, it will certainly lead to security issues.

The isolation solution is actually quite simple: give each contract an independent state space. When executing a smart contract, bind the current smart contract's state space to the virtual machine, so that the smart contract can only read its own state. If it needs to read another contract, it must go through the aforementioned inter-contract call, which is actually executed in another virtual machine.

However, when trying to make dependency calls through dynamic libraries, such isolation is insufficient. This is because, in reality, another contract runs within the execution stack of the current contract, and we need isolation at the language level rather than at the virtual machine level.

Additionally, isolating based on the state space of contracts brings up the issue of state ownership. In this case, all states belong to the contract, with no distinction between public and private states, which complicates state billing and could lead to state explosion issues in the long run.

So how can we achieve state isolation at the programming language level for smart contracts? The idea is quite simple and is based on types.

  1. Utilize the visibility constraints provided by the programming language for types, a feature supported by most programming languages.
  2. Utilize the mutability constraints provided by the programming language for variables; many programming languages distinguish between mutable and immutable references, such as Rust.
  3. Provide external storage based on types as keys, restricting the current module to only use its defined types as keys to read external storage.
  4. Provide the ability to declare copy and drop for types at the programming language level, ensuring that asset-type variables cannot be arbitrarily copied or discarded.

The Move language employs the above solutions, with points 3 and 4 being unique to Move. This solution is relatively easy to understand; if we cannot give each smart contract program a separate state space at the virtual machine level, isolating states within contracts based on types is a more understandable approach, as types have clear ownership and visibility.

Thus, in Move, calls between smart contracts become as shown in the following diagram:

move module call

Programs from different organizations and institutions can be combined into the same application through dynamic libraries, sharing the same memory world of the programming language. Interactions between organizations can not only transmit messages but also pass references and resources. The rules and protocols for interactions between organizations are only constrained by the rules of the programming language (the definition of resources is described later).

This change brings several aspects:

  1. The programming language and the chain can provide a rich foundational library, pre-deployed on the chain. Applications can directly depend on and reuse it without needing to include the foundational library part in their own binaries.
  2. Since the code of different organizations exists within the same memory state of the programming language, it can provide richer and more complex composition methods. This topic will be elaborated on later.

Although this dependency method in Move is similar to the dynamic library model, it simultaneously leverages the state hosting characteristics of the chain, introducing a new dependency model to the programming language.

In this model, the chain serves as both the execution environment for smart contracts and the binary repository for smart contract programs. Developers can freely combine smart contracts on the chain through dependencies to provide a new smart contract program, and this dependency relationship is traceable on-chain.

Of course, Move is still in its early stages, and the capabilities provided by this dependency method have not yet been fully realized, but the prototype is emerging. It can be envisioned that in the future, incentive mechanisms based on dependency relationships and new open-source ecosystems built on this incentive model will certainly emerge. Next, we will continue to discuss the issue of "composability."

Composability of Smart Contracts#

The composability between programming language modules is another important characteristic for building the ecosystem of programming languages. It can be said that it is precisely because of the composability between modules that dependency relationships arise, and different dependency methods provide different composition capabilities.

Based on the previous analysis of dependency methods, when discussing the composability of smart contracts in the Solidity ecosystem, we are primarily referring to the composition between contract modules, rather than between library modules. As mentioned earlier, the dependencies between contract modules are similar to remote call dependencies, where what is actually passed is messages, not references or resources.

The term resource is used here to emphasize that variables of this type cannot be arbitrarily copied or discarded within the program, which is a characteristic brought by linear types, a concept that is not yet widespread in programming languages.

Linear types originate from linear logic, which itself is designed to express resource consumption logic that classical logic cannot express. For example, if there is "milk," it can logically lead to "cheese," but it cannot express resource consumption, such as how X units of "milk" can yield Y units of "cheese." Hence, linear logic emerged, and linear types were introduced in programming languages.

In programming languages, the first resource to manage is memory, so one application scenario for linear types is tracking memory usage to ensure that memory resources are correctly reclaimed, as seen in Rust. However, if this feature is widely promoted, we can simulate and express any type of resource in programs.

So why is it important to be able to pass resources during composition? Let's first understand the current interface-based composition method, which most programming languages, including Solidity, adopt.

To combine multiple modules, the key is to agree on the functions to be called and the types of parameters and return values, generally referred to as the function's "signature." We typically use interfaces to define these constraints, but the specific implementation is left to each party.

For example, the commonly mentioned ERC20 Token is an interface that provides the following methods:

function balanceOf(address _owner) public view returns (uint256 balance)
function transfer(address _to, uint256 _value) public returns (bool success)

This interface definition includes methods for transferring funds to a specific address and querying balances, but it does not have a direct withdrawal method. In Solidity, tokens are treated as services rather than types. Below is a similar method defined in Move:

module Token{
   struct Token<TokenType>{
      value: u128,
   }
}
module Account{
    withdraw(sender: &signer, amount):Token<STC>;
    deposit(receiver: address, token: Token<STC>);
    transfer(sender, receiver, amount);
}

As can be seen, Token is a type that can be withdrawn from an account. One might ask, what is the significance of this?

We can use a more straightforward analogy to compare the differences in composition methods between the two. A Token object is akin to cash in real life; when you want to buy something at a mall, there are two payment methods:

  1. The mall interfaces with the bank, integrating an electronic payment system, and when you pay, you directly initiate a request for the bank to transfer funds to the mall.
  2. You withdraw cash from the bank and pay directly at the mall. In this case, the mall does not need to interface with the bank in advance; it just needs to accept this cash type. As for whether the cash is locked in a safe or deposited back into the bank, that is up to the mall to decide.

The latter type of composition can be termed resource-type-based composition, where resources flow between contracts of different organizations, referred to as "free state."

The key advantages of resource-type-based composition are twofold:

  1. It can effectively reduce the nesting depth of interface-based composition. For those interested, you can refer to a previous sharing of mine regarding flash loans. Considering that some readers may not be familiar with the background of flash loans, I will not elaborate further here.
  2. It can clearly separate the definition of resources from resource-based behaviors. A typical example is soulbound NFTs.

The concept of soulbound NFTs was proposed by Vitalik, intending to use NFTs to express identity relationships that should not be transferable, such as diplomas and honor certificates.

However, the NFT standards on Ethereum are all interfaces, such as the methods in ERC721:

function ownerOf(uint256 _tokenId) external view returns (address);
function safeTransferFrom(address _from, address _to, uint256 _tokenId) external payable;

If one wants to extend new behaviors, such as binding, a new interface must be defined. This will also affect old methods; for instance, if an NFT has been soulbound, it cannot be transferred, inevitably leading to compatibility issues. It becomes even more challenging when initially allowing transferability, but once bound, it cannot circulate, as seen with certain game items.

However, if we consider NFTs as items, their properties only determine how they are displayed and what attributes they have, while the ability to transfer should be handled by higher-level encapsulation.

For example, here is how an NFT is defined in Move, as a type:

struct NFT<NFTMeta: copy + store + drop, NFTBody: store> has store {
    creator: address,
    id: u64,
    base_meta: Metadata,
    type_meta: NFTMeta,
    body: NFTBody,
}

Then we can envision the upper-level encapsulation as different containers, each with different behaviors. For instance, when an NFT is placed in a personal exhibition hall, it can be taken out, but once placed in a special container, taking it out requires other rules and restrictions, thus achieving "binding."

For instance, Starcoin's NFT standard implements a soulbound NFT container called IdentifierNFT:

/// IdentifierNFT contains an Option NFT, which is empty by default, equivalent to a box that can hold NFTs
struct IdentifierNFT<NFTMeta: copy + store + drop, NFTBody: store> has key {
        nft: Option<NFT<NFTMeta, NFTBody>>,
}

/// Users initialize an empty IdentifierNFT under their account through the Accept method
public fun accept<NFTMeta: copy + store + drop, NFTBody: store>(sender: &signer) {
  move_to(sender, IdentifierNFT<NFTMeta, NFTBody> {
    nft: Option::none(),
  });
}

/// Developers grant the receiver the NFT through MintCapability, embedding the NFT into the IdentifierNFT
public fun grant_to<NFTMeta: copy + store + drop, NFTBody: store>(_cap: &mut MintCapability<NFTMeta>, receiver: address, nft: NFT<NFTMeta, NFTBody>) acquires IdentifierNFT {
     let id_nft = borrow_global_mut<IdentifierNFT<NFTMeta, NFTBody>>(receiver);
     Option::fill(&mut id_nft.nft, nft);
}

/// Developers can also retrieve the NFT from the owner's IdentifierNFT through BurnCapability
public fun revoke<NFTMeta: copy + store + drop, NFTBody: store>(_cap: &mut BurnCapability<NFTMeta>, owner: address): NFT<NFTMeta, NFTBody>  acquires IdentifierNFT {
     let id_nft = move_from<IdentifierNFT<NFTMeta, NFTBody>>(owner);
     let IdentifierNFT { nft } = id_nft;
     Option::destroy_some(nft)
}

The NFT in this box can only be granted or retrieved by the issuer of the NFT; the user can only decide whether to accept it, such as a diploma that the school can issue and revoke. Of course, developers can implement other rules for containers, but the NFT standard remains unified. Those interested in this specific implementation can refer to the links at the end of the article.

This section illustrates a new composition method brought by Move based on linear types. However, the advantages of language features alone cannot naturally lead to the ecosystem of a programming language; there must also be application scenarios. We will continue to discuss the application scenario expansion of the Move language.

Application Scenario Expansion of Smart Contracts#

Move was initially designed as the smart contract programming language for the Libra chain, considering various application scenarios. At that time, Starcoin was also in design, and its characteristics aligned with the goals pursued by Starcoin, leading to its application in the public chain scenario. Later, as the Libra project was shelved, several public chain projects were incubated, exploring different directions:

  • MystenLabs' Sui introduced immutable states, attempting to implement a UTXO-like programming model in Move.
  • Aptos is exploring parallel execution of transactions on Layer 1 and high performance.
  • Pontem is attempting to bring Move into the Polkadot ecosystem.
  • Starcoin is exploring layered expansion models from Layer 2 to Layer 3.

Meanwhile, Meta (Facebook)'s original Move team is trying to run Move on top of EVM. Although this will lose the feature of passing resources between contracts, it will help expand the Move ecosystem and integrate it with the Solidity ecosystem.

Currently, the Move project has become independent, evolving into a fully community-driven programming language. It now faces several challenges:

  1. How to find the greatest common divisor among the needs of different chains? Ensuring the language's universality.
  2. How to allow different chains to implement their special language extensions?
  3. How to share foundational libraries and application ecosystems across multiple chains?

These challenges also present opportunities; they are conflicting and require trade-offs, necessitating a balance to be found in development. No language has attempted this yet. This balance could ensure that Move can explore more application scenarios, not just those tied to blockchain.

In this regard, Solidity's interaction with chains through instructions brings about a problem: the Solidity & EVM ecosystem is entirely bound to the chain, requiring a simulated chain environment to run. This limits Solidity's expansion into other scenarios.

Regarding the future of smart contract programming languages, there are many different views, generally falling into four categories:

  • There is no need for a Turing-complete smart contract language; Bitcoin's script is sufficient. Without a Turing-complete smart contract, it is challenging to achieve universal arbitration capabilities, which will limit the application scenarios of the chain. This point can be seen in my previous article, "Unlocking the 'Three Locks' of Bitcoin Smart Contracts."
  • There is no need for a dedicated smart contract language; existing programming languages are sufficient, a viewpoint we have already analyzed above.
  • A Turing-complete smart contract language is needed, but its application scenarios are limited to on-chain, similar to stored procedure scripts in databases. This is the view of most current smart contract developers.
  • Smart contract programming languages will expand into other scenarios, ultimately becoming a general-purpose programming language.

The last viewpoint can be termed maximalism for smart contract languages, and I personally hold this view. The reasoning is simple: in the Web3 world, whether in games or other applications, when disputes arise, there needs to be a digital dispute arbitration solution. The key technical points of blockchain and smart contracts are about the proof of state and computation, and the arbitration mechanisms explored in this field can certainly be applied to more general scenarios. When a user installs an application and is concerned about its security, hoping that the application can provide proof of state and computation, that is when application developers must choose to implement the core logic of the application using smart contracts.

Conclusion#

This article discusses the implementation paths of on-chain smart contracts and the current challenges faced by smart contracts in terms of dependencies and composability, using as straightforward language as possible to explain the attempts made by Move in this direction and the potential for ecosystem building based on these attempts.

Considering the article is already quite lengthy, many aspects have not been addressed. I will write a series based on this topic, and here is a preview:

Why Move: Ecosystem-Building Capability#

This is the current article.

Why Move: Security of Smart Contracts#

The security of smart contracts is a widely concerned issue, and articles promoting Move often highlight "security" as a feature. But how can we compare the security of different programming languages? There is a saying that you cannot stop someone from shooting themselves in the foot; programming languages are tools, and when developers use these tools to shoot themselves in the foot, what can the programming language itself do? Smart contracts allow programs from different organizations to run in the same process, maximizing the role of programming languages, but also bringing new security challenges. This article will discuss this issue from a holistic perspective.

Why Move: State Explosion and Layering#

Move implements state isolation within the programming language, while also providing more possibilities for solutions in this field. Contracts can handle the storage location of states more freely, such as storing states in the user's own state space, which is more conducive to state billing and incentivizing users to release space. For example, can we truly achieve the migration of states between different layers, thereby transferring Layer 1 states to Layer 2, fundamentally solving the state explosion problem? This article will explore some possibilities in this direction.

If you are interested in the follow-up content, you can follow me through the following ways:

Original link:

https://jolestar.com/why-move-1/

  1. https://github.com/move-language/move New repository for the Move project
  2. awesome-move: Code and content from the Move community A resource collection for Move-related projects, including public chains and libraries implemented in Move
  3. Soulbound (vitalik.ca) Vitalik's article on soulbound NFTs
  4. SIP22 NFT Starcoin's NFT standard, including the description of IdentifierNFT
  5. Unlocking the 'Three Locks' of Bitcoin Smart Contracts (jolestar.com)
Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.