Compound V2 Utilization Rate Prediction

Intro

In this project, our aim is to predict the utilization rate of a given market on Compound V2. In particular, we will focus on USDC. Utilization rate is the ratio between the total amount borrowed and the total amount supplied to a given market (e.g. if 100 USDC is supplied but only 40 USDC is borrowed, the utilization rate is 0.4). Generally, higher utilization rates mean higher fees collected by a lending protocol but also a higher leverage taken by the users, implying higher solvency risk. Additionally, if the utilization rate is too high, some users might not be able to close their positions since there wouldn't be sufficient liquidity to do so.

Approach

We will use the Compound supply and borrow daily data as well as APYs and TVLs on other protocols to predict the USDC utilization rate. This data will be collected using the giza-datasets package. We will define our data processing and training pipeline with Giza's actions-sdk library. We will use a simple feedforward neural network built in pytorch to serve our predictions. Finally, we will use the giza-cli to transpile, deploy, and run a verifiable inference on our model.

Potential use-cases

  • Liquidity Management for multi-chain protocols

    Some multi-chain lending protocols provide liquidity on their markets. If they could predict the utilization rate on each chain, they could re-distribute their liquidity between them. That could serve multiple purposes, ranging from maintaining sufficient levels of liquidity on all chains or optimizing the utilization rate across the instances of the protocol.

Installation

We will use a couple of tools from the Giza stack. For each of them, follow the installation guides from the respective docs:

Install the remaining packages

certifi==2023.5.7
giza==0.10.0
giza_actions==0.1.2
giza_datasets==0.1.1
numpy==1.24.1
polars==0.20.7
requests==2.31.0
torch==1.13.1

Setup

Confirm that your session is active with giza users me or log into the Giza CLI with giza users login.

Create a new Giza Workspace with giza workspaces create or retrieve the existing workspace information with giza workspaces get .

Create a new Giza model with giza models create. Save the model-id and version-id to use later at the model deployment stage.

Visit the giza-hub repo to access the full code including all the python scripts and a jupyter notebook.

Load and preprocess the data

As already mentioned, we will use giza-datasets to fetch all the relevant data for this project. In particular, we will focus on these datasets:

  • compound-daily-interest-rates: This dataset contains the daily supply and interest rates in all Compound V2 markets on Ethereum mainnet. We will extract the dependent variable from this dataset as well as construct some features from markets other than USDC. You can find more information about this dataset here.

  • top-pools-apy-per-protocol: This dataset contains the Annual Percentage Yields (APYs) of top pools across multiple protocols. More info about the dataset here.

  • tvl-per-project-tokens: This dataset contains the Total Value Locked (TVL) of different assets across multiple protocols. More info on the dataset here.

  • tokens-daily-prices-mcap-volume: This dataset contains market cap, volume, and price data for multiple tokens. More details can be found here.

We will load the datasets using the DatasetsLoader object from the giza_datasets library. The code below serves as an example of how this is achieved:

from giza_datasets import DatasetsHub, DatasetsLoader
import os
import certifi

os.environ["SSL_CERT_FILE"] = certifi.where()
loader = DatasetsLoader()
compound_df = loader.load("compound-daily-interest-rates")

After loading the datasets, we will process them such that we can fill out the null values, and extract the relevant features and the target variable. All of the processing steps are wrapped in tasks. We will not discuss them in detail here for the sake of brevity. The final task responsible for collecting and processing the data is shown below:

from giza_actions.task import task

@task(name="Load and process data")
def load_and_process():
    comp_df, min_dt = parse_compound_df(assets_to_keep)
    apy_df = parse_apy_df(assets_to_keep)
    tvl_df = parse_tvl_df(assets_to_keep, min_dt)
    price_df, vol_df, mcap_df = parse_mcap_df(assets_to_keep, min_dt)
    final_df = combine_dfs(comp_df, apy_df, tvl_df, vol_df, mcap_df, price_df)
    final_df = clean_final(final_df)
    return final_df

All the other processing code can be found on the repo containing this project.

Train and export the model

As you can see in the code below, we define our neural network using pytorch and wrap the training process within a giza task. We also create a task for exporting the trained model to the ONNX type.

import torch
import torch.nn as nn
import torch.optim as optim

# Define the neural network architecture
class SimpleNN(nn.Module):
    def __init__(self, input_size):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(input_size, 64)  # First hidden layer
        self.fc2 = nn.Linear(64, 32)  # Third hidden layer
        self.fc3 = nn.Linear(32, 1)  # Output layer

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        return x

@task(name="Get train and test sets")
def get_train_test(final_df):

    X_df = final_df.drop(["USDC_utilization_rate", "date"])
    Y_df = final_df.select(["USDC_utilization_rate"])

    X_pandas = X_df.to_pandas()
    Y_pandas = Y_df.to_pandas()

    # Split the data into training and testing sets based on the time order
    # Assuming 80% for training and 20% for testing as an example
    split_index = int(len(X_pandas) * 0.8)
    X_train, X_test = X_pandas[:split_index], X_pandas[split_index:]
    y_train, y_test = Y_pandas[:split_index], Y_pandas[split_index:]
    return X_train, X_test, y_train, y_test


@task(name="Train the model")
def train_model(model, X_train, y_train, X_test, y_test, epochs=100000, lr=0.001):
    X_train_tensor = torch.tensor(X_train.to_numpy().astype(np.float32))
    y_train_tensor = torch.tensor(y_train.to_numpy().astype(np.float32).reshape(-1, 1))
    X_test_tensor = torch.tensor(X_test.to_numpy().astype(np.float32))
    y_test_tensor = torch.tensor(y_test.to_numpy().astype(np.float32).reshape(-1, 1))

    # Instantiate the model
    input_size = X_train.shape[1]
    model = SimpleNN(input_size)

    # Loss function and optimizer
    criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)

    # Training loop
    for epoch in range(epochs):
        optimizer.zero_grad()
        outputs = model(X_train_tensor)
        loss = criterion(outputs, y_train_tensor)
        loss.backward()
        optimizer.step()

        if epoch % 10000 == 0:  # Print loss every 10 epochs
            print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item()}")

    model.eval()  # Set the model to evaluation mode
    with torch.no_grad():
        y_pred_tensor = model(X_test_tensor)
        test_loss = criterion(y_pred_tensor, y_test_tensor)
        test_rmse = torch.sqrt(test_loss)

    print(f"Model RMSE: {test_rmse.item()}")


@task(name="Export model to ONNX")
def export_to_onnx(model, X_train, onnx_model_path):
    sample_input = torch.randn(1, X_train.shape[1], dtype=torch.float32)

    # Export the model
    torch.onnx.export(
        model,  # Model being exported
        sample_input,  # Model input (or a tuple for multiple inputs)
        onnx_model_path,  # Where to save the model
        export_params=True,  # Store the trained parameter weights inside the model file
        opset_version=11,  # ONNX version to export the model to
        do_constant_folding=True,  # Whether to execute constant folding for optimization
        input_names=["input"],  # Model's input names
        output_names=["output"],  # Model's output names
        dynamic_axes={
            "input": {0: "batch_size"},  # Variable length axes
            "output": {0: "batch_size"},
        },
    )

    print(f"Model has been converted to ONNX and saved to {onnx_model_path}")

Create and Deploy a Giza Action

Finally, we are ready to combine everything into a giza action and deploy it.

from giza_actions.action import Action, action

@action(name=f"Execution", log_prints=True)
def execution():
    df = load_and_process()
    X_train, X_test, y_train, y_test = get_train_test(df)
    model = SimpleNN()
    train_model(model, X_train, y_train, X_test, y_test)
    export_to_onnx(model, X_train, "ff_nn_compound_ur_prediction.onnx")


if __name__ == "__main__":
    action_deploy = Action(entrypoint=execution, name="compound_ur_prediction")
    action_deploy.serve(name="compound_ur_prediction")

Executing this code should create a new deployment in your workspace. You can access the dashboard from the URL provided in the output message. All of the steps discussed so far including the deployment of an action are contained in the train_and_deploy_action.py script within the giza-hub repo.

Transpile the ONNX model

Execute the following commands in your terminal from the root directory of your project to transpile and build the model.

giza transpile ff_nn_compound_ur_prediction --output-path transpiled_model
cd transpiled_model/inference
scarb build

It should generate an inference.sierra file in the transpiled_model/inference/target/dev/ directory. You will use this file in the deployment command.

Deploy a Giza Model

Execute the following in your terminal with the correct model-id and version-id.

giza deployments deploy --model-id <YOUR_MODEL_ID> --version-id <YOUR_VERSION_ID> ./target/dev/inference.sierra

Run unverifiable inference

If you want to run an inference on your model without generating a ZK proof of the process, you can use the following code (from unverifiable_inference.py file)

from giza_actions.action import Action, action
from giza_actions.task import task
from giza_actions.model import GizaModel
import numpy as np

onnx_model_path = "ff_nn_compound_ur_prediction.onnx"
in_x = np.load("X_test_sample.npy")
model_input_2d = in_x.reshape(1, -1)  # Reshape to 2D array with 1 row


@task(name="Unverifiable Prediction with ONNX")
def prediction(model_input):
    model = GizaModel(model_path=onnx_model_path)
    result = model.predict(
        input_feed={model.session.get_inputs()[0].name: model_input}, verifiable=False
    )
    return result


@action(name="Unverifiable Execution: Prediction with ONNX", log_prints=True)
def execution():
    predicted_val = prediction(model_input_2d)
    print(f"Predicted val: {predicted_val}")
    return predicted_val


execution()

Run verifiable inference

To run an inference with a ZK proof generated, such that you can verify its correctness, you can run the verifiable_inference.py script containing the following code:

from giza_actions.action import Action, action
from giza_actions.task import task
from giza_actions.model import GizaModel
import numpy as np

in_x = np.load("X_test_sample.npy")
model_input_2d = in_x.reshape(1, -1)  # Reshape to 2D array with 1 row
model_id = <YOUR_MODEL_ID>
version_id = <YOUR_VERSION_ID>


@task(name="Verifiable Prediction with Cairo")
def prediction(model_input, model_id, version_id):
    # Initialize a GizaModel with model and version id.
    model = GizaModel(id=model_id, version=version_id)

    # Call the predict function.
    # Set `verifiable` to True, and define the expecting output datatype.
    (result, request_id) = model.predict(
        input_feed={"model_input": model_input},
        verifiable=True,
    )
    return result, request_id


@action(name="Verifiable Execution: Prediction with Cairo", log_prints=True)
def execution():
    (result, request_id) = prediction(model_input_2d, model_id, version_id)
    return result, request_id


result, request_id = execution()
print(f"Result: {result}, Request ID: {request_id}")

Download the proof

Execute the following in your terminal to download the proof and verify it.

giza deployments download-proof --model-id <MODEL_ID> --version-id <VERSION_ID> --deployment-id <DEPLOYMENT_ID> --proof-id <PROOF_ID> --output-path <OUTPUT_PATH>

Verify the proof

giza verify --proof PATH_OF_THE_PROOF

Last updated