Risk Scores

Yearn works with risk scores to quantify and assess the amount of risk of each strategy and vault. This document outlines how we measure risk vectors and use them to find the optimal balance of security and innovation.

Strategy Risk Score defines each dimension of risk for a strategy and how we quantify them
Vault Risk Score Proposal aggregates all strategy scores for a vault, averaging by TVL (this is in draft stage)
Overall Risk Score Proposal aggregates strategy/vault scores into overall scores (this is in draft stage)

Strategy Risk Score

Risk for different strategies is quantified using a 1-5 point system developed by Yearn's strategy deployment process. The higher the risk score number, the more risky the strategy is. The risk assessment evaluates eight dimensions:

Risk Scores

This risk framework is an ongoing process to ensure the security of Yearn strategies. Yearn recognized that, due to its unique approach to deploying strategies, it could not rely on a traditional waterfall process (heavy analysis and design, testing, multiple audits before release, etc.) to deploy contracts. Strategies are deployed and capped by their risk score. As we reduce the risk in any of the eight dimensions, the strategy can grow its TVL. This system allows Yearn to compare the risk score of two strategies and prioritize risk mitigation and preventive actions, such as forming a committee to spread knowledge on the code, getting more audits, migrating current code to improved versions of the strategy, etc.

The current version of the risk score system works for Yearn's current needs, but we are always looking to improve and expand it to the vaults. We want to provide our users with a better understanding of what is happening behind the scenes in the vaults. The development of vault risk scoring is still in progress!

Audit

Auditing is a process that an audit firm or external security researcher reviews code for any potential vulnerabilities and present a report for mitigation. Audits take longer than an internal security review and are not immediately available due to the high demand for audits in the space. As such, most strategies are sent to production without an audit (leading to a high risk score) in order to limit their Total Value Locked (TVL). This approach allows for strategies to be validated in production while still managing risk, and the risk score helps determine which strategies should get an audit first, based on impact and other scoring criteria. The table below outlines the scoring criteria associated with audits.

Score	Audit
5	No audit by a trusted firm or security researcher.
4	Audit by trusted firm or security researcher conducted more than 6 months ago.
3	Audit by trusted firm or security researcher conducted more than 3 months ago.
2	Audit conducted less than 3 months ago by an independent trusted firm.
1	Audit conducted less than 3 months ago by 3 or more independent trusted firms.

Code Review

This is the process that reviews strategy code going to production. It is done in two major phases:

Phase 1: Two internal peers (strategists) review the strategy for any potential issues regarding handling accounts, profits, losses, etc. After this phase is completed, the strategy can go to ape.tax for live testing and validation.

Phase 2: An internal security reviewer from Yearn will review the code focusing on security concerns. Once phase 2 is completed, the strategy gets a risk score in all dimensions and is usually deemed enough for a strategy to go to production with limited TVL based on scoring.

After these steps a recurring review is scheduled, where either a second either internal or external security reviewer will have another look at the code:

Score	Code Review
5	0 - 1 reviewer of the code only or most recent was done 6 months+ ago
4	2 reviewers of the code, the most recent of which was done 3+ months ago
3	3 reviewers of the code, the most recent of which was done 3+ months ago (1 of the reviewers is an internal security dev)
2	4 reviewers of the code (2 peers and 2 internal security devs)
1	5 reviewers of the code, (2 strategists peers and 2 security reviewers and either external protocol devs reviewed or external security researchers reviewed)

Complexity

This is how the strategy earns its returns: is it a simple strategy like a masterchef staking or does it have complex mechanics such as leverage, risk of liquidation, and involvement with multiple protocols? The more components it needs will require a higher complexity score. This score is essential in an emergency to evaluate how difficult it is to mitigate a live issue:

Score	Complexity
5	Strategy is highly complex, uses leverage or debt, and is not easy to unwind. No health check available
4	Uses leverage or debt, and is not easy to unwind. No health check available
3	Has potential for losses, withdrawal fees, or requires detailed queue management to prevent losses. No health check available
2	Strategy is relatively simple, and is easy to migrate/unwind. Has a health check
1	Strategy is easy to understand, and can be migrated/unwound easily. No leverage and no publicly accessible methods. Highly unlikely to incur losses.

Longevity

How long the strategy has been running live on yearn.finance:

Score	Longevity
5	Code is new and did not go to ape tax before going live on yearn.finance
4	Code has been running for less than one month
3	Code has been running between 1-4 months
2	Code has been running for 4+ months
1	Code has been running for 8+ months with no critical issues and no changes in code base

Protocol Safety

Protocol Safety evaluates the resilience of the protocol the strategy uses. It takes into account the safety measures given the current DeFi security standards, based on our internal assessments and due diligence compared to the top projects in DeFI. This includes multisig health, decentralization, bounty programs, audits, etc.

We hope to improve this dimension with the help of the DeFI community to potentially use a standard scoring system that is widely accepted in the ecosystem to replace our current scoring table:

Score	Protocol Safety
5	No due diligence (DD) document for this strategy. The protocol contracts used are very recent and not audited/verified. An EOA (externally owned account) owns the contracts and can upgrade them.
4	DD took place. Protocol contracts audited/verified. A multisig is required or contracts are upgradable. Multisig has a low threshold of signers. No bounty program.
3	DD took place. Protocol contracts are audited/verified by at least one reputable audit firm. A multisig with an appropriate threshold is required and/or contracts are immutable. Has a good bounty program.
2	DD took place. Protocol contracts are audited/verified by at least two reputable audit firms. A multisig with an appropriate threshold is required and/or contracts are immutable. Has a good bounty program.
1	Protocols involved in contracts are trusted blue chip protocols with a good track record of security. For example: Maker, Uniswap, Curve, AAVE, and Compound. These protocols meet all the criteria specified in item 2 and more.

Team Knowledge

Measures the amount of expertise on a strategy that is shared amongst Yearn contributors. How many contributors can manage the strategy and respond in an emergency? The fewer people who can manage and respond during an emergency the riskier the strategy assessment in this dimension:

Score	Team Knowledge
5	1 person in the team is the only one that has in-depth knowledge of the strategy code
4	1 strategist has in-depth knowledge, and 1 strategist is somewhat familiar with the strategy code.
3	2 strategists have in-depth knowledge of the strategy code.
2	2 strategists have in-depth knowledge, and 1 strategist is somewhat familiar with the strategy code.
1	A team of 3+ strategists are very familiar with the strategy code and the protocol the strategy is utilising.

Testing Score

Testing score is a metric of how much of the codebase for the strategy has been tested. It uses the test coverage number as a reference, higher coverage means the developer/strategist took time to test most of the operations of the strategy in a unit test or fork environment. This score assumes that a less tested strategy entails more risk since we know less about what is expected from the code:

Score	Testing Score
5	Less than 20% coverage in testing
4	Less than 40% coverage in testing
3	40% to 80% coverage
2	Over 80% coverage
1	Over 90% coverage in testing. Second developer validated and added tests and also added new ones for uncovered cases while reviewing. You can pull the repository and the tests are currently passing

TVL Impact

The TVL (total value locked) metric measures how to allocate to new riskier strategies without having a catastrophic event in case of a hack or issue. The lower the impact, the more likely Yearn’s treasury can recover from an incident. The TVL is measured in USD and grows dynamically based on strategies allocations onchain. Through yearn.watch, we keep track of the TVL and risk score to make fund allocation decisions and mitigations if a strategy group has fallen into the “red” high-risk zone:

Score	TVL Impact
5	Extreme: greater than USD 100 MM
4	Very high: less than USD 100 MM
3	High: less than USD 50 MM
2	Medium: less than USD 10 MM
1	Low: less than USD 1 MM

Vault Risk Score Proposal

A vault is a contract that holds funds for up to 20 strategies, the vault risk score is a TVL weighted average for each active strategy, for example:

Strategy X has 5000$ funds deposited
Strategy Y has 1000$ funds deposited

This vault's risk score would be calculated as follows:

(
  (Strategy X risk) * 5000
  +
  (Strategy Y risk) * 1000
)
÷
6000

Overall Risk Score Proposal

Risks on some projects may have more relevance than others, so before calculating the overall score we first define the weight for the context we want to apply the framework on, and then we do a weighted average between all risk dimensions and risk profiles:

Risk Profile = Weighted table of which risk dimension is more important given the current context
Risk Score = Weighted average of all 8 dimensions using the risk profile weights

A project may have many risk profiles, so for each profile the score is calculated and the final list that remains is then used with medians to reach the final result.

The projects overall risk score will be presented in 3 variables:

high: profile score for a risk-averse user
low: profile score for a risk-seeking user
median: profile score for a median representative user

Where each one of these use the final list median:

high: median + 1.5 IQR
low: median - 1.5 IQR
median: the median value from the distribution

Where IQR stands for the interquartile range of the distribution

Here is what the final result looks like:

{
  'overallScore': {
    'high': 3.37675585284281,
    'low': 2.5463210702341135,
    'median': 2.9615384615384617,
  },
}