Query Complexity

GraphQL by design allows clients to request exactly the data they need. However, this flexibility can be exploited to create overly complex queries that can strain server resources, leading to performance degradation or denial of service. To mitigate these risks, it’s essential to implement query complexity limits in your GraphQL router configuration.

This guide explains how to configure the GraphQL router to enforce query complexity limits to prevent abusive queries. For the complete configuration options, see limits in the configuration reference.

Protection against malicious complex queries

One of the main benefits of GraphQL is that data can be requested individually. However, this also introduces the possibility for attackers to send operations with deeply nested selection sets that could block other requests being processed. Even if infinite loops are not possible by design as a fragment cannot self-reference itself; but that still does not prevent possible attackers from sending selection sets that are hundreds of levels deep.

The following schema:


type Query {
  author(id: ID!): Author!
}
type Author {
  id: ID!
  posts: [Post!]!
}
type Post {
  id: ID!
  author: Author!
}

Would allow sending and executing queries such as:


query {
  author(id: 42) {
    posts {
      author {
        posts {
          author {
            posts {
              author {
                posts {
                  author {
                    posts {
                      author {
                        posts {
                          author {
                            id
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

There are a few ways to mitigate this risk which is covered by this documentation.

Reject operations based on the size / tokens

Parsing a GraphQL operation document is not a cheap operation, but expensive and compute-intensive. If an attacker sends a very complex operation document with slight variations over and over again, they can easily degrade the performance of your GraphQL API server.

Unfortunately because of the possibility of variations, caching parsed documents is not a reliable way to mitigate this risk. Instead, you can limit the size of incoming operations.

A potential solution is to limit the maximum number of tokens in a GraphQL document.

In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters into a sequence of lexical tokens.

For example, the given GraphQL operation;


query {
  me {
    id
    user
  }
}

The tokens are query, {, me, {, id, user, }, } which gives a total of 8 tokens.

The optimal maximum token count for your application depends on the complexity of the GraphQL operations and documents. Usually 800-2000 tokens seems like a sane default.

You can use tools like GraphQL Inspector to analyse and find the best defaults for your use cases.

But on the API side, you can configure the maximum amount as shown below;

router.config.yaml


limits:
  max_tokens:
    n: 1000

In that example, any incoming GraphQL query that exceeds 1000 tokens will be rejected with an error.

Prevent deeply nested queries

If you build an API that is available to the 3rd-party users, it is recommended to limit the maximum depth of incoming GraphQL queries to prevent overly complex queries.

router.config.yaml


limits:
  max_depth:
    n: 10

In that example, any incoming GraphQL query that exceeds a depth of 10 will be rejected with an error.


query {
  user {
    posts {
      comments {
        text
      }
    }
  }
}

The above query has a depth of 3 (user -> posts -> comments), so it would be accepted. If a query exceeded a depth of 10, it would be rejected.

This can prevent malicious API users executing GraphQL operations with deeply nested selection sets. You need to tweak the maximum depth an operation selection set is allowed to have based on your schema and needs, as it could vary between users.

Why both `max_depth` and `max_tokens`?

Both max_depth and max_tokens serve different purposes in protecting your GraphQL API from abusive queries.

max_depth specifically targets the structure of the query, preventing excessively nested selection sets that can lead to performance issues. This is particularly important in GraphQL, where deeply nested queries can be constructed even without a large number of tokens.
max_tokens, on the other hand, provides a broader safeguard by limiting the overall size of the query. This helps to prevent attacks that exploit the complexity of queries through a large number of fields, arguments, and other GraphQL constructs, regardless of their nesting level.

The following operation has 20 tokens and a depth of 5 which will be rejected by both limits if set to 10 and 15 respectively:


query {
  author(id: 1) {
    id
    posts {
      id
      author {
        id
        posts {
          id
        }
      }
    }
  }
}

So you might think max_depth is sufficient enough, however consider the below operation. The following operation has 20 tokens but only a depth of 2:


query {
  me {
    id
    name
    email
  }
  post(id: 1) {
    id
    title
    content
  }
}

This operation passes a max_depth of 2 but would be rejected by a max_tokens limit of 10.

By implementing both max_depth and max_tokens, you create a more robust defense against a wider range of potential query abuses, ensuring better performance and reliability for your GraphQL API.

Query Complexity

Protection against malicious complex queries

Reject operations based on the size / tokens

Prevent deeply nested queries

Why both max_depth and max_tokens?

Why both `max_depth` and `max_tokens`?