Pentesting GraphQL 101 Part 1 - Discovery

Pentesting GraphQL 101 
Part 1 - Discovery

Recent statistics say that you have queried at least one GraphQL endpoint today. For me, as a Penetration tester, it is just a matter of concern, especially since high-quality Pentesting guides/articles are scarce online, which only signals that GraphQL security is still rudimentary.

So I decided to start this series of articles: Pentesting GraphQL 101, where I try to share my GraphQL security experience from an attackers/pentester point of view; each piece will handle a, dare I say, chronological step in the pentesting process.

Today's article will handle the first essential step of pentesting: Discovery.

What is Discovery?

A good colleague once said: "The discovery step starts but never ends." And he is right; when pentesting, you start by asking the "what am I dealing with?" questions, to which answers will keep popping up from the first interaction with a server to writing your report. But, never the less the initial and conscious discovery step is still essential, and it includes:

  • Understanding the limits enforced on the endpoint
  • Determine the Verbosity of the Endpoint
  • Fetching all the information possible about the architecture

Resources that you'll need

Fortunately for us, Escape Technologies made a well-maintained awesome-graphql-security list of all the GraphQL tools we'll need for the Pentesting GraphQL 101 series.

For now, you'll need a GraphQL IDE to be able to interact with a GraphQL endpoint in the easiest way possible; my personal favorite is altair.

Sending the first query

I've heard a lot of people claiming that introspection is the first query to be sent to any GraphQL endpoint you are using since the introspection query gives the ability to fetch which resources are available in the current API schema. But I can't entirely agree. Without denying the importance of the introspection query that I'll handle in this article, I argue that the first query to be sent to an endpoint is this little beauty:

query{
    __typename
}

So open your IDE, place your GraphQL endpoint URL and send that query.

We asked the server to return the query name "query". That is the most basic operation a GraphQL endpoint can do, nevertheless, it is still a query or the GraphQL method of fetching data.

Before continuing, let us do a little "where to look exercise". As a pentester, many elements should be noticed when sending any query.

  • The HTTP method

The method can be also fetched through the network tab in the browser's developer tools. But why? well, earlier, it was mentioned that queries are what GraphQL uses to return data, but to modify data, GraphQL uses what is known as "mutations".

The most basic mutation to send is, you guessed it:

mutation{
    __typename
}

And the ability to send mutations or data changing requests through the GET requests is a vulnerability called CSRF, which is explained splendidly in this article.

  • The time taken by the query

The habit of looking at the query time is essential for getting a feeling of whether a query is executed or not, as well as the computational cost of a  sent query. Typically, queries that cause minor errors take less time than a query that executes and cause a particular operation to occur, while queries that take a lot of time are thought of as queries that may require a lot of computation or costly queries.

Determining Some Limits

  • Aliasing

GraphQL has a feature that I can describe by simply batching many queries (unique or duplicated) into one query request, only providing a title for each query.

the requests response pair would look like this:

now start increasing the number of aliases until you get an error, without forgetting to keep checking the request time. Thus you'll end up understanding the aliasing limit for this endpoint and how the increasing number of aliased requests would affect the processing of queries.

  • Character Limit

Let us recap the last part: the idea of adding a title to a query. Well, let's make this title big, like very big, and check when the server decides to block out the request. This is the easiest way of generating a valid query with a huge amount of characters.

Determining Verbosity

Verbosity is how much information is given back when an error occurs; this is directly linked to the next section, in which we use errors to find out more information about the internal architecture.

To check for it, we need an error from the GraphQL endpoint. The simplest way to trigger an error is by sending a non-existent query (get creative)

In the preceding example, we notice a stacktrace returned with the error message, which is very helpful in many situations.

Fetching The Architecture

It's now time for the introspection. To fetch the introspection, simply send the following query to the endpoint that is being pentested, and voila, you got all the resources available in the current API schema.

To visualize them, copy the introspection query result to any of the GraphQL visualizers on awesome-graphql-security list. I'm using voyager. You can now visualize all the objects, queries, and mutations the GraphQL endpoint offers.

Example:

Where every individual box represents an object, the connections or arrows are objects that are a field (or a part) of another object.

But suppose you read any "GraphQL security best practices" article (I recommend this one). In that case, you probably know that endpoints usually disable introspection, so you cannot fetch any information about the available resources.

Lets get Tricky:

There is this minor feature that GraphQL engines provide where they suggest a valid input when the users input a wrong query name.

We can send a list of random words and wait for the "did you mean" messages. And slowly but surely find out a considerable chunk of the schema.

In awesome-graphql-security list there is a tool called Clairvoyance that does that so well that it fetches a considerable chunk of introspection and formats it to be of a valid format.

Conclusion

You never finish the discovery phase, and it is arguably the basis of all the pentesting process, but to attempt to sum up, the following three points are to be remembered:


☑️ Keep an eye on the time the queries take
☑️ Know the limiting factors of the endpoint
☑️ Fetch the introspection and visualize it