Black Sheep Code

Reasons to not use GraphQL

Published:

In this post I'm going to make the case against GraphQL, but first, let's make the case for it.

The case for GraphQL

GraphQL reduces loading waterfalls

A RESTful scenario (one I've been before!) that can be solved with GraphQL looks like this:

  • The user navigates to their todos page, and this requests GET /todos.
  • Once the todo's have been fetched, we inspect the todo.projectGroupID and we fetch the relevant project groups with GET /groups/{groupId}.
  • Once the individual project groups have been fetched, we inspect them to get the projectGroup.groupLeaderUserId and we fetch the leaders of each group with GET /users/{userId}.

This causes a serial loading waterfall, meaning that realistically the page might not ready for a full second.

GraphQL solves this problem by having that serial loading waterfall happen on the backend - hopefully right next to the database.

The serial loading waterfall still exists, it'll just occur faster.

Counterpoint: This loading waterfall can also be solved with server side rendering.

GraphQL allows consumers to quickly build out bespoke functionality by writing their own GraphQL queries

The idea is, once the GraphQL API exists, consumers of it don't need to wait for the API maintainers to create new endpoints, they can just write the single graphql query that has all the data they need.

This is actually a dishonest characterisation of the advantage - the GraphQL API still needs to implemented such that all the data required is available. The consumer could similarly write a series of RESTful requests that provides the data they will need.

That said, I'll admit that writing a series of joining logics is a lot easier with GraphQL, particularly with Apollo's GraphQL sandbox.

GraphQL abstracts the details of a multitude of upstreams

Enterprises appear to be attracted to GraphQL because they have a situation where they have a multitude of upstream services, each with slightly different conventions, and so a backend-for-frontend can serve as a mechanism for hiding the uglyness and providing a coherent experience to ultimate consumer of the APIs.

While a BFF can be helpful here, it doesn't need to be a GraphQL one. A RESTful one might suffice.

The case against GraphQL

You're losing a wealth of existing tooling developed for REST APIs

In the desire to reduce serial waterfalls you're throwing the baby out with the bath water by adopting GraphQL - there's a lot of tooling you get with a REST API that you'll lose.

HTTP Semantics

The HTTP verbs - GET, POST, PUT, PATCH, DELETE and HTTP status codes, 200, 201, 400, 401, 403, 404, 500 are the ones I most commonly use, entail rich semantic meaning. While technically I don't believe any of them have any actual behavioural difference, there's entail well established semantic meaning, meaning at a glance we can see what is happening by looking at the HTTP method and the response status code.

GraphQL reduces everything to a single method and a single URL, and collapses many of the of the errors into a 200 response.

HTTP Caching Semantics

GraphQL funnels all requests through usually a POST /graphql endpoint.

This means conventional HTTP caching semantics do not work, which rely on caching on a per URL basis.

Instead, caching needs to be achieved with implementation specific solutions - such as Apollo's cache directives.

HTTP Cache headers on the other hand, are old hat in the internet, and work automatically in a variety of tools, include the browser, cache tools like Varnish and various CDNs.

In order to take advantage of HTTP caching semantics, the server just needs to add the cache headers and the variety of tools will just work with them.

GraphQL on the other hand is not nearly so universally supported, and it's likely that you'll need to configure implementation specific mechanisms where ever you need the cache configuration to be respected.

OpenAPI based tooling

There's a wealth of OpenAPI tooling, from code generation to documentation to mock generation.

Granted there are GraphQL tools that also help with these things - the ecosystem for OpenAPI is much richer.

Observability tooling

Traditional observability tools rely heavily on the API being RESTful - they're using URLs and headers to group the requests.

Observability tools commonly don't inspect request and response bodies that's Personally Identifiable Information.

GraphQL though - puts what would otherwise be query parameters into the request body - meaning that some observability tools may lose track of it.

Granted that many observability tools do now offer GraphQL compatability, these solutions will be less battle tested than their REST counterparts.

Bitty GraphQL issues

Lack of type literal support

In TypeScript we could define an object like this:

type User = {
  userType: "student", 
  classesEnrolledIn: string[]
}  | {
  userType: "teacher", 
  classesTeaching: string[], 
  qualifications: string[]  
}

This is what's known as a discriminating union.

We can then discriminate based on the userType property:

function processUser(user: User) {
  if(user.userType === "student") {
    // definitely exists
    user.classesEnrolledIn

    // definitely does not exist
    user.qualifications
  }
}

An OpenAPI spec could define a similar shape like:

OpenAPI yaml
# Example OpenAPI 3.0 spec for the discriminated union User type
openapi: 3.0.0
info:
  title: User API
  version: 1.0.0
paths:
  /users/{userId}:
    get:
      summary: Get a user by ID
      parameters:
        - name: userId
          in: path
          required: true
          schema:
            type: string
      responses:
        '200':
          description: A user object (student or teacher)
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/User'
components:
  schemas:
    User:
      oneOf:
        - $ref: '#/components/schemas/StudentUser'
        - $ref: '#/components/schemas/TeacherUser'
      discriminator:
        propertyName: userType
        mapping:
          student: '#/components/schemas/StudentUser'
          teacher: '#/components/schemas/TeacherUser'
    StudentUser:
      type: object
      required: [userType, classesEnrolledIn]
      properties:
        userType:
          type: string
          enum: [student]
        classesEnrolledIn:
          type: array
          items:
            type: string
    TeacherUser:
      type: object
      required: [userType, classesTeaching, qualifications]
      properties:
        userType:
          type: string
          enum: [teacher]
        classesTeaching:
          type: array
          items:
            type: string
        qualifications:
          type: array
          items:
            type: string

GraphQL does not support literal types in this same way.

In GraphQL we might define a schema like

enum UserType {
  STUDENT
  TEACHER
}

type StudentUser {
  userType: UserType!
  classesEnrolledIn: [String!]!
}

type TeacherUser {
  userType: UserType!
  classesTeaching: [String!]!
  qualifications: [String!]!
}

union User = StudentUser | TeacherUser

But this will lack the ability to discriminate based on the userType property.

Clients like Apollo have the __typename which can be used to discriminate, but of course that requires you to use Apollo.

Ambiguity on nullability of input fields

Let's say we have a UserProfile object:

type UserProfile = {
  name: string;
  address: string; 
  favouriteColor: string | null;  
}

In a RESTful API we could use a PATCH operation. to just update a single field.

ie. we might do some code like:


//update the name
updateUserProfile(userId, {name: "Bob"})

// Update the name and address
updateUserProfile(userId, {name: "Bob", address: "123 Bob Lane"})

// Update the favouriteColor
updateUserProfile(userId, {favouriteColor: "green"})

// Unset the favouriteColor
updateUserProfile(userId, {favouriteColor: null})

In a GraphQL API there is ambiguity between omitting a field, and explicitly setting it as null.

ie. We could achieve the same functionality with the following GraphQL schema:

type UserProfile {
  name: String!
  address: String!
  favouriteColor: String
}

input PatchUserProfileInput {
  name: String
  address: String
  favouriteColor: String
}

type Mutation {
  patchUserProfile(userProfile: PatchUserProfileInput!): UserProfile
}

but in this case note that we could submit a payload:

{
  "address": "123 Bob Lane",
  "name": null
}

and this might mean the same as

{
  "address": "123 Bob Lane",
}

We don't have a clear way to distingush between ommiting the property, and setting it to null.

You will now need to deal with problems that are specific to GraphQL

The n+1 problem

The n+1 problem is an issue with GraphQL implementations where, in a naive implementation GraphQL can request exact same resource from an upstream, multiple times, for the same query. |

For example, if we have a list of todos, and we're fetching the projectGroups, many of the todos may have the same projectGroupId, and without adding a mitigation GraphQL ends up making the same request multiple times.

In fairness, this is not entirely unique to GraphQL. A RESTful BFF that attempted to reduce the serial loading waterfalls could encounter the same problem.

Malicious query construction

With GraphQL it's possible for an attacker to create malicious queries that will cause your GraphQL server to grind to a halt.

Such queries don't require resources by the attacker, like a conventional DOS attack would - the idea is the attacker can send a single query that causes a lot of work for the GraphQL server and its upstreams to then resolve.

One solution for this is persisted queries.

Persisted queries are actually an optimisation to reduce bandwidth, by reducing the GraphQL query to just a content hash. But this can be used mitigate malicious queries as well, by allowing only a whitelist of query hashes to reach the GraphQL server.

But note that this negates one of the core advantages of using GraphQL - the ability for consumers to build custom queries on the fly!

By introducing query whitelisting any consumer now has a dependency on the team maintaining the query whitelist before their code is production ready.

Conclusions

If the reason you are considering GraphQL is because you've seen the conference presentation and you think it's going to be all smooth sailing - it's not.

Consider what you'll be losing.

If you have a lot of chained/dependent queries, then sure, GraphQL might be good for you.

But if your application is a fairly straight forward CRUD style app - consider whether you're actually going to be making use of the features of GraphQL.



Questions? Comments? Criticisms? Get in the comments! 👇

Spotted an error? Edit this page with Github