Mastery Series: Hot Tips to Speed Up Your Indexer

Welcome back for another instalment of the SubQuery Mastery Series, where we help you master the SubQuery Network and ensure your data needs are satisfied! In this article, we’ll focus on optimising your indexer's performance so you can achieve the highest level of SubQuery’s indexing performance that we’ve worked hard to reach.

Performance is a crucial factor in quickly accessing the on-chain data you need, both in terms of initial indexing sync times, as well as indexer latency from when events are finalised on chain. So, how can you optimise your SubQuery project to speed it up? Fortunately, there are several things you can do to improve indexing and query speed.

Read on for our hottest tips covering the most common issues, reviewing your indexer’s overall architecture and other advice.

Common Issues and Top Suggestions

Avoid Using Block Handlers Where Possible

blockHandlers can slow down your project as they are executed on every block. Use them only when absolutely necessary, and consider adjusting the project architecture to reduce reliance on them.

Optimise Block Handlers
- If a block handler is necessary, ensure that all code paths called by it are thoroughly optimised. Since they are executed on each block, the time taken grows linearly with the chain’s length. To run handlers more efficiently, use a modulo filter. This allows you to execute a handler only at specified intervals (e.g., every 50 blocks), which helps group and calculate data more efficiently while reducing unnecessary executions.

Always Use a Dictionary

For all major chains we already provide valid SubQuery dictionaries, but if you’re indexing a custom chain, you may want to implement your own dictionary in your SubQuery Project to speed up your indexer. Examples of dictionary creation can be found in the dictionary repository, and if an existing one is already available for your network then it should automatically be added to your SubQuery Project.

Use Filter Conditions in Mapping Handlers

In your project manifest, apply the strictest filter conditions to mapping handlers where possible. This reduces the number of events or transactions that need to be processed, minimising unnecessary data queries.

Set Start Block in Project Manifest

Always set the start block in your project manifest to the point when the contract was initialised or, better yet, when the first relevant event/transaction occurred. This helps streamline the data indexing process.

Leverage Node Worker Threads

Move block fetching and processing into separate worker threads to speed up indexing (potentially up to 4x faster). Enable this using the -workers=<number> flag, but be mindful of CPU core limitations. More details can be found here.

Review Your Indexer’s Architecture

When your project involves indexing a large volume of data (blocks, transactions, and more specific data) it’s wise to consider splitting it into separate SubQuery indexing projects, each responsible for different data sources. This approach mirrors the architectural decision between using microservices or sticking with a monolithic design in software development.

Why should you consider this separation? For one, indexing all blocks can be a time-consuming process that slows your project down significantly. If you later need to make changes—such as adjusting filters or altering the shape of your entities—you might have to clear your entire database and reindex the project from scratch. In large projects, this reindexing can take a lot of time and resources.

A practical example of this strategy is to create a larger project that indexes everything for internal analysis of your contracts, alongside a smaller, optimised project specifically designed for your dApp. The larger, all-encompassing project might remain static and only require initial indexing, avoiding the costly process of reindexing. Meanwhile, the smaller, optimised project can evolve alongside your dApp, and when changes are needed, reindexing will be much quicker and more manageable.

This separation offers a smoother development experience and a more efficient workflow. By isolating tasks, you can iterate faster on the parts that matter most to your users, without being bogged down by the slower indexing of everything else. It’s an approach that balances performance, flexibility, and scalability—one that can set your project up for long-term success.

Maximising Indexing Performance: Best Practices for Efficient Data Handling

When building data-intensive projects, particularly historical ones, optimising indexing performance is key to maintaining speed and efficiency. Here are some crucial tips to help you enhance the performance of your indexing processes:

Add Indexes to Boost Query Performance

When filtering or sorting data, especially in historical projects, adding indexes to your entity fields can dramatically improve query performance. Simply use the @index or @index(unique: true) annotation on any non-key field that you plan to filter by. This helps speed up lookups and ensures more efficient data retrieval. For detailed guidance, check out the SubQuery documentation on indexing.

Leverage Parallel and Batch Processing

Whenever possible, use parallel or batch processing to speed up your project’s operations. For example:

Use Promise.all() for multiple async functions to execute them in parallel, avoiding the delays of resolving one by one.
If you need to create many entities within a single handler, use store.bulkCreate(entityName: string, entities: Entity[]) to generate them in bulk, rather than one at a time. This reduces overhead and improves performance. Learn more in the advanced store documentation.
Use api.queryMulti() to optimise Polkadot API calls within mapping functions, allowing you to query multiple items in parallel instead of using a loop.

Minimise External API Calls

API calls—whether querying the state or reaching third-party services—can slow down your indexing. To optimise this, try to minimise calls and rely more on extrinsic, transaction, or event data when possible. You can also persist frequently used data in the store and update it periodically, reducing the need for constant API calls.

Enable Reverse Lookups

Enabling reverse lookups on entities can simplify queries and boost efficiency. To do this, attach the @derivedFrom annotation to the field and point it to the reverse lookup field of another entity. This approach creates a relationship between entities that streamlines your queries. Find out more about reverse lookups here.

Simplify Your Schema Design

A simpler schema design leads to faster performance. Reduce unnecessary fields and columns, create indexes where needed, and use reverse lookups judiciously. Keeping your schema lean and optimised minimises data redundancy and improves efficiency across your project.

Handle BigInts with Care

Be mindful that JSON.stringify doesn’t support native BigInts, which could cause issues in logging.
By following these indexing performance tips, you’ll be able to build more efficient, scalable projects, ensuring that your queries run faster and your overall workflow remains smooth. Keep performance in mind from the start, and your project will be well-equipped to handle growth and complexity over time.

Boosting Query Performance in GraphQL: Best Practices for Efficient Data Retrieval

When working with GraphQL, optimising query performance is essential to maintaining fast and efficient data retrieval. Here are some best practices that can help you get the most out of your queries:

Use Cursor-Based Pagination for Better Efficiency

Cursor-based pagination is a much more efficient method of handling large datasets compared to traditional first/offset/after pagination. Instead of loading all the data at once, it retrieves only the necessary slices, improving performance and reducing the load on your server. If you’re unfamiliar with how it works, you can dive deeper into cursor-based pagination here.

Query Only the Fields You Need

One of the key advantages of GraphQL is its flexibility in querying specific fields. However, querying more fields than necessary can slow down performance. By requesting only the data you actually need, you can reduce query time and server processing power.

Avoid Querying totalCount Without Conditions for Large Data Tables

For large datasets, querying totalCount can be resource-intensive, especially if no filters or conditions are applied. To avoid slow query responses, it's best to add specific conditions or avoid querying totalCount entirely if it's not required.

Restrict Query Complexity for Greater Control

Controlling the complexity of your queries is another critical step in maintaining optimal performance. By restricting query complexity, you can ensure that the server isn't overwhelmed by unnecessarily complicated requests. Learn more about how to implement this in your project here.

By following these tips, you'll ensure faster, more efficient queries, leading to better performance across your entire project. Small adjustments can make a big difference when scaling your dApp or handling larger datasets!

Running High-Performance SubQuery Infrastructure

More information is focused on the DevOps and configuration of running high-performance SubQuery projects here.

You’re Ready!

By following the strategies outlined in this article—whether it's leveraging parallel processing, reducing unnecessary API calls, or designing an efficient schema—you can unlock significant improvements in speed and efficiency. The key takeaway is to be mindful of how data is handled at every stage, from indexing to querying, allowing your project to grow without being weighed down by performance bottlenecks. With these tips in hand, you'll be well on your way to running a high-performance SubQuery project that meets your data needs, now and in the future.

About SubQuery

SubQuery Network is innovating web3 infrastructure with tools that empower builders to decentralise the future. Our fast, flexible, and open data indexer supercharges dApps on over 200 networks, enabling a user-focused web3 world. Soon, our Data Node will provide breakthroughs in the RPC industry, and deliver decentralisation without compromise. We pioneer the web3 revolution for visionaries and forward-thinkers. We’re not just a company — we’re a movement driving an inclusive and decentralised web3 era. Let’s shape the future of web3, together.

Mastery Series: Hot Tips to Speed Up Your Indexer

Common Issues and Top Suggestions

Review Your Indexer’s Architecture

Maximising Indexing Performance: Best Practices for Efficient Data Handling

Boosting Query Performance in GraphQL: Best Practices for Efficient Data Retrieval

Running High-Performance SubQuery Infrastructure

You’re Ready!

About SubQuery

Mastery Series: Optimising Your Node Operator Rewards with the Help of SQTScan

Mastery Series: Tips and Tricks for New Node Operators

Mastery Series: How to Get Started as a Node Runner on the SubQuery Network

Mastery Series: Tips and Tricks to Optimise Your Query Performance

Mastery Series: Understanding the SubQuery Token (SQT)

Mastery Series: How to Get Your Blockchain Dataset Indexed on the SubQuery Network