You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Improvement of NIP-45 Event Counts by Linear Counting
Abstract
NIP-45 Event Count10 provides a mechanism for clients to obtain the number of events matching specific filters, but suffers from accuracy issues when events are distributed across multiple relays. When the same query is sent to different relays containing overlapping but non-identical event sets, the aggregated count becomes inaccurate due to potential double-counting or missing events.
This paper proposes an enhancement to NIP-45 using linear counting, a probabilistic algorithm that enables accurate event counting across distributed relays while minimizing data transfer overhead. Our approach uses bitmaps and hash functions to create mergeable count estimates that can be combined from multiple relay responses.
The proposed method introduces configurable bitset sizes (128 bytes to 8192 bytes) allowing clients to balance accuracy requirements against bandwidth usage. Unlike existing solutions that require transferring complete event lists or relying on third-party Data Vending Machines, linear counting provides near-accurate counts with fixed, small data payloads regardless of the actual number of events.
We demonstrate that this approach significantly reduces bandwidth requirements compared to REQ-based counting (from ~1.75MB to ~1.4KB for 1000 events across 5 relays) while maintaining counting accuracy suitable for common use cases such as follower counts, reactions, and reposts. The method is particularly well-suited for Nostr's decentralized architecture and outbox model, where events may be distributed across user-selected relay sets.
Introduction
NIP-45 Event Count enables clients to obtain the number of events matched to filters. This can be used for counting followers, the number of reposts and the number of reactions. However, NIP-45 has a problem: results may vary by relay.
For example, if relay X has events A and B, relay Y has events B and C, and we send a COUNT query with filters matches those events to relay X and Y, the both of COUNT results would be 2. This can happen in several situations: the user successfully publish an event to some relays but failed for other relays; the user publish to the set of relays (kind:30002) rather than the usual relay list (kind:10002); or the user start or stop to use some relays.
One solution is just to include event ids in the COUNT response20. However, if there are a lot of matched events, the list of ids would be large after all. If there are 1000 events and each id is 64 letters in hex, the size of the list is about 64KB. And, if we receive this from 5 relays, the total transferred bytes is about 256KB. To use first 8 letters of id like previous tag in NIP-2930 will reduce the size to one-eighth but the size is still big.
Other solution is Data vending machines (DVM, NIP-90)40. DVM allows user to ask some provider to process the user's request. The DVM job to count events is already defined50. This will be the one of the most efficient way to count events correctly because the result has only number without extra data. The provider can collect the data frequently from relays around the world and count in advance of the request. However, providers may choose to collect data at the time the request is published, then it may take some time to get the result and it will have a negative impact on the user experience. The user may need to choose which provider to trust and use before use.
Currently, we need to use REQ to count events if we want to know almost exact amount. This means the client will receives a lot of events from relays. If every single event has 350 bytes, there are 1000 events and the client request to 5 relays, the amount of data the client will receive would be 1.75 MB.
Outbox model encourage clients to send reactions and reposts to the relay where the target event is posted. However, the user may change relays to use.
In this paper, we propose a method to solve this problem using linear counting algorithm. The results of linear counting from multiple relays can be integrated. The linear counting provides good estimated counts with a small amount of data.
Linear Counting
Linear counting60 is a probabilistic algorithm to count unique items. The core idea is to use bitmap (bitset) and hash function.
The following describes how to add an item:
calculate the hash value of the data to be added with hash function.
calculate the remainder of the hash value divided by the bitmap length.
bit in that position is set to 1.
To estimate the number of items, use following formula60:
- m * log2(1 - n/m)
m is the size of bitset and n is the number of bits set to 1.
The algorithm has .
To get the
The remainder values
The remainder values can be the same for different data, namely hash collision.
The algorithm has :
merge-able
If we want combine two results, we just need to calculate bit-or of two bit-sets.
Use linear counting in COUNT
To apply linear counting to COUNT,
id is a SHA-256 hash value of content and considered as unique, this can be used for linear counting.
Note that Proof of Work [NIP-13]
Proposal
Request
This proposal introduces these special prefixes:
size
prefix
bitset size
id bits
0
lc:0,
128 bytes (1024 bits)
10
1
lc:1,
256 bytes (2048 bits)
11
2
lc:2,
512 bytes (4096 bits)
12
3
lc:3,
1024 bytes (8192 bits)
13
4
lc:4,
2048 bytes (16384 bits)
14
5
lc:5,
4096 bytes (32768 bits)
15
6
lc:6,
8192 bytes (65536 bits)
16
"size": values for request format B.
"prefix": values for request format A.
"bitset size": size of bitsets
"id bits": last n bits are used for bitset
If the result bitset is almost full and the error is considered large or willing to know more accurate number, client CAN retry with larger size (refer "Accuracy" section for more detail).
If server doesn't support this linear counting, it is expected to ignore the linear counting parameter and respond with the original NIP-45 response.
If there are no matching events, the server SHOULD omit "linear_counting" field from response and set "count" to 0 to reduce the size of the response. The client will interpret the response as either no data or the client received data with all bits zero (unset).
Consideration
Bitset size
The client can choose the bitset size according to the expected number of events.
For example, if the client expects the number of events to be less than 100, the client can choose size=0. This could be suitable for the case of the number of reposts and the number of reactions.
If the client expects the number of events to be less than 2000, the client can choose size 1. This can be used for the case of the number of followers.
Of course, the client can choose a larger size if the client wants more accurate results.
Transferred data size
Base64 encoded data will be 33% bigger than original size.
Original size
Base64 size
128 bytes (1024 bits)
171 chars
256 bytes (2048 bits)
341 chars
512 bytes (4096 bits)
680 chars
1024 bytes (8192 bits)
1362 chars
The larger the bitset size, the more accurate the result will be. However, the larger the bitset size, the more data will be transferred.
WebSocket compression will reduce the size of the bitset size.
Break-even point
Database load
The database is needed to return ids rather than just COUNT(id). The traffic between the relay and the database will increase and database load will also increase a bit.
To reduce the amount of traffic, we can write a database query that take the last few bytes of the id instead of the entire id. For example, a client requests COUNT where size is 2, only the last 12 bits (2 bytes) is needed to calculate: SELECT SUBSTRING(id FROM -2) FROM events WHERE ...;
Accuracy (error and bitset size)
If the error is considered to be large, the client can request with a larger size again.
According to the linear counting paper,
Suitable and Non suitable use-cases
Suitable use-cases: Counting replies, mentions, reposts and reactions.
Non-suitable use-cases: Counting reactions for each emojis.
NIP-2570 allows user to use emojis in their reactions. It is hard to count emojis with this mechanism. If we simply create bitset for each emoji, the size of the result will be large and it is not desirable.
Attack vectors: malicious relay returns untrue result
The malicious relay can tell a lie about the count by setting/un-setting some bits in the bitmap. The client can verify the result by comparing the COUNT result and the number of REQ results. It is also possible to compare a linear counting bitmap generated by the client from the REQ results with the bitmap generated by relay. If the bits in the client's bitmap were much different compared to the relay's bitmap, the server seems to lie. In that case, The client can choose to ignore the result from the relay for this use-case and inform the user of it. This lie can be verifiable but the problem is that it needs data transferring. However, there is more simple way to tell a lie -- just creating such events with random keys.
Attack vectors: collision
This might be overthinking.
By mining IDs, attackers can intentionally cause collisions. This might trick the client into underestimating the number of events and retrieving a large amount of data.
To prevent this, the client can include a seed value in requests. The ID can be combined with the client's seed using a hash function like HMAC-SHA-256. The result is then used to update the bitset.
The seed should be unpredictable. Instead of using a fixed value, the client should generate a new seed for each request or use one that is valid only for the session.
It might be acceptable to use the subscription ID of format A as a seed. For Request Format B, we can add a seed parameter.
Attack vectors: DoS
The attacker won't use this feature to do DoS attack because the memory consumption is small.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* Visit https://aka.ms/tsconfig to read more about this file */
/* Projects */
// "incremental": true, /* Save .tsbuildinfo files to allow for incremental compilation of projects. */
// "composite": true, /* Enable constraints that allow a TypeScript project to be used with project references. */
// "tsBuildInfoFile": "./.tsbuildinfo", /* Specify the path to .tsbuildinfo incremental compilation file. */
// "disableSourceOfProjectReferenceRedirect": true, /* Disable preferring source files instead of declaration files when referencing composite projects. */
// "disableSolutionSearching": true, /* Opt a project out of multi-project reference checking when editing. */
// "disableReferencedProjectLoad": true, /* Reduce the number of projects loaded automatically by TypeScript. */
/* Language and Environment */
"target": "esnext", /* Set the JavaScript language version for emitted JavaScript and include compatible library declarations. */
// "lib": [], /* Specify a set of bundled library declaration files that describe the target runtime environment. */
// "jsx": "preserve", /* Specify what JSX code is generated. */
// "experimentalDecorators": true, /* Enable experimental support for legacy experimental decorators. */
// "emitDecoratorMetadata": true, /* Emit design-type metadata for decorated declarations in source files. */
// "jsxFactory": "", /* Specify the JSX factory function used when targeting React JSX emit, e.g. 'React.createElement' or 'h'. */
// "jsxFragmentFactory": "", /* Specify the JSX Fragment reference used for fragments when targeting React JSX emit e.g. 'React.Fragment' or 'Fragment'. */
// "jsxImportSource": "", /* Specify module specifier used to import the JSX factory functions when using 'jsx: react-jsx*'. */
// "reactNamespace": "", /* Specify the object invoked for 'createElement'. This only applies when targeting 'react' JSX emit. */
// "noLib": true, /* Disable including any library files, including the default lib.d.ts. */
// "useDefineForClassFields": true, /* Emit ECMAScript-standard-compliant class fields. */
// "moduleDetection": "auto", /* Control what method is used to detect module-format JS files. */
/* Modules */
"module": "commonjs", /* Specify what module code is generated. */
// "rootDir": "./", /* Specify the root folder within your source files. */
// "moduleResolution": "node10", /* Specify how TypeScript looks up a file from a given module specifier. */
// "baseUrl": "./", /* Specify the base directory to resolve non-relative module names. */
// "paths": {}, /* Specify a set of entries that re-map imports to additional lookup locations. */
// "rootDirs": [], /* Allow multiple folders to be treated as one when resolving modules. */
// "typeRoots": [], /* Specify multiple folders that act like './node_modules/@types'. */
// "types": [], /* Specify type package names to be included without being referenced in a source file. */
// "moduleSuffixes": [], /* List of file name suffixes to search when resolving a module. */
// "allowImportingTsExtensions": true, /* Allow imports to include TypeScript file extensions. Requires '--moduleResolution bundler' and either '--noEmit' or '--emitDeclarationOnly' to be set. */
// "resolvePackageJsonExports": true, /* Use the package.json 'exports' field when resolving package imports. */
// "resolvePackageJsonImports": true, /* Use the package.json 'imports' field when resolving imports. */
// "customConditions": [], /* Conditions to set in addition to the resolver-specific defaults when resolving imports. */
// "maxNodeModuleJsDepth": 1, /* Specify the maximum folder depth used for checking JavaScript files from 'node_modules'. Only applicable with 'allowJs'. */
/* Emit */
// "declaration": true, /* Generate .d.ts files from TypeScript and JavaScript files in your project. */
// "declarationMap": true, /* Create sourcemaps for d.ts files. */
// "emitDeclarationOnly": true, /* Only output d.ts files and not JavaScript files. */
// "inlineSourceMap": true, /* Include sourcemap files inside the emitted JavaScript. */
// "outFile": "./", /* Specify a file that bundles all outputs into one JavaScript file. If 'declaration' is true, also designates a file that bundles all .d.ts output. */
// "outDir": "./", /* Specify an output folder for all emitted files. */
// "declarationDir": "./", /* Specify the output directory for generated declaration files. */
// "preserveValueImports": true, /* Preserve unused imported values in the JavaScript output that would otherwise be removed. */
/* Interop Constraints */
// "isolatedModules": true, /* Ensure that each file can be safely transpiled without relying on other imports. */
// "verbatimModuleSyntax": true, /* Do not transform or elide any imports or exports not marked as type-only, ensuring they are written in the output file's format based on the 'module' setting. */
// "allowSyntheticDefaultImports": true, /* Allow 'import x from y' when a module doesn't have a default export. */
"esModuleInterop": true, /* Emit additional JavaScript to ease support for importing CommonJS modules. This enables 'allowSyntheticDefaultImports' for type compatibility. */
// "preserveSymlinks": true, /* Disable resolving symlinks to their realpath. This correlates to the same flag in node. */
"forceConsistentCasingInFileNames": true, /* Ensure that casing is correct in imports. */
/* Type Checking */
"strict": true, /* Enable all strict type-checking options. */
// "noImplicitAny": true, /* Enable error reporting for expressions and declarations with an implied 'any' type. */
// "strictNullChecks": true, /* When type checking, take into account 'null' and 'undefined'. */
// "strictFunctionTypes": true, /* When assigning functions, check to ensure parameters and the return values are subtype-compatible. */
// "strictBindCallApply": true, /* Check that the arguments for 'bind', 'call', and 'apply' methods match the original function. */
// "strictPropertyInitialization": true, /* Check for class properties that are declared but not set in the constructor. */
// "noImplicitThis": true, /* Enable error reporting when 'this' is given the type 'any'. */
// "useUnknownInCatchVariables": true, /* Default catch clause variables as 'unknown' instead of 'any'. */