Last active
December 28, 2023 10:12
-
-
Save buanzo/7cdd2c34fc0bb25c71b857a16853c6fa to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# This is a work in progress. There are still bugs. Once it is production-ready this will become a full repo. | |
import os | |
def count_tokens(text, model_name="gpt-3.5-turbo", debug=False): | |
""" | |
Count the number of tokens in a given text string without using the OpenAI API. | |
This function tries three methods in the following order: | |
1. tiktoken (preferred): Accurate token counting similar to the OpenAI API. | |
2. nltk: Token counting using the Natural Language Toolkit library. | |
3. split: Simple whitespace-based token counting as a fallback. | |
Usage: | |
------ | |
text = "Your text here" | |
result = count_tokens(text, model_name="gpt-3.5-turbo", debug=True) | |
print(result) | |
Required libraries: | |
------------------- | |
- tiktoken: Install with 'pip install tiktoken' | |
- nltk: Install with 'pip install nltk' | |
Parameters: | |
----------- | |
text : str | |
The text string for which you want to count tokens. | |
model_name : str, optional | |
The OpenAI model for which you want to count tokens (default: "gpt-3.5-turbo"). | |
debug : bool, optional | |
Set to True to print error messages (default: False). | |
Returns: | |
-------- | |
result : dict | |
A dictionary containing the number of tokens and the method used for counting. | |
""" | |
# Try using tiktoken | |
try: | |
import tiktoken | |
encoding = tiktoken.encoding_for_model(model_name) | |
num_tokens = len(encoding.encode(text)) | |
result = {"n_tokens": num_tokens, "method": "tiktoken"} | |
return result | |
except Exception as e: | |
if debug: | |
print(f"Error using tiktoken: {e}") | |
pass | |
# Try using nltk | |
try: | |
import nltk | |
nltk.download("punkt") | |
tokens = nltk.word_tokenize(text) | |
result = {"n_tokens": len(tokens), "method": "nltk"} | |
return result | |
except Exception as e: | |
if debug: | |
print(f"Error using nltk: {e}") | |
pass | |
# If nltk and tiktoken fail, use a simple split-based method | |
tokens = text.split() | |
result = {"n_tokens": len(tokens), "method": "split"} | |
return result | |
class TokenBuffer: | |
def __init__(self, max_tokens=2048): | |
self.max_tokens = max_tokens | |
self.buffer = "" | |
self.token_lengths = [] | |
self.token_count = 0 | |
def update(self, text, model_name="gpt-3.5-turbo", debug=False): | |
new_tokens = count_tokens(text, model_name=model_name, debug=debug)["n_tokens"] | |
self.token_count += new_tokens | |
self.buffer += text | |
self.token_lengths.append(new_tokens) | |
while self.token_count > self.max_tokens: | |
removed_tokens = self.token_lengths.pop(0) | |
self.token_count -= removed_tokens | |
self.buffer = self.buffer.split(" ", removed_tokens)[-1] | |
def get_buffer(self): | |
return self.buffer |
Example usage for TokenBuffer:
from token_counter import TokenBuffer
# Initialize a TokenBuffer with a maximum token count of 30
buffer = TokenBuffer(max_tokens=30)
# Add a sentence to the buffer
buffer.update("Hello, how are you doing?")
print(buffer.get_buffer())
print("Token count:", buffer.token_count)
# Add another sentence to the buffer
buffer.update("I'm doing well, thank you!")
print(buffer.get_buffer())
print("Token count:", buffer.token_count)
# Add a longer sentence to the buffer
buffer.update("I've been working on a project and making great progress.")
print(buffer.get_buffer())
print("Token count:", buffer.token_count)
# Add one more sentence to the buffer
buffer.update("That's great to hear, keep up the good work!")
print(buffer.get_buffer())
print("Token count:", buffer.token_count)
Output (YMMV):
Hello, how are you doing?
Token count: 6
Hello, how are you doing?I'm doing well, thank you!
Token count: 11
Hello, how are you doing?I'm doing well, thank you!I've been working on a project and making great progress.
Token count: 24
I'm doing well, thank you!I've been working on a project and making great progress.That's great to hear, keep up the good work!
Token count: 30
Tiktoken's github repo: https://github.com/openai/tiktoken
NLTK's github repo: https://github.com/nltk/nltk
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
To use the TokenBuffer class, you can create an instance with an optional max_tokens argument and then call the update() method to add text. The get_buffer() method returns the current buffer.
The update() method ensures that the buffer always contains at most max_tokens by removing characters from the beginning of the buffer when the limit is exceeded.
Using a token buffer like the one implemented in the TokenBuffer class is useful for working with OpenAI's API for several reasons:
Token Limit: OpenAI's models have a maximum token limit per API call (e.g., 4096 tokens for gpt-3.5-turbo). By using a token buffer, you can manage and control the text input to ensure it stays within the allowed token limit, preventing errors when making API calls.
Cost Control: OpenAI's API pricing is based on the number of tokens processed. By maintaining a token buffer, you can keep track of the tokens used, helping you manage costs more effectively and avoid exceeding your budget.
Text Truncation: When dealing with long text inputs or a stream of text, using a token buffer can help you truncate or remove less relevant text while preserving the most recent or relevant information. This is particularly useful when working with conversational AI applications, where the latest information might be more important for generating appropriate responses.
Rate Limiting: OpenAI's API has rate limits based on tokens processed per minute. A token buffer helps you stay within these rate limits, ensuring that your application can operate smoothly without encountering rate limit errors.
Overall, using a token buffer like the TokenBuffer class is a practical way to manage tokens when working with OpenAI's API, helping you stay within token limits, control costs, and manage text inputs more effectively.