Created
March 21, 2023 09:41
-
-
Save freemandealer/33aac8ff64151ade5e91dcdcc5027d33 to your computer and use it in GitHub Desktop.
get memory consumption of data from an apache ORC format file
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pyorc | |
import sys | |
# 打开要读取的orc文件 | |
with open("./output.orc", "rb") as data: | |
# 创建一个读取器对象 | |
reader = pyorc.Reader(data) | |
# 初始化一个变量来存储总大小 | |
total_size = 0 | |
# 遍历文件中的每一行记录 | |
for row in reader: | |
# 计算当前行记录在内存中的大小,并累加到总大小中 | |
total_size += sys.getsizeof(row) | |
# 打印总大小 | |
print(f"The total size of the orc file content in memory is {total_size} bytes.") |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment