Skip to content

Instantly share code, notes, and snippets.

@Forgo7ten
Last active November 8, 2021 14:16
Show Gist options
  • Save Forgo7ten/1c678351b09ce8d302f47a9fb00e4e10 to your computer and use it in GitHub Desktop.
Save Forgo7ten/1c678351b09ce8d302f47a9fb00e4e10 to your computer and use it in GitHub Desktop.
python简单去除txt文本文件中的相同行
# coding=utf-8
# @File : removeSameRow.py
# @Desc : 检测文件中相同行,并去重
# @Author : Forgo7ten
# @Time : 2021/11/3
# 原始文件名
readFilePath = "./original.txt"
# 输出文件名
outFilePath = "./new.txt"
# 对文件进行读取
outfile = open(outFilePath, "w", encoding='UTF-8')
readfile = open(readFilePath, "r", encoding='UTF-8')
lines_seen = set()
for line in readfile:
# 去除空格和换行
line = line.strip(' \n')
if line not in lines_seen:
# 如果之前没有出现过,写入新文件
outfile.write(line + '\n')
# 添加到set集合中
lines_seen.add(line)
readfile.close()
outfile.flush()
outfile.close()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment