Skip to content

Instantly share code, notes, and snippets.

@fomightez
Last active August 16, 2022 20:42
Show Gist options
  • Save fomightez/e6695e3ee16430ef228e1d7c4cb761c1 to your computer and use it in GitHub Desktop.
Save fomightez/e6695e3ee16430ef228e1d7c4cb761c1 to your computer and use it in GitHub Desktop.
Text list ===> Python-code list object

Text list ===> Python-code list object
Quick example of how to paste a list extracted from text elsewhere with each item on a separate line and convert it to a Python list easily. Good for small lists and eliminates need for separate file or for running running regex in editor to add commas and then format to Python-code list by hand.
You can get this entire thing as Python code in the raw gist.
This markdown renders nicely here or here.

STEP 1: Paste list as a docstring below first line and in front of ```.

list_as_string ='''
RF2
9S_rRNA
tP(UGG)Q
ORI1 
15S_RRNA
tW(UCA)Q
ORI8
COX1
AI1
AI2
AI3
AI4
AI5_ALPHA
AI5_BETA
ATP8
ATP6
ORI7
ORI2
tE(UUC)Q
COB
BI2
BI3
BI4
ORI6
ATP9
tS(UGA)Q2
VAR1
ORI3
ORI4
21S_rRNA
SCEI
tT(UGU)Q1
tC(GCA)Q
tH(GUG)Q
tL(UAA)Q
tQ(UUG)Q
tK(UUU)Q
tR(UCU)Q1
tG(UCC)Q
tD(GUC)Q
tS(GCU)Q1
tR(ACG)Q2
tA(UGC)Q
tI(GAU)Q
tY(GUA)Q
tN(GUU)Q
tM(CAU)Q1
COX2
Q0255
tF(GAA)Q
tT(UAG)Q2
tV(UAC)Q
COX3
ORI5
tM(CAU)Q2
RPM1
'''

STEP 2: use split on the string to get a Python list

py_list = list_as_string.split()

OPTIONAL STEP 3: If you have made the list larger than it needs to be and just want to actually leave overlap with a separate list.

# Remove those in the new list that are absent in another list.
final_py_lsit = [x for x in py_list if x in some_list]
# If it is a column of a dataframe use, where you'd replace `df.dfcol` with reference to your dataframe and column:
final_py_lsit = [x for x in py_list if x in df.dfcol.tolist()]
# Text list ===> Python-code list object
# Quick example of how to paste a list extracted from text elsewhere with each
# item on a separate line and convert it to a Python list easily. Good for small
# lists and eliminates need for separate file or for running running regex in editor
# to add commas and then format to Python-code list by hand.
# STEP 1: Paste list as a docstring below first line and in front of ```.
list_as_string ='''
RF2
9S_rRNA
tP(UGG)Q
ORI1
15S_RRNA
tW(UCA)Q
ORI8
COX1
AI1
AI2
AI3
AI4
AI5_ALPHA
AI5_BETA
ATP8
ATP6
ORI7
ORI2
tE(UUC)Q
COB
BI2
BI3
BI4
ORI6
ATP9
tS(UGA)Q2
VAR1
ORI3
ORI4
21S_rRNA
SCEI
tT(UGU)Q1
tC(GCA)Q
tH(GUG)Q
tL(UAA)Q
tQ(UUG)Q
tK(UUU)Q
tR(UCU)Q1
tG(UCC)Q
tD(GUC)Q
tS(GCU)Q1
tR(ACG)Q2
tA(UGC)Q
tI(GAU)Q
tY(GUA)Q
tN(GUU)Q
tM(CAU)Q1
COX2
Q0255
tF(GAA)Q
tT(UAG)Q2
tV(UAC)Q
COX3
ORI5
tM(CAU)Q2
RPM1
'''
# STEP 2: use split on the string to get a Python list
py_list = list_as_string.split()
# OPTIONAL STEP 3: If you have made the list larger than it needs to be and just
# want to actually leave overlap with a separate list
# Remove those in the new list that are absent in another list.
final_py_lsit = [x for x in py_list if x in some_list]
# If it is a column of a dataframe use, where you'd replace `df.dfcol` with reference to your dataframe and column:
final_py_lsit = [x for x in py_list if x in df.dfcol.tolist()]
@fomightez
Copy link
Author

or use pandas to read in

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment