Skip to content

Instantly share code, notes, and snippets.

@Joelfranklin96
Created December 12, 2019 23:40
Show Gist options
  • Save Joelfranklin96/94c12d679c439f6c64fbd40f61277745 to your computer and use it in GitHub Desktop.
Save Joelfranklin96/94c12d679c439f6c64fbd40f61277745 to your computer and use it in GitHub Desktop.
get_dummies
# Initialize the dataframe 'numerical_nucleotide' of shape (106,(57*4))
numerical_nucleotide = pd.DataFrame(np.random.randn(106,(57*4)))
# Define the dictionary 'key1'
key1 = {'a' : '1000','c' : '0100','g' : '0010','t' : '0001'}
# Assign values to 'numerical_nucleotide'
for i in range(nucleotide_sequence.shape[0]):
temp1 = ''
for j in range(nucleotide_sequence.shape[1]-1):
temp1 = temp1 + key1[nucleotide_sequence[j][i]]
temp2 = [int(x) for x in list(temp1)]
numerical_nucleotide.iloc[i] = temp2
# Assign 'Class' column to 'numerical_nucleotide'
numerical_nucleotide['Class'] = nucleotide_sequence['Class']
# Replace '+' and '-' of 'Class' column with values 1 and 0 respectively
numerical_nucleotide.replace(to_replace = '+',value = 1,inplace = True)
numerical_nucleotide.replace(to_replace = '-',value = 0,inplace = True)
print(numerical_nucleotide)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment