Skip to content

Instantly share code, notes, and snippets.

@rg3915
Last active April 8, 2019 02:26
Show Gist options
  • Select an option

  • Save rg3915/641e7245908a191f445f54137c9948d2 to your computer and use it in GitHub Desktop.

Select an option

Save rg3915/641e7245908a191f445f54137c9948d2 to your computer and use it in GitHub Desktop.
Lendo txt e reagrupando itens para virar um JSON

Lendo txt e reagrupando itens para virar um JSON

A partir do txt a seguir:

https://gist.github.com/rg3915/641e7245908a191f445f54137c9948d2#file-rna-txt

Estou tentando gerar um JSON com o formato abaixo:

[
    {
        'id': 'GCVF01004444.1.2369',
        'reino': 'Bacteria'
        'rna': 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA
                ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
                GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
                AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
                UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
                GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
                AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
                GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
                CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
                GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
                AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
                UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
                GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
                AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
                GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'
    },
    {
        'id': 'GCVF77777777.1.1963',
        'reino': 'Bacteria',
        'rna': 'GCGCAAACGGUGGAUGCCUAGGCAGUAAGAGGCGAUGAAGGACGUGGAAUCCUGCGAAAAGCUAUGGUGAGCUGGAAACA
                AGCGCUGAGCCGUAGAUGUCCGAAUGGGGAAACCCGGCCAUAUGCAGAUAUGGUCACUCAUAAGUGAAUACAUAGGUUAU
                GAGGGCGAACUCGGGGAACUGAAACAUCUAAGUACCCGAAGGAAAAGAAAUCAAACGAGAUUCCCUAAGUAGCGGCGAGC
                GAACGGGGAGGAGCCUGGUGUGAUAUAGGUAAGAACUAAGUGGAAGCAACUGGAAAGUUGAGACAUAGAGGGUGAUAUCC
                CCGUACACGAAGAGACUGCUGGAACUAAGCACACGAACAAGUAGGUCGGAACACGAGAAAUUCUGAUUGAAUAUGGGUGG
                ACCAUCAUCCAAGGCUAAAUACUCCUUACUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGUGAAAAGAACCCCGGAG
                AGGGGAGUGAAAUAGAUCCUGAAACCGUUUGCGUACAAGCAGUGGGAGCAUGGGCUUAGGCUUCGUGUGACUGCGUACCU
                UUUGUAUAAUGGGUCAGCGAGUUACUUUCAGUGGCGAGGUUAACAAAGAAGGAAGCCGUAGAGAAAUCGAGUCUUAAAAG
                GGCGCGAGUCGCUGGGAGUAGACCCGAAACCGGGCGAUCUAGCCAUGUCCAGGAUGAAGGUUGGGUAACACCAAGUGGAG
                GUCCGAACCGGGUAAUGUUGAAAAAUUAUCGGAUGAGGUGUGGCUAGGAGUGAAAGGCUAAUCAAGCCCGGAGAUAGCUG
                CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
                GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
                AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
                UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
                GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
                AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
                GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'
    },
]

Vejam o código que eu tentei até aqui

https://gist.github.com/rg3915/641e7245908a191f445f54137c9948d2#file-rna-ipynb

Mas eu parei nesse ponto, onde não consegui continuar

[{'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},
 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',
 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',
 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',
 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',
 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',
 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',
 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',
 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',
 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',
 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',
 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',
 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',
 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',
 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',
 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',
 {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},
 'GCGCAAACGGUGGAUGCCUAGGCAGUAAGAGGCGAUGAAGGACGUGGAAUCCUGCGAAAAGCUAUGGUGAGCUGGAAACA',
 'AGCGCUGAGCCGUAGAUGUCCGAAUGGGGAAACCCGGCCAUAUGCAGAUAUGGUCACUCAUAAGUGAAUACAUAGGUUAU',
 'GAGGGCGAACUCGGGGAACUGAAACAUCUAAGUACCCGAAGGAAAAGAAAUCAAACGAGAUUCCCUAAGUAGCGGCGAGC',
 'GAACGGGGAGGAGCCUGGUGUGAUAUAGGUAAGAACUAAGUGGAAGCAACUGGAAAGUUGAGACAUAGAGGGUGAUAUCC',
 'CCGUACACGAAGAGACUGCUGGAACUAAGCACACGAACAAGUAGGUCGGAACACGAGAAAUUCUGAUUGAAUAUGGGUGG',
 'ACCAUCAUCCAAGGCUAAAUACUCCUUACUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGUGAAAAGAACCCCGGAG',
 'AGGGGAGUGAAAUAGAUCCUGAAACCGUUUGCGUACAAGCAGUGGGAGCAUGGGCUUAGGCUUCGUGUGACUGCGUACCU',
 'UUUGUAUAAUGGGUCAGCGAGUUACUUUCAGUGGCGAGGUUAACAAAGAAGGAAGCCGUAGAGAAAUCGAGUCUUAAAAG',
 'GGCGCGAGUCGCUGGGAGUAGACCCGAAACCGGGCGAUCUAGCCAUGUCCAGGAUGAAGGUUGGGUAACACCAAGUGGAG',
 'GUCCGAACCGGGUAAUGUUGAAAAAUUAUCGGAUGAGGUGUGGCUAGGAGUGAAAGGCUAAUCAAGCCCGGAGAUAGCUG',
 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',
 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',
 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',
 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',
 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',
 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',
 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',
 {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},]
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import re"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"with open('RNA.txt', 'r') as f:\n",
" lines = f.readlines()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"def is_rna(sequence):\n",
" rna = set('ACGU')\n",
" if sequence:\n",
" if set(sequence).issubset(rna):\n",
" return True\n",
" return False"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',\n",
" 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'GCGCAAACGGUGGAUGCCUAGGCAGUAAGAGGCGAUGAAGGACGUGGAAUCCUGCGAAAAGCUAUGGUGAGCUGGAAACA',\n",
" 'AGCGCUGAGCCGUAGAUGUCCGAAUGGGGAAACCCGGCCAUAUGCAGAUAUGGUCACUCAUAAGUGAAUACAUAGGUUAU',\n",
" 'GAGGGCGAACUCGGGGAACUGAAACAUCUAAGUACCCGAAGGAAAAGAAAUCAAACGAGAUUCCCUAAGUAGCGGCGAGC',\n",
" 'GAACGGGGAGGAGCCUGGUGUGAUAUAGGUAAGAACUAAGUGGAAGCAACUGGAAAGUUGAGACAUAGAGGGUGAUAUCC',\n",
" 'CCGUACACGAAGAGACUGCUGGAACUAAGCACACGAACAAGUAGGUCGGAACACGAGAAAUUCUGAUUGAAUAUGGGUGG',\n",
" 'ACCAUCAUCCAAGGCUAAAUACUCCUUACUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGUGAAAAGAACCCCGGAG',\n",
" 'AGGGGAGUGAAAUAGAUCCUGAAACCGUUUGCGUACAAGCAGUGGGAGCAUGGGCUUAGGCUUCGUGUGACUGCGUACCU',\n",
" 'UUUGUAUAAUGGGUCAGCGAGUUACUUUCAGUGGCGAGGUUAACAAAGAAGGAAGCCGUAGAGAAAUCGAGUCUUAAAAG',\n",
" 'GGCGCGAGUCGCUGGGAGUAGACCCGAAACCGGGCGAUCUAGCCAUGUCCAGGAUGAAGGUUGGGUAACACCAAGUGGAG',\n",
" 'GUCCGAACCGGGUAAUGUUGAAAAAUUAUCGGAUGAGGUGUGGCUAGGAGUGAAAGGCUAAUCAAGCCCGGAGAUAGCUG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',\n",
" 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',\n",
" 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG']"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"aux = []\n",
"aux_rna = []\n",
"aux_final = []\n",
"mydict = {}\n",
"mydict_rna = {}\n",
"_items = [re.split('\\s', line) for line in lines]\n",
"for i, item in enumerate(_items):\n",
" if item[0].startswith('>'):\n",
" mydict['id'] = item[0][1:]\n",
" mydict['reino'] = item[1].split(';')[0]\n",
" aux.append(mydict)\n",
" if is_rna(sequence=item[0]):\n",
" aux.append(item[0])\n",
"aux"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'1': 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA'}\n",
"{'1': 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n",
"{'1': 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GCGCAAACGGUGGAUGCCUAGGCAGUAAGAGGCGAUGAAGGACGUGGAAUCCUGCGAAAAGCUAUGGUGAGCUGGAAACA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'AGCGCUGAGCCGUAGAUGUCCGAAUGGGGAAACCCGGCCAUAUGCAGAUAUGGUCACUCAUAAGUGAAUACAUAGGUUAU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GAGGGCGAACUCGGGGAACUGAAACAUCUAAGUACCCGAAGGAAAAGAAAUCAAACGAGAUUCCCUAAGUAGCGGCGAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GAACGGGGAGGAGCCUGGUGUGAUAUAGGUAAGAACUAAGUGGAAGCAACUGGAAAGUUGAGACAUAGAGGGUGAUAUCC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'CCGUACACGAAGAGACUGCUGGAACUAAGCACACGAACAAGUAGGUCGGAACACGAGAAAUUCUGAUUGAAUAUGGGUGG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'ACCAUCAUCCAAGGCUAAAUACUCCUUACUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGUGAAAAGAACCCCGGAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'AGGGGAGUGAAAUAGAUCCUGAAACCGUUUGCGUACAAGCAGUGGGAGCAUGGGCUUAGGCUUCGUGUGACUGCGUACCU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'UUUGUAUAAUGGGUCAGCGAGUUACUUUCAGUGGCGAGGUUAACAAAGAAGGAAGCCGUAGAGAAAUCGAGUCUUAAAAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GGCGCGAGUCGCUGGGAGUAGACCCGAAACCGGGCGAUCUAGCCAUGUCCAGGAUGAAGGUUGGGUAACACCAAGUGGAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GUCCGAACCGGGUAAUGUUGAAAAAUUAUCGGAUGAGGUGUGGCUAGGAGUGAAAGGCUAAUCAAGCCCGGAGAUAGCUG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC'}\n",
"{'1': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '2': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '3': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG', '4': 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG'}\n"
]
},
{
"data": {
"text/plain": [
"[{'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'}]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"i, j = 0, 0\n",
"lista0 = []\n",
"lista2 = []\n",
"l = []\n",
"d = {}\n",
"for k, item in enumerate(aux):\n",
" if isinstance(item, dict):\n",
" i += 1\n",
" lista0.append(item)\n",
" else:\n",
" j = i\n",
" d[str(j)] = item\n",
" print(d)\n",
" # lista2.append(d[j] = item)\n",
"lista1"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"lista2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"i, j = 0, 0\n",
"lista0 = []\n",
"lista2 = []\n",
"l = []\n",
"d = {}\n",
"for k, item in enumerate(aux):\n",
" if isinstance(item, dict):\n",
" i += 1\n",
" lista0.append(item)\n",
" else:\n",
" j = i\n",
" d[str(j)] = item\n",
" print(d)\n",
" # lista2.append(d[j] = item)\n",
"lista1"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"def funcao(aux):\n",
" a = []\n",
" for item in aux:\n",
" a.append(item)\n",
" if isinstance(item,dict):\n",
" return aux\n",
" aux = []\n",
" if len(aux) != 0:\n",
" return aux\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',\n",
" 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'GCGCAAACGGUGGAUGCCUAGGCAGUAAGAGGCGAUGAAGGACGUGGAAUCCUGCGAAAAGCUAUGGUGAGCUGGAAACA',\n",
" 'AGCGCUGAGCCGUAGAUGUCCGAAUGGGGAAACCCGGCCAUAUGCAGAUAUGGUCACUCAUAAGUGAAUACAUAGGUUAU',\n",
" 'GAGGGCGAACUCGGGGAACUGAAACAUCUAAGUACCCGAAGGAAAAGAAAUCAAACGAGAUUCCCUAAGUAGCGGCGAGC',\n",
" 'GAACGGGGAGGAGCCUGGUGUGAUAUAGGUAAGAACUAAGUGGAAGCAACUGGAAAGUUGAGACAUAGAGGGUGAUAUCC',\n",
" 'CCGUACACGAAGAGACUGCUGGAACUAAGCACACGAACAAGUAGGUCGGAACACGAGAAAUUCUGAUUGAAUAUGGGUGG',\n",
" 'ACCAUCAUCCAAGGCUAAAUACUCCUUACUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGUGAAAAGAACCCCGGAG',\n",
" 'AGGGGAGUGAAAUAGAUCCUGAAACCGUUUGCGUACAAGCAGUGGGAGCAUGGGCUUAGGCUUCGUGUGACUGCGUACCU',\n",
" 'UUUGUAUAAUGGGUCAGCGAGUUACUUUCAGUGGCGAGGUUAACAAAGAAGGAAGCCGUAGAGAAAUCGAGUCUUAAAAG',\n",
" 'GGCGCGAGUCGCUGGGAGUAGACCCGAAACCGGGCGAUCUAGCCAUGUCCAGGAUGAAGGUUGGGUAACACCAAGUGGAG',\n",
" 'GUCCGAACCGGGUAAUGUUGAAAAAUUAUCGGAUGAGGUGUGGCUAGGAGUGAAAGGCUAAUCAAGCCCGGAGAUAGCUG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',\n",
" 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" {'id': 'GCVF02004444.1.2369', 'reino': 'Bacteria'},\n",
" 'CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA',\n",
" 'ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG',\n",
" 'CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU',\n",
" 'GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG',\n",
" 'AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA',\n",
" 'UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG',\n",
" 'GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU',\n",
" 'AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC',\n",
" 'GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG']"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"funcao(aux)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
>GCVF01004444.1.2369 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Alcanivoraceae;Alcanivorax;Thalassiosira rotula
CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA
ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
>GCVF77777777.1.1963 Bacteria;Proteobacteria;Gammaproteobacteria;Legionellales;Legionellaceae;Legionella;Thalassiosira rotula
GCGCAAACGGUGGAUGCCUAGGCAGUAAGAGGCGAUGAAGGACGUGGAAUCCUGCGAAAAGCUAUGGUGAGCUGGAAACA
AGCGCUGAGCCGUAGAUGUCCGAAUGGGGAAACCCGGCCAUAUGCAGAUAUGGUCACUCAUAAGUGAAUACAUAGGUUAU
GAGGGCGAACUCGGGGAACUGAAACAUCUAAGUACCCGAAGGAAAAGAAAUCAAACGAGAUUCCCUAAGUAGCGGCGAGC
GAACGGGGAGGAGCCUGGUGUGAUAUAGGUAAGAACUAAGUGGAAGCAACUGGAAAGUUGAGACAUAGAGGGUGAUAUCC
CCGUACACGAAGAGACUGCUGGAACUAAGCACACGAACAAGUAGGUCGGAACACGAGAAAUUCUGAUUGAAUAUGGGUGG
ACCAUCAUCCAAGGCUAAAUACUCCUUACUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGUGAAAAGAACCCCGGAG
AGGGGAGUGAAAUAGAUCCUGAAACCGUUUGCGUACAAGCAGUGGGAGCAUGGGCUUAGGCUUCGUGUGACUGCGUACCU
UUUGUAUAAUGGGUCAGCGAGUUACUUUCAGUGGCGAGGUUAACAAAGAAGGAAGCCGUAGAGAAAUCGAGUCUUAAAAG
GGCGCGAGUCGCUGGGAGUAGACCCGAAACCGGGCGAUCUAGCCAUGUCCAGGAUGAAGGUUGGGUAACACCAAGUGGAG
GUCCGAACCGGGUAAUGUUGAAAAAUUAUCGGAUGAGGUGUGGCUAGGAGUGAAAGGCUAAUCAAGCCCGGAGAUAGCUG
CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
>GCVF01004444.1.2369 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Alcanivoraceae;Alcanivorax;Thalassiosira rotula
CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA
ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
>GCVF02004444.1.2369 Bacteria;Proteobacteria;Gammaproteobacteria;Oceanospirillales;Alcanivoraceae;Alcanivorax;Thalassiosira rotula
CGUGCACGGUGGAUGCCUUGGCAGCCAGAGGCGAUGAAGGACGUUGUAGCCUGCGAUAAGCUCCGGUUAGGUGGCAAACA
ACCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
CCGUUUGACCCGGAGAUCUCCGAAUGGGGCAACCCACCCGUUGUAAGGCGGGUAUCACCGACUGAAUCCAUAGGUCGGU
GAGGCGAACGCGGGGAACUGAAACAUCUAAGUACCCGUAGGAACAGAAAUCAAUUGAGAUUCCCUGAGUAGCGGCGAGCG
AACGGGGAUUAGCCCUUAAGCUGAUGACUGAUUAGGAGAACGGUCUGGGAAGGCCGACCAUAGUGGGUGAUAGUCCCGUA
UCCGAAAAUCUGAUUCAGUGAAAACGAGUAGGUCGGGGCACGUGUAACCUUGACUGAACAUGGGGGGACCAUCCUCCAAG
GCUAAAUACUCCUGGCUGACCGAUAGUGAACCAGUACCGUGAGGGAAAGGCGAAAAGAACCCCGGAGAGGGGAGUGAAAU
AGAUCCUGAAACCGUGCACGUACAAGCAGUCGGAGCCCGCUUUGUUGGGUGACGGCGUACCUUUUGUAUAAUGGGUCAGC
GACUUAUUCUCAGUAGCGAGGUUAACCAUCUAGGGGAGCCGUAGGGAAACCGAGUCUGAAUAGGGCGUUGAGUUGCUGGG
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment