Skip to content

Instantly share code, notes, and snippets.

@ionox0
Last active April 7, 2021 20:15
Show Gist options
  • Save ionox0/8cc8e16fbd1dc8536985e74e63d031ba to your computer and use it in GitHub Desktop.
Save ionox0/8cc8e16fbd1dc8536985e74e63d031ba to your computer and use it in GitHub Desktop.
find_duplicate_fastq_requests
access_filems = FileMetadata.objects.filter(metadata__recipe='MSK-ACCESS_v1', file__file_type__name='fastq')
access_request_ids = set([f.metadata['requestId'] for f in access_filems])
# --> 103 ACCESS request IDs total
duplicates = File.objects\
.filter(
filemetadata__metadata__requestId__in=access_request_ids,
file_type__name='fastq'
)\
.values('file_name')\
.annotate(file_name_count=Count('file_name'))\
.filter(file_name_count__gt=1)
duplicates = FileMetadata.objects.filter(file__file_name__in=[f['file_name'] for f in duplicates])
requests_with_duplicate_files = set([f.metadata['requestId'] for f in duplicates])
"""
--> ACCESS requests with duplicate files:
{'05500_FZ',
'05500_GB',
'06302_AC',
'06302_AD',
'06302_AE',
'06302_AF',
'06302_AG',
'06302_AH',
'06302_AN',
'06745_J',
'07250_BS',
'07250_BX',
'09324_G',
'09342_B',
'10151_F',
'10212_H'}
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment