Skip to content

Instantly share code, notes, and snippets.

@ljnmedium
Created September 29, 2023 07:37
Show Gist options
  • Save ljnmedium/32dd1d0217fd574bd6d13363374ddd2b to your computer and use it in GitHub Desktop.
Save ljnmedium/32dd1d0217fd574bd6d13363374ddd2b to your computer and use it in GitHub Desktop.
example_upload_data.py
data = [
{'content': 'This universal registration document, eco-designed from start to finish for the second year, captures the essence \nof what makes Eiffage different and embodies its vision and commitments to advancing the environmental transition.\nUNIVERSAL REGISTRATION DOCUMENT 2021Advancing the environmental transition \nand fulfilling our social responsibility',
'metadata': {'source': 'Universal Registration Document/10_EIFFAGE_URD2021_VA.pdf',
'page': 0,
'id': 0}},
{'content': '01In your hands you have a document designed \nusing a low-carbon approach for the second \nyear in a row. Its plain and simple content, \nthe\xa0typefaces selected, the visuals, colours, \nprinting techniques and distribution methods… \nAll these decisions help to\xa0make this annual \nreport a resource-efficient, low-impact \nproduct. \nEvery year, we refine our approach using \nthe\xa0lessons we learned from our previous \nexperience, as all the Group’s teams do. \nWe\xa0continue to work hard to devise and \nimplement solutions that will enable us \nto\xa0fulfil\xa0our ambition of contributing every day, \nat\xa0every level, to the environmental transition \neverywhere we operate.Interwiew with Benoît de Ruffray p. 02-03 \nProfile and key figures p. 04-17\nHighly engaged teams p. 18-29\nOur commitment to a sustainable offer p. 30-55\nNon-financial performance statement p. 57-156\nFinancial and governance information p. 157-295 \nGeneral information p. 296-303 \nCross-references tables p. 305-312CONTENT\n100% \nrecycled paper\nClean design with \nno large areas of colour\nMonitoring and compliance \nwith ICPE thresholds\nPlant-based \ninks usedZero emails sent \nduring production \nof the projectInk coverage \nreduced by 20%\n0% paper wasted \nby optimising printing\nGreen electricity, \n100% generated \nfrom renewable sourcesNumber of documents \nprinted cut by 12%',
'metadata': {'source': 'Universal Registration Document/10_EIFFAGE_URD2021_VA.pdf',
'page': 2,
'id': 1}},
{'content': '02INTERVIEW\nWe can look to the future \nwith confidence despite the \nuncertainties \ngenerated \nby the major \ngeopolitical \ncrisis on \nEurope’s borders given the growth \nmomentum we achieved in 2021 \nand the visibility we\xa0have on 2022 \nthanks to our high order book, \nas well as the tremendous \nopportunities arising from \nthe environmental and digital \ntransition. Our driving force comes \nfrom our team’s engagement \nand the dynamic performance \nof all our business lines.Benoît \nde Ruffray, \nChairman\nand Chief\nExecutive \nOfficer \nof Eiffage',
'metadata': {'source': 'Universal Registration Document/10_EIFFAGE_URD2021_VA.pdf',
'page': 3,
'id': 2}}
]
retriever = Retreiver(index_name='project_hello', embedd_openai_model=EMBEDDING_MODEL, sparse_model_file_name=SPARSE_MODEL_FILE_NAME )
retriever.upsert_batch(data, namespace="Eiffage")
retriever.index.describe_index_stats()
#{'dimension': 1536,'index_fullness': 0.1,'namespaces': 'Eiffage': {'vector_count': 344}},'total_vector_count': 6173}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment