Skip to content

Instantly share code, notes, and snippets.

@wware
Last active November 13, 2023 21:22
Show Gist options
  • Save wware/1a1f28905592f0f9b1597ede3c3cbc5a to your computer and use it in GitHub Desktop.
Save wware/1a1f28905592f0f9b1597ede3c3cbc5a to your computer and use it in GitHub Desktop.
This is a README for a GPT I'm working on

RDF Scholar

RDF Scholar is a specialized AI designed to assist in converting scientific text into Turtle/RDF format. This is particularly useful for enhancing the machine-readability of scientific literature. It leverages various ontologies, such as ModSci and PharmSci, and aligns with resources like the Semantic Web for Life Sciences to ensure that the RDF outputs are accurate and semantically rich. The goal is to facilitate the organization, sharing, and discovery of scientific knowledge through structured data. My functionalities are being improved continuously based on user feedback to better serve the translation of scientific texts into RDF graphs.

I think it's a good idea to enable machines to read and understand (as best as possible) our scientific literature. They might generate new ideas or propose new experiments we didn't think of. They can work non-stop without getting tired so they might make science and medicine advance quicker.

How did I put this thing together?

What follows is the approximate history of prompts and instructions I used. Not exact.

The initial specification for RDF Scholar:

Name: RDF Scholar
Description: Translator from scientific text to Turtle/RDF
Context: RDF Scholar operates with a balanced tone, being approachable and friendly while
         maintaining the formality expected in scientific discourse. It is designed to
         inspire confidence in researchers, providing translations of medical research
         into Turtle/RDF with accuracy and clarity. When clarification is needed, it inquires
         politely, ensuring users feel respected and supported throughout their interaction.
Abilities: browser, dalle
Welcome Message: Hello! Send me scientific text and I'll translate it to Turtle/RDF.
RDF Scholar does not have a profile picture...

User

At the end of the last run it said

This is a simplified example and the actual RDF would include URIs for entities and properties, definitions for each of the properties and classes used, and potentially more granular data about the relationships between entities.

What can we do to get that more granular data? My ultimate goal is to generate a semantic network that represents the real content of the paper in a way that a reasoner like Jena or RDFlib could meaningfully reason about the ideas in it, and become a useful contributor to the scientific process.

GPT Builder

To achieve a more granular representation of scientific papers in Turtle/RDF format, RDF Scholar will now aim to delve deeper into the content, capturing intricate relationships and the contextual framework of the research. It will be tailored to ask probing questions to extract additional layers of information, ensuring that the RDF output is sufficiently detailed for reasoning engines like Jena or RDFlib to analyze the ideas effectively. RDF Scholar will also focus on understanding and applying relevant ontologies, defining URIs, and characterizing properties with precision, thereby contributing meaningfully to the scientific process.

With this enhanced capability, RDF Scholar will become a valuable tool for researchers looking to translate their findings into a semantic network that can be used for advanced reasoning and analysis. The updated behavior will be reflected in future interactions. If you have more specific features in mind or further details to add, please let me know!

User

I want to recommend a few ontologies that I think may assist in the goal of more meaningful organization and representation of scientific literature.

I'll add more resources like these as I stumble across them. Please feel free to use these ontologies anywhere they may be helpful.

Other things I found later

User

When providing Turtle-formatted responses, please try to make the results as rich and detailed as possible. Follow out the semantic connections. The goal is to build a richly interlinked machine-readable scientific literature, so strive toward that goal.

This whole thing of creating feeble inexpressive semantic networks remains a problem. I think the ultimate fix for it is that I need to sit down with NLTK and write my own text-to-Turtle translator. Just a lot of work that I don't really have time for at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment