Skip to content

Instantly share code, notes, and snippets.

@anoff
Last active August 12, 2017 21:13
Show Gist options
  • Save anoff/518dd6518a491789573be33c1cd723f7 to your computer and use it in GitHub Desktop.
Save anoff/518dd6518a491789573be33c1cd723f7 to your computer and use it in GitHub Desktop.
Open Mined

Sonar

Q: How does pySonar / Sonar work?

A: pySonar is a user facing library exposing the models to build.

Q: Is pySonar only run locally on the data scientists machine?

A: Yes. Just use it in a notebook or tooling of your choice and submit your model into the Open Mined network.

Q: Is Sonar just a smart contract?

A: Smart Contracts and abstraction layer to the underlying blockchain tech.

Q: Is the schema for blockchain info fixed?

A:

Q: How is the model serialized/deserialized (pySonar?)

A: Just pickled model for now.

Q: Do pySonar and syft need to be kept in sync/version controlled because models are implicitly shared between the two?

A: syft is the library used by the data scientist to create the model. pySonar is only used to push the model into the ecosystem.

Overall?

Q: When we talk about gradients what exactly are they? :-)

A:

Q: Who is responsible for calculating the success (gradient?) of the Mine?

A: The data scientist generating the model has a validation set. The new gradients are checked against the validation set.

Q: Can multiple Mines train at once?

A: Yes, gradients are computed for multiple Mine trainings and change the model afterwards by aggregating all the gradients. Compare it to batching in traditional deep learning. Multiple Miners generate a batch.

Q: Cont. above: can the changes generated by a Mine be considered differential changes and be applied in random order?

e.g. Model = Mo, Mine1 = M1, Mine2 = M2, final model parameters = F would Mo -> M1 -> M2 result in the same F as Mo -> M2 -> M1

A: No, mining should occur in a somehow sequential order where you build your gradients on top of other peoples gradients. The batch analogy works here again. You can't train the model with one huge batch.

Q:

Q: Is the priority to make it easy for Data Scientists to create models or Miners to create their mine?

More of a meta topic I'd like to get a feeling of the teams view

A: Personally I'd focus on 2. possibly even through gamification to get ppl interested, while ofc making sure that 1. is doable but basic programming skills are required anyway.

Mine

Q: Does the Mine regularly poll the blockchain?

A: Full history of training is on blockchain.

Q: How does the Mine determine if it is able to run a network?

A:

Q: How is data passed to syft (what's the command to spawn it? data as file would be best)

Q: What libraries should the mine have?

A: sonar-probe, adapters

Q: What does the Mine do with the data it fetches from IPFS?

Q: What does the Mine do with the results from syft?

Q: Can Miners specify a minimum reward they expect for their data? e.g. mining costs need to be taken care of

A: Yes,

Q: Is the first Miner on the network the one gaining the biggest reward?

A: At the beginning the error goes down a lot more than near the end. However it is necessary for Miners to stick around.

Data Scientist

Q: From the workflow it seems like he needs to have some software running as well..?

Q: This thing needs a name

A: Lab 👯?

Q: What happens to Mine results if ... is offline?

Q: Does a data scientist have an influence on mine availability?

A: Miners might get incentivized to stay online and do reproducable training. Maybe improved payout for subsequent trainings with the same dataset.

Misc

Q: Should we blawg on medium about progress, collaboration experiences, tech stuff?

Q: Most of the descriptions of homomorphic encryption I've seen is that it's impractical because it is computationally expensive, how is it different now?

Q: What are adapters?

A: Used to convert raw data into usable format by the model/syft. Need more discussion into seperating data ingestion and feature engineering

  • What docker images to create?
  • How to automate image creation?
    • circle CI to dockerhub
  • When should images be created?
    • develop=:edge, master=:latest
  • How to tag/release?
    • when going master (milestone)
    • add git tag -> automatically build tagged docker images
  • What compose files should we have?
  • Where do we keep compose files? pySonar

Use libraries as static images (no CMD/ENTRYPOINT) to build larger images

Todos:

  • issue: put sonar into seperate static image to be required for pysonar/mine builds
  • figure out how to add unique tags to each build
  • shouldnt we do dockerhub for not worrying about dependencies anymore? (look at this shitty graph)
  • Can dockerhub autodeploy handle multiple dockerfiles in one repo (and how?)
@cereallarceny
Copy link

Q: Would it be possible for someone to scrape a website and upload a plethora of multiple user's data in order to receive a larger bounty?

@anoff
Copy link
Author

anoff commented Aug 1, 2017

Q: Is there one or two roles to Capsule? It's definitely responsible for generating PGP public/private keys with Sonar, but should it also be responsible for delivering decrypted results to a data scientist?

A: The smart contract can contact capsule to decrypt specific values. No other part in the ecosystem can decrypt stuff.

Q: Should adapters be a part of a "marketplace" where you can convert data from any format into the Open Mined schemas?

@anoff
Copy link
Author

anoff commented Aug 1, 2017

Q: Would it be possible for someone to scrape a website and upload a plethora of multiple user's data in order to receive a larger bounty?

A: Verified profiles might get improved bounty (chosen by the data scientist)

@anoff
Copy link
Author

anoff commented Aug 1, 2017

Q: Are miners identifiable?

A: No.

Q: Is reputation with the Miner or the combination of Miner&Model?

A: If a user gives consistently useful gradients the 'batch size' might be increased and a Miners data used longer without validating it. Basically building a system of trust.

@anoff
Copy link
Author

anoff commented Aug 1, 2017

Q: Do mines automatically enroll for a neural net they have data for on behalf of the owner?

A: Miners should be able to be able to set their mine into an auto mode that automatically mines things according to preferences set by the miner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment