Sonar

Q: How does pySonar / Sonar work?

A: pySonar is a user facing library exposing the models to build.

Q: Is pySonar only run locally on the data scientists machine?

A: Yes. Just use it in a notebook or tooling of your choice and submit your model into the Open Mined network.

Q: Is Sonar just a smart contract?

A: Smart Contracts and abstraction layer to the underlying blockchain tech.

Q: Is the schema for blockchain info fixed?

Q: How is the model serialized/deserialized (pySonar?)

A: Just pickled model for now.

Q: Do pySonar and syft need to be kept in sync/version controlled because models are implicitly shared between the two?

A: syft is the library used by the data scientist to create the model. pySonar is only used to push the model into the ecosystem.

Overall?

Q: When we talk about gradients what exactly are they? :-)

Q: Who is responsible for calculating the success (gradient?) of the Mine?

A: The data scientist generating the model has a validation set. The new gradients are checked against the validation set.

Q: Can multiple Mines train at once?

A: Yes, gradients are computed for multiple Mine trainings and change the model afterwards by aggregating all the gradients. Compare it to batching in traditional deep learning. Multiple Miners generate a batch.

Q: Cont. above: can the changes generated by a Mine be considered differential changes and be applied in random order?

e.g. Model = Mo, Mine1 = M1, Mine2 = M2, final model parameters = F would Mo -> M1 -> M2 result in the same F as Mo -> M2 -> M1

A: No, mining should occur in a somehow sequential order where you build your gradients on top of other peoples gradients. The batch analogy works here again. You can't train the model with one huge batch.

Q: Is the priority to make it easy for Data Scientists to create models or Miners to create their mine?

More of a meta topic I'd like to get a feeling of the teams view

A: Personally I'd focus on 2. possibly even through gamification to get ppl interested, while ofc making sure that 1. is doable but basic programming skills are required anyway.

Mine

Q: Does the Mine regularly poll the blockchain?

A: Full history of training is on blockchain.

Q: How does the Mine determine if it is able to run a network?

Q: How is data passed to syft (what's the command to spawn it? data as file would be best)

Q: What libraries should the mine have?

A: sonar-probe, adapters

Q: What does the Mine do with the data it fetches from IPFS?

Q: What does the Mine do with the results from syft?

Q: Can Miners specify a minimum reward they expect for their data? e.g. mining costs need to be taken care of

A: Yes,

Q: Is the first Miner on the network the one gaining the biggest reward?

A: At the beginning the error goes down a lot more than near the end. However it is necessary for Miners to stick around.

Data Scientist

Q: From the workflow it seems like he needs to have some software running as well..?

Q: This thing needs a name

A: Lab 👯?

Q: What happens to Mine results if ... is offline?

Q: Does a data scientist have an influence on mine availability?

A: Miners might get incentivized to stay online and do reproducable training. Maybe improved payout for subsequent trainings with the same dataset.

Misc

Q: Should we blawg on medium about progress, collaboration experiences, tech stuff?

Q: Most of the descriptions of homomorphic encryption I've seen is that it's impractical because it is computationally expensive, how is it different now?

Q: What are adapters?

A: Used to convert raw data into usable format by the model/syft. Need more discussion into seperating data ingestion and feature engineering

What docker images to create?
How to automate image creation?
- circle CI to dockerhub
When should images be created?
- develop=:edge, master=:latest
How to tag/release?
- when going master (milestone)
- add git tag -> automatically build tagged docker images
What compose files should we have?
Where do we keep compose files? pySonar

Use libraries as static images (no CMD/ENTRYPOINT) to build larger images

Todos:

issue: put sonar into seperate static image to be required for pysonar/mine builds
figure out how to add unique tags to each build
shouldnt we do dockerhub for not worrying about dependencies anymore? (look at this shitty graph)
Can dockerhub autodeploy handle multiple dockerfiles in one repo (and how?)

anoff/2017-08-02-questions.md

Sonar

Overall?

Mine

Data Scientist

Misc

anoff commented Aug 1, 2017

Uh oh!