Q: How does pySonar / Sonar work?
A: pySonar
is a user facing library exposing the models to build.
Q: Is pySonar only run locally on the data scientists machine?
A: Yes. Just use it in a notebook or tooling of your choice and submit your model into the Open Mined network.
Q: Is Sonar just a smart contract?
A: Smart Contracts and abstraction layer to the underlying blockchain tech.
Q: Is the schema for blockchain info fixed?
A:
Q: How is the model serialized/deserialized (pySonar?)
A: Just pickled model for now.
Q: Do pySonar
and syft
need to be kept in sync/version controlled because models are implicitly shared between the two?
A: syft
is the library used by the data scientist to create the model. pySonar
is only used to push the model into the ecosystem.
Q: When we talk about gradients what exactly are they? :-)
A:
Q: Who is responsible for calculating the success (gradient?) of the Mine?
A: The data scientist generating the model has a validation set. The new gradients are checked against the validation set.
Q: Can multiple Mines train at once?
A: Yes, gradients are computed for multiple Mine trainings and change the model afterwards by aggregating all the gradients. Compare it to batching
in traditional deep learning. Multiple Miners generate a batch.
Q: Cont. above: can the changes generated by a Mine be considered differential changes and be applied in random order?
e.g. Model = Mo, Mine1 = M1, Mine2 = M2, final model parameters = F would
Mo -> M1 -> M2
result in the same F asMo -> M2 -> M1
A: No, mining should occur in a somehow sequential order where you build your gradients on top of other peoples gradients. The batch analogy works here again. You can't train the model with one huge batch.
Q:
Q: Is the priority to make it easy for Data Scientists to create models or Miners to create their mine?
More of a meta topic I'd like to get a feeling of the teams view
A: Personally I'd focus on 2. possibly even through gamification to get ppl interested, while ofc making sure that 1. is doable but basic programming skills are required anyway.
Q: Does the Mine regularly poll the blockchain?
A: Full history of training is on blockchain.
Q: How does the Mine determine if it is able to run a network?
A:
Q: How is data passed to syft
(what's the command to spawn it? data as file would be best)
Q: What libraries should the mine
have?
A: sonar-probe
, adapters
Q: What does the Mine do with the data it fetches from IPFS?
Q: What does the Mine do with the results from syft
?
Q: Can Miners specify a minimum reward they expect for their data? e.g. mining costs need to be taken care of
A: Yes,
Q: Is the first Miner on the network the one gaining the biggest reward?
A: At the beginning the error goes down a lot more than near the end. However it is necessary for Miners to stick around.
Q: From the workflow it seems like he needs to have some software running as well..?
Q: This thing needs a name
A: Lab 👯?
Q: What happens to Mine results if ... is offline?
Q: Does a data scientist have an influence on mine availability?
A: Miners might get incentivized to stay online and do reproducable training. Maybe improved payout for subsequent trainings with the same dataset.
Q: Should we blawg on medium about progress, collaboration experiences, tech stuff?
Q: Most of the descriptions of homomorphic encryption I've seen is that it's impractical because it is computationally expensive, how is it different now?
Q: What are adapters
?
A: Used to convert raw data into usable format by the model/syft. Need more discussion into seperating data ingestion and feature engineering
A: The smart contract can contact capsule to decrypt specific values. No other part in the ecosystem can decrypt stuff.
Q: Should adapters be a part of a "marketplace" where you can convert data from any format into the Open Mined schemas?