Use GitHub API to batch download .sch
files.
- GitHub API requires you to limit searches to users, orgs, or repos. So we first have to get a list of repos that use the KiCad or EAGLE "languages".
https://api.github.com/search/repositories?q=language:kicad&per_page=100&page=1
https://api.github.com/search/repositories?q=language:eagle&per_page=100&page=1
The max pages returned is 34 with 100 results per page.
- Once we have a list of repos we can extract users from these results and target our next set of queries on users. For instance, the mossman/hackrf repo showed up in our first query. From this we can assume that mossman might have some more KiCad/EAGLE repos, so lets include him in our
.sch
file search.
https://api.github.com/search/code?q=endComp+in:file+language:kicad+user:mossmann
Here we are looking for files that include the text endComp in repositories written in KiCad by the mossman user. We use endComp
as our search query because we are only interested in .sch
files that actually include components, and $endComp
is a required tag for such files.
List files less than n
bytes
find data/sch/*.sch -type f -size -4096c
That output can also be redirected to xargs so and the contents of those files can be concated together and saved to disk like:
find data/sch/*.sch -type f -size -4096c | xargs cat > data/4KB_concated.txt
If no docker container running launch one w(with a mounted volume) like:
# mounting data/ from current directory
docker run -ti -v $(pwd)/data:/root/torch-rnn/data crisbal/torch-rnn:base bash
Then pre-process data with:
FILE_BASENAME='data/concated/4KB_concated'
python scripts/preprocess.py \
--input_txt "$FILE_BASENAME.txt" \
--output_h5 "$FILE_BASENAME.h5" \
--output_json "$FILE_BASENAME.json"
And train with default hyperparameters like:
th train.lua \
-input_h5 "$FILE_BASENAME.h5" \
-input_json "$FILE_BASENAME.json" \
-checkpoint_every 250 \
-gpu -1 # disable gpu support
Samples can be generated from checkpoints with:
CHECKPOINT=1000
SAMPLE_SIZE=2000
th sample.lua -checkpoint "cv/checkpoint_$CHECKPOINT.t7" \
-length $SAMPLE_SIZE \
-start_text "EESchema Schematic File Version " \
-gpu -1