Skip to content

Instantly share code, notes, and snippets.

View breandan's full-sized avatar
📖
I may be slow to respond.

breandan breandan

📖
I may be slow to respond.
View GitHub Profile
@breandan
breandan / doc_synth_graphcodebert.txt
Last active October 15, 2021 14:40
GraphCodeBERT document synthesis experiments.
> python embedding_server.py --model=microsoft/graphcodebert-base --offline
Starting embeddings server...
Started embeddings server at http://localhost:8000/?query=
Ground truth doc: This method is called during OGNL's bytecode enhancement optimizations in order to determine better-
Synth origin doc: Helpers to provide a string suitable in calling all Java methods of Object for each object
Synth refact doc: This provides two forms to access index value to object properties to avoid using any abstract access
Rouge score before refactoring: 0.33604336043360433
Rouge score after refactoring: 0.12059620596205962
Relative difference: 1.7865168539325842
Put 1.7865168539325842 in (14, renameTokens)
\begin{table}[H]
\begin{tabular}{l|ccc}
Complexity & renameTokens & permuteArgument & swapMultilineNo \\\hline\
10-20 & 0.018 ± 0.004 (594) & 0.071 ± 0.024 (541) & 0.0 ± 0.0 (594) \\
20-30 & 0.030 ± 0.003 (440) & 0.081 ± 0.018 (439) & 0.002 ± 2.675 (440) \\
30-40 & 0.032 ± 0.004 (242) & 0.096 ± 0.014 (241) & 0.005 ± 6.368 (242) \\
40-50 & 0.036 ± 0.003 (144) & 0.088 ± 0.015 (144) & 0.014 ± 0.001 (144) \\
50-60 & 0.026 ± 0.004 (286) & 0.106 ± 0.011 (286) & 0.017 ± 0.002 (286) \\
60-70 & 0.036 ± 0.005 (149) & 0.091 ± 0.009 (149) & 0.032 ± 0.005 (149) \\
70-80 & 0.051 ± 0.005 (49) & 0.078 ± 0.013 (49) & 0.063 ± 0.005 (49) \\
\begin{table}[H]
\begin{tabular}{l|ccc}
Complexity & renameTokens & permuteArgument & swapMultilineNo \\\hline\
10-20 & 0.014 ± 0.003 (1691)& 0.050 ± 0.017 (1584)& 1.774 ± 1.770 (1691)\\
20-30 & 0.021 ± 0.004 (1656)& 0.051 ± 0.015 (1640)& 0.001 ± 2.482 (1656)\\
30-40 & 0.030 ± 0.004 (789) & 0.080 ± 0.019 (787) & 0.004 ± 6.568 (789) \\
40-50 & 0.032 ± 0.003 (519) & 0.064 ± 0.014 (517) & 0.013 ± 0.001 (519) \\
50-60 & 0.037 ± 0.005 (623) & 0.081 ± 0.013 (624) & 0.017 ± 0.002 (623) \\
60-70 & 0.044 ± 0.005 (370) & 0.068 ± 0.007 (370) & 0.028 ± 0.003 (370) \\
70-80 & 0.046 ± 0.005 (168) & 0.088 ± 0.013 (168) & 0.049 ± 0.006 (168) \\
@breandan
breandan / graphcodebert_base.tex
Last active October 11, 2021 23:04
Average decrease in completion accuracy across 10 masked tokens before and after code transformation. Columns = code transformations, rows = cyclomatic complexity.
\begin{table}[H]
\begin{tabular}{l|ccc}
Complexity & renameTokens & permuteArgument & swapMultilineNo \\\hline\
10-20 & 0.012 ± 0.003 (638) & 0.044 ± 0.015 (578) & 0.0 ± 0.0 (638) \\
20-30 & 0.027 ± 0.003 (476) & 0.054 ± 0.011 (475) & 0.005 ± 5.921 (476) \\
30-40 & 0.023 ± 0.003 (284) & 0.072 ± 0.014 (283) & 0.005 ± 8.444 (284) \\
40-50 & 0.032 ± 0.003 (161) & 0.068 ± 0.011 (161) & 0.009 ± 0.001 (161) \\
50-60 & 0.030 ± 0.003 (305) & 0.094 ± 0.011 (305) & 0.020 ± 0.002 (305) \\
60-70 & 0.030 ± 0.003 (165) & 0.085 ± 0.013 (165) & 0.036 ± 0.004 (165) \\
70-80 & 0.041 ± 0.004 (57) & 0.074 ± 0.007 (57) & 0.070 ± 0.008 (57) \\
@breandan
breandan / palprimes.txt
Created September 21, 2021 17:21
Numbers with exactly n factors, all of which are palindromes.
n=5
257 65537 107
n=4
364621 257
257 397379
400067 440171
475159 257
@breandan
breandan / dokka_error.log
Created September 18, 2021 18:18
Build log after updating from Dokka `1.4.32` to `1.5.30`.
Run git submodule update --init --recursive
Submodule 'kaliningraph' (https://github.com/breandan/kaliningraph) registered for path 'kaliningraph'
Cloning into '/home/runner/work/kotlingrad/kotlingrad/kaliningraph'...
Submodule path 'kaliningraph': checked out 'c4776d73bbb792cb6e19f6f9ead4913834743d7f'
Downloading https://services.gradle.org/distributions/gradle-7.2-bin.zip
..........10%...........20%...........30%...........40%...........50%...........60%...........70%...........80%...........90%...........100%
Welcome to Gradle 7.2!
@breandan
breandan / codeTxLogs.txt
Created September 13, 2021 01:19
graphcodebert code transformations
Running accuracy of microsoft/graphcodebert-base with [renameTokens] transformation (10.0 samples): 0.2
Running accuracy of microsoft/graphcodebert-base with [permuteArgumentOrder] transformation (10.0 samples): 0.4
Running accuracy of microsoft/graphcodebert-base with [swapMultilineNoDeps] transformation (10.0 samples): 0.3
Running accuracy of microsoft/graphcodebert-base with [same] transformation (10.0 samples): 0.4
Running accuracy of microsoft/graphcodebert-base with [renameTokens] transformation (20.0 samples): 0.2
@breandan
breandan / eval_doc_synthesis_task.txt
Last active September 1, 2021 04:24
Sample JavaDocs from evaluating GraphCodeBERT on documentation synthesis. Computes the depth-3-expanded English synonym overlap between the original and synthetic Javadoc.
| original doc | synthetic doc |
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| /**~ See {@link #add(Object)} for general comments. */~ | /** |
| | ** * Add all objects to the database. ** |
| | ** */** |
ROUGE-Synonym-Depth3 score: 0.26424870466321243
@breandan
breandan / document_generation.txt
Last active August 28, 2021 23:19
Synthetically-generated documentation using GraphCodeBERT-base. Second lines are sampled using autoregressive masking w/ greedy decoding.
| prompt given to model | predicted comment |
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| /** | /** |
| * | * **Private methods ** |
| */ | */ |
| | |
| private Net()
@breandan
breandan / samples.txt
Last active August 24, 2021 18:00
Examples of code snippets, synthetic variants, and transformer predictions for masked tokens.
Each diff is a triplet comparing:
1. the original code snippet
2. the synthetically generated variant
3. the model's prediction vs. ground truth
| original | synthetic variant |
|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
| fun VecIndex.knn(v: DoubleArray, i: Int, ~exact:~ Boolean = false) = | fun VecIndex.knn(v: DoubleArray, i: Int, **involve:** Boolean = false) = |
| if(~exact~) ~exactKNNSearch~(v, i + 10) | if(**involve**) **involveKNNSearch**(v, i + 10) |