breandan’s gists

breandan / doc_synth_graphcodebert.txt

Last active October 15, 2021 14:40

GraphCodeBERT document synthesis experiments.

	> python embedding_server.py --model=microsoft/graphcodebert-base --offline
	Starting embeddings server...
	Started embeddings server at http://localhost:8000/?query=
	Ground truth doc: This method is called during OGNL's bytecode enhancement optimizations in order to determine better-
	Synth origin doc: Helpers to provide a string suitable in calling all Java methods of Object for each object
	Synth refact doc: This provides two forms to access index value to object properties to avoid using any abstract access
	Rouge score before refactoring: 0.33604336043360433
	Rouge score after refactoring: 0.12059620596205962
	Relative difference: 1.7865168539325842
	Put 1.7865168539325842 in (14, renameTokens)

breandan / roberta.tex

Last active October 12, 2021 04:18

	\begin{table}[H]
	\begin{tabular}{l\|ccc}
	Complexity & renameTokens & permuteArgument & swapMultilineNo \\\hline\
	10-20 & 0.018 ± 0.004 (594) & 0.071 ± 0.024 (541) & 0.0 ± 0.0 (594) \\
	20-30 & 0.030 ± 0.003 (440) & 0.081 ± 0.018 (439) & 0.002 ± 2.675 (440) \\
	30-40 & 0.032 ± 0.004 (242) & 0.096 ± 0.014 (241) & 0.005 ± 6.368 (242) \\
	40-50 & 0.036 ± 0.003 (144) & 0.088 ± 0.015 (144) & 0.014 ± 0.001 (144) \\
	50-60 & 0.026 ± 0.004 (286) & 0.106 ± 0.011 (286) & 0.017 ± 0.002 (286) \\
	60-70 & 0.036 ± 0.005 (149) & 0.091 ± 0.009 (149) & 0.032 ± 0.005 (149) \\
	70-80 & 0.051 ± 0.005 (49) & 0.078 ± 0.013 (49) & 0.063 ± 0.005 (49) \\

breandan / codebertbase-mlm.tex

Last active October 12, 2021 04:00

	\begin{table}[H]
	\begin{tabular}{l\|ccc}
	Complexity & renameTokens & permuteArgument & swapMultilineNo \\\hline\
	10-20 & 0.014 ± 0.003 (1691)& 0.050 ± 0.017 (1584)& 1.774 ± 1.770 (1691)\\
	20-30 & 0.021 ± 0.004 (1656)& 0.051 ± 0.015 (1640)& 0.001 ± 2.482 (1656)\\
	30-40 & 0.030 ± 0.004 (789) & 0.080 ± 0.019 (787) & 0.004 ± 6.568 (789) \\
	40-50 & 0.032 ± 0.003 (519) & 0.064 ± 0.014 (517) & 0.013 ± 0.001 (519) \\
	50-60 & 0.037 ± 0.005 (623) & 0.081 ± 0.013 (624) & 0.017 ± 0.002 (623) \\
	60-70 & 0.044 ± 0.005 (370) & 0.068 ± 0.007 (370) & 0.028 ± 0.003 (370) \\
	70-80 & 0.046 ± 0.005 (168) & 0.088 ± 0.013 (168) & 0.049 ± 0.006 (168) \\

breandan / graphcodebert_base.tex

Last active October 11, 2021 23:04

Average decrease in completion accuracy across 10 masked tokens before and after code transformation. Columns = code transformations, rows = cyclomatic complexity.

	\begin{table}[H]
	\begin{tabular}{l\|ccc}
	Complexity & renameTokens & permuteArgument & swapMultilineNo \\\hline\
	10-20 & 0.012 ± 0.003 (638) & 0.044 ± 0.015 (578) & 0.0 ± 0.0 (638) \\
	20-30 & 0.027 ± 0.003 (476) & 0.054 ± 0.011 (475) & 0.005 ± 5.921 (476) \\
	30-40 & 0.023 ± 0.003 (284) & 0.072 ± 0.014 (283) & 0.005 ± 8.444 (284) \\
	40-50 & 0.032 ± 0.003 (161) & 0.068 ± 0.011 (161) & 0.009 ± 0.001 (161) \\
	50-60 & 0.030 ± 0.003 (305) & 0.094 ± 0.011 (305) & 0.020 ± 0.002 (305) \\
	60-70 & 0.030 ± 0.003 (165) & 0.085 ± 0.013 (165) & 0.036 ± 0.004 (165) \\
	70-80 & 0.041 ± 0.004 (57) & 0.074 ± 0.007 (57) & 0.070 ± 0.008 (57) \\

breandan / palprimes.txt

Created September 21, 2021 17:21

Numbers with exactly n factors, all of which are palindromes.

	n=5

	257 65537 107

	n=4

	364621 257
	257 397379
	400067 440171
	475159 257

breandan / dokka_error.log

Created September 18, 2021 18:18

Build log after updating from Dokka `1.4.32` to `1.5.30`.


	Run git submodule update --init --recursive
	Submodule 'kaliningraph' (https://github.com/breandan/kaliningraph) registered for path 'kaliningraph'
	Cloning into '/home/runner/work/kotlingrad/kotlingrad/kaliningraph'...
	Submodule path 'kaliningraph': checked out 'c4776d73bbb792cb6e19f6f9ead4913834743d7f'
	Downloading https://services.gradle.org/distributions/gradle-7.2-bin.zip
	..........10%...........20%...........30%...........40%...........50%...........60%...........70%...........80%...........90%...........100%

	Welcome to Gradle 7.2!

breandan / codeTxLogs.txt

Created September 13, 2021 01:19

graphcodebert code transformations

	Running accuracy of microsoft/graphcodebert-base with [renameTokens] transformation (10.0 samples): 0.2

	Running accuracy of microsoft/graphcodebert-base with [permuteArgumentOrder] transformation (10.0 samples): 0.4

	Running accuracy of microsoft/graphcodebert-base with [swapMultilineNoDeps] transformation (10.0 samples): 0.3

	Running accuracy of microsoft/graphcodebert-base with [same] transformation (10.0 samples): 0.4

	Running accuracy of microsoft/graphcodebert-base with [renameTokens] transformation (20.0 samples): 0.2

breandan / eval_doc_synthesis_task.txt

Last active September 1, 2021 04:24

Sample JavaDocs from evaluating GraphCodeBERT on documentation synthesis. Computes the depth-3-expanded English synonym overlap between the original and synthetic Javadoc.

	\| original doc \| synthetic doc \|
	\|----------------------------------------------------------------------------------\|----------------------------------------------------------------------------------\|
	\| /*~ See {@link #add(Object)} for general comments. /~ \| /** \|
	\| \| ** * Add all objects to the database. ** \|
	\| \| ** /* \|


	ROUGE-Synonym-Depth3 score: 0.26424870466321243

breandan / document_generation.txt

Last active August 28, 2021 23:19

Synthetically-generated documentation using GraphCodeBERT-base. Second lines are sampled using autoregressive masking w/ greedy decoding.

	\| prompt given to model \| predicted comment \|
	\|----------------------------------------------------------------------------------\|----------------------------------------------------------------------------------\|
	\| / \| / \|
	\| * \| * Private methods \|
	\| / \| / \|
	\| \| \|
	\| private Net()

breandan / samples.txt

Last active August 24, 2021 18:00

Examples of code snippets, synthetic variants, and transformer predictions for masked tokens.

	Each diff is a triplet comparing:

	1. the original code snippet
	2. the synthetically generated variant
	3. the model's prediction vs. ground truth

	\| original \| synthetic variant \|
	\|----------------------------------------------------------------------------------\|----------------------------------------------------------------------------------\|
	\| fun VecIndex.knn(v: DoubleArray, i: Int, ~exact:~ Boolean = false) = \| fun VecIndex.knn(v: DoubleArray, i: Int, involve: Boolean = false) = \|
	\| if(~exact~) ~exactKNNSearch~(v, i + 10) \| if(involve) involveKNNSearch(v, i + 10) \|

	\| original doc \| synthetic doc \|
	\|----------------------------------------------------------------------------------\|----------------------------------------------------------------------------------\|
	\| /*~ See {@link #add(Object)} for general comments. /~ \| /** \|
	\| \| ** * Add all objects to the database. ** \|
	\| \| ** /* \|


	ROUGE-Synonym-Depth3 score: 0.26424870466321243

	\| prompt given to model \| predicted comment \|
	\|----------------------------------------------------------------------------------\|----------------------------------------------------------------------------------\|
	\| / \| / \|
	\| * \| * Private methods \|
	\| / \| / \|
	\| \| \|
	\| private Net()

breandan breandan