|
2020 |
abdelaziz2020graph4code |
Graph4Code: A Machine Interpretable Knowledge Graph for Code |
|
2019 |
agashe2019julce |
JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation |
|
2015 |
aggarwal2015using |
Using Machine Translation for Converting Python 2 to Python 3 Code |
|
2020 |
ahmad2020transformer |
A Transformer-based Approach for Source Code Summarization |
|
2021 |
ahmad2021unified |
Unified Pre-training for Program Understanding and Generation |
|
2019 |
ahmed2019learning |
Learning Lenient Parsing & Typing via Indirect Supervision |
|
2022 |
ahmed2022learning |
Learning code summarization from a small and local dataset |
|
2023 |
ahmed2033improving |
Improving Few-Shot Prompts with Relevant Static Analysis Products |
|
2021 |
alet2021largescale |
A large-scale benchmark for few-shot program induction and synthesis |
|
2022 |
allal2022santacoder |
SantaCoder: don’t reach for the stars! |
|
2013 |
allamanis2013mining |
Mining Source Code Repositories at Massive Scale Using Language Modeling |
|
2014 |
allamanis2014learning |
Learning Natural Coding Conventions |
|
2014 |
allamanis2014mining |
Mining Idioms from Source Code |
|
2015 |
allamanis2015bimodal |
A Bimodal Modelling of Source Code and Natural Language |
|
2015 |
allamanis2015suggesting |
Suggesting Accurate Method and Class Names |
|
2016 |
allamanis2016convolutional |
A Convolutional Attention Network for Extreme Summarization of Source Code |
|
2017 |
allamanis2017mining |
Mining Semantic Loop Idioms from Big Code |
|
2017 |
allamanis2017smartpaste |
SmartPaste: Learning to Adapt Source Code |
|
2018 |
allamanis2018learning |
Learning to Represent Programs with Graphs |
|
2019 |
allamanis2019adverse |
The Adverse Effects of Code Duplication in Machine Learning Models of Code |
|
2020 |
allamanis2020typilus |
Typilus: Neural Type Hints |
|
2021 |
allamanis2021self |
Self-Supervised Bug Detection and Repair |
|
2019 |
alon2018code2seq |
code2seq: Generating Sequences from Structured Representations of Code |
|
2018 |
alon2018general |
A General Path-Based Representation for Predicting Program Properties |
|
2019 |
alon2019code2vec |
code2vec: Learning Distributed Representations of Code |
|
2019 |
alon2019structural |
Structural Language Models for Any-Code Generation |
|
2017 |
amodio2017neural |
Neural Attribute Machines for Program Generation |
|
2020 |
arakelyan2020towards |
Towards Learning Representations of Binary Executable Files for Security Tasks |
|
2020 |
ashwath2020predicting |
Predicting Vulnerability in Large Codebases With Deep Code Representation |
|
2020 |
aye2020learning |
Learning Autocompletion from Real-World Datasets |
|
2020 |
aye2020sequence |
Sequence Model Design for Code Completion in the Modern IDE |
|
2021 |
bai2021jointly |
Jointly Learning to Repair Code and Generate Commit Message |
|
2022 |
bareiss2022code |
Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code |
|
2022 |
barke2022grounded |
Grounded Copilot: How Programmers Interact with Code-Generating Models |
|
2017 |
barone2017parallel |
A parallel corpus of Python functions and documentation strings for automated code documentation and code generation |
|
2022 |
bavarian2022efficient |
Efficient Training of Language Models to Fill in the Middle |
|
2017 |
bavishi2017context2name |
Context2Name: A Deep Learning-Based Approach to Infer Natural Variable Names from Usage Contexts |
|
2019 |
bavishi2019autopandas |
AutoPandas: neural-backed generators for program synthesis |
|
2017 |
beltramelli2017pix2code |
pix2code: Generating Code from a Graphical User Interface Screenshot |
|
2018 |
bennun2018neural |
Neural Code Comprehension: A Learnable Representation of Code Semantics |
|
2021 |
berabi2021tfix |
TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer |
|
2016 |
bhatia2016automated |
Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks |
|
2018 |
bhatia2018neurosymbolic |
Neuro-symbolic program corrector for introductory programming assignments |
|
2016 |
bhoopchand2016learning |
Learning Python Code Suggestion with a Sparse Pointer Network |
|
2020 |
bian2020sinkfinder |
SinkFinder: harvesting hundreds of unknown interesting function pairs with just one seed |
|
2016 |
bichsel2016statistical |
Statistical Deobfuscation of Android Applications |
|
2020 |
bieber2020learning |
Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks |
|
2022 |
bieber2022static |
Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions |
|
2016 |
bielik2016phog |
PHOG: Probabilistic Model for Code |
|
2020 |
bielik2020adversarial |
Adversarial Robustness for Code |
|
2020 |
brauckmann2020compiler |
Compiler-based graph representations for deep learning models of code |
|
2020 |
brauckmann2020compy |
ComPy-Learn: A toolbox for exploring machine learning representations for compilers |
|
2020 |
briem2020offside |
OffSide: Learning to Identify Mistakes in Boundary Conditions |
|
2019 |
brockschmidt2019generative |
Generative Code Modeling with Graphs |
|
2020 |
brody2020structural |
A Structural Model for Contextual Code Changes |
|
2009 |
bruch2009learning |
Learning from Examples to Improve Code Completion Systems |
|
2019 |
buech2019learning |
Learning-based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detection |
|
2018 |
bui2018bilateral |
Bilateral Dependency Neural Networks for Cross-Language Algorithm Classification |
|
2018 |
bui2018cross |
Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks |
|
2018 |
bui2018hierarchical |
Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Code |
|
2019 |
bui2019learning |
SAR: Learning Cross-Language API Mappings with Little Knowledge |
|
2021 |
bui2021efficient |
Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations |
|
2021 |
bui2021infercode |
InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees |
|
2020 |
cai2020tag |
TAG : Type Auxiliary Guiding for Code Comment Generation |
|
2019 |
cambronero2019deep |
When Deep Learning Met Code Search |
|
2014 |
campbell2014syntax |
Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models |
|
2013 |
cerulo2013hidden |
A Hidden Markov Model to Detect Coded Information Islands in Free Text |
|
2015 |
cerulo2015irish |
Irish: A Hidden Markov Model to detect coded information islands in free text |
|
2016 |
chae2016automatically |
Automatically generating features for learning program analysis heuristics |
|
2018 |
chakraborty2018tree2tree |
CODIT: Code Editing with Tree-Based Neural Machine Translation |
|
2021 |
chakraborty2020deep |
Deep Learning based Vulnerability Detection: Are We There Yet? |
|
2021 |
chakraborty2021multimodal |
On Multi-Modal Learning of Editing Source Code |
|
2019 |
chen2019capturing |
Capturing source code semantics via tree-based convolution over API-enhanced AST |
|
2019 |
chen2019literature |
A Literature Study of Embeddings on Source Code |
|
2019 |
chen2019mining |
Mining Likely Analogical APIs across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding |
|
2019 |
chen2019sequencer |
SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair |
|
2021 |
chen2021evaluating |
Evaluating Large Language Models Trained on Code |
|
2021 |
chen2021plur |
PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair |
|
2022 |
chen2022codet |
CodeT: Code Generation with Generated Tests |
|
2023 |
chen2023diversevul |
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection |
|
2019 |
chibotaru2019scalable |
Scalable Taint Specification Inference with Big Code |
|
2020 |
chirkova2020empirical |
Empirical Study of Transformers for Source Code |
|
2021 |
chirkova2021embeddings |
On the Embeddings of Variables in Recurrent Neural Networks for Source Code |
|
2023 |
chow2023beware |
Beware of the Unexpected: Bimodal Taint Analysis |
|
2020 |
ciurumelea2020suggesting |
Suggesting Comment Completions for Python using Neural Language Models |
|
2020 |
clement2020pymt5 |
PyMT5: multi-mode translation of natural language and Python code with transformers |
|
2021 |
clement2021distilling |
Distilling Transformers for Neural Cross-Domain Search |
|
2021 |
clement2021long |
Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy |
|
2019 |
commit2vec2019lozoya |
Commit2Vec: Learning Distributed Representations of Code Changes |
|
2020 |
compton2020embedding |
Embedding Java Classes with code2vec: Improvements from Variable Obfuscation |
|
2015 |
corley2015exploring |
Exploring the Use of Deep Learning for Feature Location |
|
2017 |
cummins2017end |
End-to-end Deep Learning of Optimization Heuristics |
|
2017 |
cummins2017synthesizing |
Synthesizing benchmarks for predictive modeling |
|
2018 |
cummins2018compiler |
Compiler Fuzzing through Deep Learning |
|
2020 |
cummins2020programl |
ProGraML: Graph-based Deep Learning for Program Optimization and Analysis |
|
2018 |
cvitkovic2018open |
Open Vocabulary Learning on Source Code with a Graph-Structured Cache |
|
2016 |
dam2016deep |
A deep language model for software code |
|
2018 |
dash2018refinym |
RefiNym: Using Names to Refine Types |
|
2019 |
david2019neural |
Neural Reverse Engineering of Stripped Binaries |
|
2018 |
defreez2018path |
Path-Based Function Embedding and its Application to Specification Mining |
|
2020 |
derezendemartins2020concra.md |
CoNCRA: A Convolutional Neural Network Code Retrieval Approach |
|
2020 |
devanbu2020deep |
Deep Learning & Software Engineering: State of Research and Future Directions |
|
2017 |
devlin2017semantic |
Semantic Code Repair using Neuro-Symbolic Transformation Networks |
|
2021 |
deze2021mulcode |
MulCode: A Multi-task Learning Approach for Source Code Understanding |
|
2022 |
deze2022bridging |
Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding |
|
2020 |
dinella2020hoppity |
Hoppity: Learning Bug Detection and Repair |
|
2021 |
dinella2021deepmerge |
DeepMerge: Learning to Merge Programs |
|
2022 |
dinella2022toga |
TOGA: A Neural Method for Test Oracle Generation |
|
2019 |
ding2019asm2vec |
Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization |
|
2021 |
ding2021contrastive |
Contrastive Learning for Source Code with Structural and Functional Properties |
|
2022 |
doderlein2022piloting |
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic? |
|
2023 |
dong2023codescore |
CodeScore: Evaluating Code Generation by Learning Code Execution |
|
2021 |
drain2021deepdebug |
DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons |
|
2021 |
drain2021generating |
Generating Bug-Fixes Using Pretrained Transformers |
|
2019 |
edelmann2019neural |
Neural-Network Guided Expression Transformation |
|
2019 |
ederhardt2019unsupervised |
Unsupervised Learning of API Aliasing Specifications |
|
2019 |
efstathiou2019semantic |
Semantic Source Code Models Using Identifier Embeddings |
|
2022 |
eghbali2022crystalbleu |
CrystalBLEU: Precisely and Efficiently Measuring the Similarity of Code |
|
2021 |
ellis2021dreamcoder |
DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning |
|
2021 |
elnaggar2021codetrans |
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing |
|
2020 |
feng2020codebert |
CodeBERT: A Pre-Trained Model for Programming and Natural Languages |
|
2019 |
fernandes2019structured |
Structured Neural Summarization |
|
2016 |
fowkes2016parameter |
Parameter-Free Probabilistic API Mining across GitHub |
|
2017 |
fowkes2017autofolding |
Autofolding for Source Code Summarization |
|
2015 |
franks2015cacheca |
CACHECA: A Cache Language Model Based Code Suggestion Tool |
|
2022 |
fried2022incoder |
InCoder: A Generative Model for Code Infilling and Synthesis |
|
2019 |
fu2019coda |
Coda: An End-to-End Neural Program Decompiler |
|
2019 |
gao2019neural |
A Neural Model for Method Name Generation from Functional Description |
|
2022 |
garg2022deepperf |
DeepPERF: A Deep Learning-Based Approach For Improving Software Performance |
|
2021 |
gholamian2021naturalness |
On the Naturalness and Localness of Software Logs |
|
2015 |
glassman2015overcode |
OverCode: visualizing variation in student solutions to programming problems at scale |
|
2019 |
goens2019case |
A case study on machine learning for synthesizing benchmarks |
|
2020 |
gros2020code |
Code to Comment "Translation": Data, Metrics, Baselining & Evaluation |
|
2016 |
gu2016deep |
Deep API Learning |
|
2017 |
gu2017deepam |
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning |
|
2018 |
gu2018deep |
Deep Code Search |
|
2022 |
gui2022cross |
Cross-Language Binary-Source Code Matching with Intermediate Representations |
|
2014 |
gulwani2014nlyze |
NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation |
|
2017 |
guo2017semantically |
Semantically enhanced software traceability using deep learning techniques |
|
2020 |
guo2020graphcodebert |
GraphCodeBERT: Pre-training Code Representations with Data Flow |
|
2022 |
guo2022learning |
Learning to Complete Code with Sketches |
|
2022 |
guo2022unixcoder |
UniXcoder: Unified Cross-Modal Pre-training for Code Representation |
|
2017 |
gupta2017deepfix |
DeepFix: Fixing Common C Language Errors by Deep Learning |
|
2018 |
gupta2018deep |
Deep Reinforcement Learning for Programming Language Correction |
|
2018 |
gupta2018intelligent |
Intelligent code reviews using deep learning |
|
2019 |
gupta2019neural |
Neural Attribution for Semantic Bug-Localization in Student Programs |
|
2015 |
gvero2015synthesizing |
Synthesizing Java expressions from free-form queries |
|
2019 |
habib2019neural |
Neural Bug Finding: A Study of Opportunities and Challenges |
|
2019 |
hajipour2019samplefix |
SampleFix: Learning to Correct Programs by Sampling Diverse Fixes |
|
2020 |
haldar2020multiperspective |
A Multi-Perspective Architecture for Semantic Code Search |
|
2020 |
haque2020improved |
Improved Automatic Summarization of Subroutines via Attention to File Context |
|
2022 |
haque2022semantic |
Semantic Similarity Metrics for Evaluating Source Code Summarization |
|
2018 |
harer2018learning |
Learning to Repair Software Vulnerabilities with Generative Adversarial Networks |
|
2018 |
hashimoto2018retrieve |
A Retrieve-and-Edit Framework for Predicting Structured Outputs |
|
2018 |
hata2018learning |
Learning to Generate Corrective Patches using Neural Machine Translation |
|
2021 |
hazoom2021text |
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data |
|
2019 |
he2019learning |
Learning to Fuzz from Symbolic Execution with Application to Smart Contracts |
|
2021 |
he2021learning |
Learning to Find Naming Issues with Big Code and Small Supervision |
|
2022 |
he2022distribution |
On Distribution Shift in Learning-based Bug Detectors |
|
2015 |
hellendoorn2015will |
Will they like this? Evaluating Code Contributions With Language Models |
|
2017 |
hellendoorn2017deep |
Are Deep Neural Networks the Best Choice for Modeling Source Code? |
|
2018 |
hellendoorn2018deep |
Deep Learning Type Inference |
|
2020 |
hellendoorn2020global |
Global Relational Models of Source Code |
|
2022 |
henkel2020semantic |
Semantic Robustness of Models of Source Code |
|
2020 |
heyman2020neural |
Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent |
|
2012 |
hindle2012naturalness |
On the Naturalness of Software |
|
2020 |
hoang2020cc2vec |
CC2Vec: Distributed Representations of Code Changes |
|
2021 |
hong2021fix |
Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixing |
|
2014 |
hsiao2014using |
Using Web Corpus Statistics for Program Analysis |
|
2017 |
hu2017codesum |
CodeSum: Translate Program Language to Natural Language |
|
2021 |
huang2021cosqa |
CoSQA: 20,000+ Web Queries for Code Search and Question Answering |
|
2019 |
husain2019codesearchnet |
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search |
|
2019 |
hussain2019deep |
Deep Transfer Learning for Source Code Modeling |
|
2016 |
iyer2016summarizing |
Summarizing Source Code using a Neural Attention Model |
|
2018 |
iyer2018mapping |
Mapping Language to Code in Programmatic Context |
|
2019 |
iyer2019learning |
Learning Programmatic Idioms for Scalable Semantic Parsing |
|
2020 |
jain2020contrastive |
Contrastive Code Representation Learning |
|
2019 |
jayasundara2019treecaps |
TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing |
|
2021 |
jesse2021learning |
Learning Type Annotation: Is Big Data Enough? |
|
2022 |
jesse2022learning |
Learning To Predict User-Defined Types |
|
2023 |
jesse2023large |
Large Language Models and Simple, Stupid Bugs |
|
2021 |
jian2021multimodal |
Multimodal Representation for Neural Code Search |
|
2022 |
jian2022assemble |
Assemble Foundation Models for Automatic Code Summarization |
|
2017 |
jiang2017automatically |
Automatically Generating Commit Messages from Diffs using Neural Machine Translation |
|
2021 |
jiang2021treebert |
TreeBERT: A Tree-Based Pre-Trained Model for Programming Language |
|
2020 |
johnson2020learning |
Learning Graph Structure With A Finite-State Automaton Layer |
|
2021 |
jung2021commitbert |
CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model |
|
2019 |
kacmajor2019automatic |
Automatic Acquisition of Annotated Training Corpora for Test-Code Generation |
|
2020 |
kanade2020pretrained |
Pre-trained Contextual Embedding of Source Code |
|
2014 |
karaivanov2014phrase |
Phrase-Based Statistical Translation of Programming Languages |
|
2019 |
karampatsis2019deep |
Maybe Deep Neural Networks are the Best Choice for Modeling Source Code |
|
2020 |
karampatsis2020big |
Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code |
|
2020 |
karampatsis2020scelmo |
SCELMo: Source Code Embeddings from Language Models |
|
2021 |
karmakar2021what |
What do pre-trained code models know about code? |
|
2022 |
karmakar2022jemma |
JEMMA: An Extensible Java Dataset for ML4Code Applications |
|
2015 |
karpathy2015visualizing |
Visualizing and Understanding Recurrent Networks |
|
2019 |
katz2019towards |
Towards Neural Decompilation |
|
2022 |
key2022speak |
I Speak, You Verify: Toward Trustworthy Neural Program Synthesis |
|
2022 |
kharkar2022learning |
Learning to Reduce False Positives in Analytic Bug Detectors |
|
2020 |
kim2020code |
Code Prediction by Feeding Trees to Transformers |
|
2017 |
koc2017learning |
Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools |
|
2022 |
kocetkov2022stack |
The Stack: 3TB of permissively licensed source code |
|
2021 |
korbak2021energy |
Energy-Based Models for Code Generation under Compilability Constraints |
|
2022 |
kovalchuk2022human |
Human perceiving behavior modeling in evaluation of code generation models |
|
2019 |
kovalenko2019pathminer |
PathMiner : A Library for Mining of Path-Based Representations of Code |
|
2007 |
kremenek2007factor |
A Factor Graph Model for Software Bug Finding |
|
2019 |
kulal2019spoc |
SPoC: Search-based Pseudocode to Code |
|
2020 |
kurbatova2020recommendation |
Recommendation of Move Method Refactoring Using Path-Based Representation of Code |
|
2013 |
kushman2013using |
Using Semantic Unification to Generate Regular Expressions from Natural Language |
|
2020 |
lachaux2020unsupervised |
Unsupervised Translation of Programming Languages |
|
2019 |
lacomis2019neural |
A Neural Approach to Decompiled Identifier Renaming |
|
2018 |
lanchantin2018exploring |
Exploring the Naturalness of Buggy Code with Recurrent Neural Network |
|
2019 |
leclair2019neural |
A Neural Model for Generating Natural Language Summaries of Program Subroutines |
|
2019 |
leclair2019recommendations |
Recommendations for Datasets for Source Code Summarization |
|
2020 |
leclair2020improved |
Improved Code Summarization via a Graph Neural Network |
|
2020 |
lee2020montage |
Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer |
|
2021 |
lee2021cotraining |
Co-Training for Commit Classification |
|
2017 |
levy2017learning |
Learning to Align the Source Code to the Compiled Object Code |
|
2022 |
lherondelle2022topical |
Topical: Learning Repository Embeddings from Source Code using Attention |
|
2016 |
li2016gated |
Gated Graph Sequence Neural Networks |
|
2017 |
li2017code |
Code Completion with Neural Attention and Pointer Networks |
|
2017 |
li2017software |
Software Defect Prediction via Convolutional Neural Network |
|
2019 |
li2019improving |
Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networks |
|
2019 |
li2019neural |
Neural Code Search Evaluation Dataset |
|
2019 |
li2019using |
Using GGNN to recommend log statement level |
|
2020 |
li2020dlfix |
DLFix: Context-based Code Transformation Learning for Automated Program Repair |
|
2020 |
li2020learning |
Learning Code-Query Interaction for Enhancing Code Searches |
|
2021 |
li2021learning |
Learning to Extend Program Graphs to Work-in-Progress Code |
|
2021 |
li2021toward |
Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models |
|
2022 |
li2022codereviewer |
CodeReviewer: Pre-Training for Automating Code Review Activities |
|
2022 |
li2022exploring |
Exploring Representation-Level Augmentation for Code Search |
|
2021 |
liguori2021shellcode_ia32 |
Shellcode_IA32: A Dataset for Automatic Shellcode Generation |
|
2017 |
lin2017program |
Program Synthesis from Natural Language Using Recurrent Neural Networks |
|
2018 |
lin2018nl2bash |
NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System |
|
2019 |
lin2019impact |
On the Impact of Refactoring Operations on Code Naturalness |
|
2016 |
ling2016latent |
Latent Predictor Networks for Code Generation |
|
2020 |
ling2020adaptive |
Adaptive Deep Code Search |
|
2020 |
ling2020deep |
Deep Graph Matching and Searching for Semantic Code Retrieval |
|
2016 |
liu2016towards |
Towards Better Program Obfuscation: Optimization via Language Models |
|
2018 |
liu2018neural |
Neural-Machine-Translation-Based Commit Message Generation: How Far Are We? |
|
2019 |
liu2019deepfuzz |
DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing |
|
2019 |
liu2019generating |
Generating commit messages from diffs using pointer-generator network |
|
2019 |
liu2019learning |
Learning to Sport and Refactor Inconsistent Method Names |
|
2019 |
liu2019neural |
Neural query expansion for code search |
|
2020 |
liu2020automating |
Automating Just-In-Time Comment Updating |
|
2022 |
liu2022open |
Open-ended Knowledge Tracing |
|
2018 |
louis2018deep |
Deep Learning to Detect Redundant Method Comments |
|
2020 |
louis2020where |
Where should I comment my code? A dataset and model for predicting locations that need comments |
|
2017 |
loyola2017neural |
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes |
|
2018 |
loyola2018content |
Content Aware Source Code Change Description Generation |
|
2019 |
lu2019program |
Program Classification Using Gated Graph Attention Neural Network for Online Programming Service |
|
2021 |
lu2021codexglue |
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation |
|
2022 |
lu2022reacc |
ReACC: A Retrieval-Augmented Code Completion Framework |
|
2015 |
luan2019aroma |
Aroma: code recommendation via structural code search |
|
2014 |
maddison2014structured |
Structured Generative Models of Natural Source Code |
|
2021 |
mahmud2021code |
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors |
|
2019 |
malik2019nl2type |
NL2Type: Inferring JavaScript Function Types from Natural Language Information |
|
2020 |
mammadli2020static |
Static Neural Compiler Optimization via Deep Reinforcement Learning |
|
2015 |
mangal2015user |
A User-Guided Approach to Program Analysis |
|
2017 |
markovtsev2017topic |
Topic modeling of public repositories at scale using names in source code |
|
2018 |
markovtsev2018public |
Public Git Archive: a Big Code dataset for all |
|
2019 |
markovtsev2019style |
STYLE-ANALYZER: fixing code style inconsistencies with interpretable unsupervised algorithms |
|
2022 |
mastropaolo2022using |
Using Deep Learning to Generate Complete Log Statements |
|
2020 |
mehrotra2020modeling |
Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks |
|
2013 |
menon2013machine |
A Machine Learning Framework for Programming by Example |
|
2019 |
mesbah2019deepdelta |
DeepDelta: Learning to Repair Compilation Errors |
|
2021 |
mir2021manytypes4py |
ManyTypes4Py: A Benchmark Python Dataset for Machine Learning-based Type Inference |
|
2021 |
mir2021type4py |
Type4Py: Deep Similarity Learning-Based Type Inference for Python |
|
2021 |
monperrus2021megadiff |
Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Size |
|
2014 |
mou2014building |
Building Program Vector Representations for Deep Learning |
|
2016 |
mou2016convolutional |
Convolutional Neural Networks over Tree Structures for Programming Language Processing |
|
2013 |
movshovitz2013natural |
Natural Language Models for Predicting Programming Comments |
|
2015 |
movshovitz2015kb |
KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts |
|
2020 |
mukherjee2020searching |
Searching a Database of Source Codes Using Contextualized Code Search |
|
2021 |
mukherjee2021neural |
Neural Program Generation Modulo Static Analysis |
|
2018 |
murali2017bayesian |
Bayesian Sketch Learning for Program Synthesis |
|
2017 |
murali2017finding |
Finding Likely Errors with Bayesian Specifications |
|
2022 |
nadeem2022codedsi |
CodeDSI: Differentiable Code Search |
|
2022 |
naik2022probing |
Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis |
|
2020 |
nair2020funcgnn |
funcGNN: A Graph Neural Network Approach to Program Similarity |
|
2013 |
nguyen2013lexical |
Lexical Statistical Machine Translation for Language Migration |
|
2013 |
nguyen2013statistical |
A Statistical Semantic Language Model for Source Code |
|
2013 |
nguyen2013study |
A Study of Repetitiveness of Code Changes in Software Evolution |
|
2014 |
nguyen2014statistical |
Statistical Learning Approach for Mining API Usage Mappings for Code Migration |
|
2014 |
nguyen2015divide |
Divide-and-Conquer Approach for Multi-phase Statistical Migration for Source Code |
|
2015 |
nguyen2015graph |
Graph-based Statistical Language Model for Code |
|
2016 |
nguyen2016learning |
Learning API Usages from Bytecode: A Statistical Approach |
|
2016 |
nguyen2016mapping |
Mapping API Elements for Code Migration with Vector Representations |
|
2017 |
nguyen2017exploring |
Exploring API Embedding for API Usages and Applications |
|
2019 |
nguyen2019graph |
Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change Patterns |
|
2020 |
nguyen2020suggesting |
Suggesting Natural Method Names to Check Name Consistencies |
|
2021 |
nie2021evaluation |
Impact of Evaluation Methodologies on Code Summarization |
|
2022 |
nijkamp2022conversational |
A Conversational Paradigm for Program Synthesis |
|
2021 |
nitin2021direct |
DIRECT : A Transformer-based Model for Decompiled Identifier Renaming |
|
2022 |
niu2022spt-code |
SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations |
|
2021 |
nye2021program |
Program Synthesis with Large Language Models |
|
2021 |
nye2021show |
Show Your Work: Scratchpads for Intermediate Computation with Language Models |
|
2015 |
oda2015learning |
Learning to Generate Pseudo-code from Source Code using Statistical Machine Translation |
|
2015 |
oh2015learning |
Learning a Strategy for Adapting a Program Analysis via Bayesian Optimisation |
|
2013 |
omar2013structured |
Structured Statistical Syntax Tree Prediction |
|
2021 |
orlanski2021reading |
Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation |
|
2018 |
ott2018deep |
A Deep Learning Approach to Identifying Source Code in Images and Video |
|
2020 |
pandi2020opttyper |
OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraints |
|
2020 |
panthaplackel2020associating |
Associating Natural Language Comment and Source Code Entities |
|
2020 |
panthaplackel2020copy |
Copy that! Editing Sequences by Copying Spans |
|
2020 |
panthaplackel2020deep |
Deep Just-In-Time Inconsistency Detection Between Comments and Source Code |
|
2020 |
panthaplackel2020learning |
Learning to Update Natural Language Comments Based on Code Changes |
|
2021 |
panthaplackel2021learning |
Learning to Describe Solutions for Bug Reports Based on Developer Discussions |
|
2022 |
panthaplackel2022using |
Using Developer Discussions to Guide Fixing Bugs in Software |
|
2018 |
parvez2018building |
Building Language Models for Text with Named Entities |
|
2021 |
parvez2021retrieval |
Retrieval Augmented Code Generation and Summarization |
|
2022 |
pashakhanloo2022codetrek |
CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation |
|
2022 |
patil2022exploring |
Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsing |
|
2016 |
patra2016learning |
Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Data |
|
2021 |
patra2021semantic |
A Semantic Bug Seeding: A Learning-Based Approach for Creating Realistic Bugs |
|
2021 |
pearce2021empirical |
An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions |
|
2021 |
peng2021how |
How could Neural Networks understand Programs? |
|
2021 |
phan2021cotext |
CoTexT: Multi-task Learning with Code-Text Transformer |
|
2015 |
piech2015learning |
Learning Program Embeddings to Propagate Feedback on Student Code |
|
2022 |
poesia2022synchromesh |
Synchromesh: Reliable code generation from pre-trained language models |
|
2021 |
popov2021time |
Time-Efficient Code Completion Model for the R Programming Language |
|
2017 |
pradel2017deep |
Deep Learning to Find Bugs |
|
2019 |
pradel2019typewriter |
TypeWriter: Neural Type Prediction with Search-based Validation |
|
2020 |
pradel2020neural |
Neural Software Analysis |
|
2021 |
pravilov2021unsupervised |
Unsupervised Learning of General-Purpose Embeddings for Code Changes |
|
2015 |
proksch2015intelligent |
Intelligent Code Completion with Bayesian Networks |
|
2016 |
pu2016skp |
sk_p: a neural program corrector for MOOCs |
|
2021 |
puri2021project |
Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks |
|
2019 |
rabin2019testing |
Testing Neural Program Analyzers |
|
2020 |
rabin2020demystifying |
Towards Demystifying Dimensions of Source Code Embeddings |
|
2021 |
rabin2021generalizability |
On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations |
|
2021 |
rabin2021understanding |
Understanding Neural Code Intelligence Through Program Simplification |
|
2022 |
rabin2022memorization |
Memorization and Generalization in Neural Code Intelligence Models |
|
2022 |
rabin2022understanding |
Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Models |
|
2017 |
rabinovich2017abstract |
Abstract Syntax Networks for Code Generation and Semantic Parsing |
|
2018 |
raghothaman2018user |
User-guided program reasoning using Bayesian inference |
|
2019 |
rahman2019natural |
Natural Software Revisited |
|
2022 |
ramakrishnan2020backdoors |
Backdoors in Neural Models of Source Code |
|
2015 |
ray2015naturalness |
On the “Naturalness” of Buggy Code |
|
2014 |
raychev2014code |
Code Completion with Statistical Language Models |
|
2015 |
raychev2015predicting |
Predicting Program Properties from “Big Code” |
|
2016 |
raychev2016learning |
Learning Programs from Noisy Data |
|
2022 |
reid2022learning |
Learning to Model Editing Processes |
|
2020 |
ren2020codebleu |
CodeBLEU: a Method for Automatic Evaluation of Code Synthesis |
|
2017 |
richardson2017code2text |
The Code2Text Challenge: Text Generation in Source Code Libraries |
|
2017 |
richardson2017function |
Function Assistant: A Tool for NL Querying of APIs |
|
2017 |
richardson2017learning |
Learning Technical Correspondences in Technical Documentation |
|
2018 |
richardson2018polyglot |
Polyglot Semantic Parsing in APIs |
|
2022 |
richter2022can |
Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixes |
|
2021 |
roziere2021dobf |
DOBF: A Deobfuscation Pre-Training Objective for Programming Languages |
|
2021 |
roziere2021leveraging |
Leveraging Automated Unit Tests for Unsupervised Code Translation |
|
2018 |
russell2018automated |
Automated Vulnerability Detection in Source Code Using Deep Representation Learning |
|
2023 |
saberi2023model |
Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models |
|
2022 |
sahu2022learning |
Learning to Answer Semantic Queries over Code |
|
2018 |
saini2018oreo |
Oreo: detection of clones in the twilight zone |
|
2018 |
santos2018syntax |
Syntax and Sensibility: Using language models to detect and correct syntax errors |
|
2015 |
saraiva2015products |
Products, Developers, and Milestones: How Should I Build My N-Gram Language Model |
|
2022 |
sarkar2022what |
What is it like to program with artificial intelligence? |
|
2019 |
schrouff2019inferring |
Inferring Javascript types using Graph Neural Networks |
|
2021 |
schuster2021you |
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion |
|
2015 |
sharma2015nirmal |
NIRMAL: Automatic Identification of Software Relevant Tweets Leveraging Language Model |
|
2019 |
sharma2019feasibility |
On the Feasibility of Transfer-learning Code Smells using Deep Learning |
|
2022 |
sharma2022exploratory |
An Exploratory Study on Code Attention in BERT |
|
2022 |
sharma2022lamner |
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition |
|
2019 |
she2019neuzz |
NEUZZ: Efficient Fuzzing with Neural Program Smoothing |
|
2019 |
shi2019learning |
Learning Execution through Neural Code Fusion |
|
2022 |
shi2022cv4code |
CV4Code: Sourcecode Understanding via Visual Code Representations |
|
2019 |
shido2019automatic |
Automatic Source Code Summarization with Extended Tree-LSTM |
|
2018 |
shirani2018evaluation |
Evaluation of Type Inference with Textual Cues |
|
2020 |
shrivastava2020on-the-fly |
On-the-Fly Adaptation of Source Code Models using Meta-Learning |
|
2022 |
shrivastava2020repository |
Repository-Level Prompt Generation for Large Language Models of Code |
|
2020 |
shuai2020improving |
Improving Code Search with Co-Attentive Representation Learning |
|
2018 |
si2018learning |
Learning Loop Invariants for Program Verification |
|
2022 |
silavong2022senatus |
Senatus - A Fast and Accurate Code-to-Code Recommendation Engine |
|
2016 |
singh2016question |
Question Independent Grading using Machine Learning: The Case of Computer Program Grading |
|
2019 |
siow2019core |
CORE: Automating Review Recommendation for Code Changes |
|
2022 |
siow2022learning |
Learning Program Semantics with Code Representations: An Empirical Study |
|
2021 |
sivaraman2021mining |
Mining Idioms in the Wild |
|
2023 |
souza2023lexecutor |
LExecutor: Learning-Guided Execution |
|
2021 |
spirin2021psiminer |
PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code |
|
2014 |
srikant2014system |
A system to grade computer programming skills using machine learning |
|
2019 |
sun2019grammar |
A Grammar-Based Structural CNN Decoder for Code Generation |
|
2020 |
sun2020pscs |
PSCS: A Path-based Neural Model for Semantic Code Search |
|
2019 |
svyatkovskiy2019pythia |
Pythia: AI-assisted Code Completion System |
|
2020 |
svyatkovskiy2020fast |
Fast and Memory-Efficient Neural Code Completion |
|
2020 |
svyatkovskiy2020intellicode |
IntelliCode Compose: Code Generation Using Transformer |
|
2022 |
szafraniec2022code |
Code Translation with Compiler Representations |
|
2020 |
tabassum2020code |
Code and Named Entity Recognition in StackOverflow |
|
2019 |
tarlow2019learning |
Learning to Fix Build Errors with Graph2Diff Neural Networks |
|
2019 |
theeten2019import2vec |
Import2vec - Learning Embeddings for Software Libraries |
|
2020 |
tian2020evaluating |
Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair |
|
2019 |
tomczak2019simulating |
Simulating Execution Time of Tensor Programs using Graph Neural Networks |
|
2019 |
tran2019recovering |
Recovering Variable Names for Minified Code with Usage Contexts |
|
2014 |
tu2014localness |
On the Localness of Software |
|
2018 |
tufano2018deep |
Deep Learning Similarities from Different Representations of Source Code |
|
2018 |
tufano2018empirical |
An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation |
|
2018 |
tufano2018learning |
Learning How to Mutate Source Code from Bug-Fixes |
|
2019 |
tufano2019learning |
On Learning Meaningful Code Changes via Neural Machine Translation |
|
2020 |
tufano2020generating |
Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers |
|
2020 |
tufano2020unit |
Unit Test Case Generation with Transformers |
|
2022 |
vaithilingam2022expectation |
Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models |
|
2019 |
vasic2019neural |
Neural Program Repair by Jointly Learning to Localize and Repair |
|
2017 |
vasilescu2017recovering |
Recovering Clear, Natural Identifiers from Obfuscated JS Names |
|
2021 |
villmow2021contest |
ConTest: A Unit Test Completion Benchmark featuring Context |
|
2018 |
wan2018improving |
Improving Automatic Source Code Summarization via Deep Reinforcement Learning |
|
2019 |
wan2019multimodal |
Multi-Modal Attention Network Learning for Semantic Source Code Retrieval |
|
2020 |
wan2020naturalcc |
NaturalCC: A Toolkit to Naturalize the Source Code Corpus |
|
2022 |
wan2022what |
What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code |
|
2016 |
wang2016automatically |
Automatically Learning Semantic Features for Defect Prediction |
|
2016 |
wang2016bugram |
Bugram: bug detection with n-gram language models |
|
2016 |
wang2016neural |
Neural Code Completion |
|
2019 |
wang2019learning |
Learning Scalable and Precise Representation of Program Semantics |
|
2020 |
wang2020blended |
Blended, precise semantic program embeddings |
|
2020 |
wang2020cocogum |
CoCoGUM: Contextual Code Summarization with Multi-Relational GNN on UMLs |
|
2020 |
wang2020detecting |
Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree |
|
2020 |
wang2020learning |
Learning Semantic Program Embeddings with Graph Interval Neural Network |
|
2020 |
wang2020learning2 |
Learning to Represent Programs with Heterogeneous Graphs |
|
2020 |
wang2020modular |
Modular Tree Network for Source Code Representation Learning |
|
2020 |
wang2020trans |
TranS^3: A Transformer-based Framework for Unifying Code Summarization and Code Search |
|
2021 |
wang2021codet5 |
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation |
|
2021 |
wang2021syncobert |
SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation |
|
2021 |
watson2021systematic |
A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research |
|
2021 |
waunakh2019idbench |
IdBench: Evaluating Semantic Representations of Identifier Names in Source Code |
|
2019 |
wei2019code |
Code Generation as a Dual Task of Code Summarization |
|
2020 |
wei2020lambdanet |
LambdaNet: Probabilistic Type Inference using Graph Neural Networks |
|
2023 |
wei2023typet5 |
TypeT5: Seq2seq Type Inference using Static Analysis |
|
2015 |
white2015toward |
Toward Deep Learning Software Repositories |
|
2016 |
white2016deep |
Deep Learning Code Fragments for Code Clone Detection |
|
2017 |
white2017sorting |
Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities |
|
2021 |
wong2021leveraging |
Leveraging Language to Learn Program Abstractions and Search Heuristics |
|
2021 |
wu2021prototransformer |
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback |
|
2019 |
xu2019commit |
Commit Message Generation for Source Code Changes |
|
2019 |
xu2019method |
Method name suggestion with hierarchical attention networks |
|
2020 |
xu2020incorporating |
Incorporating External Knowledge through Pre-training for Natural Language to Code Generation |
|
2021 |
xu2021capturing |
Capturing Structural Locality in Non-parametric Language Models |
|
2022 |
xu2022systematic |
A Systematic Evaluation of Large Language Models of Code |
|
2016 |
yadid2016extracting |
Extracting Code from Programming Tutorial Videos |
|
2020 |
yan2020are |
Are the Code Snippets What We Are Searching for? A Benchmark and an Empirical Study on Code Search with Natural-Language Queries |
|
2017 |
yang2017language |
A Language Model for Statements of Software Code |
|
2020 |
yang2020survey |
A Survey on Deep Learning for Software Engineering |
|
2018 |
yao2018staqc |
StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow |
|
2019 |
yao2019coacor |
CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning |
|
2020 |
yasunaga2020graph |
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback |
|
2020 |
ye2020leveraging |
Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning |
|
2020 |
ye2020misim |
MISIM: An End-to-End Neural Code Similarity System |
|
2021 |
ye2021neural |
Neural Program Repair with Execution-based Backpropagation |
|
2022 |
ye2022selfapr |
SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics |
|
2019 |
yefet2019adversarial |
Adversarial Examples for Models of Code |
|
2017 |
yin2017syntactic |
A Syntactic Neural Model for General-Purpose Code Generation |
|
2018 |
yin2018mining |
Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow |
|
2019 |
yin2019learning |
Learning to Represent Edits |
|
2019 |
yonai2019mercem |
Mercem: Method Name Recommendation Based on Call Graph Embedding |
|
2017 |
yuan2017abridging |
Abridging Source Code |
|
2014 |
zaremba2014learning |
Learning to Execute |
|
2022 |
zeng2022extensive |
An Extensive Study on Pre-trained Models for Program Understanding and Generation |
|
2019 |
zhang2019learning |
Learning Uniform Semantic Features for Natural Language and Programming Language Globally, Locally and Sequentially |
|
2019 |
zhang2019novel |
A Novel Neural Source Code Representation based on Abstract Syntax Tree |
|
2020 |
zhang2020generating |
Generating Adversarial Examples for Holding Robustness of Source Code Processing Models |
|
2021 |
zhang2021bag |
Bag-of-Words Baselines for Semantic Code Search |
|
2021 |
zhang2021disentangled.md |
Disentangled Code Representation Learning for Multiple Programming Languages |
|
2022 |
zhang2022coditt5 |
CoditT5: Pretraining for Source Code and Natural Language Editing |
|
2023 |
zhang2023repocoder |
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation |
|
2018 |
zhao2018neural |
Neural-Augumented Static Analysis of Android Communication |
|
2019 |
zhao2019neural |
Neural Networks for Modeling Source Code Edits |
|
2018 |
zhong2018generating |
Generating Regular Expressions from Natural Language Specifications: Are We There Yet? |
|
2020 |
zhong2020semantic |
Semantic Scaffolds for Pseudocode-to-Code Generation |
|
2020 |
zhou2019devign |
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks |
|
2021 |
zhou2021improving |
Improving Code Autocompletion with Transfer Learning |
|
2023 |
zhou2022codebertscore |
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code |
|
2022 |
zhou2022docoder |
DocCoder: Generating Code by Retrieving and Reading Docs |
|
2020 |
zhu2020ocor |
OCoR: An Overlapping-Aware Code Retriever |
|
2021 |
zhu2921syntax |
A Syntax-Guided Edit Decoder for Neural Program Repair |
|
2022 |
ziegler2022productivity |
Productivity Assessment of Neural Code Completion |
|
2022 |
zlotchevski2022exploring |
Exploring and Evaluating Personalized Models for Code Generation |
|
2021 |
zugner2021language |
Language-Agnostic Representation Learning of Source Code from Structure and Context |