Studying the usage of text-to-text transfer transformer to support code-related tasksA Mastropaolo, S Scalabrino, N Cooper, DN Palacio, D Poshyvanyk, ...2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE …, 2021| 249| 2021
A systematic literature review on the use of deep learning in software engineering researchC Watson, N Cooper, DN Palacio, K Moran, D PoshyvanykACM Transactions on Software Engineering and Methodology (TOSEM) 31 (2), 1-58, 2022| 117| 2022
An empirical study on the usage of bert models for code completionM Ciniselli, N Cooper, L Pascarella, D Poshyvanyk, M Di Penta, G Bavota2021 IEEE/ACM 18th International Conference on Mining Software Repositories …, 2021| 84| 2021
An empirical study on the usage of transformer models for code completionM Ciniselli, N Cooper, L Pascarella, A Mastropaolo, E Aghajani, ...IEEE Transactions on Software Engineering 48 (12), 4818-4837, 2021| 83| 2021
Translating video recordings of mobile app usages into replayable scenariosC Bernal
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
use arrow::{ | |
file::{writer::FileWriter, write_all, Writer}, | |
record_batch::RecordBatch, | |
util::hash::XXHash64, | |
}; | |
use std::fs::File; | |
fn hash_text_column(input_path: &str, output_path: &str) { | |
let mut input_reader = FileReader::try_new(input_path).unwrap(); | |
let input_schema = input_reader.schema().clone(); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import time | |
import os | |
from pyspark.ml import Pipeline | |
from pyspark.ml.feature import RegexTokenizer, NGram, HashingTF, MinHashLSH | |
from pyspark.sql.functions import col | |
from spark_session_builder import build_spark_session | |
spark = build_spark_session("spark://cpu64-dy-c6i-16xlarge-1:7077", 32, 128) | |
db = spark.read.parquet("/fsx/shared/pilev2_parquet/StackExchange_ver4_non_local_dedupped/dataset.parquet").limit(1_000_000) # Stage 0 & 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import boto3 | |
s3 = boto3.resource("s3") | |
my_bucket = s3.Bucket("s-eai-neox") | |
file_paths = [] | |
for my_bucket_object in my_bucket.objects.filter(Prefix="data/codepile/group1/"): | |
# print(my_bucket_object.key) | |
file_paths.append(f"s3a://s-eai-neox/{my_bucket_object.key}") | |
print(len(file_paths)) | |
from spark_session_builder import build_spark_session | |
file_paths = file_paths[100:200] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
base_model: NousResearch/Meta-Llama-3-8B | |
model_type: LlamaForCausalLM | |
tokenizer_type: AutoTokenizer | |
load_in_8bit: true | |
load_in_4bit: false | |
strict: false | |
datasets: | |
- path: answerdotai/tiny_programs_haiku3_critiques |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
print("Hello") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
print("Hello") |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
OlderNewer