This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.util.zip.{Inflater, Deflater} // Zlib library | |
import java.nio.file.{Files, Paths} | |
import java.io.{File, FileOutputStream} | |
object Inf { | |
def compress(inData: Array[Byte]): Array[Byte] = { | |
var deflater: Deflater = new Deflater() | |
deflater.setInput(inData) | |
deflater.finish | |
val compressedData = new Array[Byte](inData.size * 2) // compressed data can be larger than original data |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
final Configuration hadoopConf = sparkContext.hadoopConfiguration(); | |
hadoopConf.set("fs." + CustomS3FileSystem.SCHEMA + ".impl", | |
CustomS3FileSystem.class.getName()); | |
public class CustomS3FileSystem extends NativeS3FileSystem { | |
public static final String SCHEMA = "custom"; | |
@Override | |
public FileStatus[] globStatus(final Path pathPattern, final PathFilter filter) | |
throws IOException { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ππππππππππππππππππππππππποΏ½οΏ½οΏ½ | |
ππππππππππππππππππππππππππππππ | |
ππππππππππππππππππππππππππππππ | |
ππππππππππππππππππππππππππππππ | |
ππππππππππππππππππππππππππππππ | |
ππποΏ½οΏ½οΏ½ππππππππππππππππππππππππ | |
ππππππππππππππππππππππππππππππ | |
ππππππππππππππππππππππππππππππ | |
ππππππππππππππππππποΏ½οΏ½οΏ½ππππππππ | |
ππππππππππππππππππππππππππππππ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
val storage = "hdfs://nameservice1/user/plutus/data/kmeans_prediction_par_" | |
val penInputs = (1 to 30).map(x =>{ | |
val date = DateTime.now().minusDays(x).toString("yyyy-MM-dd") | |
(date, storage + date) | |
}).filter(prediction_storage => { | |
HdfsTools.checkIfFolderExists(new Path(prediction_storage._2)) | |
}) | |
penInputs.foreach(println) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cat urls | while read url; | |
do | |
curl -o- $url | grep -oh -i '[A-Z0-9._%+-]\+@[A-Z0-9.-]\+\.[A-Z]\{2,4\}' > emails; | |
email_found=`[[ $(wc -l < emails) -ge 1 ]] && echo "yes" || echo "no"`; | |
emails=`head -n3 emails | perl -00 -lpe 's/\n/,/g'`; | |
domain=`echo $url | awk -F[/:] '{print $4}'`; | |
more_emails=`[[ $(wc -l < emails) -ge 3 ]] && echo "yes" || echo "no"`; | |
echo "$domain, $email_found, $emails, $more_emails, $url"; | |
done |
OlderNewer