Skip to content

Instantly share code, notes, and snippets.

View shriphani's full-sized avatar

Shriphani Palakodety shriphani

View GitHub Profile
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Heritrix 3
[INFO] Heritrix 3: 'commons' subproject (utility classes)
[INFO] Heritrix 3: 'modules' subproject (reusable components)
[INFO] Heritrix 3: 'engine' subproject
[INFO] Heritrix 3 (distribution bundles)
[INFO]
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Heritrix 3
[INFO] Heritrix 3: 'commons' subproject (utility classes)
[INFO] Heritrix 3: 'modules' subproject (reusable components)
[INFO] Heritrix 3: 'engine' subproject
[INFO] Heritrix 3 (distribution bundles)
[INFO]
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Heritrix 3
[INFO] Heritrix 3: 'commons' subproject (utility classes)
[INFO] Heritrix 3: 'modules' subproject (reusable components)
[INFO] Heritrix 3: 'engine' subproject
[INFO] Heritrix 3 (distribution bundles)
[INFO]
@shriphani
shriphani / build_with_tests.txt
Last active August 29, 2015 13:57
Output of commands mvn install and mvn -DskipTests install
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Heritrix 3
[INFO] Heritrix 3: 'commons' subproject (utility classes)
[INFO] Heritrix 3: 'modules' subproject (reusable components)
[INFO] Heritrix 3: 'engine' subproject
[INFO] Heritrix 3 (distribution bundles)
[INFO]
package org.archive.modules.extractor;
import org.archive.modules.CrawlURI;
import org.htmlcleaner.CleanerProperties;
import org.htmlcleaner.DomSerializer;
import org.htmlcleaner.HtmlCleaner;
import org.htmlcleaner.TagNode;
import org.w3c.dom.Document;
import org.w3c.dom.NamedNodeMap;
<?xml version="1.0" encoding="UTF-8"?>
<!--
HERITRIX 3 CRAWL JOB CONFIGURATION FILE
This is a relatively minimal configuration suitable for many crawls.
Commented-out beans and properties are provided as an example; values
shown in comments reflect the actual defaults which are in effect
if not otherwise specified specification. (To change from the default
behavior, uncomment AND alter the shown values.)
@shriphani
shriphani / build_with_tests.txt
Created March 17, 2014 01:03
heritrix build status
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Heritrix 3
[INFO] Heritrix 3: 'commons' subproject (utility classes)
[INFO] Heritrix 3: 'modules' subproject (reusable components)
[INFO] Heritrix 3: 'engine' subproject
[INFO] Heritrix 3 (distribution bundles)
[INFO]
clj-heritrix.core=> (print-table (:members (r/reflect h)))
| :name | :type | :declaring-class | :flags |
|------------------------------+--------------------------------------+------------------------------+----------------------------|
| PROPERTIES | java.lang.String | org.archive.crawler.Heritrix | #{:private :static :final} |
| useAdhocKeystore | | org.archive.crawler.Heritrix | #{:protected} |
| getComponent | | org.archive.crawler.Heritrix | #{:public} |
| instanceMain | | org.archive.crawler.Heritrix | #{:public} |
| options | | org.archive.crawler.Heritrix | #{:private :static} |
| org.archive.crawler.Heritrix |
;; gorilla-repl.fileformat = 1
;; **
;;; # Gorilla REPL
;;;
;;; Welcome to gorilla :-) Shift + enter evaluates code. Poke the question mark (top right) to learn more ...
;; **
;; @@
(+ 1 2)
<html><head><link href="http://fonts.googleapis.com/css?family=Arvo:400,700,400italic,700italic|Lora:400,700,400italic,700italic" rel="stylesheet" type="text/css" /><link href="http://yandex.st/highlightjs/8.0/styles/default.min.css" rel="stylesheet" type="text/css" /><script src="http://yandex.st/highlightjs/8.0/highlight.min.js"></script><style>
body {
/*padding-top: 40px;*/
}
div#contents {
margin-left: 10%;
margin-right: 10%;
}