Created
August 29, 2020 16:16
-
-
Save flavorjones/a6d6695a0744a77e0581439b58f24ba1 to your computer and use it in GitHub Desktop.
nokogiri support 2020-08-29
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#! /usr/bin/env ruby | |
require 'nokogiri' | |
input = <<EOF | |
<title> The journey </title> | |
</head> | |
<body> | |
<h1> The <index class = "estimate"> Trip </index> </h1> | |
<p> Our scavenger hunt consists of several stages, starting with the <index class = "treasure"> crossing </index> & mdash; how do we actually get to those islands. The nice thing about an intellectual quest is that we can be in all kinds of places at the same time. If we ever want to revisit a previous episode, all we have to do is click there, and even though we haven't finished a particular stage yet, we can already look ahead to the next. We can also keep in touch with fellow travelers who are in completely different places in the world of thought. | |
<p> However, this can also easily confuse you. The solution of one problem leads to requirements and limitations that are imposed on subsequent answers, and those who have not found that first solution sometimes do not understand why certain later answers are not possible. With all our wandering around it is therefore important not to lose sight of <a href="../So/Place.htm"> the line of the trip </a>. | |
EOF | |
output_template = <<EOF | |
<! - saved from url = (0028) https://library.biep.org -> | |
<html lang = nl> | |
<head> | |
<meta HTTP-EQUIV = "Content-Type" CONTENT = "text / html; charset = windows-1252"> | |
<script language = JavaScript src = "../../../ Sheet.js"> </script> | |
<link href = "../../../ Blad.css" rel = "stylesheet" type = "text / css"> | |
</head> | |
<body> | |
</body> | |
</html> | |
EOF | |
fragment = Nokogiri::HTML::DocumentFragment.parse(input) | |
output = Nokogiri::HTML::Document.parse(output_template) | |
title = fragment.at_css("title") | |
title.remove | |
output.at_css("head").add_child(title) | |
output.at_css("body").add_child(fragment) | |
puts output | |
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> | |
# >> <html lang="nl"> | |
# >> <head> | |
# >> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> | |
# >> <script language="JavaScript" src="../../../%20Sheet.js"> </script> | |
# >> <link href="../../../%20Blad.css" rel="stylesheet" type="text / css"> | |
# >> <title> The journey </title> | |
# >> </head> | |
# >> <body> | |
# >> | |
# >> | |
# >> | |
# >> <h1> The <index class="estimate"> Trip </index> </h1> | |
# >> <p> Our scavenger hunt consists of several stages, starting with the <index class="treasure"> crossing </index> & mdash; how do we actually get to those islands. The nice thing about an intellectual quest is that we can be in all kinds of places at the same time. If we ever want to revisit a previous episode, all we have to do is click there, and even though we haven't finished a particular stage yet, we can already look ahead to the next. We can also keep in touch with fellow travelers who are in completely different places in the world of thought. | |
# >> </p> | |
# >> <p> However, this can also easily confuse you. The solution of one problem leads to requirements and limitations that are imposed on subsequent answers, and those who have not found that first solution sometimes do not understand why certain later answers are not possible. With all our wandering around it is therefore important not to lose sight of <a href="../So/Place.htm"> the line of the trip </a>. | |
# >> </p> | |
# >> </body> | |
# >> </html> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment