Last active
September 15, 2024 13:36
-
-
Save HoussenMoshine/16599d3c977aac616bfb27a9ecc6a3b5 to your computer and use it in GitHub Desktop.
Help for Telegram Instant View Template
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# To remove only some text on a content | |
@remove: //p[self::p/strong[contains(text(), "Articles liés")]] | |
@remove: //iframe[contains(@src,"https://urlsomething")] | |
# To deal with fucking error of <img> is not supported in blabla | |
@before_el(./..): //a/img | |
@before_el(./..): //p/img | |
# Same thing but, but iframe is not supported in p | |
<div>: //p[descendant::iframe] | |
#Cover and figure. Some work and others not. The code trie to add caption, but no each time | |
# idealy, we should get something like this : | |
<figure> | |
<img src="xxx"/> | |
<figcaption>caption</figcaption> | |
</figure> | |
------------------------------------ | |
<figure>: //img | |
<figcaption>: //img/p | |
cover: //figure | |
--------------- | |
$bg_section: //div[has-class("image-container")] | |
<figure>: $bg_section | |
<figcaption>: //div[has-class("image-caption")] | |
cover: //figure | |
---------- | |
$cover_img: //head/meta[@property="og:image"] | |
$content: $cover_img/@content | |
@set_attr(src, $content): $cover_img | |
<img>: $cover_img | |
@wrap(<figure>): $@ | |
$mynewcoverphoto: $@ | |
cover: $@ | |
<figcaption>: //div[has-class("entry-media-credit")] | |
@append_to($mynewcoverphoto): $@ | |
----------------- | |
# To extract image from slider, display them and add them to body | |
$slide: //a[has-class("rsImg")] | |
@set_attr(src, @href): $slide | |
<img>: $slide | |
@wrap(<figure>): $@ | |
@append_to($body): $@ | |
@append(<slideshow>): $body | |
@append_to(//slideshow): $body//figure | |
# To use <anchor> tag in IV. In the example below, the span is the place of the anchor. | |
@before(<anchor>, name, @id): $body//h2/span | |
# For dealing with FB video or Widget (code taken from IV chat group) | |
# facebook post | |
$fb_post: $body//div[has-class("fb-post")][@data-href] | |
@urlencode: $fb_post/@data-href | |
@set_attr(data-src, "https://www.facebook.com/plugins/post.php?href=", @data-href, "&show_text=", @data-show-text, "&width=640"): $fb_post | |
@before(<iframe>, src, @data-src, class, "fb-post"): $fb_post | |
@remove | |
# facebook video | |
$fb_video: $body//div[has-class("fb-video")][@data-href] | |
@urlencode: $fb_video/@data-href | |
@set_attr(data-src, "https://www.facebook.com/plugins/video.php?href=", @data-href, "&show_text=", @data-show-text, "&width=640"): $fb_video | |
@before(<iframe>, src, @data-src, class, "fb-video"): $fb_video | |
@remove | |
# For instagram | |
@before(<iframe>, \ | |
src, ".//a[contains(@href, \"instagram.com/p/\")]/@href", \ | |
class, "instagram" \ | |
): $body//blockquote[has-class("instagram-media")] | |
@remove | |
# Using @combine for combining 2 tags and using the xpath axis | |
@combine: //div[has-class("img-source")]/following::*[1]/self::div | |
# To remove something with a value "starting with" or "contains()" | |
@remove: //img[starts-with(@src, "data:image/jpeg;base64")] | |
$dada: //p[contains(text(), "incantation")] | |
# A template base to gain time | |
~version: "2.0" | |
# Template de base pour Actualite Houssenia Writing | |
# Pour commencer, on traite les éléments principaux | |
$main: //article | |
site_name: //meta[@itemprop="name"]/@content | |
title: //h1[has-class("post-title")] | |
subtitle: //div[@id="Chapo"] | |
body: //div[has-class("entry-inner")] | |
@datetime(-1, "fr","dd/MM/yyyy"): //time[has-class("published")]/text() | |
published_date: $@ | |
author: //span[has-class("fn")] | |
author_url: //a[@rel="author"]/@href | |
# Example for related tag | |
# $related: //div[has-class("tl_blog_bottom_blog")] | |
# <h4>: $related//a[has-class("side_blog_header")] | |
# <related>: $related | |
# @append_to($body) | |
----- Variante of related ------ | |
$related: //div[has-class("article__related")] | |
<related>: $related//a[has-class("article-card__link")] | |
<related>: $related | |
@append_to($body) | |
# For showing multiple author (code from .vhs user in the Instant View group chat). You must adapt it depending of the site | |
@append(<div>): $head | |
$authors_el: $@ | |
$authors: $meta[@name="author"] | |
$authors?: $meta[@name="article:author"] | |
@append_to($authors_el): $authors | |
@before(", ") | |
@remove: ($@)[1] | |
@before(@content): $authors | |
@remove | |
author: $authors_el | |
----- A variant of multiple authors with @map ------- | |
$single: //div[has-class("content-authors-group")]//a[@rel="author"]/span[@itemprop="name"] | |
@prepend(<authors>): //div[has-class("content-authors-group")] | |
@map($single) { | |
$author: $@ | |
@append_to(//authors): $author/text() | |
@after("- ") | |
} | |
@replace("[\r\n]+",""): //authors | |
author: //authors/text() | |
----- Another variant for showing multiples authors -------- | |
@prepend(<auteurs>): /html | |
@prepend_to(//auteurs): //div[has-class("article__inline-sidebar__mobile")]//p[has-class("article__body__author-name")]/a | |
@combine(", "): //auteurs/a/next-sibling::a | |
author: //auteurs/a | |
# To use a video inside a div (vimeo in this case) and putting it in cover | |
@html_to_dom: //div[has-class("video__embed")]/@data-html | |
cover: $@/iframe | |
# To use the json to dom function to extract json value | |
@json_to_xml: //script[@data-type="application/ld+json"] | |
$data: $@//datePublished/text() | |
published_date: $data | |
# For using sibling and others stuff | |
//h2[has-class("article-chapo")]/following-sibling::p | |
Quanto tempo demora para o telegram fazer a análise, eu fiz meu template tem mais de uma semana e não tive retorno.
Parabéns pelo código.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Houssen.
thanks for your codes. it saved me.
i some cases empty paragraphs exist in html.
<p> </p>
how can i remove this blank paragraphs?