- Type : Standard d'API
- Language : URL, XML
- Difficulté d'utilisation : Basse
- Tadirah :
- Research Activities > 7_Dissemination > Sharing
- Research Activities > 7_Dissemination > Publishing
- Research Activities > 1_Capture > Discovering
- Research Activities > 1_Capture > Discovering
- Description courte : Une API CTS donne la capacité de citer des passages de textes en utilisant des identifiants logiques
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import json | |
from collections import defaultdict | |
import os | |
from MyCapytain.common.utils import xmlparser | |
# Logging related dependencies | |
import logging | |
import time | |
import math | |
# Multi Proc | |
from multiprocessing import Pool |
This file has been truncated, but you can view the full file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<GetValidReff xmlns="http://chs.harvard.edu/xmlns/cts"> | |
<request> | |
<requestName>GetValidReff</requestName><requestUrn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2</requestUrn><requestLevel>2</requestLevel> | |
</request> | |
<reply> | |
<reff><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.1</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.2</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.3</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.4</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.5</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.6</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.7</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.8</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.9</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.10</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.11</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.12</urn><urn>urn:cts:greekLit:tlg0012.tlg001.perseus-grc2:1.13< |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import requests | |
import os | |
import shutil | |
from argparse import ArgumentParser | |
parser = ArgumentParser(description="Download Full Quality sets of pages from the BNF") | |
parser.add_argument("text", type=str, help="ID of the text. In http://gallica.bnf.fr/ark:/12148/btv1b53084829z/, this would be btv1b53084829z") | |
parser.add_argument("--start", type=int, default=1, help="Page to start from") | |
parser.add_argument("--end", type=int, default=None, help="Page to end at") |
Sometime, Perseus XML files contains two concurrent citation sytems, one will be used for passage matching on the web interface and one will be marked but not used. It might be you are looking at a secondary source using another citation scheme. Here is a simple XSL and an example output based on Pliny the Elder Mayhoff-Perseus digitized Edition
To do that, I simply applied the XSL below in Oxygen to the XML file. It results in a simple CSV file with all references and the equivalences.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" | |
xmlns:xs="http://www.w3.org/2001/XMLSchema" | |
xmlns:saxon="http://saxon.sf.net/" | |
xmlns:my="foo.bar" | |
exclude-result-prefixes="xs my saxon uuid" | |
xpath-default-namespace="http://www.w3.org/1999/xhtml" | |
version="2.0" | |
xmlns:uuid="java:java.util.UUID"> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from lxml import etree as ET | |
import re | |
def fix_xml(xml_string: str) -> str: | |
""" Given an illformated xml, try to fix it | |
:param xml_string: XML that is faulty | |
:return: xml that should not be faulty | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ENDPOEM | |
Carminis incompti lusus lecture procaces, | |
conueniens Latio pone supercilium. | |
non soror hoc habitat Phoebi, non uesta sacello, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<?xml-model href="https://hipster-philology.github.io/protogenie/protogenie/schema.rng" | |
schematypens="http://relaxng.org/ns/structure/1.0"?> | |
<config> | |
<output column_marker="TAB"> | |
<header name="order"> | |
<key>token</key> | |
<key>lemma</key> | |
<key>pos</key> | |
<key>Dis</key> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<?xml version="1.0" encoding="UTF-8"?> | |
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" | |
xmlns:xs="http://www.w3.org/2001/XMLSchema" | |
xmlns:tei="http://www.tei-c.org/ns/1.0" | |
exclude-result-prefixes="xs" | |
xpath-default-namespace="http://www.tei-c.org/ns/1.0" | |
version="2.0"> | |
<xsl:output encoding="UTF-8" method="html" ></xsl:output> | |
<xsl:variable name="textchunk" select="'ab'"/> | |
<xsl:variable name="chunkTitle" select="'Priapea'"/> |
OlderNewer