Skip to content

Instantly share code, notes, and snippets.

@andreasvc
Created November 2, 2013 23:48
Show Gist options
  • Save andreasvc/7284788 to your computer and use it in GitHub Desktop.
Save andreasvc/7284788 to your computer and use it in GitHub Desktop.
The TigerXML versions of the Tiger corpus contain a few nodes with multiple <edge> elements for a single node. The following is a patch against version 2.2 of the Tiger corpus that removes such edges to follow the v2.1 export version of the corpus. Since the export version does not contain these edges as secondary edges, they are probably spurious.
--- tiger_release_aug07.corrected.16012013.xml 2013-01-16 16:35:23.000000000 +0100
+++ tiger_2.2a.xml 2013-11-03 00:02:12.890306125 +0100
@@ -3097934,7 +3097934,6 @@
<nt id="s46234_505" cat="PP">
<edge label="AC" idref="s46234_24" />
<edge label="NK" idref="s46234_25" />
- <edge label="CJ" idref="s46234_135" />
</nt>
<nt id="s46234_506" cat="PP">
<edge label="AC" idref="s46234_30" />
@@ -3097951,7 +3097950,6 @@
<edge label="CM" idref="s46234_46" />
<edge label="AC" idref="s46234_47" />
<edge label="NK" idref="s46234_48" />
- <edge label="CJ" idref="s46234_133" />
</nt>
<nt id="s46234_509" cat="PP">
<edge label="AC" idref="s46234_51" />
@@ -3097983,7 +3097981,6 @@
<edge label="AC" idref="s46234_74" />
<edge label="NK" idref="s46234_75" />
<edge label="NK" idref="s46234_76" />
- <edge label="CJ" idref="s46234_131" />
</nt>
<nt id="s46234_516" cat="NP">
<edge label="NK" idref="s46234_82" />
@@ -3376990,22 +3376987,16 @@
<nt id="s50224_500" cat="NP">
<edge label="NK" idref="s50224_1" />
<edge label="NK" idref="s50224_2" />
- <edge label="NK" idref="s50224_133" />
- <edge label="NK" idref="s50224_134" />
</nt>
<nt id="s50224_501" cat="CNP">
<edge label="CJ" idref="s50224_7" />
<edge label="CD" idref="s50224_8" />
<edge label="CJ" idref="s50224_9" />
- <edge label="AC" idref="s50224_135" />
- <edge label="NK" idref="s50224_136" />
- <edge label="NK" idref="s50224_137" />
</nt>
<nt id="s50224_502" cat="NP">
<edge label="NK" idref="s50224_22" />
<edge label="NK" idref="s50224_23" />
<edge label="NK" idref="s50224_24" />
- <edge label="NK" idref="s50224_138" />
</nt>
<nt id="s50224_503" cat="NP">
<edge label="NK" idref="s50224_26" />
@@ -3377045,14 +3377036,12 @@
<nt id="s50224_511" cat="NP">
<edge label="NK" idref="s50224_61" />
<edge label="NK" idref="s50224_62" />
- <edge label="NK" idref="s50224_142" />
</nt>
<nt id="s50224_512" cat="CNP">
<edge label="CJ" idref="s50224_64" />
<edge label="CJ" idref="s50224_66" />
<edge label="CJ" idref="s50224_68" />
<edge label="CJ" idref="s50224_70" />
- <edge label="NK" idref="s50224_141" />
</nt>
<nt id="s50224_513" cat="NP">
<edge label="NK" idref="s50224_76" />
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment