<record>
<header>
<identifier>oai:CiteSeerX.psu:10.1.1.1.1484</identifier>
<datestamp>2009-05-24</datestamp>
</header>
<metadata>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarc
<dc:title>Winner-Take-All Network Utilising Pseudoinverse Reconstruction Subnets Demonstrates Robustness on the Handprinted Character Recognition Problem</dc:title>
<dc:creator>J. Körmendy-rácz</dc:creator>
<dc:creator>S. Szabó</dc:creator>
<dc:creator>J. Lörincz</dc:creator>
<dc:creator>G. Antal</dc:creator>
<dc:creator>G. Kovács</dc:creator>
<dc:creator>A. Lörincz</dc:creator>
<dc:subject>Correspondence and offprint requests to</dc:subject>
<dc:subject>J. Kormendy-Rácz</dc:subject>
<dc:description>Wittmeyer’s pseudoinverse iterative algorithm is formulated as a dynamic connectionist Data Compression and Reconstruction (DCR) network, and subnets of this type are supplemented by the winner-take-all paradigm. The winner is selected upon the goodness-of-fit of
<dc:contributor>The Pennsylvania State University CiteSeerX Archives</dc:contributor>
<dc:publisher>Springer</dc:publisher>
<dc:date>2009-05-24</dc:date>
<dc:date>2007-11-19</dc:date>
<dc:date>1999</dc:date>
<dc:format>application/pdf</dc:format>
<dc:type>text</dc:type>
<dc:identifier>http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.1484</dc:identifier>
<dc:source>http://people.inf.elte.hu/lorincz/Files/publications/WTA_NCA.pdf</dc:source>
<dc:language>en</dc:language>
<dc:rights>Metadata may be used without restrictions as long as the oai identifier remains attached to it.</dc:rights>
</oai_dc:dc>
</metadata>
</record>
-
Program OAIHarvester2 DEMO: data downloader (the demo link). Instruction
execute:
$ java -classpath .:oaiharvester.jar:xerces.jar org.acme.oai.OAIReaderRawDump http://citeseerx.ist.psu.edu/oai2 -o citeseerx_alldata.xml
-
Data 7.8GB
citeseerx_alldata.xml
: original raw data -
Program
extract_dc:descriptions.sh
: extract dc:descriptions fromciteseerx_alldata.xml
execute:
$ ./extract_dc:descriptions.sh citeseerx_alldata.xml > citeseerx_descriptions.txt
-
Data 2.6GB
citeseerx_descriptions.txt
: extracted descriptions -
Program
line_tokenizer.py
: sentences tokenizerexecute:
$ cat citeseerx_descriptions.txt | parallel -j 16 --keep-order --spreadstdin --block 20m ./line_tokenizer.py > citeseerx_descriptions_sents.txt
-
Data 2.6GB
citeseerx_descriptions_sents.txt
: sentences from descriptions -
Program
geniatagger
execute:
$ cat citeseerx_descriptions_sents.txt | parallel -j 16 --keep-order --spreadstdin --block 20m geniatagger > citeseerx_descriptions_sents_genia.txt
-
Data 9.4GB
citeseerx_descriptions_sents_genia.txt
: geniatagger tagged sentences